In biochemistry, graph structures have been widely used for modeling compounds, proteins, functional interactions, etc. A common task that divides these graphs into different categories, known as graph classification, highly relies on the quality of the representations of graphs. With the advance in graph neural networks, message-passing-based methods are adopted to iteratively aggregate neighborhood information for better graph representations. These methods, though powerful, still suffer from some shortcomings. The first challenge is that pooling-based methods in graph neural networks may sometimes ignore the part-whole hierarchies naturally existing in graph structures. These part-whole relationships are usually valuable for many molecular function prediction tasks. The second challenge is that most existing methods do not take the heterogeneity embedded in graph representations into consideration. Disentangling the heterogeneity will increase the performance and interpretability of models. This paper proposes a graph capsule network for graph classification tasks with disentangled feature representations learned automatically by well-designed algorithms. This method is capable of, on the one hand, decomposing heterogeneous representations to more fine-grained elements, whilst on the other hand, capturing part-whole relationships using capsules. Extensive experiments performed on several public-available biochemistry datasets demonstrated the effectiveness of the proposed method, compared with nine state-of-the-art graph learning methods.