With the widespread development of the internet, multi-view text documents have become increasingly common, which has led to extensive research on multi-view text document modeling. As opposed to traditional single-view document modeling, which treats each document independently and learns each document as a single topic representation, the views of multi-view text documents have complicated correlation relationships that include both the global and local underlying topical information. In this study, we introduce a deep generative model for multi-view document modeling known as Hierarchical Variational Auto-Encoder (HVAE), which combines the advantages of the probability generative model for learning interpretable latent information and the deep neural network for efficient parameter inference. Specifically, a set of hierarchical topic representations is learned for each multi-view document to capture the document-level global topical information and view-level local topical information for each view. A two-level hierarchical topic inference network is investigated as the encoder network of HVAE, which is designed using an aligned variational auto-encoder, to learn the hierarchical topic representations. Subsequently, multi-view documents are generated through a two-layered generation network, considering both the view-level local and document-level global topic representations. Experiments on three real datasets of different scales for various tasks demonstrate the satisfactory results of the proposed method.
Read full abstract