1,112 publications found
Sort by
Robust Collaborative Filtering to Popularity Distribution Shift

In leading collaborative filtering (CF) models, representations of users and items are prone to learn popularity bias in the training data as shortcuts. The popularity shortcut tricks are good for in-distribution (ID) performance but poorly generalized to out-of-distribution (OOD) data, i.e., when popularity distribution of test data shifts w.r.t. the training one. To close the gap, debiasing strategies try to assess the shortcut degrees and mitigate them from the representations. However, there exist two deficiencies: (1) when measuring the shortcut degrees, most strategies only use statistical metrics on a single aspect ( i.e., item frequency on item and user frequency on user aspect), failing to accommodate the compositional degree of a user-item pair; (2) when mitigating shortcuts, many strategies assume that the test distribution is known in advance. This results in low-quality debiased representations. Worse still, these strategies achieve OOD generalizability with a sacrifice on ID performance. In this work, we present a simple yet effective debiasing strategy, PopGo , which quantifies and reduces the interaction-wise popularity shortcut without any assumptions on the test data. It first learns a shortcut model, which yields a shortcut degree of a user-item pair based on their popularity representations. Then, it trains the CF model by adjusting the predictions with the interaction-wise shortcut degrees. By taking both causal- and information-theoretical looks at PopGo, we can justify why it encourages the CF model to capture the critical popularity-agnostic features while leaving the spurious popularity-relevant patterns out. We use PopGo to debias two high-performing CF models (MF [ 28 ], LightGCN [ 19 ]) on four benchmark datasets. On both ID and OOD test sets, PopGo achieves significant gains over the state-of-the-art debiasing strategies ( e.g., DICE [ 71 ], MACR [ 58 ]). Codes and datasets are available at https://github.com/anzhang314/PopGo.

Open Access
Relevant
Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

Text response generation for multimodal task-oriented dialog systems, which aims to generate the proper text response given the multimodal context, is an essential yet challenging task. Although existing efforts have achieved compelling success, they still suffer from two pivotal limitations: 1) overlook the benefit of generative pre-training , and 2) ignore the textual context related knowledge . To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection , dual knowledge-enhanced context learning , and knowledge-enhanced response generation . To be specific, the dual knowledge selection component aims to select the related knowledge according to both textual and visual modalities of the given context. Thereafter, the dual knowledge-enhanced context learning component targets seamlessly integrating the selected knowledge into the multimodal context learning from both global and local perspectives, where the cross-modal semantic relation is also explored. Moreover, the knowledge-enhanced response generation component comprises a revised BART decoder, where an additional dot-product knowledge-decoder attention sub-layer is introduced for explicitly utilizing the knowledge to advance the text response generation. Extensive experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.

Open Access
Relevant
Spatio-Temporal Contrastive Learning Enhanced GNNs for Session-based Recommendation

Session-based recommendation (SBR) systems aim to utilize the user’s short-term behavior sequence to predict the next item without the detailed user profile. Most recent works try to model the user preference by treating the sessions as between-item transition graphs and utilize various graph neural networks (GNNs) to encode the representations of pair-wise relations among items and their neighbors. Some of the existing GNN-based models mainly focus on aggregating information from the view of spatial graph structure, which ignores the temporal relations within neighbors of an item during message passing and the information loss results in a sub-optimal problem. Other works embrace this challenge by incorporating additional temporal information but lack sufficient interaction between the spatial and temporal patterns. To address this issue, inspired by the uniformity and alignment properties of contrastive learning techniques, we propose a novel framework called Session-based RE commendation with S patio- T emporal C ontrastive Learning Enhanced GNNs ( RESTC ). The idea is to supplement the GNN-based main supervised recommendation task with the temporal representation via an auxiliary cross-view contrastive learning mechanism. Furthermore, a novel global collaborative filtering graph (CFG) embedding is leveraged to enhance the spatial view in the main task. Extensive experiments demonstrate the significant performance of RESTC compared with the state-of-the-art baselines. We release our source code at https://github.com/SUSTechBruce/RESTC-Source-code.

Open Access
Relevant
SetRank: A Setwise Bayesian Approach for Collaborative Ranking in Recommender System

The recent development of recommender systems has a focus on collaborative ranking, which provides users with a sorted list rather than rating prediction. The sorted item lists can more directly reflect the preferences for users and usually perform better than rating prediction in practice. While considerable efforts have been made in this direction, the well-known pairwise and listwise approaches have still been limited by various challenges. Specifically, for the pairwise approaches, the assumption of independent pairwise preference is not always held in practice. Also, the listwise approaches cannot efficiently accommodate “ties” and unobserved data due to the precondition of the entire list permutation. To this end, in this paper, we propose a novel setwise Bayesian approach for collaborative ranking, namely SetRank, to inherently accommodate the characteristics of user feedback in recommender systems. SetRank aims to maximize the posterior probability of novel setwise preference structures and three implementations for SetRank are presented. We also theoretically prove that the bound of excess risk in SetRank can be proportional to \(\sqrt {M/N} \) , where M and N are the numbers of items and users, respectively. Finally, extensive experiments on four real-world datasets clearly validate the superiority of SetRank compared with various state-of-the-art baselines.

Open Access
Relevant
Syntactic-Informed Graph Networks for Sentence Matching

Matching two natural language sentences is a fundamental problem in both natural language processing and information retrieval. Preliminary studies have shown that the syntactic structures help improve the matching accuracy, and different syntactic structures in natural language are complementary to sentence semantic understanding. Ideally, a matching model would leverage all syntactic information. Existing models, however, are only able to combine limited (usually one) types of syntactic information due to the complex and heterogeneous nature of the syntactic information. To deal with the problem, we propose a novel matching model, which formulates sentence matching as a representation learning task on a syntactic-informed heterogeneous graph. The model, referred to as SIGN (Syntactic-Informed Graph Network), first constructs a heterogeneous matching graph based on the multiple syntactic structures of two input sentences. Then the graph attention network algorithm is applied to the matching graph to learn the high-level representations of the nodes. With the help of the graph learning framework, the multiple syntactic structures, as well as the word semantics, can be represented and interacted in the matching graph and therefore collectively enhance the matching accuracy. We conducted comprehensive experiments on three public datasets. The results demonstrate that SIGN outperforms the state of the art and also can discriminate the sentences in an interpretable way.

Open Access
Relevant
Hierarchical Transformer with Spatio-temporal Context Aggregation for Next Point-of-interest Recommendation

Next point-of-interest (POI) recommendation is a critical task in location-based social networks, yet remains challenging due to a high degree of variation and personalization exhibited in user movements. In this work, we explore the latent hierarchical structure composed of multi-granularity short-term structural patterns in user check-in sequences. We propose a Spatio-Temporal context AggRegated Hierarchical Transformer (STAR-HiT) for next POI recommendation, which employs stacked hierarchical encoders to recursively encode the spatio-temporal context and explicitly locate subsequences of different granularities. More specifically, in each encoder, the global attention layer captures the spatio-temporal context of the sequence, while the local attention layer performed within each subsequence enhances subsequence modeling using the local context. The sequence partition layer infers positions and lengths of subsequences from the global context adaptively, such that semantics in subsequences can be well preserved. Finally, the subsequence aggregation layer fuses representations within each subsequence to form the corresponding subsequence representation, thereby generating a new sequence of higher-level granularity. The stacking of hierarchical encoders captures the latent hierarchical structure of the check-in sequence, which is used to predict the next visiting POI. Extensive experiments on three public datasets demonstrate that the proposed model achieves superior performance while providing explanations for recommendations.

Relevant
Contextualized Knowledge Graph Embedding for Explainable Talent Training Course Recommendation

Learning and development, or L&D, plays an important role in talent management, which aims to improve the knowledge and capabilities of employees through a variety of performance-oriented training activities. Recently, with the rapid development of enterprise management information systems, many research efforts and industrial practices have been devoted to building personalized employee training course recommender systems. Nevertheless, a widespread challenge is how to provide explainable recommendations with the consideration of different learning motivations from talents. To this end, we propose CKGE, a contextualized knowledge graph (KG) embedding approach for developing an explainable training course recommender system. A novel perspective of CKGE is to integrate both the contextualized neighbor semantics and high-order connections as motivation-aware information for learning effective representations of talents and courses. Specifically, in CKGE, for each entity pair (i.e., the talent-course pair), we first construct a meta-graph, including the neighbors of each entity and the meta-paths between entities as motivation-aware information. Then, we develop a novel KG-based Transformer, which can serialize entities and paths in the meta-graph as a sequential input, with the specially designed relational attention and structural encoding mechanisms to better model the global dependence of KG structured data. Meanwhile, the local path mask prediction can effectively reveal the importance of different paths. As a result, CKGE not only can make precise predictions but also can discriminate the saliencies of meta-paths in characterizing corresponding preferences. Extensive experiments on real-world and public datasets clearly validate the effectiveness and interpretability of CKGE compared with state-of-the-art baselines.

Relevant
Towards Efficient Coarse-grained Dialogue Response Selection

Coarse-grained response selection is a fundamental and essential subsystem for the widely used retrieval-based chatbots, aiming to recall a coarse-grained candidate set from a large-scale dataset. The dense retrieval technique has recently been proven very effective in building such a subsystem. However, dialogue dense retrieval models face two problems in real scenarios: (1) the multi-turn dialogue history is re-computed in each turn, leading to inefficient inference; (2) the index storage of the offline index is enormous, significantly increasing the deployment cost. To address these problems, we propose an efficient coarse-grained response selection subsystem consisting of two novel methods. Specifically, to address the first problem, we propose the H ierarchical D ense R etrieval. It caches rich multi-vector representations of the dialogue history and only encodes the latest user’s utterance, leading to better inference efficiency. Then, to address the second problem, we design the D eep S emantic H ashing to reduce the index storage while effectively saving its recall accuracy notably. Extensive experimental results prove the advantages of the two proposed methods over previous works. Specifically, with the limited performance loss, our proposed coarse-grained response selection model achieves over 5x FLOPs speedup and over 192x storage compression ratio. Moreover, our source codes have been publicly released. 1

Relevant
PARADE: Passage Representation Aggregation forDocument Reranking

Pre-trained transformer models, such as BERT and T5, have shown to be highly effective at ad hoc passage and document ranking. Due to the inherent sequence length limits of these models, they need to process document passages one at a time rather than processing the entire document sequence at once. Although several approaches for aggregating passage-level signals into a document-level relevance score have been proposed, there has yet to be an extensive comparison of these techniques. In this work, we explore strategies for aggregating relevance signals from a document’s passages into a final ranking score. We find that passage representation aggregation techniques can significantly improve over score aggregation techniques proposed in prior work, such as taking the maximum passage score. We call this new approach PARADE. In particular, PARADE can significantly improve results on collections with broad information needs where relevance signals can be spread throughout the document (such as TREC Robust04 and GOV2). Meanwhile, less complex aggregation techniques may work better on collections with an information need that can often be pinpointed to a single passage (such as TREC DL and TREC Genomics). We also conduct efficiency analyses and highlight several strategies for improving transformer-based aggregation.

Open Access
Relevant