Summarizing ‘Enhancing Social Recommendation with Multi-View BERT Network (MVBN)’

BLOGS

Summarizing ‘Enhancing Social Recommendation with Multi-View BERT Network (MVBN)’

Tushar Prakash, Raksha Jalan, Onoe Naoyuki

30^th September 2024

Overview of proposed framework

Tushar Prakash summarises paper titled Enhancing Social Recommendation with Multi-View BERT Network co-authored by Raksha Jalan and Naoyuki Onoe. Accepted at 23rd IEEE International Conference on Data Mining (ICDM)

Introduction

E-commerce and online entertainment platforms increasingly rely on recommendation systems to filter irrelevant content and offer personalized recommendations. With features like Disney+‘s “GroupWatch” and Amazon Prime Video’s “Watch Party,” there’s a growing focus on incorporating social relationships into these models. However, the complexity of social dynamics makes it challenging to accurately model user preferences. Traditional approaches, such as Matrix Factorization and deep learning models, attempt to incorporate social relations but often assume shared interests among connected users, overlooking the diverse factors that influence preferences. These models also tend to focus on one-sided interactions, failing to capture a comprehensive, context-aware understanding.

To address these issues, we proposed MVBN (Multi-View BERT Network), which contains the following key components:

Neighbourhood Sampling: Selects influential neighbors based on social behavior and interactions to enhance user and item embeddings.
Sequence Masking: Creates sequences of user-interacted items and masks some items to form a prediction task, refining embeddings through sampling to capture correlations.
Item and User Views: The model incorporates both item view (user’s interaction history and item similarity) and user view (social connections, user-user similarities, and network data) to generate embeddings that reflect hidden preferences, enabling a more context-aware recommendation system.

By integrating these item and user views, the model provides a richer understanding of both user preferences and social influences. A shared BERT network processes masked sequences through Transformer layers, and multi-task learning is employed to predict masked items while refining user embeddings. This setup improves both the efficiency and overall performance of the recommendation system.

Key Results

Table: Comparison with state-of-the-art methods on XD-Violence Dataset

The results in Tables 3 and 4 display MVBN’s superior performance across all three datasets compared to baselines, underscoring the effectiveness of our BERT and Multi-task learning embedding layer for processing user-item interactions and social relations. From these tables, we note:

Social information-based methods tend to outperform non-social ones, suggesting social information enhances recommendation performance.
MVBN’s substantial improvement over existing deep learning-based Social Recommendation models is credited to the custom embedding layer, BERT and sequence header capturing users’ dynamic interest. The self-attention mechanism with BERT, multi-task learning, and consideration of bi-directional context of user-item interaction play significant roles in this improvement.
When compared to sequential recommendation baselines like BERT4Rec and SASRec, MVBN still outperforms due to our embedding layer and multi-tasking. This validates our choice of the BERT network.

Fig: Visual comparison in terms of anomaly score curves on sample video for Violence Detection task. Here, yellow regions are the temporal ground-truths.

Conclusion

We introduced the novel Multi-View BERT Network (MVBN) for Social Recommendation. MVBN uses neighbourhood sampling and sequence headers with BERT for a better representation of user-item interactions. It considers bidirectional contexts in predictions, and improves performance by incorporating users’ social links in multi-task learning. Our model consistently surpasses the current top social recommendation algorithms. Future work will explore session-based interactions and social links with transformers.

To know more about Sony Research India’s Research Publications, visit the ‘Publications’ section on our ‘Open Innovation’s page: Open Innovation with Sony R&D – Sony Research India

The introduced modules and techniques help the proposed method to align known class representations effectively so that it can detect the unknown objects accurately. To validate this, we carried out extensive experiments & ablation studies and found that the proposed method outperforms existing SOTA methods with significant improvement on the MS-COCO & PASCAL VOC dataset for the OSOD task.

To know more about the paper, visit: Open-Set Object Detection by Aligning Known Class Representations (thecvf.com)

To know more about Sony Research India’s Research Publications, visit the ‘Publications’ section on our ‘Open Innovation’s page: Open Innovation with Sony R&D – Sony Research India