30th September 2024
E-commerce and online entertainment platforms increasingly rely on recommendation systems to filter irrelevant content and offer personalized recommendations. With features like Disney+‘s “GroupWatch” and Amazon Prime Video’s “Watch Party,” there’s a growing focus on incorporating social relationships into these models. However, the complexity of social dynamics makes it challenging to accurately model user preferences. Traditional approaches, such as Matrix Factorization and deep learning models, attempt to incorporate social relations but often assume shared interests among connected users, overlooking the diverse factors that influence preferences. These models also tend to focus on one-sided interactions, failing to capture a comprehensive, context-aware understanding.
By integrating these item and user views, the model provides a richer understanding of both user preferences and social influences. A shared BERT network processes masked sequences through Transformer layers, and multi-task learning is employed to predict masked items while refining user embeddings. This setup improves both the efficiency and overall performance of the recommendation system.
Table: Comparison with state-of-the-art methods on XD-Violence Dataset
The results in Tables 3 and 4 display MVBN’s superior performance across all three datasets compared to baselines, underscoring the effectiveness of our BERT and Multi-task learning embedding layer for processing user-item interactions and social relations. From these tables, we note:
Fig: Visual comparison in terms of anomaly score curves on sample video for Violence Detection task. Here, yellow regions are the temporal ground-truths.
To know more about Sony Research India’s Research Publications, visit the ‘Publications’ section on our ‘Open Innovation’s page: Open Innovation with Sony R&D – Sony Research India