Publications

|

Iteratively Improving Speech Recognition and Voice Conversion

Authors: Mayank Kumar Singh, Naoya Takahashi, Onoe Naoyuki
INTERSPEECH 2023 | August 2023

Read More ➜

|

Cd-HRNN: Content-Driven HRNN to Improve Session-Based Recommendation System

Authors: Sonal Dabral, Brijraj Singh and Naoyuki Onoe (Sony Research India Pvt. Ltd.)
IJCNN Main Conference 2023 | April 2023

|

A Multi-Modal Multi-Task Based Approach for Movie Recommendation

Authors: Sriparna Saha (IIT Patna) and Naoyuki Onoe (Sony Research India Pvt. Ltd.)
IJCNN Main Conference 2023 | April 2023

|

A Meta-Learning Based Generative Model with Graph Attention Network for Multi-Modal Recommender Systems

Authors:Sriparna Saha (IIT Patna) and Naoyuki Onoe (Sony Research India Pvt. Ltd.)
INNS DLIA Workshop /IJCNN 2023 | April 2023

|

Task-Specific and Graph Convolutional Network Based Multi-Modal Movie Recommendation System in Indian Setting

Authors: Sriparna Saha (IIT Patna) and Naoyuki Onoe (Sony Research India Pvt. Ltd.)
INNS DLIA Workshop /IJCNN 2023 | April 2023

|

Revisiting Class Imbalance for End-to-end Semi-Supervised Object Detection

Authors: Purbayan Kar, Vishal Chudasama, Pankaj Wasnik and Naoyuki Onoe (Sony Research India Pvt. Ltd.)
Efficient Deep Learning for Computer Vision (ECV) Workshop in CVPR 2023 | April 2023

|

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing

Authors: Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe
Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS) | February 2023

Read More ➜

|

Hierarchical disentangled representation learning for singing voice conversion

Authors: Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji
Machine Learning (cs.LG); Audio and Speech Processing (eess.AS) | February 2023

Read More ➜

|

Graph Network based Approaches for Multi-modal Movie Recommendation System

Authors: Daipayan Chakder**, Parbir Mondal**, Subham Raj**, Sriparna Saha**, Angshuman Gosh, Naoyuki Onoe
IEEE International Conference on System, Man, and Cybernetics (SMC 2022) | November 2022

Read More ➜

|

Semi-supervised Acoustic and Language Modeling for Hindi ASR

Authors:Tarun Sai Bandarupalli*, Shakti Rath*, Nirmesh Shah, Onoe Naoyuki, Sriram Ganapathy*
INTERSPEECH 2022| September 2022

Read More ➜

|

Towards Developing a Multi-Modal Video Recommendation System

Authors: Sriram Pingali**, Prabir Mondal**, Daipayan Chakder**, Sriparna Saha**, Angshuman Ghosh
International Joint Conference on Neural Networks (IJCNN 2022)| September 2022

Read More ➜

|

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer

Authors: Shrutina Agarwal*, Sriram Ganapathy*, Naoya Takahashi
INTERSPEECH 2022| September 2022

Read More ➜

|

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Authors: Vishal Chudasama, Purbayan Kar, Ashish Gudmalwar, Nirmesh Shah, Pankaj Wasnik, Naoyuki Onoe
Conference on Computer Vision and Pattern Recognition (CVPR 2022)| June 2022

Read More ➜

|

A Unified Model for Fingerprint Authentication and Presentation Attack Detection

Authors: Additya Popli***, Saraansh Tandon***, Joshua J. Engelsma#, Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri***
International Conference on Acoustics, Speech, and Signal Processing (IJCB 2021)| April 2021

Read More ➜

|

End-to-end lyrics Recognition with Voice to Singing Style Transfer

Authors: Sakya Basak*, Shrutina Agarwal*, Sriram Ganapathy*, Naoya Takahashi
International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)| February 2021

Read More ➜

***International Institute of Information Technology Hyderabad **Indian Institute of Technology Patna *Indian Institute of Science, Bangalore #Michigan State University