
{"id":7568,"date":"2023-03-29T11:02:39","date_gmt":"2023-03-29T11:02:39","guid":{"rendered":"https:\/\/www.sonyresearchindia.com\/a-man-with-a-plan-vishals-first-business-trip-to-japan-copy-copy-copy\/"},"modified":"2023-11-30T10:39:39","modified_gmt":"2023-11-30T10:39:39","slug":"non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs","status":"publish","type":"post","link":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/","title":{"rendered":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"7568\" class=\"elementor elementor-7568\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-cd44eb5 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"cd44eb5\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9f11b70\" data-id=\"9f11b70\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-215a70e elementor-widget elementor-widget-heading\" data-id=\"215a70e\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">BLOGS<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-28dc161 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"28dc161\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-63cf269\" data-id=\"63cf269\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6837436 elementor-widget elementor-widget-heading\" data-id=\"6837436\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Non-parallel Emotional Voice Conversion for <br>Unseen\nSpeaker-emotion Pairs<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9bd1630 elementor-widget elementor-widget-text-editor\" data-id=\"9bd1630\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tBy Nirmesh J. Shah, Senior Research Scientist At Sony Research India\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7a034cb elementor-widget elementor-widget-text-editor\" data-id=\"7a034cb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>29<sup>th<\/sup> March 2023<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-acbeaeb elementor-widget elementor-widget-text-editor\" data-id=\"acbeaeb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-acbeaeb elementor-widget elementor-widget-text-editor\" data-id=\"acbeaeb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\"><div class=\"elementor-widget-container\"><div class=\"elementor-text-editor elementor-clearfix\"><p>In this blog, Nirmesh J. Shah sums up the paper titled <a href=\"https:\/\/arxiv.org\/pdf\/2302.10536.pdf\" target=\"_blank\" rel=\"noopener\">\u201cNon-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs\u201d<\/a> co-authored by Mayank Singh, Naoya Takahashi and Naoyuki Onoe which has been accepted at the <a href=\"https:\/\/ieeexplore.ieee.org\/xpl\/conhome\/10094559\/proceeding\" target=\"_blank\" rel=\"noopener\">International Conference on Acoustics, Speech, and Signal Processing (ICASSP)<\/a>, hosted in Rhodes Island, Greece from 04-10 June 2023.<\/p><\/div><\/div><\/div><p><strong>Collaborative Background:<\/strong> In Sony Research India, we are given opportunities to work and collaborate with experts across global Sony Group of Companies. Being the experts in developing speech technologies for Indian languages, we explored the opportunity to forge a close collaboration with Dr.Naoya Takahashi, one of the leading experts in the audio\/speech domain in Sony Group Corporation, Japan to develop an Emotional Voice Conversion system.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d1cd4f9 elementor-hidden-desktop elementor-hidden-tablet elementor-hidden-mobile elementor-widget elementor-widget-text-editor\" data-id=\"d1cd4f9\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-element elementor-element-acbeaeb elementor-widget elementor-widget-text-editor\" data-id=\"acbeaeb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\"><div class=\"elementor-widget-container\"><div class=\"elementor-text-editor elementor-clearfix\"><p>This article presents our work on non-parallel emotion voice conversion and addresses the problem of converting the emotion of speakers (of whom we only possess neutral data during the time of training and testing i.e., for unseen speaker-emotion combinations) and is based on a research paper accepted in ICASSP 2023, \u201cNonparallel Emotional Voice Conversion for Unseen Speaker-Emotion pairs using Dual Domain Adversarial Network &amp; Virtual Domain Pairing\u201d, co-authored by:<\/p><ul><li>Nirmesh Shah, (Sony Research India Pvt. Ltd.)<\/li><li>Mayank Singh, (Sony Research India Pvt. Ltd.)<\/li><li>Naoya Takahashi (Sony Group Corporation, Japan)<\/li><li>Naoyuki Onoe (Sony Research India Pvt. Ltd.)<\/li><\/ul><\/div><\/div><\/div><p><strong>Collaborative Background:<\/strong> In Sony Research India, we are given opportunities to work and collaborate with experts across global Sony Group of Companies. Being the experts in developing speech technologies for Indian languages, we explored the opportunity to forge a close collaboration with Dr.Naoya Takahashi, one of the leading experts in the audio\/speech domain in Sony Group Corporation, Japan to develop an Emotional Voice Conversion system.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-9b69060 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"9b69060\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-cfbe302\" data-id=\"cfbe302\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fa4789b elementor-widget elementor-widget-text-editor\" data-id=\"fa4789b\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<h4>What is an Emotional Voice Conversion System and its applications?<\/h4>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7132bf0 elementor-widget elementor-widget-text-editor\" data-id=\"7132bf0\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Emotional voice conversion (EVC) system converts the emotion of a given speech signal from one style to another,without modifying the linguistic content of the signal. EVC technology has potential applications in movie dubbing, conversational assistance, cross-lingual synthesis, etc.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-85bbfff elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"85bbfff\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8881599\" data-id=\"8881599\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3818f26 elementor-widget elementor-widget-text-editor\" data-id=\"3818f26\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<h4>Why is it important to develop EVC for unseen Speaker-emotion pairs?<\/h4>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6d045fb elementor-widget elementor-widget-text-editor\" data-id=\"6d045fb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tMost of the previous approaches of EVC systems can convert the emotion of a speaker whose emotional data is present either at the time of training or testing, i.e., for seen speaker-emotion combinations only. However, collecting emotional voice for target speakers is often expensive, time- consuming, and sometimes impossible. In this paper, we address the problem of converting the emotion of speakers (of whom we only possess data having neutral emotion) by leveraging emotional speech data from other supporting speakers.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a8149c3 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a8149c3\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-80dcf9e\" data-id=\"80dcf9e\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ec264d8 elementor-widget elementor-widget-text-editor\" data-id=\"ec264d8\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<h4>How have we achieved EVC for unseen speaker-emotion pairs?<\/h4>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1833c2e elementor-widget elementor-widget-text-editor\" data-id=\"1833c2e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWe first modify the StarGANv2-VC architecture for converting the speaker and emotion stylessimultaneously in a unified model by utilizing two encoders for learning speaker style and emotion style embeddings along with dual domain source classifiers for classifying source speaker and the emotion style. We then devise training strategies to achieve EVC for Unseen Speaker-Emotion Pairs (i.e., EVC-USEP) by using emotional data from supporting speakers. To achieve this, we propose a Virtual Domain Pairing (VDP) training strategy, which randomly generates the combinations of speaker-emotion pairs that are not present in the real data without compromising the min-max game of a discriminator and generator in adversarial training. In particular, a fake-pair masking (FPM) strategy is proposed to ensure that the discriminator does not overfit because of the fake pairs. We refer our proposed system as EVC-USEP throughout the paper.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a398304 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a398304\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-5813c34\" data-id=\"5813c34\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-0731ac8\" data-id=\"0731ac8\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-44a29ed elementor-widget elementor-widget-image\" data-id=\"44a29ed\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"607\" height=\"250\" src=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/StarGANv2_EVC-2-1-q33wxq0sf6xwobwr00mzdh0nkxdbrkasug3ikdh1q8-1.png\" class=\"attachment-full size-full wp-image-7572\" alt=\"\" srcset=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/StarGANv2_EVC-2-1-q33wxq0sf6xwobwr00mzdh0nkxdbrkasug3ikdh1q8-1.png 607w, https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/StarGANv2_EVC-2-1-q33wxq0sf6xwobwr00mzdh0nkxdbrkasug3ikdh1q8-1-300x124.png 300w\" sizes=\"(max-width: 607px) 100vw, 607px\" style=\"width:100%;height:41.19%;max-width:607px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-13be7a0 elementor-widget elementor-widget-text-editor\" data-id=\"13be7a0\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tFigure 1: Block diagram of the proposed EVC-USEP architecture.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-274164b\" data-id=\"274164b\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-b669483 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b669483\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8e2d1b2\" data-id=\"8e2d1b2\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-96de67b elementor-widget elementor-widget-text-editor\" data-id=\"96de67b\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<h4>Key Results and Findings<\/h4>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-20f4684 elementor-widget elementor-widget-text-editor\" data-id=\"20f4684\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We presented our results on Hindi emotional database. Demo audio samples can be found online.<br \/>URL: <a href=\"https:\/\/demosamplesites.github.io\/EVCUP\/\">https:\/\/demosamplesites.github.io\/EVCUP\/<\/a><br \/><br \/>We have conducted two subjective tests, namely, mean opinion scores (MOS) and ABX test to evaluate the quality of converted voices and evaluation of emotion conversion, respectively. For objective evaluation, we use an emotion classification network to evaluate the accuracy of emotion conversion and speaker similarity scores. From both objective and subjective evaluations, we confirm that the proposed method successfully converts the emotion of the target speakers,<br \/>outperforming the baselines w.r.t. emotion similarity, speaker similarity, and quality of the converted voices, while achieving decent naturalness.<br \/><br \/>Table 1: Subjective and objective evaluations results. MOS are shown for quality along with margin of error corresponding to 95% confidence interval.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4829ba8 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4829ba8\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-80dbba7\" data-id=\"80dbba7\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ce60f53 elementor-widget elementor-widget-image\" data-id=\"ce60f53\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"514\" height=\"189\" data-src=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Blog-Imge-SRI3.png\" class=\"attachment-full size-full wp-image-7573 lazyload\" alt=\"\" data-srcset=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Blog-Imge-SRI3.png 514w, https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Blog-Imge-SRI3-300x110.png 300w\" data-sizes=\"(max-width: 514px) 100vw, 514px\" style=\"--smush-placeholder-width: 514px; --smush-placeholder-aspect-ratio: 514\/189;width:100%;height:36.77%;max-width:514px\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a617587 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a617587\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-43867b5\" data-id=\"43867b5\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f145342 elementor-widget elementor-widget-image\" data-id=\"f145342\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"700\" height=\"250\" data-src=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/blog-image-sri4-q2ykntky95dpw0cv1hkbboupj8jfgjtyu7tzg6n8cg.png\" class=\"attachment-full size-full wp-image-7574 lazyload\" alt=\"\" data-srcset=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/blog-image-sri4-q2ykntky95dpw0cv1hkbboupj8jfgjtyu7tzg6n8cg.png 700w, https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/blog-image-sri4-q2ykntky95dpw0cv1hkbboupj8jfgjtyu7tzg6n8cg-300x107.png 300w\" data-sizes=\"(max-width: 700px) 100vw, 700px\" style=\"--smush-placeholder-width: 700px; --smush-placeholder-aspect-ratio: 700\/250;width:100%;height:35.71%;max-width:700px\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-04f7976 elementor-widget elementor-widget-text-editor\" data-id=\"04f7976\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Figure 2: ABX Subjective Evaluation for Emotion Similarity.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-23c10e5 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"23c10e5\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-89a7a11\" data-id=\"89a7a11\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4473282 elementor-widget elementor-widget-text-editor\" data-id=\"4473282\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>To know more about Sony Research India\u2019s Research Publications, visit the \u2018Publications\u2019 section on our \u2018Open Innovation\u2019s page:<\/p><p><a href=\"https:\/\/www.sonyresearchindia.com\/open-innovation\/\">Open Innovation with Sony R&amp;D \u2013 Sony Research India<\/a><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>This article presents our work on non-parallel emotion voice conversion and addresses&#8230;<\/p>\n","protected":false},"author":1,"featured_media":11251,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"elementor_header_footer","format":"standard","meta":{"footnotes":""},"categories":[22,17],"tags":[],"class_list":["post-7568","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-all-blogs","category-technology","entry"],"yoast_head":"\n<title>Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India\" \/>\n<meta property=\"og:description\" content=\"This article presents our work on non-parallel emotion voice conversion and addresses...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\" \/>\n<meta property=\"og:site_name\" content=\"Sony Research India\" \/>\n<meta property=\"article:published_time\" content=\"2023-03-29T11:02:39+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-11-30T10:39:39+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"380\" \/>\n\t<meta property=\"og:image:height\" content=\"190\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"sri_user@2021\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"sri_user@2021\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\"},\"author\":{\"name\":\"sri_user@2021\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/589cf1e285a7c37cf0cb9feba7ae4338\"},\"headline\":\"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs\",\"datePublished\":\"2023-03-29T11:02:39+00:00\",\"dateModified\":\"2023-11-30T10:39:39+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\"},\"wordCount\":777,\"publisher\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization\"},\"image\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg\",\"articleSection\":[\"All Blogs\",\"Technology\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\",\"url\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\",\"name\":\"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India\",\"isPartOf\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg\",\"datePublished\":\"2023-03-29T11:02:39+00:00\",\"dateModified\":\"2023-11-30T10:39:39+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage\",\"url\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg\",\"contentUrl\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg\",\"width\":380,\"height\":190},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#website\",\"url\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/\",\"name\":\"Sony Research India\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization\",\"name\":\"sonyresearchindia\",\"url\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Sony_Logo.png\",\"contentUrl\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Sony_Logo.png\",\"width\":168,\"height\":31,\"caption\":\"sonyresearchindia\"},\"image\":{\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/589cf1e285a7c37cf0cb9feba7ae4338\",\"name\":\"sri_user@2021\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e0c9edcfb42567c720cc449d4b1e0812298e8172a5a7e4296127a0adba7e705b?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e0c9edcfb42567c720cc449d4b1e0812298e8172a5a7e4296127a0adba7e705b?s=96&d=mm&r=g\",\"caption\":\"sri_user@2021\"},\"sameAs\":[\"http:\/\/whiteriversmediasolutions.com\/staging\/SRI\"]}]}<\/script>\n","yoast_head_json":{"title":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/","og_locale":"en_US","og_type":"article","og_title":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India","og_description":"This article presents our work on non-parallel emotion voice conversion and addresses...","og_url":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/","og_site_name":"Sony Research India","article_published_time":"2023-03-29T11:02:39+00:00","article_modified_time":"2023-11-30T10:39:39+00:00","og_image":[{"width":380,"height":190,"url":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg","type":"image\/jpeg"}],"author":"sri_user@2021","twitter_card":"summary_large_image","twitter_misc":{"Written by":"sri_user@2021","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#article","isPartOf":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/"},"author":{"name":"sri_user@2021","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/589cf1e285a7c37cf0cb9feba7ae4338"},"headline":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs","datePublished":"2023-03-29T11:02:39+00:00","dateModified":"2023-11-30T10:39:39+00:00","mainEntityOfPage":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/"},"wordCount":777,"publisher":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization"},"image":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage"},"thumbnailUrl":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg","articleSection":["All Blogs","Technology"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/","url":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/","name":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs - Sony Research India","isPartOf":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#website"},"primaryImageOfPage":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage"},"image":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage"},"thumbnailUrl":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg","datePublished":"2023-03-29T11:02:39+00:00","dateModified":"2023-11-30T10:39:39+00:00","breadcrumb":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#primaryimage","url":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg","contentUrl":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/unnamed-2.jpg","width":380,"height":190},{"@type":"BreadcrumbList","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/non-parallel-emotional-voice-conversion-for-unseen-speaker-emotion-pairs\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/whiteriversmediasolutions.com\/Sony\/"},{"@type":"ListItem","position":2,"name":"Non-parallel Emotional Voice Conversion for Unseen Speaker-emotion Pairs"}]},{"@type":"WebSite","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#website","url":"https:\/\/whiteriversmediasolutions.com\/Sony\/","name":"Sony Research India","description":"","publisher":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/whiteriversmediasolutions.com\/Sony\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#organization","name":"sonyresearchindia","url":"https:\/\/whiteriversmediasolutions.com\/Sony\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/logo\/image\/","url":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Sony_Logo.png","contentUrl":"https:\/\/whiteriversmediasolutions.com\/Sony\/uvaftoap\/2023\/03\/Sony_Logo.png","width":168,"height":31,"caption":"sonyresearchindia"},"image":{"@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/589cf1e285a7c37cf0cb9feba7ae4338","name":"sri_user@2021","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/whiteriversmediasolutions.com\/Sony\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e0c9edcfb42567c720cc449d4b1e0812298e8172a5a7e4296127a0adba7e705b?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e0c9edcfb42567c720cc449d4b1e0812298e8172a5a7e4296127a0adba7e705b?s=96&d=mm&r=g","caption":"sri_user@2021"},"sameAs":["http:\/\/whiteriversmediasolutions.com\/staging\/SRI"]}]}},"_links":{"self":[{"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/posts\/7568","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/comments?post=7568"}],"version-history":[{"count":50,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/posts\/7568\/revisions"}],"predecessor-version":[{"id":11318,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/posts\/7568\/revisions\/11318"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/media\/11251"}],"wp:attachment":[{"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/media?parent=7568"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/categories?post=7568"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/whiteriversmediasolutions.com\/Sony\/wp-json\/wp\/v2\/tags?post=7568"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}