Multidimensional Quality Assessment of Speech-to-Speech Translation:

RecordNumber
1392
CallNo
ENG2 759
Author
يوسفيان، ستايش
English Author
Setayesh Yousefian
FarsiTitle
ارزيابي كيفي چندجانبه ترجمۀ گفتار به گفتار: مورد پژوهي چت جي پي تي نسخۀ چهارِ اُ در گفت و گوهاي روزمرۀ فارسي- انگليسي
Title
Multidimensional Quality Assessment of Speech-to-Speech Translation:
Degree
كارشناسي ارشد
Date
1404/06/16
Collation
184 p.
Supervisor
دكتر حسين براتي
Consultor
زهرا اميريان
Persain Descriptors
هوش مصنوعي , ترجمه گفتار به گفتار (S2ST) , ارزيابي كيفيت ترجمه (TQA) , شاخص‌هاي كيفيت چندبعدي (MQM)
English Abstract
The purpose of this research was to assess the quality of an advanced multimodal large language model, Generative Pre-Trained Transformer 4 Omni (ChatGPT-4o)’s real-time Speech-to-Speech Translation (S2ST) system performance in everyday bilingual Persian-English conversations, employing a meticulously adapted Multidimensional Quality Metrics (MQM) as a Translation Quality Assessment (TQA) framework. To collect data, five bilingual dialogues were specially scripted an‎d performed by trained speakers. This ensured a diverse range of conversational challenges that reflect real-world use cases. The S2ST text an‎d audios was analyzed to identify the types, frequencies, an‎d severities of errors. This analysis also assessed performance acceptability using the MQM scoring model. Following that, a qualitative analysis was conducted to achieve a comprehensive understan‎ding of the nature of errors an‎d the potential factors contributing to their occurrence. The studyʹs findings revealed a total of 178 instances of errors distributed across nine distinct dimensions an‎d associated error types. Notably, the majority of these errors, specifically 144, were classified as minor, indicating that while they were frequent, they were less severe in nature. Additionally, the quality score of ChatGPT-4o’s S2ST passed the predefined threshold of acceptability based on a calibrated MQM scoring model. The study showed that the immediate focus should be placed on architectural advancement, specifically in the text-to-speech an‎d neural machine translation components of ChatGPT-4o’s S2ST system, to improve real-time performance an‎d accuracy dimensions of the adapted MQM model. This research provided insights for improving S2ST quality an‎d supports the development of human-centered AI communication technologies.
FarsiAbstract
پيشرفت سريع هوش مصنوعي، به‌ويژه از طريق توسعه سامانه‌هاي ترجمه گفتار به گفتار تأثير قابل‌توجهي بر ارتباطات جهاني داشته است. چت‌جي‌پي‌تي نسخۀ چهارِ اُ به‌عنوان يك مدل هوش مصنوعي، به يكي از مدل¬هاي زباني بزرگ و چندرسانه‌اي در اين حوزه تبديل شده است و ارزيابي جامع كيفيت آن ضروري است. اين مطالعه به يك خلأ مهم يعني نبود چارچوب ارزيابي جامع براي بررسي كيفيت سامانه‌هاي ترجمه گفتار به گفتار به‌ويژه براي جفت زبان پيچيده و چندوجهي فارسي_انگليسي پرداخت. هدف اين پژوهش، ارزيابي نظام‌مند كيفيت عملكرد اين نرم¬افزار در ترجمه گفتار به گفتار مكالمات روزمره دوزبانه بود. در پيشبرد اين امر، از چارچوب شاخص‌هاي كيفيت چندبعديِ متناسب‌سازي‌شده براي ارزيابي عملكرد سامانه‌هاي ترجمه گفتار به گفتار بهره گرفته شد. براي جمع‌آوري داده‌ها، پنج گفتگوي دوزبانه به‌صورت ويژه طراحي و توسط گويندگان آموزش‌ديده اجرا شد تا مجموعه‌اي متنوع از چالش‌هاي مكالمه‌اي واقعي ايجاد شود. سپس، خروجي اين ترجمه تحت تجزيه و تحليل كيفي و كمي خطا قرار گرفت. يافته‌هاي پژوهش نشان مي‌دهد كه اين مدل هوش مصنوعي در ترجمه گفتار به گفتار نمره قابل‌ قبولي را كسب كرد؛ با اين حال اين مدل زباني با وجود توانايي پر كردن شكاف‌هاي ارتباطي، خطاهايي در زمينه دقت، رواني، اصطلاحات، قواعد متني، سبك، هنجارهاي بومي و عملكرد بلادرنگ دارد و هنوز به‌طور كامل به ظرافت¬هاي اجتماعي، عاطفي و فرا¬زباني تسلط نيافته‌ است. بر اساس اين يافته ها و پيشرفت¬هاي روزافزون تكنولوژي هوش مصنوعي، به پژوهشگران توصيه مي‌شود كه تحقيقات آتي فراتر از مكالمات روزمره رفته و عملكرد سامانه¬هاي گفتار به گفتار را در حوزه‌هاي تخصصي و حساس‌ مانند پزشكي يا حقوقي به طور مستقل يا در مقايسه با عملكرد ترجمه شفاهي انساني ارزيابي كنند؛ همچنين با استفاده پيشرفته از مهندسي پرسش (pro‎mp‎t engineering)، انواع خطاهاي خاص را بررسي نمايند. علاوه بر اين، نياز مبرمي به توسعه شاخص‌هاي ارزيابي نوين و كاربرمحور احساس مي‌شود؛ شاخص‌هايي كه بتوانند تأثير خطاها بر تجربه كاربر را به‌طور دقيق‌تر منعكس سازند. در نهايت، اميد است كه اين پژوهش گامي مؤثر در جهت توسعه فناوري‌هاي ارتباطي هوشمند، قابل اعتماد و انسان‌محور برداشته باشد.
DataEntry Person
ستايش يوسفيان
identification number
4021744021
field
مترجمي زبان انگليسي
educational group
زبان و ادبيات انگليسي
persain approval page
146698
english letter approval page
146699
number of chapters
5
full text
146700
full text word latex
146701
home pages
146702
chapter one
146703
second chapter
146704
chapter 3
146705
chapter 4
146706
chapter 5
146707
table of contents
146708
sources of references
146709
english descriptors
Artificial intelligence (AI) , Speech-to-Speech Translation (S2ST) , Translation Quality Assessment (TQA) , Multidimensional Quality Metrics (MQM)
english descriptors - جزئيات
Link To Document :
https://lib.ui.ac.ir/dL/search/default.aspx?Term=1392&Field=0&DTC=6

يوسفيان، ستايش

Multidimensional Quality Assessment of Speech-to-Speech Translation: