سيستم پرسش‌وپاسخ واقع‌نما از ترجمه فارسي قرآن كريم

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي كامپيوتر - نرم افزار

دانشكده

مهندسي كامپيوتر

تاريخ دفاع

1403/06/07

صفحه شمار

88 ص.

استاد راهنما

رضا رمضاني

كليدواژه فارسي

سيستم پرسش‌وپاسخ دامنه‌بسته , ترجمه قرآن , پرسش واقع‌نما , مدل زباني , پرسش‌وپاسخ مبتني بر آيه , بازيابي معنايي

چكيده فارسي

قرآن كريم مهم‌ترين منبع آموزه‌هاي دين اسلام به‌شمار مي‌رود. يكي از بهترين و اصلي‌ترين منابعي كه مسلمانان براي يافتن پاسخ پرسش‌هاي ديني خود به آن مراجعه مي‌كنند، متن قرآن كريم است. امروزه با گسترش استفاده از اينترنت و پيشرفت‌هاي چشمگير هوش مصنوعي در زمينه‌ي ايجاد چت‌بات‌ها و سيستم‌هاي پرسش‌وپاسخ هوشمند، كاربران تمايل دارند در كوتاه‌ترين زمان ممكن به پاسخ پرسش‌هاي خود در حوزه‌هاي مختلف دسترسي داشته باشند. با اين وجود، متون ديني و به‌ويژه متن قرآن كريم داراي ويژگي‌هاي منحصربه‌فردي است كه سيستم‌هاي پرسش‌وپاسخ متداول را در پاسخ‌گويي به برخي از پرسش‌هاي قرآني با مشكل مواجه كرده است. اين مشكل در مورد ترجمه‌هاي فارسي قرآن كريم نيز وجود دارد. متن ترجمه‌هاي قرآني از نظر ساختار و واژگان با منابع متني مورد استفاده براي آموزش مدل‌هاي هوش مصنوعي متفاوت است. همچنين سيستم‌هاي قرآني موجود نيز از روش‌هاي سنتي براي يافتن پاسخ پرسش‌ها استفاده مي‌كنند و دقت مناسبي ندارند. در اين پژوهش تلاش شده است تا با استفاده از قابليت‌هاي مدل‌هاي زباني در پردازش متن بر اساس جنبه‌هاي معنايي زبان طبيعي، يك سيستم پرسش‌وپاسخ به‌منظور يافتن پاسخ پرسش‌هاي واقع‌نما از ترجمه فارسي قرآن كريم ارائه گردد. با توجه به عدم وجود داده‌ها‌ي مورد نياز، در ابتدا مجموعه داده‌ي پرسش و پاسخ از ترجمه قرآن كريم با استفاده از يك رويكرد تركيبي ايجاد شده است. در اين رويكرد، از روش‌هاي خودكار، نيمه‌خودكار و دستي براي توليد داده‌ها استفاده شده است. سپس از مجموعه داده‌ي به‌دست‌آمده براي آموزش بخش‌هاي مختلف سيستم پرسش‌وپاسخ مذكور استفاده شده است. معماري سيستم پيشنهادي از دو مؤلفه‌ي اصلي بازيابي آيات و خوانشگر ماشيني تشكيل شده است. اين دو مؤلفه به‌ترتيب وظيفه‌ي بازيابي آيات مرتبط با پرسش و يافتن پاسخ نهايي را بر عهده دارند. در ساختار هر يك از اين مؤلفه‌ها از مدل‌هاي زباني استفاده شده است. سيستم پرسش‌وپاسخ ارائه‌شده در اين پژوهش توانسته است به‌ترتيب به مقادير 39% و 85% براي معيارهاي بازخواني و F1 دست يابد. اين نتايج نشان‌دهنده‌ي عملكرد قابل‌قبول سيستم پيشنهادي در بازيابي آيات مرتبط و يافتن پاسخ پرسش‌هاي واقع‌نماي قرآني از متن ترجمه‌هاي فارسي قرآن كريم در مقايسه با ساير روش‌هاي موجود است. همچنين عملكرد سيستم پرسش‌وپاسخ مورد نظر بهتر از عملكرد مدل‌‌هاي زباني بزرگ در زمينه‌ي يافتن پاسخ پرسش‌هاي قرآني است. بر اساس بررسي انجام‌شده، سيستم پيشنهادي در يافتن پاسخ پرسش‌هاي مجموعه داده‌ي ارزيابي، موفق‌تر از چت‌ جي‌پي‌تي (مدل‌هاي GPT-3.5 و GPT-4o) عمل كرده است.

كليدواژه لاتين

Closed-Domain Question Answering , Quran’s Translation , Factoid Question , Language Model , Semantic Retrieva‎l

عنوان لاتين

A Factoid Question Answering System on Persian Translation of Holy Quran

گروه آموزشي

مهندسي نرم افزار

چكيده لاتين

The Holy Quran is the most important source of knowledge in Islam. It is one of the main references that Muslims refer to and rely on to find answers to their religious questions. Nowadays, with the widespread use of the internet and significant advancements of Artificial Intelligence in the field of chatbots and Question Answering (QA) systems creation, users tend to get the answers to different types of questions in the shortest possible time. However, religious textual sources such as Quran, have some unique features that make it challenging for common QA systems to find the answer of Quranic questions which is obviously because these systems are trained with different types of textual resources. The same problem also exists for Persian translations of the Holy Quran. Quranic translations have a unique style and are different from other Persian textual resources in terms of structure and vocabulary. Furthermore, most of the existing Quranic sytems use traditional approaches which makes them inefficient to answer users’ questions. In this study, an attempt has been made to provide a closed-domain QA system in order to answer factoid questions from the Persian translation of the holy Quran. Due to the lack of a proper dataset, a dataset of question-verse-answer triplets has been created. The approach used for creating the dataset is a hybrid approach which is consisted of automatic, semi-automatic, and manual techniques. The obtained dataset has been used to train different components of the mentioned QA system. The proposed system comprises two main modules which are responsible for retrieving related verses and extracting the final answer from related verses respectively. language models have been used in the structure of both the retriever and reader modules. The QA system presented in this research achieved a recall of 39% and an F1 score of 85%. These results show the acceptable performance of the proposed system in retrieving related verses and finding accurate answers to factoid question from the Persian translation of the Quran. According to the eva‎luation results, the proposed system provides more accurate answers to qurainc questions compared to widely used LLM based models such as ChatGPT.

تعداد فصل ها

فهرست مطالب pdf

34777

نويسنده

شمس دستجردي، نگين

لينک به اين مدرک

https://lib.ui.ac.ir/dl/search/default.aspx?Term=23774&Field=0&DTC=3