طبقه‌بندي متون فارسي در آزمون حافظه سرگذشتي به دو طبقه خاص و بيش‌كلي‌گرا

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي كامپيوتر - نرم افزار

دانشكده

مهندسي كامپيوتر

تاريخ دفاع

1403/10/29

صفحه شمار

97 ص.

استاد راهنما

محمدعلي نعمت بخش

استاد مشاور

حميدطاهر نشاط دوست

كليدواژه فارسي

آزمون حافظه سرگذشتي , خاطره خاص و عام , پردازش زبان طبيعي , بردار تعبيه شده , شبكه‌هاي عصبي

چكيده فارسي

اختلالات روانپزشكي از مشكلات پزشكي مهمي هستند كه بر عملكرد فرد تأثير مي‌گذارند و ممكن است به اختلال در فعاليت‌هاي روزمره و كاهش كيفيت زندگي منجر شوند. يكي از روش‌هاي شناسايي اين اختلالات، ارزيابي حافظه افرد است. حافظه سرگذشتي شامل تجربيات زندگي فرد است، ارزيابي اين حافظه مي‌تواند در تشخيص اختلالاتي مانند افسردگي و اضطراب موثر باشد. براي ارزيابي اين حافظه، از آزمون حافظه سرگذشتي (AMT) استفاده مي‌شود. در اين آزمون، به فرد مجموعه‌اي از كلمات با بار عاطفي مثبت، منفي يا خنثي نشان داده شده و از او خواسته مي‌شود خاطرات مرتبط با اين كلمات را بازگو كند. سپس خاطرات بيان‌شده در چهار دسته؛ خاطرات اختصاصي، خاطرات گسترده، خاطرات طبقه‌بندي‌شده و همبسته معنايي طبقه‌بندي مي‌شوند. براي بهبود و تسريع فرآيند طبقه‌بندي خاطرات، پژوهش‌هايي در زمينه خودكارسازي كلاس‌بندي خاطرات با استفاده از مدل‌هاي زباني و روش‌هاي يادگيري ماشيني در زبان‌هاي انگليسي، ژاپني و... انجام شده است. پژوهش حاضر به بومي‌سازي آزمون حافظه سرگذشتي براي زبان فارسي پرداخته است. با توجه به پيچيدگي‌هاي زبان فارسي و گستردگي دايره لغات و افعال آن، طبقه‌بندي خاطرات در اين زبان چالش‌برانگيزتر است. هدف اين تحقيق طراحي راه‌حلي است كه بتواند خاطرات بيان‌شده در آزمون حافظه سرگذشتي را به‌طور خودكار در دو دسته‌بندي باينري طبقه‌بندي كند: خاطرات اختصاصي (كلاس 1) و خاطرات غير‌اختصاصي (كلاس 0). براي دستيابي به اين هدف، بايد نگاهي سطح بالاتر به خاطرات اتخاذ شود تا ويژگي‌هاي مشترك بين خاطرات اختصاصي و غير‌اختصاصي استخراج گردد. به‌عبارت ديگر، هدف اين است كه در فرآيند طبقه‌بندي خاطرات، وابستگي به واژگان خاطرات كاهش يابد و طبقه‌بندي به‌طور مستقل از تغييرات واژگان صورت گيرد. استفاده از دستور زبان فارسي و طراحي تعبيه برداري جملات مي‌تواند به تحقق اين هدف كمك كند. به‌علاوه، پردازش بردارهاي تعبيه‌شده به‌دليل حجم كم‌تر و محدوديت‌هاي آن‌ها، فرآيند ساده‌تري نسبت به پردازش بردارهاي تعبيه‌شده براي هر كلمه در جمله دارد. نتايج پژوهش نشان مي‌دهد كه طبقه‌بند پيشنهادي طراحي‌شده با بردارهاي تعبيه‌شده، عملكرد بهتري نسبت به مدل‌ زباني توكابرت مبتني بر برت، كه با داده‌هاي متني حافظه تنظيم دقيق شده‌اند، نشان داده است. اين طبقه‌بند به‌طور خودكار و بدون وابستگي به كلمات خاطره قادر به پيش‌بيني كلاس خاطرات بوده و توانسته است نتايج بهتري در مقايسه با مدل‌هاي ديگر كسب كند. طبقه‌بند پيشنهادي در ارزيابي بامعيارهاي مختلف توانسته نتايج، صحت %86، دقت 87%، يادآوري 87% و خاصيت 83% را بدست آورد. طبقه‌بندهاي ديگري نيز جهت مقايسه و ارزيابي طبقه‌بند پيشنهادي با مجموعه داده ترجمه شده خاطرات انگليسي آموزش داده شدند. تركيب مدل‌هاي مختلف مانند طبقه‌بند پيشنهادي، شبكه عصبي و مدل 5NN بر روي داده‌هاي تست نيز بررسي شد كه نتايج را به صورت ناچيز بهبود داد اما نتوانست تاثير قابل توجهي داشته باشد. نتايج حاصل از طبقه‌بند پيشنهادي نشان‌ داد پيشرفت خوبي در زمينه خودكارسازي فرآيند طبقه‌بندي خاطرات آزمون حافظه سرگذشتي در زبان فارسي نسبت به مدل زباني از پيش آموزش داده شده، اتفاق افتاده است.

كليدواژه لاتين

Autobiographical Memory Test , Specific and Overgeneral Memory , Natural Language Processing , Embedded Vector , Neural Networks

عنوان لاتين

Classification of Persian Texts in Autobiographical Memory Test into two specific and overgeneral classes

گروه آموزشي

مهندسي نرم افزار

چكيده لاتين

Psychiatric disorders are significant medical issues that affect an individualʹs functioning and may lead to disruptions in daily activities and a reduced quality of life. One method for identifying these disorders is the eva‎luation of individuals’ memory. Autobiographical memory consists of a person’s life experiences, and assessing this memory can be effective in diagnosing disorders such as depression and anxiety. The Autobiographical Memory Test (AMT) is used to eva‎luate this type of memory. In this test, individuals are presented with a set of emotionally positive, negative, or neutral words and are asked to recall memories associated with these words. The recalled memories are then classified into four categories: specific memories, extended memories, categorized memories, and semantically associated memories. To enhance and accelerate the classification process, research has been conducted on automating memory classification using language models and machine learning methods in languages such as English, Japanese, and others. The present study aims to localize the Autobiographical Memory Test for the Persian language. Given the complexities of Persian, including its extensive vocabulary and diverse verb structures, classifying memories in this language is more challenging. The goal of this research is to develop a solution that can automatically classify the recalled memories from the AMT into a binary classification: specific memories (Class 1) and non-specific memories (Class 0). To achieve this objective, a higher-level approach to memories must be adopted to extract common features between specific and non-specific memories. In other words, the aim is to reduce dependency on the vocabulary of the memories in the classification process and ensure that classification occurs independently of word variations. The use of Persian Grammer and sentence embedding design can help achieve this goal. Moreover, processing embedded sentence vectors is simpler than processing word-level embeddings due to their lower dimensionality and constraints. The research findings indicate that the proposed classifier, designed with embedded vectors, outperforms the TookaBERT-based language model fine-tuned with textual memory data. The proposed classifier can automatically predict the memory class without reliance on specific words and has achieved superior results compared to other models. The classifier, eva‎luated using different metrics, achieved an accuracy of 86%, precision of 87%, recall of 87%, and specificity of 83%. Other classifiers were also trained using a translated dataset of English autobiographical memories for comparison and eva‎luation. The combination of different models, such as the proposed classifier, a neural network, and the 5NN model, was tested on the dataset, slightly improving the results but not significantly impacting overall performance. The results of the proposed classifier demonstrate a considerable advancement in automating the classification process of autobiographical memory test responses in Persian compared to pre-trained language models.

تعداد فصل ها

فهرست مطالب pdf

120186

نويسنده

موذني قهدريجاني، ابوالفضل

لينک به اين مدرک

https://lib.ui.ac.ir/dl/search/default.aspx?Term=24368&Field=0&DTC=3