پيش‌بيني هم‌نويسندگي پژوهشگران حوزه علم اطلاعات و دانش‌شناسي مبتني بر پيش‌بيني لينك

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

علم اطلاعات و دانش شناسي- مديريت اطلاعات

دانشكده

علوم تربيتي و روان شناسي

تاريخ دفاع

1404/07/22

صفحه شمار

133 ص .

استاد راهنما

مهرداد چشمه سهرابي , علي منصوري

استاد مشاور

دكتر مريم لطفي شهرضا

كليدواژه فارسي

شبكه هم‌نويسندگي , همكاري علمي , پيش‌بيني لينك , گراف دانش , علم اطلاعات و دانش‌شناسي

چكيده فارسي

پژوهش حاضر با هدف پيش‌بيني هم‌نويسندگي پژوهشگران حوزه علم اطلاعات و دانش‌شناسي مبتني بر پيش‌بيني لينك در گراف دانش اين حوزه انجام شد. اين پژوهش از نظر هدف در گروه پژوهش‌هاي كاربردي قرار مي‌گيرد و از نظر نوع رويكرد، پيمايشي _ اكتشافي است. جامعه اين پژوهش شامل مقالات مجلات داراي رتبه Q1 تا Q3 در پايگاه استنادي اسكوپوس در حوزه علم اطلاعات و دانش‌شناسي است كه در بازه زماني ده ساله (2015-2024) منتشر شده‌اند. در مرحله تحليل داده و بخصوص در مرحله كاهش ابعاد، مقالاتي با نويسندگان بدون همكار، نويسندگان داراي كمتر از 3 مقاله، نويسندگان داراي كمتر از 3 همكاري مشترك و نويسندگاني كه از سال 2020 به بعد مقاله‌اي چاپ نكرده‌اند از مجموعه جامعه پژوهش حذف شدند، چرا كه هدف از اين پژوهش تمركز بر شبكه‌اي پايدار و نويسندگان فعال در عرصه همكاري‌هاي علمي بود. پس از پيش‌پردازش داده‌ها، به طراحي گراف دانش ناهمگن اين حوزه پرداخته شد كه شامل موجوديت‌هاي نويسندگان، مقالات، مجلات، وابستگي‌هاي سازماني، خوشه‌هاي كليدواژه، كشورها و برخي از ويژگي‌ها و ارتباطات بين موجوديت‌ها است. با در نظر گرفتن ماژول HeteroGNN به عنوان يك روش يادگيري عميق و توليد بردار بازنمايي موجوديت‌هاي شبكه به پردازش داده‌ها پرداخته شد. پس از آن بردارهاي خروجي HeteroGNN به عنوان ورودي ماژول LinkPredictor در نظر گرفته شدند كه با تنظيم حد آستانه 5/0 در مرحله ارزيابي، روابط هم‌نويسندگي بالقوه‌اي كه امتيازي بالاتر از اين مقدار را داشتند، شناسايي شدند. بررسي روابط شناسايي شده نشان داد كه نويسندگان اشتراكات و شباهت‌هاي بسياري در برخي از موجوديت‌ها از جمله وابستگي سازماني، كشور، تعداد همكاري‌هاي قبلي، تعداد مقالات منتشر شده، مجلاتي كه نويسنده با آن‌ها همكاري داشته است، چارك و اس جي آر مجله، كليدواژه‌هاي نويسندگان، همكاران مشترك قبلي و ميزان استناد و زبان مقالات قبلي دارند كه مي‌تواند در تصميم‌گيري براي تشكيل تيم‌هاي پژوهشي مورد استفاده قرار گيرد. براي ارزيابي دقت مدل پيش‌بيني پژوهشگران حوزه علم اطلاعات و دانش‌شناسي، داده‌ها به سه بخش آموزشي (70%)، اعتبارسنجي (15%) و آزمايشي (15%) تقسيم شد. مدل با 100 دوره آموزش داده شد و شاخص‌هاي Hits@3، Hits@5، Hits@10، Precision، Recall، F1-score و AUC براي مجموعه‌ي آزمايشي محاسبه گرديد كه به ترتيب برابر با 0098/0، 0163/0، 0327/0، 8207/0، 000/1، 9015/0 و 9416/0 است. همچنين براي گزارش عملكرد ثبات مدل در برخورد با داده‌هاي مختلف از روش 5-fold cross validation استفاده شد كه با محاسبه همه‌ي معيارهاي ارزيابي براي همه‌ي fold‌ها توازن بالاي مدل در تفكيك داده‌هاي مثبت از منفي اثبات شد. ميانگين شاخص‌هاي Hits@3، Hits@5، Hits@10، Precision، Recall، F1-score و AUC نيز در اين روش به ترتيب برابر با 0078/0، 0133/0، 0273/0، 8424/0، 9808/0، 9062/0 و 9525/0 است. انتظار مي‌رود مدل طراحي شده و ارتباطات بالقوه پيش‌بيني شده در اين پژوهش، به عنوان مبنايي براي توصيه مناسب‌ترين همكاران علمي در حوزه علم اطلاعات و دانش‌شناسي و تشكيل تيم‌هاي پژوهشي جديد به كار گرفته شوند و نيز در فرآيند تصميم‌گيري براي توسعه همكاري‌هاي علمي و سياست‌گذاري علمي مورد استفاده قرار گيرد.

كليدواژه لاتين

Co-authorship Network , Scientific Collaboration , Link Prediction , Knowledge Graph , Knowledge an‎d Information Science

عنوان لاتين

Predicting the Co-authorship of Researchers in the Field of Knowledge an‎d Information Science Based on Link Prediction

گروه آموزشي

علم اطلاعات و دانش شناسي

چكيده لاتين

The present study aims to predict co-authorship among researchers in the field of knowledge an‎d information science based on link prediction in the knowledge graph of this domain. In terms of purpose, this research falls into the category of applied studies, an‎d in terms of approach, it is exploratory survey. The research population consists of journal articles ranked Q1-Q3 in the Scopus database in the field of knowledge an‎d information science, published over a ten-year period (2015-2024). During the data analysis stage, particularly in the dimensional reduction process, articles authored by researchers without collaborators, authors with fewer than three articles, authors with fewer than three co-authorships, an‎d authors who had not published an article since 2020 were excluded from the study population. This was done to focus on a stable network an‎d active authors engaged in scientific collaboration. After preprocessing the data, a heterogeneous knowledge graph for this field was designed, consisting of entities such as authors, articles, journals, affiliations, keyword clusters, countries, an‎d several features an‎d relationships between these entities. Using the HeteroGNN module as a deep learning method to generate vector representations of the network entities, the data were processed. The output vectors from HeteroGNN were then used as input to the LinkPredictor module. By setting a threshold of 0.5 in the eva‎luation stage, potential co-authorship relationships with scores above this value were identified. The analysis of identified relationships revealed that authors share significant similarities in several entities, including affiliations, country, number of prior collaborations, number of published articles, journals in which they had published, journal quartile ranking an‎d SJR, authors’ keywords, previous mutual collaborators, citation counts, an‎d language of previous publications. These factors can be utilized in decision-making for forming research teams. To eva‎luate the accuracy of the prediction model for researchers in knowledge an‎d information science, the dataset was divided into three subsets: training (70%), validation (15%), an‎d testing (15%). The model was trained for 100 epochs, an‎d the eva‎luation metrics Hits@3, Hits@5, Hits@10, Precision, Recall, F1-score, an‎d AUC were calculated for the test set, yielding 0.0098, 0.0163, 0.0327, 0.8207, 1.0000, 0.9015, an‎d 0.9416, respectively. Furthermore, 5-fold cross validation was applied to report the stability of the model across different data samples. The calculation of eva‎lutaion metrics for all folds confirmed the model’s strong ability to distinguish between positive an‎d negative data. The average values of Hits@3, Hits@5, Hits@10, Precision, Recall, F1-score, an‎d AUC in this method were 0.0078, 0.0133, 0.0273, 0.8424, 0.9808, 0.9062, an‎d 0.9525, respectively. It is expected that the designed model an‎d the predicted potential relationships in this study will serve as a basis for recommending suitable scientific collaborators in the field of Knowledge an‎d Information Science, an‎d also be applied in decision-making processes related to the development of scientific collaborations an‎d science policy-making.

تعداد فصل ها

فهرست مطالب pdf

148145

نويسنده

بيديان، فرشته

لينک به اين مدرک

https://lib.ui.ac.ir/dl/search/default.aspx?Term=25175&Field=0&DTC=3