-
شماره ركورد
25216
-
شماره راهنما
STA2 299
-
نويسنده
كريمي، مريم
-
عنوان
كاوش يادگيري با نظارت از طريق تحليل مكاوش يادگيري با نظارت از طريق تحليل مولفههاي اصلي احتمالگرايانه تاوانيده
-
مقطع تحصيلي
كارشناسي ارشد
-
رشته تحصيلي
آمار اقتصادي
-
دانشكده
رياضي و آمار
-
تاريخ دفاع
1404/07/30
-
صفحه شمار
135 ص .
-
استاد راهنما
ايرج كاظمي
-
استاد مشاور
هوشنگ طالبي حبيب آبادي
-
كليدواژه فارسي
انتخاب بعد , تاوانيدن , تحليل مؤلفەهاي اصلي احتمال گرايانه , تجزيه مقدار ويژه تكين , درستنمايي برشي , رويكرد بيز محاسباتي
-
چكيده فارسي
كاهش بعد در تحليل دادههاي ابعاد بالا نيازمند رويكردهاي مؤثر در حوزه يادگيري است. در بسياري از موارد، تعيين كاهش بعد لازم است بر اساس روش هاي مناسب تحليل داده انجام شود و تعداد ابعاد كاهش يافته به دقت تخمين زده شود. اين تخمين معمولا_ بهصورت يك مسئله بهينهسازي مقيد مطرح مي شود كه در آن مقدار تخميني ابعاد بر اساس بيشينه كردن معيار خاصي در تابع درستنمايي برشي و تاوانيده در چارچوب تحليل مؤلفههاي اصلي احتمالگرايانه تعيين ميگردد. در اين شيوه، برخلاف ساير روش هاي مبتني بر بيشينهسازي تاوانيده كه وابسته به تنظيم پارامتر جريمه بهينه هستند، استفاده از يك راهبرد ميانگينگيري پيشنهاد شده است. اين راهبرد امكان انتخاب مطلوب تعداد ابعاد كاهش يافته را در طيف گستردهاي از پارامترهاي جريمه فراهم ميكند. كارآمدي اين روش پيشنهادي با معيارهاي مختلف در مطالعات شبيهسازي و تحليل دادههاي واقعي حوزه ژنتيك ارزيابي شده است. نتايج اين مطالعات نشان داده است كه داشتن دانش كافي درباره موضوع مورد بررسي ميتواند تأثير قابل توجهي بر كارايي روش داشته باشد. اين موضوع شامل مباحث آماري پيشرفته نيز مي شود كه براي اكتشاف يادگيري ابعاد ضروري است. نتايج عملي مطالعات نشان دادهاند كه دادههاي بيان ژن داراي ابعاد ذاتي بيشتري نسبت به آنچه در تحقيقات گذشته تصور ميشد، هستند. بهطور كلي، زماني كه مفروضات مدل با انحراف متوسط همراه باشد، روش پيشنهادي عملكرد بهتري نسبت به ساير روشهاي موجود ارائه ميدهد.
-
كليدواژه لاتين
Dimensionality selection , Penalization , Singular Value Decomposition , Profile likelihood , Computational Bayes Approach
-
عنوان لاتين
Exploring supervised learning via penalized probabilistic principal component analysis
-
گروه آموزشي
آمار
-
چكيده لاتين
Dimensionality reduction in high-dimensional datasets necessitates the application of ef-fective learning strategies to address computational and interpretative challenges. In many scenarios, this reduction process must be explicitly guided by rigorous data analysis tech-niques, where an appropriate estimate of the reduced dimensionality is essential. This esti-mation task can be formally characterized as solving a constrained optimization problem, in which the reduced dimension is determined by maximizing a penalized likelihood func-tion within the framework of probabilistic principal component analysis (PPCA). Unlike traditional penalized maximization methods that rely on identifying an optimal penalty tuning parameter, the proposed approach employs an averaging strategy. This strategy identifies the estimated dimension as the most consistent choice across a broad spec-trum of feasible penalty parameters, thereby providing a robust alternative to parameter-specific approaches. The effectiveness of this methodology may be systematically eval-uated through simulation studies and empirical analyses, with applications particularly relevant in genetic research. These assessments highlight how domain-specific knowledge significantly improves the efficiency of the proposed approach by integrating nuanced sta-tistical considerations into the dimensionality reduction process. Additionally, empirical investigations suggest that gene expression data inherently exhibit a higher degree of in-trinsic dimensionality than previously recognized in earlier research efforts. Notably, the proposed method demonstrates superior adaptability and performance, especially under conditions where model assumptions are only partially satisfied. This capability under-scores its potential as a reliable and advanced alternative to conventional dimension reduc-tion techniques. Dimension reduction in high-dimensional data requires efficient learning strategies. In many cases, the reduction process must be explicitly determined based on appropriate data analysis methods, and the number of reduced dimensions needs to be estimated. This estimation can, in practice, be formulated as solving a constrained op-timization problem, where the estimated dimension corresponds to the maximization of a penalized likelihood function within the framework of probabilistic principal compo-nent analysis. Unlike other penalized maximization approaches that require an optimal penalty tuning parameter, the proposed method adopts an averaging strategy, in which the estimated dimension emerges as a desirable choice across a wide range of admissible penalty parameters. The effectiveness of the proposed approach can be assessed through various criteria in simulation studies and real data analyses in the field of genetics. The results of these studies demonstrate that sufficient knowledge of the specific subject under investigation significantly enhances the efficiency of the method, which also encompasses advanced statistical considerations for exploring dimensionality learning. Furthermore, empirical findings reveal that gene expression data possess higher inherent dimension than previously assumed in earlier research. Overall, the proposed method exhibits superior performance compared to existing approaches, particularly when the model assumptions are moderately violated.
-
تعداد فصل ها
4
-
لينک به اين مدرک :