طراحي سيستم توصيه‌كننده‌ي دارو براي بيماران كرونايي با استفاده از روش‌هاي آماري و يادگيري ماشين

مقطع تحصيلي

كارشناسي ارشد

رشته تحصيلي

مهندسي پزشكي- بيوالكتريك

دانشكده

فني و مهندسي

تاريخ دفاع

چهار بهمن 1404

صفحه شمار

148 ص.

استاد راهنما

حميدرضا مراتب

كليدواژه فارسي

كوويد-19، سيستم توصيه‌كننده، يادگيري ماشين، استنباط علّي و درمان

چكيده فارسي

همه‌گيري كوويد-19 چالش‌هاي بي‌سابقه‌اي را براي سيستم‌هاي درماني ايجاد كرد و يكي از مهم‌ترين آن‌ها، انتخاب درمان دارويي مؤثر براي هر بيمار باتوجه‌به ناهمگوني پاسخ‌هاي باليني بود؛ ازاين‌رو، اتكاي صرف به پروتكل‌هاي يكسان براي همه بيماران كارآمد نيست و ضرورت پزشكي شخصي‌سازي‌شده را برجسته مي‌كند. در همين راستا، در اين پژوهش با استفاده از داده‌هاي 2285 بيمار بستري در بيمارستان Bellvitge (مطالعه‌ي COVID-SEMI)، يك چارچوب تصميم يار درماني مبتني بر هوش مصنوعي و استنباط علّي توسعه داده شد كه هم‌زمان دو هدف كليدي پيش‌بيني پيامد و برآورد علّي اثر درمان را پوشش مي‌دهد. براي مدل‌سازي، علاوه بر مدل‌هاي طبقه‌بندي كلاسيك مانند XGBoost و جنگل تصادفي، از شبكه‌هاي عصبي عميق (Long Short-Term Memory) LSTM براي بهره‌گيري از ساختار توالي و MLP(Multi Layer Perceptron) به‌عنوان جايگزين ساده‌تر استفاده شد. ازآنجاكه مدل‌هاي يادگيري ماشين صرفاً الگوهاي هم‌بستگي را مي‌آموزند و به‌تنهايي تضميني براي استنتاج علّي ندارند، لايه‌ي علّي پژوهش با تركيب يادگيري ماشين/عميق و روش‌هاي استنباط علّي طراحي شد؛ به‌اين‌ترتيب، براي غلبه بر چالش مخدوش‌كننده‌هاي وابسته به زمان از شبيه‌سازي كارآزمايي هدف و فرمول جي پارامتريك استفاده شد و اثرات ناهمگن درمان مانند ITE (Individual Treatment Effect) و CATE(Conditional Average Treatment Effect) براي پنج داروي اصلي شامل استروئيدها، هيدروكسي‌كلروكين، رمدسيوير، كالترا و توسيليزومب برآورده شد و بر اساس آن سيستم درماني شخصي‌سازي‌شده طراحي شد. نكته‌ي كليدي آن است كه چارچوب استنباط علّي به معماري خاص شبكه وابسته نيست؛ به بيان ديگر، با جايگزيني LSTM با MLP نيز كيفيت برازش و به‌ويژه كاليبراسيون همچنان مناسب باقي ماند (مقدار ECE در هر دو معماري در حد بسيار كوچك و كمتر از 01/0 بود) كه نشان مي‌دهد بخش علّي مبتني بر فرمول جي مي‌تواند بر بستر مدل‌هاي پيش‌بين متفاوت اجرا شود، مشروط به آنكه مدل‌هاي پيامد/درمان به‌قدر كافي خوب كاليبره و پايدار آموزش داده شوند. در اين بخش، مقادير گزارش‌شده مربوط به مدل LSTM است، زيرا در مجموع عملكرد اندكي بهتر از MLP نشان داد، نتايج مدل براي پيامدهاي روز بعد عملكرد مناسبي دارد، به‌گونه‌اي كه مقدار AUC-ROC براي ترخيص در روز بعد در سطح نفر-روز مقدار 0/803±0/015 و براي مرگِ روز بعد 0/795±0/010، در مقابل، براي ورود به بخش مراقبت‌هاي ويژه در روز بعد، عملكرد 0/717±0/023 و براي مرگ تا پايان بستري در سطح بيمار و در چارچوب مدل‌سازي تابع خطر، مقدار 0/749±0/033 گزارش شد. درنهايت، از منظر توضيح‌پذيري، تحليل SHAP نشان داد سن مهم‌ترين متغير در تصميمات مدل است و پس از آن شاخص‌هاي التهاب و عملكرد اندام‌ها شامل سطح پروتئين واكنش‌گر C، كراتينين، لنفوسيت‌ها گرفتند. اين پژوهش نشان داد كه تركيب پيش‌بيني پيامد با استنتاج علّي مي‌تواند به توليد شاخص‌هاي تصميم محور مانند مرگ قابل‌پيشگيري برسد. بر مبناي برآوردهاي خلاف واقعيت، سهم مرگ‌هاي قابل‌اجتناب از كل مرگ‌ها در بازه‌‌ي 3/39٪ تا 4/49٪ قرار گرفت (بيشترين مقدار مربوط به كورتيكواستروئيدها: 4/49٪) كه نشان مي‌دهد سيستم مي‌تواند به طور بالقوه بخش قابل‌توجهي از پيامدهاي نامطلوب را كاهش دهد؛ بنابراين سيستم درماني شخصي‌سازي‌شده مبتني بر داده مي‌تواند بهره‌وري و اثربخشي تصميم‌گيري باليني را به طور چشمگيري افزايش دهد.

كليدواژه لاتين

COVID-19, recommender system, Machine Learning, Causal Inference, treatment

عنوان لاتين

Designing a drug recommendation system for COVID-19 patients using statistical methods an‎d machine learning

گروه آموزشي

مهندسي پزشكي

چكيده لاتين

The COVID-19 pan‎demic created unprecedented challenges for healthcare systems, one of the most important being the selec‎tion of an effective drug therapy for each patient given the heterogeneity of clinical responses. Therefore, relying solely on uniform protocols for all patients is not efficient an‎d highlights the need for personalized medicine. In this context, using data from 2,285 patients hospitalized at Bellvitge Hospital (the COVID-SEMI study), this research developed an AI- an‎d causal-inference–based clinical decision-support framework that simultaneously addresses two key objectives: outcome prediction an‎d causal estimation of treatment effects. For modeling, in addition to classical classification models such as XGBoost an‎d Ran‎dom Forest, deep neural networks were also used. Specifically, a Long Short-Term Memory (LSTM) network was employed to leverage sequential structure, an‎d a Multi-Layer Perceptron (MLP) was used as a simpler alternative (with a fully feed-forward architecture). Because machine learning models learn only correlational patterns an‎d do not by themselves guarantee causal inference, the causal layer of the study was designed by combining machine learning/deep learning with causal inference methods. Accordingly, to overcome the challenge of time-varying confounders, target trial emulation an‎d the parametric g-formula were used, an‎d heterogeneous treatment effects such as the Individual Treatment Effect (ITE) an‎d the Conditional Average Treatment Effect (CATE) were estimated for five main drugs—steroids, hydroxychloroquine, remdesivir, Kaletra, an‎d tocilizumab—based on which a personalized treatment system was designed. A key point is that the causal inference framework is not dependent on a specific network architecture; in other words, even when replacing the LSTM with an MLP, goodness-of-fit an‎d especially calibration remained appropriate (ECE was very small an‎d below 0.01 for both architectures). This indicates that the g-formula–based causal component can be implemented on top of different predictive models, provided that the outcome/treatment models are trained to be sufficiently well-calibrated an‎d stable. In this section, the reported values correspond to the LSTM model because overall it performed slightly better than the MLP. The model showed good performance for next-day outcomes: at the person-day level, the AUC-ROC for next-day discharge was 0.803±0.015 an‎d for next-day death 0.795±0.010; in contrast, for next-day ICU admission performance was 0.717±0.023, an‎d for death by the end of hospitalization at the patient level within a hazard-function modeling framework, the reported value was 0.749±0.033. Finally, from an interpretability perspective, SHAP analysis showed that age was the most important variable in the model’s decisions, followed by markers of inflammation an‎d organ function, including C-reactive protein level, creatinine, an‎d lymphocyte count. This study demonstrated that combining outcome prediction with causal inference can produce decision-oriented indicators such as preventable mortality. Based on counterfactual estimates, the share of preventable deaths among all deaths ranged from 39.3% to 49.4% (the highest value corresponding to corticosteroids: 49.4%), suggesting that the system could potentially reduce a substantial portion of adverse outcomes. Therefore, a data-driven personalized treatment system can significantly enhance the efficiency an‎d effectiveness of clinical decision-making.

تعداد فصل ها

استاد مشاور خارج از دانشگاه

مارتين وكويتز

فهرست مطالب pdf

157596

نويسنده

اسماعيلي فارساني، محدثه

لينک به اين مدرک

https://lib.ui.ac.ir/dl/search/default.aspx?Term=25723&Field=0&DTC=3