Optimal Machine Learning Model Selection for Predicting Cancer Drug Response Using Genomic Data

한국인공지능학회
인공지능연구
Vol.13 No. 1
2025.03

35 - 41 (7 pages)
DOI : 10.24225/kjai.2025.13.1.35

원문보기

원문저장

In this paper, we investigate the optimal machine learning model for predicting drug response in cancer cells by leveraging genomic data, with an emphasis on clinical applicability. Utilizing the Cancer Drug Sensitivity Genomics dataset, we integrated diverse genetic characteristics, including gene mutations, copy number variations, and gene expression levels, along with drug response data. A structured data preprocessing pipeline was implemented, including mode replacement for tissue and cancer types, K-Nearest Neighbors imputation for genetic features, and Random Forest Regressor for handling missing numerical values. Regression models, such as Random Forest, K-Nearest Neighbors, Decision Tree, and CatBoost, were trained and evaluated for predictive performance. Experimental results revealed that the CatBoost model outperformed others, achieving a mean squared error of 1.5618, mean absolute error of 0.9355, and an R² score of 0.7855, with the Random Forest model showing comparable performance. These findings highlight the CatBoost model as a robust tool for predicting cancer drug response. Furthermore, this research underscores its potential integration into clinical decision-making systems by enabling personalized drug selection based on patient-specific genetic profiles. Future research may extend this approach to incorporate additional omics data and validate the model's utility in real-world clinical scenarios.

1. Introduction

2. Related research

3. Research Methods

4. Conclusion

References

Optimal Machine Learning Model Selection for Predicting Cancer Drug Response Using Genomic Data

(0)

(0)

(0)

(0)

Optimal Machine Learning Model Selection for Predicting Cancer Drug Response Using Genomic Data

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

(0)