머신러닝 분류 모형을 이용한 Netflix 콘텐츠 시청 시간 예측

고선제; 김일주

doi:10.37727/jkdas.2024.26.5.1357

OTT(over-the-top) 시장은 COVID-19 이후 급속한 성장을 보이며 미디어 산업의 중심이 되고 있다. 이러한 환경에서 OTT 플랫폼의 경쟁력은 콘텐츠의 질과 양에 크게 좌우되고 있다. 이에 본 연구는 머신러닝 기법을 활용하여 대표적인 OTT 서비스 플랫폼인 Netflix에서 최근 공개한 콘텐츠별 시청 시간 데이터를 활용, 각 콘텐츠의 특성에 기반한 시청 시간 예측 모델을 개발함으로써 콘텐츠 제작 및 구매 결정에 도움을 줄 수 있는 지표를 제시하고자 한다. 본 연구는 IMDB, TMDB에서 수집한 콘텐츠의 특성 데이터를 예측 변수로 활용하였고, Netflix의 콘텐츠 시청 시간 예측을 학습하기 위해 Random Forest, Support Vector Machine(SVM), XGBoost 알고리즘을 적용한 분류 모델을 개발하였으며, Confusion Matrix와 ROC(receiver operating characteristic) Curve의 AUC(area under the ROC curve)를 사용하여 해당 모델의 성능을 평가하였다. 그 결과 세알고리즘 모두 시청 시간 예측에 있어 우수한 성능을 보였으며, 그 중 특히 XGBoost가 가장 뛰어난 성능을 보였다. 이러한 모델은 콘텐츠의 성공 또는 실패를 효과적으로 사전에 예측함으로써, OTT 플랫폼의 콘텐츠 관련 의사결정에 유용한 지표를 제공할 수 있을 것으로 기대된다.

The OTT (over-the-top) market has become the center of the media industry, showing rapid growth since COVID-19, and the competitiveness of an OTT platform largely depends on the quality and quantity of its contents. In this environment, this study aims to develop a model that can predict viewing time of each content based on its characteristics, utilizing machine learning techniques and the recent viewership data released by Netflix. This study used content characteristics data collected from IMDB and TMDB as predictor variables, and developed classification models applying Random Forest, Support Vector Machine (SVM), and XGBoost algorithms. The performance of each model was evaluated using Confusion Matrix and the AUC (area under the ROC curve) measure. As a result, all three algorithms showed good performance and XGBoost has demonstrated the best performance. The proposed model is expected to support OTT platforms to make a better decision on purchasing content licences or making original contents by effectively predicting the success or failure of the content in advance.

머신러닝 분류 모형을 이용한 Netflix 콘텐츠 시청 시간 예측
Prediction of Netflix Content Viewing Time Using a Machine Learning Classification Model

(0)

(0)

(0)

(0)

머신러닝 분류 모형을 이용한 Netflix 콘텐츠 시청 시간 예측 Prediction of Netflix Content Viewing Time Using a Machine Learning Classification Model

(0)

(0)

(0)

(0)

머신러닝 분류 모형을 이용한 Netflix 콘텐츠 시청 시간 예측
Prediction of Netflix Content Viewing Time Using a Machine Learning Classification Model