K-nn을 이용한 Hot Deck 기반의 결측치 대체
Imputation of Missing Data Based on Hot Deck Method Using K-nn
- 한국IT서비스학회
- 한국IT서비스학회지
- 한국IT서비스학회지 제13권 제4호
-
2014.12359 - 375 (17 pages)
- 76
Researchers cannot avoid missing data in collecting data, because some respondents arbitrarily or non-arbitrarily do not answer questions in studies and experiments. Missing data not only increase and distort standard deviations, but also impair the convenience of estimating parameters and the reliability of research results. Despite widespread use of hot deck, researchers have not been interested in it, since it handles missing data in ambiguous ways. Hot deck can be complemented using K-nn, a method of machine learning, which can organize donor groups closest to properties of missing data. Interested in the role of k-nn, this study was conducted to impute missing data based on the hot deck method using k-nn. After setting up imputation of missing data based on hot deck using k-nn as a study objective, deletion of listwise, mean, mode, linear regression, and svm imputation were compared and verified regarding nominal and ratio data types and then, data closest to original values were obtained reasonably. Simulations using different neighboring numbers and the distance measuring method were carried out and better performance of k-nn was accomplished. In this study, imputation of hot deck was re-discovered which has failed to attract the attention of researchers. As a result, this study shall be able to help select non-parametric methods which are less likely to be affected by the structure of missing data and its causes.
Abstract
1. 서론
2. 결측치
3. Hot Deck과 K-nn 처리 과정
4. 실험 설계 및 결과 분석
5. 결론 및 연구의 한계점
References
(0)
(0)