Web-based Consumer Involvement Indices and Vegetable Consumption: The Quantification of Unstructured Information and an Exploration of a Causal Relationship
- 조인호(Inho Cho) 김동환(DongHan Kim) 전채남(ChaeNam Chun)
- Journal of The Korean Data Analysis Society (JKDAS)
- Vol.18 No.3
- 등재여부 : KCI등재
- 1259 - 1270 (12 pages)
Managing text-based information is crucial when trying to extract valuable information from documents. This research studied the quantification of unstructured text and its forecasting power. In order to examine unstructured information related to predictive models, documents generated on Naver (a Korean web portal site) from January, 2009 to September, 2015 were used to investigate and predict changes in consumption of four different types of vegetables in South Korea. To quantify the text-based unstructured information, several methods were proposed, such as the amount of search keyword, the amount of buzzword, degree-centrality-weighted term frequency, IDF weighted term frequency, and skewness- weighted term frequency. These methods tracked the keywords in the document and their numerical weights to score individual terms co-occurred with each type of vegetable. Statistical analyses were then conducted to verify the stationarity and cointegration. In addition, VEC system was used to estimate the relationship between consumption of each type of vegetable and quantified text-based unstructured information. Afterwards, Granger causality tests were conducted to verify the effects from the quantified unstructured information to the consumption rather than vice versa.