상세검색
최근 검색어 전체 삭제
다국어입력
즐겨찾기0
학술저널

Automatic acquisition of “noun+verb” idiomatic compounds in Korean

  • 25
121614.jpg

The state-of-the-art skills of computational linguistics pay attention to lexical semantics, because it has a potential to be used to improve language processing systems in terms of coverage as well as accuracy. In particular, utilizing multiword expressions is importantly regarded as one of the components to foster performance of language applications. Handling these expressions is particularly crucial in multilingual processing, such as machine translation. Amongst a variety of multiword expressions, the present study investigates “noun+verb” idiomatic compounds in Korean. These compounds are made up of a verb plus the verb’s syntactic object, and what the combination of the two words conveys is not equivalent to the sum of the meanings of the parts. In order to acquire the “noun+verb” idiomatic compounds in Korean in a fully automatic way, the current work exploits a syntax-annotated corpus (i.e. treebank) and three lexical hierarchies in Korean. The current work extracts the syntactic patterns from the development corpus (the Sejong Korean Treebank), calculates the selectional preferences each verbal item has with its objects, and identifies the idiosyncratic items with reference to the three lexical hierarchies (CoreNet, KorLex, and U-WIN). The result includes 548 idiomatic compounds, 70% of which are evaluated as satisfactory. (Nanyang Technological University)

Abstract

1. Introduction

2. Background

3. Methodology

4. Acquisition

5. Result

6. Conclusion

References

(0)

(0)

로딩중