Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies
Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies
- 경희대학교 언어정보연구소
- 언어연구
- 제40권 제2호
-
2023.06317 - 352 (36 pages)
-
DOI : 10.17250/khisli.40.2.202306.007
- 11
The phenomenon of attraction effects, whereby a verb erroneously retrieves a syntactically inaccessible but feature-matching noun, is a type of grammatical illusions (Phillips, Wagers, and Lau 2011) that can occur in long-distance subject-verb agreement in human sentence processing (Wagers et al. 2009). In contrast, reflexive-antecedent dependencies have been claimed to lack attraction effects when the reflexive and the antecedent mismatch (Dillon et al. 2013). Yet, some other studies have shown that attraction effects have been observed in reflexive-antecedent dependencies, when the number of feature mismatch between the reflexive and the antecedent increases (Parker and Philips 2017). These findings suggest that there are different cue weightings based on the predictability of the dependency, and these cues are combined according to different cue-combination scheme, such as a linear or a non-linear cue-combination rule (Parker 2019). These linguistic phenomena can be used to analyze how linguistic features are accessed and combined within the internal states of Deep Neural Network (DNN) language models. In the linguistic representations of BERT (Devlin et al. 2018), one of the pre-trained DNN language models, various types of linguistic information are encoded in each layer (Jawahar et al. 2019) and combined while passing through the layers. By measuring the performance of Masked Language Model (MLM), this study finds that both subject-verb agreement and reflexive-antecedent dependencies show attraction effects and follow the linear-combinatoric rule in BERT. The different results from human sentence processing suggest that the self-attention mechanism of BERT may not be able to capture the differences in the predictability of the dependency as effectively as memory retrieval mechanisms in humans. These findings have important implications for developing more understandable and interpretable explainable-AI (xAI) systems that better capture the complexities of human language processing.
1. Introduction
2. Background
3. Experiment
4. General discussion
5. Conclusion
References
(0)
(0)