Volume.9 No.1 June 2024
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.1-12
This study aims to extract grammatical collocation patterns from the learner corpus constructed by the National Institute of the Korean Language (2015–2019) and to investigate the usage patterns of grammatical collocations among Korean language learners. Grammatical collocations are particularly interesting because they involve two or more linguistic units that function as a single, integrated entity and because these units include forms responsible for grammatical functions. In particular, grammatical collocations are not only a challenging aspect for foreign learners of Korean but also a crucial component in language instruction. Therefore, extracting grammatical collocations from a large-scale learner corpus and analyzing their characteristics is of utmost importance. To achieve this, this study extracted grammatical collocations according to proficiency levels from a morphologically analyzed learner corpus consisting of approximately 2.6 million word tokens and examined their characteristics. Furthermore, to highlight the distinctive features of grammatical collocations in the learner corpus, a comparative analysis was conducted with a native Korean corpus. Through this analysis, the study quantitatively and qualitatively examined the usage patterns of grammatical collocations among Korean learners based on their proficiency levels, while also explicitly identifying distinctions between learners' grammatical collocations and those of native speakers.
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.13-30
The purpose of this study is to present a methodology for processing corpus in order to lay the foundation for comprehensively observing the use of Korean affixes by Korean learners. As basic data for preparing the corpus processing methodology, we will use the
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.31-50
This study examines the semantic distinction between malda and beorida through corpus analysis. The non-substitutability of the two verbs is largely due to the syntactic and semantic constraints of malda, which are more restrictive than those of beorida. Despite similar conceptual meanings and syntactic combinations, differences emerge in morphological usage and modal nuances perceived by speakers. These findings suggest that malda and beorida form a challenging synonym pair for both teaching and learning, requiring careful semantic analysis. The identified constraints and differences may inform more effective materials for Korean language learners.
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.51-66
The press not only delivers a wide range of news to the public but also plays a crucial role in shaping public opinion on various issues and situations. Depending on their interests, newspapers may interpret the same issue differently. One of the major topics consistently covered by the South Korean press is North Korea. Since the division of the Korean Peninsula, issues related to North Korea have remained a focal point in South Korean society. This study analyzes and discusses the lexical characteristics of North Korea-related news coverage according to the political orientation of newspapers. Politically charged high-frequency words were selected from both progressive and conservative newspapers. An analysis of the usage examples of these words reveals that progressive and conservative newspapers tend to view the same topic from differing perspectives when reporting on North Korea.
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.67-89
This study examined the semantic distinctions among the synonymous Korean adverbs gojak (‘고작’), gikkeot (‘기껏’), gyewu (‘겨우’), bulgwa (‘불과’), and gikkeot-haeya (‘기껏해야’) through a quantitative analysis based on co-occurrence data from the Sejong Corpus. Using hierarchical clustering and correspondence analysis, the study visualized the degree of semantic proximity among these adverbs.The hierarchical clustering results show that gikkeot and gikkeot-haeya form the closest semantic cluster, followed by gojak and bulgwa. In contrast, gyewu emerged as a semantically independent adverb, forming a distinct cluster. Correspondence analysis further confirmed these patterns by illustrating that gojak, gikkeot, and gikkeot-haeya are located near the origin and share similar directional vectors, indicating overlapping co-occurrence profiles. Meanwhile, bulgwa and gyewu are clearly separated in different quadrants of the plot, reflecting their distinct semantic and syntactic properties.By integrating co-occurrence patterns with statistical analysis, this study supplements intuition-based and dictionary-driven synonym classifications. The findings affirm that a corpus-based approach is effective in distinguishing subtle semantic differences among synonymous adverbs. Further research is needed to expand this analysis to a wider range of adverbs and to incorporate pragmatic and discourse-level factors into the investigation..
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.91-110
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.111-137
This study analyzes cultural biases in major large language models from the United States, South Korea, and China (GPT-4, CLOVA X, and Qwen1.5) through story generation tasks using culture-specific names. Morphological analysis of the generated stories revealed that all models exhibited certain cultural biases. GPT-4 did not show negative biases toward Korean and Chinese cultures but tended to prefer traditional and rural settings when describing these cultures. In contrast, CLOVA X and Qwen1.5, which are specialized for their respective national languages, portrayed their own cultures in modern and positive terms while using a relatively higher proportion of negative adjectives and unrealistic settings when describing Western contexts. These findings are significant because they go beyond the conventional focus on biases in Western-centric models toward non-Western contexts. They newly reveal that East Asian-based models can also exhibit similar biases when representing Western cultures. This research suggests that current language model has fundamental limitations in achieving cultural neutrality and highlights the importance of balanced learning and reflection of diverse cultural contexts as a crucial challenge in language model development.
CORPUS LINGUSITICS RESEARCH
Vol.9 No.1
pp.138-154
본 학회의 학술지 “코퍼스언어학연구(Corpus Linguistics Research)”는 영어, 한국어, 중국어, 일본어, 불어, 독어 등 다양한 언어의 코퍼스를 기초로 한 연구를 다루며, 특정한 언어나 연구 방법에 얽매이지 않고 실험, 분석, 이론, 응용 연구 등 코퍼스를 활용한 다양한 연구의 활성화를 추구한다. ......
more...