ALL ISSUE
학습자 말뭉치 기반의 문법적 연어 구성 연구
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.1-12
AbstractThis study aims to extract grammatical collocation patterns from the learner corpus constructed by the National Institute of the Korean Language (2015–2019) and to investigate the usage patterns of grammatical collocations among Korean language learners. Grammatical collocations are particularly interesting because they involve two or more linguistic units that function as a single, integrated entity and because these units include forms responsible for grammatical functions. In particular, grammatical collocations are not only a challenging aspect for foreign learners of Korean but also a crucial component in language instruction. Therefore, extracting grammatical collocations from a large-scale learner corpus and analyzing their characteristics is of utmost importance. To achieve this, this study extracted grammatical collocations according to proficiency levels from a morphologically analyzed learner corpus consisting of approximately 2.6 million word tokens and examined their characteristics. Furthermore, to highlight the distinctive features of grammatical collocations in the learner corpus, a comparative analysis was conducted with a native Korean corpus. Through this analysis, the study quantitatively and qualitatively examined the usage patterns of grammatical collocations among Korean learners based on their proficiency levels, while also explicitly identifying distinctions between learners' grammatical collocations and those of native speakers.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
한국어 학습자의 접사 사용 양상 분석을 위한 자료 가공 방법론 연구
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.13-30
AbstractThe purpose of this study is to present a methodology for processing corpus in order to lay the foundation for comprehensively observing the use of Korean affixes by Korean learners. As basic data for preparing the corpus processing methodology, we will use the
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
어휘 교육을 위한 보조용언 ‘-어 버리다'와 ‘-고 말다'의 어휘 변별 연구
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.31-50
AbstractThis study examines the semantic distinction between malda and beorida through corpus analysis. The non-substitutability of the two verbs is largely due to the syntactic and semantic constraints of malda, which are more restrictive than those of beorida. Despite similar conceptual meanings and syntactic combinations, differences emerge in morphological usage and modal nuances perceived by speakers. These findings suggest that malda and beorida form a challenging synonym pair for both teaching and learning, requiring careful semantic analysis. The identified constraints and differences may inform more effective materials for Korean language learners.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
신문사의 정치 성향에 따른 북한 관련 보도 어휘 연구
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.51-66
AbstractThe press not only delivers a wide range of news to the public but also plays a crucial role in shaping public opinion on various issues and situations. Depending on their interests, newspapers may interpret the same issue differently. One of the major topics consistently covered by the South Korean press is North Korea. Since the division of the Korean Peninsula, issues related to North Korea have remained a focal point in South Korean society. This study analyzes and discusses the lexical characteristics of North Korea-related news coverage according to the political orientation of newspapers. Politically charged high-frequency words were selected from both progressive and conservative newspapers. An analysis of the usage examples of these words reveals that progressive and conservative newspapers tend to view the same topic from differing perspectives when reporting on North Korea.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
공기어를 활용한 유의 부사 변별 연구 : ‘고작', ‘기껏', ‘겨우', ‘불과', '기껏해야'를 중심으로
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.67-89
AbstractThis study examined the semantic distinctions among the synonymous Korean adverbs gojak (‘고작’), gikkeot (‘기껏’), gyewu (‘겨우’), bulgwa (‘불과’), and gikkeot-haeya (‘기껏해야’) through a quantitative analysis based on co-occurrence data from the Sejong Corpus. Using hierarchical clustering and correspondence analysis, the study visualized the degree of semantic proximity among these adverbs.The hierarchical clustering results show that gikkeot and gikkeot-haeya form the closest semantic cluster, followed by gojak and bulgwa. In contrast, gyewu emerged as a semantically independent adverb, forming a distinct cluster. Correspondence analysis further confirmed these patterns by illustrating that gojak, gikkeot, and gikkeot-haeya are located near the origin and share similar directional vectors, indicating overlapping co-occurrence profiles. Meanwhile, bulgwa and gyewu are clearly separated in different quadrants of the plot, reflecting their distinct semantic and syntactic properties.By integrating co-occurrence patterns with statistical analysis, this study supplements intuition-based and dictionary-driven synonym classifications. The findings affirm that a corpus-based approach is effective in distinguishing subtle semantic differences among synonymous adverbs. Further research is needed to expand this analysis to a wider range of adverbs and to incorporate pragmatic and discourse-level factors into the investigation..
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
세종 말뭉치 기반 한국어 사자성어 세부 분류 연구
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.91-110
Abstract- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
대형 언어 모델의 문화적 편향 측정
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.111-137
AbstractThis study analyzes cultural biases in major large language models from the United States, South Korea, and China (GPT-4, CLOVA X, and Qwen1.5) through story generation tasks using culture-specific names. Morphological analysis of the generated stories revealed that all models exhibited certain cultural biases. GPT-4 did not show negative biases toward Korean and Chinese cultures but tended to prefer traditional and rural settings when describing these cultures. In contrast, CLOVA X and Qwen1.5, which are specialized for their respective national languages, portrayed their own cultures in modern and positive terms while using a relatively higher proportion of negative adjectives and unrealistic settings when describing Western contexts. These findings are significant because they go beyond the conventional focus on biases in Western-centric models toward non-Western contexts. They newly reveal that East Asian-based models can also exhibit similar biases when representing Western cultures. This research suggests that current language model has fundamental limitations in achieving cultural neutrality and highlights the importance of balanced learning and reflection of diverse cultural contexts as a crucial challenge in language model development.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
한국코퍼스언어학회 회칙 외
CORPUS LINGUSITICS RESEARCH :: Vol.9 No.1 pp.138-154
Abstract- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
코퍼스 분석을 통한 서술 특징 분석 : 구병모 作 『한 스푼의 시간』과 『상아의 문으로』의 비교
CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.1-14
AbstractThis paper analyzed narrative characteristics of a writer through the comparison of two literary works of the same writer. Two literary works written by Byeong-mo Gu were chosen for this purpose: ‘A Spoonful of Time’ and ‘To the Ivory Gate’. The corpus was compiled with these two texts, and all the words are POS-tagged. Then, AntConc was utilized for the analysis of corpus data. Three types of linguistic factors were incorporated in this analysis: high-frequency words, n-grams, and pronouns. Through the analysis, the following facts were revealed: (i) the highfrequency words showed the material of the work, (ii) n-gram analysis foregrounded the atmosphere of the work intended by the author, and (iii) pronouns were rarely used when referring to characters. Although there were some valid aspects of analyzing literary works through the corpus analysis, it was recommended that a database of literary works was necessary to be constructed by period and that further studies were necessary to be conducted based on the database.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX
A Comparative Analysis of Syntactic Complexity between Scholars and AI-based Machine Translation Systems
CORPUS LINGUSITICS RESEARCH :: Vol.8 No.2 pp.15-37
AbstractThis study investigates the syntactic complexity in English prose between the writings of Chinese scholars and the corresponding translations generated by AI-based machine translation systems. A corpus of 100 English abstracts written by Chinese scholars and 300 English abstracts translated by ChatGPT 4.0, Google Bard and Microsoft Bing was constructed. These texts were analysed using 14 measures of syntactic complexity as defined by the L2 Syntactic Complexity Analyzer (Lu, 2010). The analysis revealed that when comparing the original Chinese-English texts with the outputs of machine translation systems, significant differences were found in 13 of the 14 syntactic measures. Conversely, when comparing the translations from ChatGPT 4.0, Bard and Bing, significant differences were found in 10 of the 14 measures. This research advances the understanding of machine translation systems and has relevant implications for pedagogy and assessment in the field.
- EndNote
- RefWorks
- Scholar's Aid
- BibTeX