City University of Hong Kong Dep
Department of Chinese, Translation and Linguistics
Research Degree Forum
Word Frequency Distribution for Electronic English Learner’s Dictionaries: Based on BNC XML
Presented by
Mr. LI Hanhong
PhD candidate, Department of Chinese, Translation and Linguistics, City University of Hong Kong
Date: 24 Nov 2009, Tuesday
Time: 4:30 - 5:30pm
Venue: B7603 (Lift 3, 7/F, Blue Zone), Academic Building, CityU
Abstract
Word frequency information has been an indispensable part of electronic learner’s English dictionaries. The review of the current five major electronic learner’s English dictionaries shows that word frequency information is mainly based on the raw frequency counting without considering the word frequency distribution in different text genres. With the further development of modern corpora and studies in text genres, a new vision of word frequency information is available for our dictionary users: word frequency distribution in different genres. Our current research attempts to indicate word frequency information based on the current tagging of genres in the British National Corpus XML Edition (BNC XML 2007). By reorganizing the genres in BNC XML, we explore the word frequency distribution across the written genres (academic prose, non-academic prose, biography, fiction, news, public, religion, pops, and commerce) and spoken genres (conversation, education, business, public, and leisure). With the genre tagging in BNC XML, an electronic English Frequency Dictionary containing frequency information across different genres is produced and proposed for our future electronic learner’s dictionaries. The word frequency distributed in different genres can help EFL learners to understand the register, culture implication of English words and even capture the better difference between English synonyms. Further research shows that when we select core words for EFL English learners the combined parameters of word frequency and its distribution in different genres will achieve better coverage than raw frequency only.
Speaker
Mr. LI Hanhong is currently a PhD candidate in the Department of Chinese, Translation and Linguistics. His research interest mainly involves corpus linguistics, lexicography, SLA and core vocabulary research.
~ CTL Staff and Research Degree Students only ~