College of Liberal Arts and Social Sciences

Department of Linguistics and Translation

翻譯及語言學系

CLASS

Department of LT

News & Events

News & Events Home

News and Events Archive

Seminar 學術講座: Gu Yueguo on 19th December 2002 (Thursday)

Topic: Seminar 學術講座: Gu Yueguo on 19th December 2002 (Thursday)

posted item

Posted - 16/12/2002 : 13:53:02

Gu Yueguo

Department of Chinese, Translation and Linguistics

Institute of Chinese Linguistics

Language Information Sciences Research Centre

Seminar

by

Gu Yueguo

Chinese Academy of Social Sciences

guyueguo@vip.sina.com

Segmenting and Annotating Spoken Corpus of Situated Discourse

Time: 4:30 - 6:00 pm

Date: 19th December 2002 (Thursday)

Venue: B7603 (CTL Multi-purpose Room), City University of Hong Kong

Abstract

This talk discusses the research into segmenting and annotating situated discourse with a special reference to Chinese. By the time of this talk, a spoken Chinese corpus of situated discourse has already reached the size of 650 hours of audio and 150 hours of video recordings, with the orthographic transcripts of about 21,000,000 characters under the auspices of the Chinese Academy of Social Sciences. Situated discourse refers to spontaneous naturally-occurring discourse with the ensuing features of situatedness:
[]It is situated to an actual social situation; []It is situated to actual users; []It is situated to an inter-subjective world of discourse; []It is situated to social activities with actual goals; []It is situated to spatial and temporal setting; []It is situated to the cognitive capacity of actual users; []It is situated to performance contingencies of actual users who are engaged in spontaneous talking with little pre-planning.
It is no small undertaking to segment and annotate a corpus like such. Currently the project team working on the corpus is divided into two groups, with one being research-oriented, and the other application-oriented. The research-oriented group focuses its attention on segmenting sound and video streams into chunks, and annotate the chunks in terms of pragmatics and discourse analysis. The application-oriented group, on the other hand, attempts at detailed phonetic/acoustic annotations of small samples (about 15 hours) with the goal of helping the project partner team under the Chinese Academy of Sciences with the acoustic information for speech synthesis and production.
So what are to be dealt with in this talk are:
[]issues pertinent to segmenting sound and video streams into chunks; []issues pertinent to annotating the chunks pragmatically; []issues pertinent to annotating the chunks discourse-wise; []issues pertinent to phonetically/acoustically transcribing, in the machine-readable codes, natural-occurring spontaneous discourse rather than read speech under an ideal studio condition; []computer programs developed or used for the segmenting and annotating purposes.
The talk will be delivered in English.
Enquiries: 2788-8705

All Are Welcome

Enquiry: LTenquiry@cityu.edu.hk