News & Events
Topic: Seminar 學術講座: Gu Yueguo on 19th December 2002 (Thursday)
|
Posted - 16/12/2002 : 13:53:02
Gu Yueguo
Department of Chinese, Translation and Linguistics
Institute of Chinese Linguistics
Language Information Sciences Research Centre
Seminar
by
Gu Yueguo
Chinese Academy of Social Sciences
guyueguo@vip.sina.com
Segmenting and Annotating Spoken Corpus of Situated
Discourse
Time: 4:30 - 6:00 pm
Date: 19th December 2002 (Thursday)
Venue: B7603 (CTL Multi-purpose Room), City University of Hong Kong
Abstract
This talk discusses the research into segmenting and annotating situated
discourse with a special reference to Chinese. By the time of this talk,
a spoken Chinese corpus of situated discourse has already reached the size
of 650 hours of audio and 150 hours of video recordings, with the orthographic
transcripts of about 21,000,000 characters under the auspices of the Chinese
Academy of Social Sciences. Situated discourse refers to spontaneous
naturally-occurring discourse with the ensuing features of situatedness:
[*]It is situated to an actual social situation;
[*]It is situated to actual users;
[*]It is situated to an inter-subjective world of discourse;
[*]It is situated to social activities with actual goals;
[*]It is situated to spatial and temporal setting;
[*]It is situated to the cognitive capacity of actual users;
[*]It is situated to performance contingencies of actual users who are
engaged in spontaneous talking with little pre-planning.
It is no small undertaking to segment and annotate a corpus like such.
Currently the project team working on the corpus is divided into two groups,
with one being research-oriented, and the other application-oriented. The
research-oriented group focuses its attention on segmenting sound and video
streams into chunks, and annotate the chunks in terms of pragmatics and
discourse analysis. The application-oriented group, on the other hand,
attempts at detailed phonetic/acoustic annotations of small samples
(about 15 hours) with the goal of helping the project partner team under
the Chinese Academy of Sciences with the acoustic information for speech
synthesis and production.
So what are to be dealt with in this talk are:
[*]issues pertinent to segmenting sound and video streams into chunks;
[*]issues pertinent to annotating the chunks pragmatically;
[*]issues pertinent to annotating the chunks discourse-wise;
[*]issues pertinent to phonetically/acoustically transcribing, in the
machine-readable codes, natural-occurring spontaneous discourse rather
than read speech under an ideal studio condition;
[*]computer programs developed or used for the segmenting and annotating
purposes.
The talk will be delivered in English.
Enquiries: 2788-8705
All Are Welcome
|
|
|
Enquiry: LTenquiry@cityu.edu.hk