Introduction    |    Members    |    Research    |    Resources    |    Service    |    Information
  Introduction
  Sub Centers
- KORTERM
- BORA
- CILab
 
HOME > Introduction > Sub Centers    
 

Sub Centers



 

1. What is Korea Terminology Research Center for Language and Knowledge Engineering (KORTERM)?

Korea Terminology Research Center for Language and Knowledge Engineering (KORTERM) was established in 1998 upon the authorization of the Ministry of Culture and Tourism. The center began the terminology standardization project in 1998, and completed it in 2007.

The terminology standardization project was conducted to resolve the confusion caused by the disorderly use of terminologies. In the current environment, a term may be used to denote different things in different fields, and the use of technical terms has become privatized as a consequence of the information overflow. The project investigated the actual uses of the terms from the corpora of various specialized fields in order to establish a list of terminologies and to derive their standard forms.

In collecting the terminologies, the center obtained agreements from the associations representing each discipline in advance, collected terminologies, and went through a refinement process. The center completed the terminology standardization-based project in consultation with the terminology committees of the relevant disciplines.

2. Purpose of the Research

Under the new Chapter 17 of the Fundamental Law for Korean Language adopted in 2005 ('On the standardization of technical terms, etc.’) , "the state must standardize, systematize, and distribute terminologies used in each discipline to enable the citizens to easily understand and use them conveniently."

The topic of this research, “Terminology Management (Terminology Standardization-based Project)” was promoted for the following research objectives:
  • First: Organize information on the terminologies used in each specialized discipline, the number of which has grown explosively with the coming of the knowledge and information era, and which has been increasingly arranged in database form. Above all, compile a multi-lingual terminology list in Korean and English for each specialized area.
  • Second: Implement the standardization-based project to resolve the confusions in terminologies caused by their disorderly and privatized uses following the information overflow.
  • Third: Compile a list of terminologies by investigating their actual uses based on specialized corpora, and construct a management model that can generate standard forms of the terminologies
  • Fourth: Contribute to terminology management and circulation of standardized terminology uses by developing and servitizing an integrated search system for cross-searching the already compiled terminologies, and tools for providing other terminology related information.
  • Fifth: Raise the social awareness of terminology management by making work plans in compliance with the ISO’s international standard; though scholarly research; by combining opinions and exchanging information in addition to providing methodologies through symposiums and forums; and through education and advertisement using various publications.



  • 3. Center History

     Date Content 
     1998.08

     Authorized by the Ministry of Culture and Tourism

     1998.12.04 ~ 05

     2nd Forum on EAFTerm Held

     1998.12.07 ~ 10  Tutorial on Terminology
     1998.12.12  1st Terminology Language Engineering Symposium Held
     1999.07 ~ 08  KORTERM Summer Colloquium '99
     1999.08  First Issue of “Terminology News” Published
     1999.09.10   KORTERM Seminar : "Terminology and Knowledge Representation:
      application to the translation in   medical domain" (Prof. Henry Zingle, Univ. Of Nice-LILLA)
     1999.11.05  NLPRS'99 Workshop MAL'99
    (Multi-lingual Information Processing and Asian Language processing)
     1999.11.19  The 1st International Roundtable on Terminology Held
     1999.11.20  2nd Terminology Language Engineering Symposium Held
     1999.12  Appointed as the ISO/TC37 Terminology Standardization Management Organization
     2000.02  “Terminology Studies 1” Published
     2000.04  KORTERM 연변연구소 설립 (KORTERMYUST)
     2000.05.29  WTRC2000 (LREC2000 workshop) Held
     2000.07 ~ 08  KORTERM Summer Colloquium 2000 Held
     2000.11  "Terminology Studies 2" Published
     2000.12.09  The 2nd International Roundtable on Terminology Held
     2001.10.19  “High-Capacity Voice (Sound)/Language/Image DB Construction and Standardization” Project Presentation
     2001.11.24  4th Terminology Language Engineering Symposium Held
     2002.02.03 ~ 05  ISO/TC37/SC4 Preliminary Meeting Held
     2002.05.27 ~ 28   ISO/TC37/SC4 Conference Held
    “International Standard for Terminology and Language Resource Management” Workshop Held


    4. Steps and Contents of the Terminology Standardization Project

    Steps  Research Objectives  Research Contents
     Step 1
    (1998 ~ 2000)
     Step Constructing Development Environment and Compiling Foundational Data - Compile terminologies (in Economics and National Sciences)
    - Compose a list of the terminologies in 3 languages
    - Manage and standardize system of terminology development environment
    -Formalize terminologies and expand educational base
     Step 2
    (2001 ~ 2003)
     전Expanded Compilation of Terminologies and Practical Application - Compile terminologies (in Economics and National Sciences)
    - Compose a list of the terminologies in 3 languages
    - Expand and maintain terminologies in national sciences and technology
    - Broaden the areas of terminologies, and upgrade their quality
    - Strengthen applicability with language industry
    - Verify high reliability, and distribute
     Step 3
    (2004 ~ 2007)
    Developing the Basis for Automating the Development and Maintenance of Terminologies - Compile terminologies (in Technology and Engineering)
    – Expand and maintain science and technology terminologies
    - Complete a large-scale integrated database for the terminologies in science and technology
    - Complete a blueprint of terminology education
    - Develop application products for the development and maintenance of the terminologies
     Step 4
    (2008 ~        )
     Maintenance Expansion -Localize the terminologies
    -Distribute the terminology integrated database
    -Proceed with continuous expansion and maintenance of terminologies


    5. Project Introduction

    For more detailed information on the Terminology Standardization Project, go to [Research] --> [Major Projects] --> [KORTERM]
    1. What is the Bank of Resource for language and Annotation (BORA)?

    The Bank of Resource for language and Annotation (BORA) is supported as a special research materials bank to allow researchers to efficiently obtain research material that is recognized by Korean Science and Engineering Foundation as requiring state-level sponsorship for its management and distribution. Designated as a special research materials bank, the BORA was established to extensively and systematically develop, manage, and distribute language resources essential to the development of language and knowledge infra-software that will serve as the basis for the information era of the future.

    A language resource bank literally provides language resources. This still-unfamiliar resource includes primitive and analysis corpora of spoken and written languages, all kinds of electronic dictionaries, wordnets , ontologies , and other language-related resources, processed and developed in diverse forms from any pronouncements resulting from human use of languages.

    2. Purpose of BORA

    In today’s world of knowledge and information, language engineering technology that enables the efficient search and processing of information through the mechanical processing of languages and the speedy distribution of information is emerging as a nation’s industrial competitive factor. A language resource is considered an understructure that is essential to the development of language engineering technology, and can be used as an important resource for language-based systems like machine translation and interpretation, information search, Korean language education system, and guide system. For example, in order to develop a syntax analyzer, an essential component of most natural language processing systems, the co-occurrence information extracted from a primitive corpus and morphological analyzer probability model is learned, and the co-occurrence information extracted from a morphological analytic corpus must be used. Likewise, a number of [different] language resources are required in developing a single piece of software. Recognizing the importance of language resources, the Bank of Resources for Language and Annotation was established to develop, manage and distribute the resources systematically and extensively.

    1) Uses of Language Resources
    Languages resources are used as an essential resource for all the systems based on languages, such as mechanical translation and interpretation, information search, Korean language education system, and guide system. For example, to develop a syntax analyzer, an essential component for most natural language processing systems, co-occurrence information is extracted from a primitive corpus and a morphological analyzer probability model is learned, and additional co-occurrence information is extracted from a morphological analytic corpus and used. Likewise, to develop a single piece of software, a number of language resources are required.

    2) Types of Resources Available
    For Korean, English, Chinese, Japanese, and other languages, the Bank currently holds primitive and analytic corpora, electronic dictionaries divided according to parts of speech, specialized dictionaries for proper nouns, compound words, case frames, and various natural language processing software based on these resources. Most notably, the recently created basic Korean lexicon list and the concept-based multi-lingual word-net are advanced language resources that meet global needs. However, these resources are not complete achievements, and require constant improvement and maintenance to increase their value.

    3. History of the BORA

     Date Major Developments  
     2002   Signed Contract regarding Supply of Language Resource through the ELDA
     2002   Signed Contract regarding the Project on Supplying Language Resource through Korea Science and Engineering Foundation
     2002   Appointed as the International ISO/TC37/SC4 Language Resource Application Management Organization
     2003   Established Bank of Resources for Language and Annotation. Designated by the State as a special research material bank sponsored by Korea Science and Engineering Foundation.
     2003   East Asia Forum on Terminology Held
     2003   Conference on Standardizing North and South Korea’s Language Information Industry Held (in Beijing)
     2004   Elected as the host of IJCNLP (International Joint Conference of Natural Language Processing)
     2005   IJCNLP Held (Jeju)
     2005.11.   International ISO/TC37 SC4 Work Group Conference Held (Jeju)
     2006.01.   GWC (Global Wordnet Conference) Held (Jeju)
     2007.05.   Elected as the Directing Center for Frontier Semantic Web Research Centers
     2007.1.   ISWC (International Semantic Web Conference) Held (Busan) – Elected as the Program Chair


    4. Project Introduction

    For more detailed information on the Bank of Resource for language and Annotation, go to [Research] --> [Major Projects] --> [BORA]


    Develop Artificial Brain with Knowledge Learning and Language Ability

    Founded in 1998, KAIST’s Cognitive Information Lab is committed to comprehending human intellectual activities and developing an information system that will enable computers to assist in human learning activities. The Lab’s research objectives are to learn about human intelligence and its knowledge processing mechanisms; to develop knowledge information processing technology that functions similarly to the manner in which humans think, based on a brain information processing mechanism; and to study the core foundational technology for the development of an artificial brain.

    With the development of artificial brains, humans would be able to leave relatively simple knowledge and information processing tasks to artificial brains, and be able to focus on creative activities in the arts and sciences. As this development would enable a more intelligent knowledge information system that could infer relationships between learning and information, it would also materialize talking intelligent software agents based on an integrated human abilities model. These could include, for example, question-answer assistants, lecture preparation assistants, essay and report graders, and other knowledge information processing with the ability to intelligently infer relationships between information.

    To make the artificial brain a reality, an Autonomous Mental Development model, which would enable machines to automatically acquire knowledge, is necessary. An Autonomous Mental Development model studies artificial knowledge, which ranges from unstructured, illogical situation data to structured, logical knowledge information, based on the brain’s information processing mechanism. During this development process, a number of language resources are needed and are produced.

    To learn from both unstructured, illogical language resources and structured, logical language resources, an understanding of language resources, such as concept recognition and relation recognition between concepts, is necessary. The Autonomous Mental Development model learns new knowledge and information, including concept relations knowledge, definitions of newly recognized terminology, and relations between sentences, through causality inference, hierarchy inference, and other inference processes based on the understanding of language. This new hierarchical knowledge and information is again stored by the memory model based on the brain information processing of the artificial brain, and is used in active learning system development, the comprehension of newly-inputted information, and in artificial intelligence applications.

    Research Area
    - Human Languages Engineering
    Install language ability to enable a computer to communicate with businesses and people
    Meaning-Unit Comprehension Morphological Analysis, Terminology Recognition, Foreign Word Recognition, Named Entity Recognition, Word Sense Disambiguation
    Dependence Structure Comprehension Syntactic Analysis, Dependency Structure Analysis, Noun Phrase/Verb Phrase Recognition
    Concept Relations Comprehension Hierarchy Relation, Polysemy Relation, Causality Relation
    Application Knowledge Learning and Extraction, Information Search, Document Classification


    - Artificial Brain
    Study a superior computer information-processing model that imitates the information-processing model of the human brain
    Causality Inference Causality Relations between Terms Inference, Causality Relations between Sentences Inference
    Hierarchy Relation Inference Terminology Hierarchy Relation Inference, Specificity Comparison, Automatic Ontology Construction, Ontology Mapping
    Knowledge Expression Concept Relations Expression, Knowledge Acquisition and Inference
    Application Question-Answer based on Knowledge, Intelligent Agents


    - Knowledge Mining
    **Summarizing , Grading Essays, Student-tailored Education, Etc.

    - Information Search and Multimedia Search

    Application of Language Processing in Brain Science
    - Knowledge Accumulation : Resource -> Artificial Knowledge
    Research Area: Terminology Extraction, Comprehension of Relationship Between Events , Study of Knowledge Expression

    -Knowledge Search: Knowledge Exploration: Knowledge Search and Inference
    Research Area: High-capacity Data Search, Meanings Comprehension, Information Search, Causality Inference.

    - Answer Creation: Resource -> Language Expression
    Research Area: Definitions and Explanation Formation, Video-clip Generation, Automatic Abstraction, Powerpoint Generation