Project Introduction(BORA)

I. Introduction to the BORA (Bank Of Resource for language and Annotation)

1. Purpose of the BORA

In today’s world of knowledge and information, language engineering technology that fosters both the efficient querying and processing of information through mechanical language processing, and the speedy distribution of such information, is emerging as a major factor of a nation’s industrial competitiveness. Language resources are considered the underpinnings essential to the development of language engineering technology, and are critical for all language-based systems, including machine translation and interpretation, information search engines, Korean language education systems, and guidance systems. For example, a parser is a core program used in most natural language processing systems. To develop the parser, co-occurrence information is extracted from a raw corpus and a morphological analyzer probability model is learned, and co-occurrence information is extracted from a morphological corpus and used. Likewise, to develop a single piece of software, a number of different language resources are needed. Recognizing the importance of language resources, the Bank of Resource for Language and Annotation was established to develop, manage and distribute such resources in a more organized and pervasive way.

2. History of the BORA

 Date Major Developments  
 2002   Signed Contract regarding Supply of Language Resource through the ELDA
 2002   Signed Contract regarding the Project on Supplying Language Resource through Korea Science and Engineering Foundation
 2002   Appointed as the International ISO/TC37/SC4 Language Resource Application Management Organization
 2003   Established Bank of Resources for Language and Annotation. Designated by the State as a special research material bank sponsored by Korea Science and Engineering Foundation.
 2003   East Asia Forum on Terminology Held
 2003   Conference on Standardizing North and South Korea’s Language Information Industry Held (in Beijing)
 2004   Elected as the host of IJCNLP (International Joint Conference of Natural Language Processing)
 2005   IJCNLP Held (Jeju)
 2005.11.   International ISO/TC37 SC4 Work Group Conference Held (Jeju)
 2006.01.   GWC (Global Wordnet Conference) Held (Jeju)
 2007.05.   Elected as the Directing Center for Frontier Semantic Web Research Centers
 2007.1.   ISWC (International Semantic Web Conference) Held (Busan) – Elected as the Program Chair

3. Special Properties of the BORA

BORA is Korea’s only research center specialized in the construction, management, and distribution of language resources. Recognizing the importance of language resources, vast state-level interest and support has been directed toward securing, distributing, and continuously upgrading these resources in order to foster information technology development and strengthen national competiveness.

The bank manages language resources through the following 3 steps: 1) Secure the resources, 2) Continuously maintain them, and 3) Distribute them to researchers.

A. Secure New High-Quality Language Resources

Language resources are the basis of all core information technology. However, due to their special properties, it is tremendously difficult for a private researcher to assemble such resources independently. Therefore, the resources are attained and processed to meet the user’s demands through the bank. The resources with scarcity value currently held by the bank include Korean cursive-script video DB, proper noun dictionary, compound noun dictionary, verb-case frame argument structure dictionary, adjective syntax dictionary, high-capacity technology-service resource, multi-lingual CoreNet and many others.

B. Continuous Resources Maintenance

Due to the special properties of language resources, their continuous upgrading is crucial. New words and new knowledge emerge as society changes over time, and broadening the knowledge base and upgrading the resources to fit the time are crucial tasks. Moreover, as there are limits to what a private researcher can achieve in building and maintaining a high-capacity DB, professional resource management and maintenance by a specialized organization is vitally important. To meet these needs, the bank and its brilliant researchers are efficiently managing the resources, based on over 10 years of research achievements.

C. Language Resources Distribution

As language resources serve as the basis for core information technology, their distribution helps to prepare the cornerstone for the development of cutting-edge information technology. The bank shares its resources in many different ways through distribution, and prevents the leakage of the latest information.
Within Korea, the bank distributes its resources for free to students and scholars engaged in the study of natural language processing, and responds to their inquiries and requests in a timely manner, thus energizing the field of research. In addition, by building multi-lingual resources, the bank eliminates the need to import expensive foreign language resources, helping to reduce expenses for researchers.
Our distribution of language resources outside Korea is an opportunity for us to show the world the advanced level of Korea’s language processing technology, to make the world dependent on the bank for Korean-based resources, and to secure profitability through the provision of differentiated services.

4 The Roles and Functions of the BORA

A. Integrate, collect, manage, and compile into databases the diverse language resources available in Korea and abroad.
B. Secure high-quality language resources by making use of expert knowledge of specialists in the field.
C. Strengthen research and development by distributing the language resources to private researchers, corporations, and research organizations.
D. Enable users to easily access and use the resources by operating a web-based language resource bank.
E. Increase and maximize public awareness of the importance of language resources as a research material.
F. Contribute to the acceleration of software development by reducing the time and cost needed for building the resources.
G. Provide customized resources by adjusting and processing language resources to meet user requests.
H. Promote further study on language resources by holding various national and international workshops.

5. Expected Impacts of the BORA

5.1 Completion of Relevant Research Aspects

A. Avoid research duplication by sharing the resources with other domestic research organizations.
B. Raise Korea’s international status by providing high-quality Korean language resources to foreign researchers, demonstrating the advanced level of Korea’s information technology.
C. Establish standardization of language-based system designs.
D. Shorten research procedures for private researchers by automatically processing high-capacity language resources through the bank.
E. Make research and strategy planning more efficient by enabling the rapid querying and sharing of the language resources.

5.2 Economic and Technological Aspects

A. Cost reduction expected through the systematic construction, management, and distribution of language resource information.
B. Overlapping of tasks prevented in nationwide data research and analysis projects by constructing a language-related information system.
C. Laying the foundation for internet semantic technology, which will be crucial for future society, through the development of language resource contents adapted to the on-line environment.
D. As the computer becomes used for artificial intelligence applications, the natural language processing system’s added value and competitiveness will increase through the increased use of language resource applications, which will serve as the basis for the businesses of the future (Ubiquitous technology, E-business).
E. Facilitating the revitalization and utilization of the market for the next generation of semantic information technology, by providing the foundation for cyber information technology development.

5.3 National Aspects

A. Secure infrastructure for the development of language information processing system.
B. Enable the searching and utilization of a diverse range of information through developed software tools.
C. Help ordinary users to enjoy internet culture more conveniently by promoting natural language-based internet technology.
D. Provide cutting-edge text-mining technologies, such as Bio-Informatics and Medical Informatics, to the areas of industry where they are required.