ISO/DIS 24617-15

Language resource management — Semantic annotation framework (SemAF) — Part 15: Measurable quantitative information extraction (MQIE)
Measurable quantitative information (MQI) describes one of basic properties that is associated with the magnitude aspect of quantity, and is very common in ordinary language. The main characteristics of MQI, as described in ISO 24617-11[1], is that quantitative information is presented as measures expressed in terms of a pair of a numerically expressed quantity and a unit. Such information is much more abundant in scientific publications or technical reports to the extent that it constitutes an essential part of communicative segments of language in general. The processing of such information is thus required for any successful language resource management. This proposal aims to develop a general but mandatory framework to extract MQI according to the metamodel defined in ISO 24617-11. The proposal specifies a framework for Measureable Quantitative Information Extraction (MQIE), consisting of the identification of basic elements (entities, numerals, units, and relators), and links (measure links and comparison links). For example, as to a sentence “Body mass index must between 20-40kg/m”, it extracts the entity “Body mass index”, the numerals “20” and “40”, the unit “kg/m”, the relator “between”, as well as measure link between the entity and measure, and comparison link between measures. The framework in the proposal generally consists of five mandatory steps including: 1) pre-processing, 2) basic element identification, 3) link identification, 4) measure normalization, and 5) verification and filtering. One feature of this proposal MQIE is that it provides a general guideline for identifying and extracting MQI for carrying out necessary IR and NLP tasks for different applications. The second feature is that it is compatible with ISO 24617-11 MQI in concept definition, element definition, and representation. Another feature is it output a flexible and open representation formats that allows the use of various types of tasks that conform to non-ISO standards, guidelines or recommendations as long as they are compatible with the ISO Language Research Management (LRM) standards. This proposal MQIE is designed to be widely applied in information extraction (IR) and natural language processing (NLP) tasks for the identification, aggregation, computation and analysis of MQI. The choice of such tasks depends on the type of applications and use cases.
OEN:
ISO
Langue:
English
Code(s) de l'ICS:
01.020
Statut:
Brouillon
Date de Publication:
1969-12-30
Numéro Standard:
ISO/DIS 24617-15