Abstract

Formal Concept Analysis is a symbolic learning technique derived from mathematical algebra and order theory. The technique has been applied to a broad range of knowledge representation and exploration tasks in a number of domains. Most recorded applications of Formal Concept Analysis deal with a small number of objects and attributes, in which case the complexity of the algorithms used for indexing and retrieving data is not a significant issue. However, when Formal Concept Analysis is applied to exploration of a large numbers of objects and attributes, the size of the data makes issues of complexity and scalability crucial.This paper presents the results of experiments carried out with a set of 4,000 medical discharge summaries in which were recognized 1,962 attributes from the Unified Medical Language System (UMLS). In this domain, the objects are medical documents (4,000) and the attributes are UMLS terms extracted from the documents (1,962). When Formal Concept Analysis is used to iteratively analyze and visualize these data, complexity and scalability become critically important.Although the amount of data used in this experiment is small compared with the size of primary memory in modern computers, the results are still important because the probability distributions that determine the efficiencies are likely to remain stable as the size of the data is increased.Our work presents two outcomes. First, we present a methodology for exploring knowledge in text documents using Formal Concept Analysis by employing conceptual scales created as the result of direct manipulation of a line diagram. The conceptual scales lead to small derived purified contexts that are represented using nested line diagrams. Second, we present an algorithm for the fast determination of purified contexts from compressed representation of the large formal context. Our work draws on existing encoding and compression techniques to show how rudimentary data analysis can lead to substantial efficiency improvements in knowledge visualization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.