Abstract

The primary focus of technical communication (TC) in the past decade has been the system-assisted generation and utilization of standardized, structured, and classified content for dynamic output solutions. Nowadays, machine learning (ML) approaches offer a new opportunity to integrate unstructured data into existing knowledge bases without the need to manually organize information into topic-based content enriched with semantic metadata. To make the field of artificial intelligence (AI) more accessible for technical writers and content managers, cloud-based machine learning as a service (MLaaS) solutions provide a starting point for domain-specific ML modelling while unloading the modelling process from extensive coding, data processing and storage demands. Therefore, information architects can focus on information extraction tasks and on prospects to include pre-existing knowledge from other systems into the ML modelling process. In this paper, the capability and performance of a cloud-based ML service, IBM Watson, are analysed to assess their value for semantic context analysis. The ML model is based on a supervised learning method and features deep learning (DL) and natural language processing (NLP) techniques. The subject of the analysis is a corpus of scientific publications on the 2019 Coronavirus disease. The analysis focuses on information extractions regarding preventive measures and effects of the pandemic on healthcare workers.

Highlights

  • Introduction and backgroundThis paper illustrates the introduction of artificial intelligence (AI) into the field of technical communication (TC) and examines the potentials of machine learning (ML) to gain insight into data and to enhance or replace manual feature extraction and classification tasks, to provide a deeper semantic analysis of large amounts of unstructured data.The implementation of a ML model for semantic context analysis and insight into unstructured data shall be examined regarding its potential for automating and extending manual analysis processes.For this purpose, a ML model with deep learning (DL) techniques was implemented using Watson, a cloud-based machine learning as a service (MLaaS) algorithm by IBM

  • The domain-specific feature extraction was based on the semantic tagging of content according to a customized supervised learning model created in IBM Watson Knowledge Studio (WKS)

  • The customized model for the Covid-19 domain was deployed to the Watson Natural Language Understanding (WNLU) service to actively conduct feature extractions and to measure the success of a domain-specific model for context analysis tasks in contrast to the performance of a standard model provided by IBM Watson

Read more

Summary

Introduction and background

This paper illustrates the introduction of AI into the field of TC and examines the potentials of ML to gain insight into data and to enhance or replace manual feature extraction and classification tasks, to provide a deeper semantic analysis of large amounts of unstructured data. The implementation of a ML model for semantic context analysis and insight into unstructured data shall be examined regarding its potential for automating and extending manual analysis processes. For this purpose, a ML model with DL techniques was implemented using Watson, a cloud-based MLaaS algorithm by IBM. In the context of technical information analysis, generation and delivery, AI applications can be applied for automated extraction of metadata and knowledge, for automated content classification, or even automated content generation

ML modelling with IBM Watson
Data corpus collection
Preparing model assets
Model performance evaluation
Deploying the model to Watson NLU for context analysis
Analysis features
Performance comparison of the standard model and the custom model
Category and concept extraction with the Watson standard model
Comparing entity and relation extractions of both models
Findings
Conclusion and outlook

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.