Abstract

Today the Internet growing exponentially and revolutionizing everything with increasing number of users everywhere in order to meet the superfluous demand has triggered an unprecedented wave of various kinds of digital data on the Web. Among them much of the data is relevant and can be turned into actionable insights but difficulties to face are that handling such a hype of data on the Web and due to its unstructured format can not meet the pre-set requirements of professionals and end users. In the context of biodiversity domain, a conceptual approach of data science has been proposed in this paper to extract and structure data seamlessly, which makes sense of all biodiversity-rich data and multiple-record documents by saving time and energy. The major drawback in manual extraction and storage of biodiversity data is that it gives rise to several errors (such as spelling errors, skipping of some data fields etc.) which can be difficult to improve during the processing stage, thereafter can not meet the research demands. However, such drawbacks can be dealt if data science approach is applied within the system and this automated approach will be fast, flexible, reliable and accurate. Nevertheless, the only thing to be taken care in the extraction approach is regular monitoring and analysis of Hypertext Markup Language (HTML) structure, documents, and links of target sources. Such a huge set of data contains many error and noisy characters; to eliminate these errors, data cleaning algorithm has been used to make data error-free and ready for further systematic research. Due to the wide variety of data formats, achieving interoperability is a daunting task, since some of the datasets do not follow their own schema structure. To cope with this demand, semantic interoperability has proved to be helpful by exchanging data through web services between different independent loosely coupled systems. This paper presents an overview of semantic interoperability and case studies on various projects that implemented it for biodiversity data sharing.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call