
With the success of collaborative knowledge-building portals, such as Wikipedia, Stack Overflow, Quora, and GitHub, a class of researchers is driven towards understanding the dynamics of knowledge building on these portals. Even though collaborative knowledge building portals are known to be better than expert-driven knowledge repositories, limited research has been performed to understand the knowledge building dynamics in the former. This is mainly due to two reasons; first, unavailability of the standard data representation format, second, lack of proper tools and libraries to analyze the knowledge building dynamics.We describe Knowledge Data Analysis and Processing Platform (KDAP), a programming toolkit that is easy to use and provides high-level operations for analysis of knowledge data. We propose Knowledge Markup Language (Knol-ML), a generic representation format for the data of collaborative knowledge building portals. KDAP can process the massive data of crowdsourced portals like Wikipedia and Stack Overflow efficiently. As a part of this toolkit, a data-dump of various collaborative knowledge building portals is published in Knol-ML format. The combination of Knol-ML and the proposed open-source library will help the knowledge building community to perform benchmark analysis.Link of the repository: Verma et al. (2020)Video Tutorial: Verma et al. (2020)Supplementary Material: Verma et al. (2020)


  • With progress in computational power, research in various domains is primarily based on the availability of data and appropriate tools for analysis

  • The significant changes are as follows: 1) We evaluate our toolkit based on scalability and generalizability on a new dataset of Joy of Computing Wiki Portal (JOCWiki); 2) We modify the Wiki-based compression method to store in the revisionhistory dataset in Knowledge Markup Language (Knol-ML) format; 3) We discuss the limitations of our toolkit in terms of Knol-ML representation and generalizability of Knowledge Data Analysis and Processing Platform (KDAP) methods; 4) We provide more details on the crowdsourced portals in the Background Section 2

  • Methods implemented on wiki and question and answer (QnA)-based containers can be divided into three categories; knowledge generation methods, for the generation of new knowledge data; knowledge conversion methods, for the conversion of data into Knol-ML format and knowledge analytic methods, for the computation of specific knowledge building-related analysis without manipulating the underlying structure

Read more



With progress in computational power, research in various domains is primarily based on the availability of data and appropriate tools for analysis. Open access to libraries and data enhances the ease and pace of research [1]. A simple task like matrix inversion requires multiple lines of code to be written in Python. Majority of the online knolwledge portals are collaborative in nature. They do not represent all the knowledge building portals. For instance a blog article is written and maintained by a single user, making the article non-collaborative in nature. We first define the online crowdsourced portals based on the degree of freedom provided to users on each knowledge instance. We classify these portals into two categories. In the upcoming subsections we define the online crowdsourced portals and each of its categories

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call