A university map of course knowledge.

Zachary A Pardos,Andrew Joo Hun Nam,Pin-Yu Chen

doi:10.1371/journal.pone.0233207

Zachary A Pardos, Andrew Joo Hun Nam + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0233207

Copy DOI

Journal: PloS one	Publication Date: Sep 30, 2020
Citations: 18	License type: CC BY 4.0

Affiliation: University of California, Berkeley, Stanford University

Abstract

Knowledge representation has gained in relevance as data from the ubiquitous digitization of behaviors amass and academia and industry seek methods to understand and reason about the information they encode. Success in this pursuit has emerged with data from natural language, where skip-grams and other linear connectionist models of distributed representation have surfaced scrutable relational structures which have also served as artifacts of anthropological interest. Natural language is, however, only a fraction of the big data deluge. Here we show that latent semantic structure can be informed by behavioral data and that domain knowledge can be extracted from this structure through visualization and a novel mapping of the text descriptions of elements onto this behaviorally informed representation. In this study, we use the course enrollment histories of 124,000 students at a public university to learn vector representations of its courses. From these course selection informed representations, a notable 88% of course attribute information was recovered, as well as 40% of course relationships constructed from prior domain knowledge and evaluated by analogy (e.g., Math 1B is to Honors Math 1B as Physics 7B is to Honors Physics 7B). To aid in interpretation of the learned structure, we create a semantic interpolation, translating course vectors to a bag-of-words of their respective catalog descriptions via regression. We find that representations learned from enrollment histories resolved courses to a level of semantic fidelity exceeding that of their catalog descriptions, revealing nuanced content differences between similar courses, as well as accurately describing departments the dataset had no course descriptions for. We end with a discussion of the possible mechanisms by which this semantic structure may be informed and implications for the nascent research and practice of data science.

Highlights

The emergence of data science [1] and the application of word vector models for representation learning [2,3,4] have, together, focused attention on surfacing structure from big data in ways that are scrutable and show signs of being able to contribute to domain knowledge [5, 6]
This representational structure, illuminated by data and studied through the instrument of a learned embedding analysis, is analogous to the physical structures studied with instruments from the natural sciences and is part of a larger universe of explorable structure expanding at the speed of data collection
A question of natural concern to the developing notion of data science is whether truths can be learned from behavioral data through this particular lens of a representation analysis

Summary

Introduction

The emergence of data science [1] and the application of word vector models for representation learning [2,3,4] have, together, focused attention on surfacing structure from big data in ways that are scrutable and show signs of being able to contribute to domain knowledge [5, 6]. These neural models, stemming from cognitive theories of distributed representation [32], have been shown to encode a surprising portion of linguistic relationships learned directly from text [7]. They contribute to part of a quickly growing field around computational text and natural

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A university map of course knowledge.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

A Natural Language Interface for Database: Achieving Transfer-learnability Using Adversarial Method for Question Understanding
Wenlu Wang ... Haixun Wang
-
Wenlu Wang, et. al.Wenlu Wang ... Haixun Wang
01 Apr 2020
01 Apr 2020

Establishing a Data Science Unit in an Academic Medical Center: An Illustrative Model.
Manisha Desai ... Gary K Steinberg
Academic Medicine | VOL. 97
Manisha Desai, et. al.Manisha Desai ... Gary K Steinberg
23 Mar 2021
Academic Medicine | VOL. 97

Latent semantic analysis via truncated ULV decomposition
Fatih Varçın ... Fahrettin Horasan
-
Fatih Varçın, et. al.Fatih Varçın ... Fahrettin Horasan
01 May 2016
01 May 2016

Hidden semantic hashing for fast retrieval over large scale document collection
Fuhao Zou ... Xiaoman Tang
Multimedia Tools and Applications | VOL. 77
Fuhao Zou, et. al.Fuhao Zou ... Xiaoman Tang
13 Dec 2017
Multimedia Tools and Applications | VOL. 77

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A university map of course knowledge.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one