Semantic Space Model Research Articles

Information Retrieval (IR) approaches, such as Latent Semantic Indexing (LSI) and Vector Space Model (VSM), are commonly applied to recover software traceability links. Recently, an approach based on developers' eye gazes was proposed to retrieve traceability links. This paper presents a comparative study on IR and eye-gaze based approaches. In addition, it reports on the possibility of using eye gaze links as an alternative benchmark in comparison to commits. The study conducted asked developers to perform bug-localization tasks on the open source subject system JabRef. The iTrace environment, which is an eye tracking enabled Eclipse plugin, was used to collect eye gaze data. During the data collection phase, an eye tracker was used to gather the source code entities (SCE's), developers looked at while solving these tasks. We present an algorithm that uses the collected gaze dataset to produce candidate traceability links related to the tasks. In the evaluation phase, we compared the results of our algorithm with the results of an IR technique, in two different contexts. In the first context, precision and recall metric values are reported for both IR and eye gaze approaches based on commits. In the second context, another set of developers were asked to rate the candidate links from each of the two techniques in terms of how useful they were in fixing the bugs. The eye gaze approach outperforms standard LSI and VSM approaches and reports a 55 % precision and 67 % recall on average for all tasks when compared to how the developers actually fixed the bug. In the second context, the usefulness results show that links generated by our algorithm were considered to be significantly more useful (to fix the bug) than those of the IR technique in a majority of tasks. We discuss the implications of this radically different method of deriving traceability links. Techniques for feature location/bug localization are commonly evaluated on benchmarks formed from commits as is done in the evaluation phase of this study. Although, commits are a reasonable source, they only capture entities that were eventually changed to fix a bug or resolve a feature. We investigate another type of benchmark based on eye tracking data, namely links generated from the bug-localization tasks given to the developers in the data collection phase. The source code entities relevant to subjected bugs recommended from IR methods are evaluated on both commits and links generated from eye gaze. The results of the benchmarking phase show that the use of eye tracking could form an effective (complementary) benchmark and add another interesting perspective in the evaluation of bug-localization techniques.

Read full abstract

Corpus-based semantic space models, which primarily rely on lexical co-occurrence statistics, have proven effective in modeling and predicting human behavior in a number of experimental paradigms that explore semantic memory representation. The most widely studied extant models, however, are strongly influenced by orthographic word frequency (e.g., Shaoul & Westbury, Behavior Research Methods, 38, 190-195, 2006). This has the implication that high-frequency closed-class words can potentially bias co-occurrence statistics. Because these closed-class words are purported to carry primarily syntactic, rather than semantic, information, the performance of corpus-based semantic space models may be improved by excluding closed-class words (using stop lists) from co-occurrence statistics, while retaining their syntactic information through other means (e.g., part-of-speech tagging and/or affixes from inflected word forms). Additionally, very little work has been done to explore the effect of employing morphological decomposition on the inflected forms of words in corpora prior to compiling co-occurrence statistics, despite (controversial) evidence that humans perform early morphological decomposition in semantic processing. In this study, we explored the impact of these factors on corpus-based semantic space models. From this study, morphological decomposition appears to significantly improve performance in word-word co-occurrence semantic space models, providing some support for the claim that sublexical information-specifically, word morphology-plays a role in lexical semantic processing. An overall decrease in performance was observed in models employing stop lists (e.g., excluding closed-class words). Furthermore, we found some evidence that weakens the claim that closed-class words supply primarily syntactic information in word-word co-occurrence semantic space models.

Read full abstract

Semantic Space Model Research Articles

Related Topics

Articles published on Semantic Space Model

Deceptive text detection using continuous semantic space models

Eye movements in software traceability link recovery

텐서공간모델 기반 시멘틱 검색 기법

Cross-domain deception detection using support vector networks

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning.

Applying an exemplar model to an implicit rule-learning task: Implicit learning of semantic structure.

A lectometric analysis of aggregated lexical variation in written Standard English with Semantic Vector Space models

Meaning change in a petri dish: constructions, semantic vector spaces, and motion charts

An improved focused crawler based on Semantic Similarity Vector Space Model

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

Performance impact of stop lists and morphological decomposition on word-word corpus-based semantic space models.

Hidden processes in structural representations: A reply to Abbott, Austerweil, and Griffiths (2015).

Encoding sequential information in semantic space models: comparing holographic reduced representation and random permutation.

Blog Topic Detection Based on the Model

Fuzzy control GA with a novel hybrid semantic similarity strategy for text clustering

A novel semantic level text classification by combining NLP and Thesaurus concepts

Utilizing Conceptual Indexing to Enhance the Effectiveness of Vector Space Model

Lexical acquisition and semantic space models: Learning the semantics of unknown words

A semantic space approach to the computational semantics of noun compounds

Latent Semantic Engineering – A new conceptual user-centered design approach

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semantic Space Model Research Articles

Related Topics

Articles published on Semantic Space Model

Deceptive text detection using continuous semantic space models

Eye movements in software traceability link recovery

텐서공간모델 기반 시멘틱 검색 기법

Cross-domain deception detection using support vector networks

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning.

Applying an exemplar model to an implicit rule-learning task: Implicit learning of semantic structure.

A lectometric analysis of aggregated lexical variation in written Standard English with Semantic Vector Space models

Meaning change in a petri dish: constructions, semantic vector spaces, and motion charts

An improved focused crawler based on Semantic Similarity Vector Space Model

A Semantic Aspect-Based Vector Space Model to Identify the Event Evolution Relationship within Topics

Performance impact of stop lists and morphological decomposition on word-word corpus-based semantic space models.

Hidden processes in structural representations: A reply to Abbott, Austerweil, and Griffiths (2015).

Encoding sequential information in semantic space models: comparing holographic reduced representation and random permutation.

Blog Topic Detection Based on the Model

Fuzzy control GA with a novel hybrid semantic similarity strategy for text clustering

A novel semantic level text classification by combining NLP and Thesaurus concepts

Utilizing Conceptual Indexing to Enhance the Effectiveness of Vector Space Model

Lexical acquisition and semantic space models: Learning the semantics of unknown words

A semantic space approach to the computational semantics of noun compounds

Latent Semantic Engineering – A new conceptual user-centered design approach