Characterizing search activities on stack overflow

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

To solve programming issues, developers commonly search on Stack Overflow to seek potential solutions. However, there is a gap between the knowledge developers are interested in and the knowledge they are able to retrieve using search engines. To help developers efficiently retrieve relevant knowledge on Stack Overflow, prior studies proposed several techniques to reformulate queries and generate summarized answers. However, few studies performed a large-scale analysis using real-world search logs. In this paper, we characterize how developers search on Stack Overflow using such logs. By doing so, we identify the challenges developers face when searching on Stack Overflow and seek opportunities for the platform and researchers to help developers efficiently retrieve knowledge. To characterize search activities on Stack Overflow, we use search log data based on requests to Stack Overflow's web servers. We find that the most common search activity is reformulating the immediately preceding queries. Related work looked into query reformulations when using generic search engines and found 13 types of query reformulation strategies. Compared to their results, we observe that 71.78% of the reformulations can be fitted into those reformulation strategies. In terms of how queries are structured, 17.41% of the search sessions only search for fragments of source code artifacts (e.g., class and method names) without specifying the names of programming languages, libraries, or frameworks. Based on our findings, we provide actionable suggestions for Stack Overflow moderators and outline directions for future research. For example, we encourage Stack Overflow to set up a database that includes the relations between all computer programming terminologies shared on Stack Overflow, e.g., method name, data structure name, design pattern, and IDE name. By doing so, Stack Overflow could improve the performance of search engines by considering related programming terminologies at different levels of granularity.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.1109/access.2023.3238813
An Empirical Study of Web Services Topics in Web Developer Discussions on Stack Overflow
  • Jan 1, 2023
  • IEEE Access
  • Khalid Mahmood + 3 more

Web Services (WSs) are gaining worldwide popularity due to reliable and fast intercommunication services for the development of web and mobile applications. WSs are provided to client application developers through web Application Programming Interfaces (APIs), such as YouTube API, Twitter API, Facebook API, etc. Due to the popularity of WSs, the developers frequently discuss various WSs-based application’ issues on online forums, such as Stack Overflow (SO). This study aims to highlight the problems faced by client developers in the development process of WSs-based applications using the dataset of SO. The comprehension of developers’ conversations on SO can give insight into the frequency, difficulty, and popularity of different WSs-related problems of developers. We downloaded 12,746 posts from SO relevant to WSs-related issues for this article. We used the topic modeling technique (LDA) to extract various topics from the SO dataset. The topics are labeled and organized into categories and sub-categories according to relationships among them. The difficulty and popularity of each topic have been analyzed. Our investigation yield several findings. First, developers focus on six topics related to WSs on SO: Client APIs development, Data Processing, Web services Authorization, Framework Support, Web APIs, and Mobile Applications. Secondly, the advantages and disadvantages of web applications topic (Fused_Popularity=0.39), from the Clients APIs development category have the highest prevalence, followed by Database (DB) and Data Processing in Applications topic (Fused_Popularity=0.38) from the Data Processing category. Third, most WSs-related topics in all categories are evolving promptly on SO, i.e., new questions are added daily about WSs development, deployment, and authorization. Fourth, the questions of type “how” are primarily asked in Framework support, Client APIs development, and Web APIs categories. Although, many questions in other categories are of the kind “What”. It is also observed that WSs developers not only used SO to ask How and What types of questions but they also used SO to ask information-seeking questions (i.e., in Data processing and Client APIs development categories). Fifth, the topics relevant to Web APIs (Fused_Popularity=10.8) and Client API Development ((Fused_Popularity=9.35) categories of WSs are very popular on SO. Sixth, the questions relevant to the Web APIs (Fused_Difficulty =3) & Client APIs development (Fused_Difficulty=2.25) categories are more difficult than the other four categories. The results of our research may be helpful for the following WSs stakeholders: WSs Client application developers, WSs Educators, and WSs researchers. The WSs Educators and investigators can get more knowledge of new methods and discover novel techniques to make challenging WSs topics easy to understand. WSs framework developers can utilize our extracted WSs topics and categories to know the preferences of WSs developers that may support them in upgrading existing frameworks or developing new ones.

  • Conference Article
  • Cite Count Icon 7
  • 10.1145/3196321.3196348
Recommending frequently encountered bugs
  • May 28, 2018
  • Yun Zhang + 4 more

Developers introduce bugs during software development which reduce software reliability. Many of these bugs are commonly occurring and have been experienced by many other developers. Informing developers, especially novice ones, about commonly occurring bugs in a domain of interest (e.g., Java), can help developers comprehend program and avoid similar bugs in the future. Unfortunately, information about commonly occurring bugs are not readily available. To address this need, we propose a novel approach named RFEB which recommends frequently encountered bugs (FEBugs) that may affect many other developers. RFEB analyzes Stack Overflow which is the largest software engineering-specific Q&A communities. Among the plenty of questions posted in Stack Overflow, many of them provide the descriptions and solutions of different kinds of bugs. Unfortunately, the search engine that comes with Stack Overflow is not able to identify FEBugs well. To address the limitation of the search engine of Stack Overflow, we propose RFEB which is an integrated and iterative approach that considers both relevance and popularity of Stack Overflow questions to identify FEBugs. To evaluate the performance of RFEB, we perform experiments on a dataset from Stack Overflow which contains more than ten million posts. We compared our model with Stack Overflow's search engine on 10 domains, and the experiment results show that RFEB achieves the average NDCG10 score of 0.96, which improves Stack Overflow's search engine by 20%.

  • Research Article
  • 10.5281/zenodo.4683732
Broken external links on Stack Overflow
  • Oct 10, 2020
  • arXiv (Cornell University)
  • Jiakun Liu + 6 more

This is the dataset, coding guides, and scripts for our paper: Broken external links on Stack Overflow.

  • Research Article
  • Cite Count Icon 11
  • 10.1109/tse.2021.3086494
Broken External Links on Stack Overflow
  • Sep 1, 2022
  • IEEE Transactions on Software Engineering
  • Jiakun Liu + 6 more

Stack Overflow hosts valuable programming-related knowledge with 11,926,354 links that reference to the third-party websites. The links that reference to the resources hosted outside the Stack Overflow websites extend the Stack Overflow knowledge base substantially. However, with the rapid development of programming-related knowledge, many resources hosted on the Internet are not available anymore. Based on our analysis of the Stack Overflow data that was released on Jun. 2, 2019, 14.2% of the links on Stack Overflow are broken links. The broken links on Stack Overflow can obstruct viewers from obtaining desired programming-related knowledge, and potentially damage the reputation of the Stack Overflow as viewers might regard the posts with broken links as obsolete. In this paper, we characterize the broken links on Stack Overflow. 65% of the broken links in our sampled questions are used to show examples, e.g., code examples. 70% of the broken links in our sampled answers are used to provide supporting information, e.g., explaining a certain concept and describing a step to solve a problem. Only 1.67% of the posts with broken links are highlighted as such by viewers in the posts' comments. Only 5.8% of the posts with broken links removed the broken links. Viewers cannot fully rely on the vote scores to detect broken links, as broken links are common across posts with different vote scores. The websites that host resources that can be maintained by their users are referenced by broken links the most on Stack Overflow -- a prominent example of such websites is GitHub. The posts and comments related to the web technologies, i.e., JavaScript, HTML, CSS, and jQuery, are associated with more broken links. Based on our findings, we shed lights for future directions and provide recommendations for practitioners and researchers.

  • Conference Article
  • Cite Count Icon 63
  • 10.1109/icse43902.2021.00116
Automated Query Reformulation for Efficient Search Based on Query Logs From Stack Overflow
  • May 1, 2021
  • Kaibo Cao + 4 more

As a popular Q&A site for programming, Stack Overflow is a treasure for developers. However, the amount of questions and answers on Stack Overflow make it difficult for developers to efficiently locate the information they are looking for. There are two gaps leading to poor search results: the gap between the user's intention and the textual query, and the semantic gap between the query and the post content. Therefore, developers have to constantly reformulate their queries by correcting misspelled words, adding limitations to certain programming languages or platforms, etc. As query reformulation is tedious for developers, especially for novices, we propose an automated software-specific query reformulation approach based on deep learning. With query logs provided by Stack Overflow, we construct a large-scale query reformulation corpus, including the original queries and corresponding reformulated ones. Our approach trains a Transformer model that can automatically generate candidate reformulated queries when given the user's original query. The evaluation results show that our approach outperforms five state-of-the-art baselines, and achieves a 5.6% to 33.5% boost in terms of $\mathit{ExactMatch}$ and a 4.8% to 14.4% boost in terms of $\mathit{GLEU}$.

  • Research Article
  • Cite Count Icon 39
  • 10.1109/tse.2020.3016006
Chatbot4QR: Interactive Query Refinement for Technical Question Retrieval
  • May 11, 2021
  • IEEE Transactions on Software Engineering
  • Neng Zhang + 5 more

Technical Q&#x0026;A sites (e.g., Stack Overflow (SO)) are important resources for developers to search for knowledge about technical problems. Search engines provided in Q&#x0026;A sites and information retrieval approaches (e.g., word embedding-based) have limited capabilities to retrieve relevant questions when queries are imprecisely specified, such as missing important technical details (e.g., the user&#x2019;s preferred programming languages). Although many automatic query expansion approaches have been proposed to improve the quality of queries by expanding queries with relevant terms, the information missed in a query is not identified. Moreover, without user involvement, the existing query expansion approaches may introduce unexpected terms and lead to undesired results. In this paper, we propose an interactive query refinement approach for question retrieval, named <i>Chatbot4QR</i>, which can assist users in recognizing and clarifying technical details missed in queries and thus retrieve more relevant questions for users. Chatbot4QR automatically detects missing technical details in a query and generates several clarification questions (CQs) to interact with the user to capture their overlooked technical details. To ensure the accuracy of CQs, we design a heuristic-based approach for CQ generation after building two kinds of technical knowledge bases: a manually categorized result of 1,841 technical tags in SO and the multiple version-frequency information of the tags. We develop a Chatbot4QR prototype that uses 1.88 million SO questions as the repository for question retrieval. To evaluate Chatbot4QR, we conduct six user studies with 25 participants on 50 experimental queries. The results are as follows. (1) On average 60.8 percent of the CQs generated for a query are useful for helping the participants recognize missing technical details. (2) Chatbot4QR can rapidly respond to the participants after receiving a query within approximately 1.3 seconds. (3) The refined queries contribute to retrieving more relevant SO questions than nine baseline approaches. For more than 70 percent of the participants who have preferred techniques on the query tasks, Chatbot4QR significantly outperforms the state-of-the-art word embedding-based retrieval approach with an improvement of at least 54.6 percent in terms of two measurements: Pre<inline-formula><tex-math notation="LaTeX">$@$</tex-math><alternatives><mml:math><mml:mo>@</mml:mo></mml:math><inline-graphic xlink:href="xia-ieq1-3016006.gif"/></alternatives></inline-formula>k and NDCG<inline-formula><tex-math notation="LaTeX">$@$</tex-math><alternatives><mml:math><mml:mo>@</mml:mo></mml:math><inline-graphic xlink:href="xia-ieq2-3016006.gif"/></alternatives></inline-formula>k. (4) For 48-88 percent of the assigned query tasks, the participants obtain more desired results after interacting with Chatbot4QR than directly searching from Web search engines (e.g., the SO search engine and Google) using the original queries.

  • Conference Article
  • Cite Count Icon 57
  • 10.1109/icsme.2018.00057
Effective Reformulation of Query for Code Search Using Crowdsourced Knowledge and Extra-Large Data Analytics
  • Sep 1, 2018
  • Mohammad Masudur Rahman + 1 more

Software developers frequently issue generic natural language queries for code search while using code search engines (e.g., GitHub native search, Krugle). Such queries often do not lead to any relevant results due to vocabulary mismatch problems. In this paper, we propose a novel technique that automatically identifies relevant and specific API classes from Stack Overflow Q &amp; A site for a programming task written as a natural language query, and then reformulates the query for improved code search. We first collect candidate API classes from Stack Overflow using pseudo-relevance feedback and two term weighting algorithms, and then rank the candidates using Borda count and semantic proximity between query keywords and the API classes. The semantic proximity has been determined by an analysis of 1.3 million questions and answers of Stack Overflow. Experiments using 310 code search queries report that our technique suggests relevant API classes with 48% precision and 58% recall which are 32% and 48% higher respectively than those of the state-of-the-art. Comparisons with two state-of-the-art studies and three popular search engines (e.g., Google, Stack Overflow, and GitHub native search) report that our reformulated queries (1) outperform the queries of the state-of-the-art, and (2) significantly improve the code search results provided by these contemporary search engines.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3539618.3591966
ConQueR: Contextualized Query Reduction using Search Logs
  • Jul 18, 2023
  • Hye-Young Kim + 5 more

Query reformulation is a key mechanism to alleviate the linguistic chasm of query in ad-hoc retrieval. Among various solutions, query reduction effectively removes extraneous terms and specifies concise user intent from long queries. However, it is challenging to capture hidden and diverse user intent. This paper proposes Contextualized Query Reduction (ConQueR) using a pre-trained language model (PLM). Specifically, it reduces verbose queries with two different views: core term extraction and sub-query selection. One extracts core terms from an original query at the term level, and the other determines whether a sub-query is a suitable reduction for the original query at the sequence level. Since they operate at different levels of granularity and complement each other, they are finally aggregated in an ensemble manner. We evaluate the reduction quality of ConQueR on real-world search logs collected from a commercial web search engine. It achieves up to 8.45% gains in exact match scores over the best competing model.

  • Research Article
  • Cite Count Icon 55
  • 10.1007/s10664-018-9671-0
Automatic query reformulation for code search using crowdsourced knowledge
  • Jan 21, 2019
  • Empirical Software Engineering
  • Mohammad M Rahman + 2 more

Traditional code search engines (e.g., Krugle) often do not perform well with natural language queries. They mostly apply keyword matching between query and source code. Hence, they need carefully designed queries containing references to relevant APIs for the code search. Unfortunately, preparing an effective search query is not only challenging but also time-consuming for the developers according to existing studies. In this article, we propose a novel query reformulation technique–RACK–that suggests a list of relevant API classes for a natural language query intended for code search. Our technique offers such suggestions by exploiting keyword-API associations from the questions and answers of Stack Overflow (i.e., crowdsourced knowledge). We first motivate our idea using an exploratory study with 19 standard Java API packages and 344K Java related posts from Stack Overflow. Experiments using 175 code search queries randomly chosen from three Java tutorial sites show that our technique recommends correct API classes within the Top-10 results for 83% of the queries, with 46% mean average precision and 54% recall, which are 66%, 79% and 87% higher respectively than that of the state-of-the-art. Reformulations using our suggested API classes improve 64% of the natural language queries and their overall accuracy improves by 19%. Comparisons with three state-of-the-art techniques demonstrate that RACK outperforms them in the query reformulation by a statistically significant margin. Investigation using three web/code search engines shows that our technique can significantly improve their results in the context of code search.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/cspa48992.2020.9068697
Applications and use Cases of Multilevel Granularity for Network Traffic Classification
  • Feb 1, 2020
  • Faiz Zaki + 2 more

Network traffic classification is a fundamental process in network management and security. It allows network administrators to classify traffic based on various levels of classification granularity such as the source type or application. Existing literature focuses on analyzing the entire network traffic classification process with emphasis on the classification techniques. However, besides classification techniques, the literature lacks coverage on classification granularity, which deserves proper attention due to its increasing application in modern networks. Understanding the various levels of classification granularity and their use cases allow for more optimized traffic classification. As such, this paper aims to explore the different levels of classification granularity and their use cases. We studied papers published between 2013 and 2019 in order to investigate the different levels of granularity and use cases in the literature. As a result, this paper groups the classification granularity into a systematic multilevel taxonomy to assist in attaining a deeper understanding of their applications. Finally, to motivate future research, we elaborated on the current challenges and future directions for network traffic classification.

  • Conference Article
  • Cite Count Icon 26
  • 10.1145/3338906.3341186
AnswerBot: an answer summary generation tool based on stack overflow
  • Aug 12, 2019
  • Liang Cai + 6 more

Software Q&A sites (like Stack Overflow) play an essential role in developers' day-to-day work for problem-solving. Although search engines (like Google) are widely used to obtain a list of relevant posts for technical problems, we observed that the redundant relevant posts and sheer amount of information barriers developers to digest and identify the useful answers. In this paper, we propose a tool AnswerBot which enables to automatically generate an answer summary for a technical problem. AnswerBot consists of three main stages, (1) relevant question retrieval, (2) useful answer paragraph selection, (3) diverse answer summary generation. We implement it in the form of a search engine website. To evaluate AnswerBot, we first build a repository includes a large number of Java questions and their corresponding answers from Stack Overflow. Then, we conduct a user study that evaluates the answer summary generated by AnswerBot and two baselines (based on Google and Stack Overflow search engine) for 100 queries. The results show that the answer summaries generated by AnswerBot are more relevant, useful, and diverse. Moreover, we also substantially improved the efficiency of AnswerBot (from 309 to 8 seconds per query).

  • Conference Article
  • Cite Count Icon 31
  • 10.1109/compsac.2016.210
Towards Correlating Search on Google and Asking on Stack Overflow
  • Jun 1, 2016
  • Chunyang Chen + 1 more

Search engines and Question and Answer (Q&A) sites are the two commonly used ways for developers to seek information on the web. In this paper, we ask whether the questions developers ask on Q&A sites correlate with the information developers search for using search engines. We report on our empirical study to investigate the correlations of the 185 popular technical terms developers search on Google and ask on Stack Overflow using search statistics obtained from Google Trends over a 574-weeks span and question statistics derived from Stack Overflow Data Dump over a 300-weeks span. Our study shows that technical terms searched and asked have strong correlation over time. Search and asking of newer, specific technical terms have stronger correlation, compared with older, general technical terms. We have developed a web interface for accessing our dataset and empirical results available at http://comparetrend.appspot.com/. Inspired by our empirical results, we present future directions that can harness Stack Overflow as sampled data for supporting time-aware search and semantic search.

  • Research Article
  • Cite Count Icon 116
  • 10.1007/s11390-015-1576-4
Multi-Factor Duplicate Question Detection in Stack Overflow
  • Sep 1, 2015
  • Journal of Computer Science and Technology
  • Yun Zhang + 3 more

Stack Overflow is a popular on-line question and answer site for software developers to share their experience and expertise. Among the numerous questions posted in Stack Overflow, two or more of them may express the same point and thus are duplicates of one another. Duplicate questions make Stack Overflow site maintenance harder, waste resources that could have been used to answer other questions, and cause developers to unnecessarily wait for answers that are already available. To reduce the problem of duplicate questions, Stack Overflow allows questions to be manually marked as duplicates of others. Since there are thousands of questions submitted to Stack Overflow every day, manually identifying duplicate questions is a difficult work. Thus, there is a need for an automated approach that can help in detecting these duplicate questions. To address the above-mentioned need, in this paper, we propose an automated approach named DupPredictor that takes a new question as input and detects potential duplicates of this question by considering multiple factors. DupPredictor extracts the title and description of a question and also tags that are attached to the question. These pieces of information (title, description, and a few tags) are mandatory information that a user needs to input when posting a question. DupPredictor then computes the latent topics of each question by using a topic model. Next, for each pair of questions, it computes four similarity scores by comparing their titles, descriptions, latent topics, and tags. These four similarity scores are finally combined together to result in a new similarity score that comprehensively considers the multiple factors. To examine the benefit of DupPredictor, we perform an experiment on a Stack Overflow dataset which contains a total of more than two million questions. The result shows that DupPredictor can achieve a recall-rate@20 score of 63.8%. We compare our approach with the standard search engine of Stack Overflow, and DupPredictor improves its recall-rate@10 score by 40.63%. We also compare our approach with approaches that only use title, description, topic, and tag similarity and Runeson et al.’s approach that has been used to detect duplicate bug reports, and DupPredictor improves their recall-rate@10 scores by 27.2%, 97.4%, 746.0%, 231.1%, and 16.4% respectively.

  • Research Article
  • Cite Count Icon 47
  • 10.1016/j.infsof.2020.106367
PostFinder: Mining Stack Overflow posts to support software developers
  • Jun 25, 2020
  • Information and Software Technology
  • Riccardo Rubei + 4 more

PostFinder: Mining Stack Overflow posts to support software developers

  • Book Chapter
  • Cite Count Icon 50
  • 10.1007/bfb0013980
Sometimes “Tomorrow” is “Sometime”
  • Jan 1, 1994
  • José Luiz Fiadeiro + 1 more

We address the hierarchical (vertical) decomposition, or abstract implementation, of object specification in temporal logic. Whereas previous approaches to refinement in the context of temporal logic such as those developed by Lamport and by Barringer, Kuiper and Pnueli are based on a single logic that accommodates different levels of action granularity, our approach is based on relating different logics corresponding to different levels of granularity. More precisely, we map abstract actions (propositions) to concrete objects (theories) and, through inference rules that relate the different logics, derive properties of the abstracted actions from the behaviour of the corresponding objects. In this way, we keep a tighter control of action granularity and interference, enabling us to maintain the use of the “next” operator and make the development of reactive systems more tractable.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant