806 results on '"Stack overflow"'
Search Results
2. A framework for generating recommendations based on trust in an informal e-learning environment.
- Author
-
Rehman, Amjad, Ahmed, Adeel, Alahmadi, Tahani Jaser, Mirdad, Abeer Rashad, Al Ghofaily, Bayan, and Saleem, Khalid
- Abstract
Rapid advancement in information technology promotes the growth of new online learning communities in an e-learning environment that overloads information and data sharing. When a new learner asks a question, how a system recommends the answer is the problem of the learner's cold start. In this article, our contributions are: (i) We proposed a Trust-aware Deep Neural Recommendation (TDNR) framework that addresses learner cold-start issues in informal e-learning by modeling complex nonlinear relationships. (ii) We utilized latent Dirichlet allocation for tag modeling, assigning tag categories to newly posted questions and ranking experts related to specific tags for active questioners based on hub and authority scores. (iii) We enhanced recommendation accuracy in the TDNR model by introducing a degree of trust between questioners and responders. (iv) We incorporated the questioner-responder relational graph, derived from structural preference information, into our proposed model. We evaluated the proposed model on the Stack Overflow dataset using mean absolute precision (MAP), root mean squared error (RMSE), and F-measure metrics. Our significant findings are that TDNR is a hybrid approach that provides more accurate recommendations compared to rating-based and social-trust-based approaches, the proposed model can facilitate the formation of informal e-learning communities, and experiments show that TDNR outperforms the competing methods by an improved margin. The model's robustness, demonstrated by superior MAE, RMSE, and F-measure metrics, makes it a reliable solution for addressing information overload and user sparsity in Stack Overflow. By accurately modeling complex relationships and incorporating trust degrees, TDNR provides more relevant and personalized recommendations, even in cold-start scenarios. This enhances user experience by facilitating the formation of supportive learning communities and ensuring new learners receive accurate recommendations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Enhanced Multi-Label Question Tagging on Stack Overflow: A Two-Stage Clustering and DeBERTa-Based Approach
- Author
-
Isun Chehreh, Farzaneh Saadati, Ebrahim Ansari, and Bahram Sadeghi Bigham
- Subjects
terms—automatic question tagging ,stack overflow ,smpnet ,deberta ,Telecommunication ,TK5101-6720 - Abstract
This paper introduces a novel method for automatically classifying questions with multiple labels, using data specifically sourced from Stack Overflow. Traditional tagging methods frequently face challenges due to the complexity and semantic diversity of these questions, resulting in inconsistent and sometimes inaccurate results. The process starts with preprocessing to remove any unwanted elements. Next, we convert the questions into meaningful representations using SMPNet. The semantic vectors obtained are then processed using UMAP to help us understand the overall structure of the data and make it easier to cluster similar items. After dimensionality reduction with UMAP, we use the K-Means method to group the questions into clusters, with the best number of groups determined by the Silhouette Score. Finally, a fine-tuned DeBERTa model is trained for each cluster to accurately predict the appropriate tags. Our approach significantly outperforms traditional methods, achieving 2% improvement over the best baseline. This strategy improves model efficiency by narrowing the focus to specific subsets of data.
- Published
- 2024
- Full Text
- View/download PDF
4. Richen: Automated enrichment of Git documentation with usage examples and scenarios.
- Author
-
Shen, Chaochao, Yang, Wenhua, Jia, Haitao, Pan, Minxue, and Zhou, Yu
- Subjects
- *
COMPUTER software development , *EMPIRICAL research , *DOCUMENTATION , *CROWDS , *COMPUTER software - Abstract
As the predominant modern version control system, Git has become an indispensable tool for both commercial and open‐source software projects. It substantially improves software development effectiveness and efficiency through its distributed version control system, fostering seamless collaboration among teams and across locations. However, research has found that many developers have doubts about using Git commands, while the official Git documentation is rather scanty, that is, lacking sufficient explanations and examples. To help developers learn and use Git commands, we propose the first approach (Richen) for enriching Git documentation with usage examples and scenarios by leveraging crowd knowledge from Stack Overflow. Richen retrieves Git‐related posts from Stack Overflow, extracts relevant Q&A pairs, and selects representative command usages, including usage examples and scenarios, for different Git commands. Experimental results have shown that Richen can extract informative and concise command usages for Git commands. Compared with alternative methods adapted from API usage mining, the command usages obtained by Richen have significant advantages in terms of relevance, readability, and usability. Furthermore, we have shown through an empirical study that the command usages extracted by Richen can better help developers complete Git command‐related tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. A framework for generating recommendations based on trust in an informal e-learning environment
- Author
-
Amjad Rehman, Adeel Ahmed, Tahani Jaser Alahmadi, Abeer Rashad Mirdad, Bayan Al Ghofaily, and Khalid Saleem
- Subjects
Stack overflow ,Trust ,Neural networks ,Recommender systems ,HITS algorithm ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Rapid advancement in information technology promotes the growth of new online learning communities in an e-learning environment that overloads information and data sharing. When a new learner asks a question, how a system recommends the answer is the problem of the learner’s cold start. In this article, our contributions are: (i) We proposed a Trust-aware Deep Neural Recommendation (TDNR) framework that addresses learner cold-start issues in informal e-learning by modeling complex nonlinear relationships. (ii) We utilized latent Dirichlet allocation for tag modeling, assigning tag categories to newly posted questions and ranking experts related to specific tags for active questioners based on hub and authority scores. (iii) We enhanced recommendation accuracy in the TDNR model by introducing a degree of trust between questioners and responders. (iv) We incorporated the questioner-responder relational graph, derived from structural preference information, into our proposed model. We evaluated the proposed model on the Stack Overflow dataset using mean absolute precision (MAP), root mean squared error (RMSE), and F-measure metrics. Our significant findings are that TDNR is a hybrid approach that provides more accurate recommendations compared to rating-based and social-trust-based approaches, the proposed model can facilitate the formation of informal e-learning communities, and experiments show that TDNR outperforms the competing methods by an improved margin. The model’s robustness, demonstrated by superior MAE, RMSE, and F-measure metrics, makes it a reliable solution for addressing information overload and user sparsity in Stack Overflow. By accurately modeling complex relationships and incorporating trust degrees, TDNR provides more relevant and personalized recommendations, even in cold-start scenarios. This enhances user experience by facilitating the formation of supportive learning communities and ensuring new learners receive accurate recommendations.
- Published
- 2024
- Full Text
- View/download PDF
6. Comparing emotions in ChatGPT answers and human answers to the coding questions on Stack Overflow
- Author
-
Somayeh Fatahi, Julita Vassileva, and Chanchal K. Roy
- Subjects
generative AI ,large language models ,natural language processing ,emotion analysis ,Stack Overflow ,ChatGPT ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
IntroductionRecent advances in generative Artificial Intelligence (AI) and Natural Language Processing (NLP) have led to the development of Large Language Models (LLMs) and AI-powered chatbots like ChatGPT, which have numerous practical applications. Notably, these models assist programmers with coding queries, debugging, solution suggestions, and providing guidance on software development tasks. Despite known issues with the accuracy of ChatGPT’s responses, its comprehensive and articulate language continues to attract frequent use. This indicates potential for ChatGPT to support educators and serve as a virtual tutor for students.MethodsTo explore this potential, we conducted a comprehensive analysis comparing the emotional content in responses from ChatGPT and human answers to 2000 questions sourced from Stack Overflow (SO). The emotional aspects of the answers were examined to understand how the emotional tone of AI responses compares to that of human responses.ResultsOur analysis revealed that ChatGPT’s answers are generally more positive compared to human responses. In contrast, human answers often exhibit emotions such as anger and disgust. Significant differences were observed in emotional expressions between ChatGPT and human responses, particularly in the emotions of anger, disgust, and joy. Human responses displayed a broader emotional spectrum compared to ChatGPT, suggesting greater emotional variability among humans.DiscussionThe findings highlight a distinct emotional divergence between ChatGPT and human responses, with ChatGPT exhibiting a more uniformly positive tone and humans displaying a wider range of emotions. This variance underscores the need for further research into the role of emotional content in AI and human interactions, particularly in educational contexts where emotional nuances can impact learning and communication.
- Published
- 2024
- Full Text
- View/download PDF
7. Using Graph Neural Network to Analyse and Detect Annotation Misuse in Java Code
- Author
-
Yang, Jingbo, Ji, Xin, Wu, Wenjun, Ren, Jian, Zhang, Kui, Zhang, Wenya, Wang, Qingliang, Dong, Tingting, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Xiankun, editor, and Zhang, Qinhu, editor
- Published
- 2024
- Full Text
- View/download PDF
8. How Is Software Reuse Discussed in Stack Overflow?
- Author
-
AlOmar, Eman Abdullah, Peruma, Anthony, Mkaouer, Mohamed Wiem, Newman, Christian, Ouni, Ali, Verma, Dinesh, editor, Madni, Azad M., editor, Hoffenson, Steven, editor, and Xiao, Lu, editor
- Published
- 2024
- Full Text
- View/download PDF
9. JARAD: An Approach for Java API Mention Recognition and Disambiguation in Stack Overflow
- Author
-
Liang, Qingmi, Jin, Yi, Xie, Qi, Kuang, Li, Sheng, Yu, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Gao, Honghao, editor, Wang, Xinheng, editor, and Voros, Nikolaos, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Quality Prediction of a Stack Overflow Question Using Machine Learning
- Author
-
Mehta, Tanvi, Multaikar, Samruddhi, Patil, Srushti, Gawande, Namrata, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Sharma, Harish, editor, Chakravorty, Antorweep, editor, Hussain, Shahid, editor, and Kumari, Rajani, editor
- Published
- 2024
- Full Text
- View/download PDF
11. An investigation of web use during programming
- Author
-
Alghamdi, Omar, Clinch, Sarah, and Jay, Caroline
- Subjects
Computer science education ,Stack Overflow ,Online code snippets ,Problematic code ,Human memory ,Professional practice in software engineering ,Program comprehension - Abstract
Websites are a key resource consulted by programmers during their coding tasks, providing essential information, including code snippets. However, the implications of website use are poorly understood with respect to both programmers cognition and their code outputs. Programmers' (human) memory has also been shown to be an important resource in coding tasks, and there is some evidence from psychology that website use may inhibit memory. Studies of online code repositories also suggest that problematic code propagates through the Web. To date there has been little research on programmers' memory implications from using the Web, nor on programmers' experiences of encountering problematic online code, and whether coding with the Web leads to the adoption of problematic online code in programmers' own code. This thesis sets out to contribute to understandings of the role of websites in programmers' coding activities, and the possible implications of their usage. Three studies provide qualitative and quantitative data describing participants' use of the Web when coding, including its role, follow-on activities and consequences (perceived and actual). The results confirm that the Web and human memory are essential resources used by programmers when coding, and that they make frequent use of search engines and online code when using the Web. Programmers perceived little impact of this web use on their memory, but recognised the prevalence of problematic online code. Through an observed coding task and analysis of resulting source code, we find evidence that encounters with problematic online code can have negative consequences for programmers code outputs. The results advance the current understanding of Web usage for coding and how it affects programmers' memory and code.
- Published
- 2023
12. Representation Learning for Stack Overflow Posts: How Far Are We?
- Author
-
He, Junda, Zhou, Xin, Xu, Bowen, Zhang, Ting, Kim, Kisub, Yang, Zhou, Thung, Ferdian, Irsan, Ivana Clairine, and Lo, David
- Subjects
CONVOLUTIONAL neural networks ,LANGUAGE models ,SOFTWARE engineering ,TRANSFORMER models ,RESEARCH personnel ,EVIDENCE gaps - Abstract
The tremendous success of Stack Overflow has accumulated an extensive corpus of software engineering knowledge, thus motivating researchers to propose various solutions for analyzing its content. The performance of such solutions hinges significantly on the selection of representation models for Stack Overflow posts. As the volume of literature on Stack Overflow continues to burgeon, it highlights the need for a powerful Stack Overflow post representation model and drives researchers' interest in developing specialized representation models that can adeptly capture the intricacies of Stack Overflow posts. The state-of-the-art (SOTA) Stack Overflow post representation models are Post2Vec and BERTOverflow, which are built upon neural networks such as convolutional neural network and transformer architecture (e.g., BERT). Despite their promising results, these representation methods have not been evaluated in the same experimental setting. To fill the research gap, we first empirically compare the performance of the representation models designed specifically for Stack Overflow posts (Post2Vec and BERTOverflow) in a wide range of related tasks (i.e., tag recommendation, relatedness prediction, and API recommendation). The results show that Post2Vec cannot further improve the SOTA techniques of the considered downstream tasks, and BERTOverflow shows surprisingly poor performance. To find more suitable representation models for the posts, we further explore a diverse set of transformer-based models, including (1) general domain language models (RoBERTa, Longformer, and GPT2) and (2) language models built with software engineering related textual artifacts (CodeBERT, GraphCodeBERT, seBERT, CodeT5, PLBart, and CodeGen). This exploration shows that models like CodeBERT and RoBERTa are suitable for representing Stack Overflow posts. However, it also illustrates the "No Silver Bullet" concept, as none of the models consistently wins against all the others. Inspired by the findings, we propose SOBERT, which employs a simple yet effective strategy to improve the representation models of Stack Overflow posts by continuing the pre-training phase with the textual artifact from Stack Overflow. The overall experimental results demonstrate that SOBERT can consistently outperform the considered models and increase the SOTA performance significantly for all the downstream tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Building Status in an Online Community
- Author
-
Smirnova, Inna, Reitzig, Markus, and Sorenson, Olav
- Subjects
Human Resources and Industrial Relations ,Commerce ,Management ,Tourism and Services ,Strategy ,Management and Organisational Behaviour ,status attainment ,action ambiguity ,online communities ,stack overflow ,experiment ,Business and Management ,Marketing ,Business & Management ,Human resources and industrial relations ,Strategy ,management and organisational behaviour - Abstract
We argue that the actions for which actors receive recognition vary as they move up the hierarchy. When actors first enter a community, the community rewards them for their easier-to-evaluate contributions to the community. Eventually, however, as these actors rise in status, further increases in stature come increasingly from engaging in actions that are more difficult to evaluate or even impossible to judge. These dynamics produce a positive feedback loop, in which those who have already been accorded some stature garner even greater status through quality-ambiguous actions. We present evidence from Stack Overflow, an online community, and from two online experiments consistent with these expected patterns. Funding: All authors would like to acknowledge funding from the Austrian Science Fund [Grant P 25768-G16]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/orsc.2021.1559 .
- Published
- 2022
14. Enhancing User Experience on Q&A Platforms: Measuring Text Similarity Based on Hybrid CNN-LSTM Model for Efficient Duplicate Question Detection
- Author
-
Muhammad Faseeh, Murad Ali Khan, Naeem Iqbal, Faiza Qayyum, Asif Mehmood, and Jungsuk Kim
- Subjects
Duplicate question identification ,stack overflow ,deep learning (DL) ,word embeddings ,natural language processing (NLP) ,question-and-answer (QA) platforms ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This research introduces an innovative approach for identifying duplicate questions within the Stack Overflow community, a challenging task in NLP. Leveraging deep learning techniques, our proposed methodology combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to capture both local and long-term dependencies in textual data. We employ word embeddings, specifically Google’s Word2Vec and GloVe, to enhance text representation. Extensive experiments on the Stack Overflow dataset demonstrate the effectiveness of our approach, achieving an impressive accuracy of 87.09% and a recall rate of 87.%. The integration of CNN and LSTM models significantly streamlines preprocessing, making it a valuable tool for detecting duplicate questions. Future directions include extending the model to multiple languages and exploring alternative word embedding techniques. Our approach presents promising applications beyond Stack Overflow, offering solutions for identifying similar questions on various QA platforms.
- Published
- 2024
- Full Text
- View/download PDF
15. TopicAns: Topic-informed Architecture for Answer Recommendation on Technical Q&A Site.
- Author
-
Yang, Yuanhang, He, Wei, Gao, Cuiyun, Xu, Zenglin, Xia, Xin, and Liu, Chuanyi
- Subjects
QUESTION & answer websites ,LANGUAGE models ,SOFTWARE engineering ,SOFTWARE engineers ,PROBLEM solving - Abstract
Technical Q&A sites, such as Stack Overflow and Ask Ubuntu, have been widely utilized by software engineers to seek support for development challenges. However, not all the raised questions get instant feedback, and the retrieved answers can vary in quality. The users can hardly avoid spending much time before solving their problems. Prior studies propose approaches to automatically recommend answers for the question posts on technical Q&A sites. However, the lengthiness and the lack of background knowledge issues limit the performance of answer recommendation on these sites. The irrelevant sentences in the posts may introduce noise to the semantics learning and prevent neural models from capturing the gist of texts. The lexical gap between question and answer posts further misleads current models to make failure recommendations. From this end, we propose a novel neural network named TopicAns for answer selection on technical Q&A sites. TopicAns aims at learning high-quality representations for the posts in Q&A sites with a neural topic model and a pre-trained model. This involves three main steps: (1) generating topic-aware representations of Q&A posts with the neural topic model, (2) incorporating the corpus-level knowledge from the neural topic model to enhance the deep representations generated by the pre-trained language model, and (3) determining the most suitable answer for a given query based on the topic-aware representation and the deep representation. Moreover, we propose a two-stage training technique to improve the stability of our model. We conduct comprehensive experiments on four benchmark datasets to verify our proposed TopicAns's effectiveness. Experiment results suggest that TopicAns consistently outperforms state-of-the-art techniques by over 30% in terms of Precision@1. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. What causes exceptions in machine learning applications? Mining machine learning-related stack traces on Stack Overflow
- Author
-
Ghadesi, Amin, Lamothe, Maxime, and Li, Heng
- Published
- 2024
- Full Text
- View/download PDF
17. Cybersecurity discussions in Stack Overflow: a developer-centred analysis of engagement and self-disclosure behaviour.
- Author
-
Díaz Ferreyra, Nicolás E., Vidoni, Melina, Heisel, Maritta, and Scandariato, Riccardo
- Abstract
Stack Overflow (SO) is a popular platform among developers seeking advice on various software-related topics, including privacy and security. As for many knowledge-sharing websites, the value of SO depends largely on users' engagement, namely their willingness to answer, comment or post technical questions. Still, many of these questions (including cybersecurity-related ones) remain unanswered, putting the site's relevance and reputation into jeopardy. Hence, it is important to understand users' participation in privacy and security discussions to promote engagement and foster the exchange of such expertise. Objective: Based on prior findings on online social networks, this work elaborates on the interplay between users' engagement and their privacy practices in SO. Particularly, it analyses developers' self-disclosure behaviour regarding profile visibility and their involvement in discussions related to privacy and security. Method: We followed a mixed-methods approach by (i) analysing SO data from 1239 cybersecurity-tagged questions along with 7048 user profiles, and (ii) conducting an anonymous online survey (N=64). Results: About 33% of the questions we retrieved had no answer, whereas more than 50% had no accepted answer. We observed that proactive users tend to disclose significantly less information in their profiles than reactive and unengaged ones. However, no correlations were found between these engagement categories and privacy-related constructs such as perceived control or general privacy concerns. Implications: These findings contribute to (i) a better understanding of developers' engagement towards privacy and security topics, and (ii) to shape strategies promoting the exchange of cybersecurity expertise in SO. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Fine-Grained Modeling of ROP Vulnerability Exploitation Process under Stack Overflow Based on Petri Nets.
- Author
-
Zhang, Liumei, Zhang, Wei, Wang, Yichuan, Xia, Bowen, and Han, Yu
- Subjects
PETRI nets ,COMPUTER security vulnerabilities - Abstract
Software vulnerability discovery is currently a hot topic, and buffer overflow remains a prevalent security vulnerability. One of the key issues in vulnerability discovery and analysis is how to quickly analyze buffer overflow vulnerabilities and select critical exploitation paths. Existing modeling methods for vulnerability exploitation cannot accurately reflect the fine-grained execution process of stack overflow exploitation paths. This paper, based on the discussion of buffer overflow exploitation techniques, proposes a fine-grained modeling and analysis method based on Petri nets for the selection and execution of exploitation processes, specifically focusing on the return-oriented programming in stack overflow. Through qualitative analysis, we compared the simulated time of the software with the execution time of existing exploitation tools, achieving timeout-based simulation experiments. We validated the model's effectiveness using symbolic execution and dynamic analysis techniques. The results indicate that this model performs well for vulnerable programs with Position Independent Executable (PIE) protection enabled and has an advantage in selecting exploitation paths, enabling timeout-based simulation. This method provides a reference for rapidly constructing exploitation implementations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Automatic Structuring of Topics for Natural Language Generation in Community Question Answering in Programming Domain
- Author
-
Rvanova, Lyudmila, Kovalchuk, Sergey, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mikyška, Jiří, editor, de Mulatier, Clélia, editor, Paszynski, Maciej, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M.A., editor
- Published
- 2023
- Full Text
- View/download PDF
20. Towards Quality Improvement and Prediction of Closed Questions on Stack Overflow
- Author
-
Opu, Md. Nahidul Islam, Roy, Animesh Chandra, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, and Uddin, Mohammad Shorif, editor
- Published
- 2023
- Full Text
- View/download PDF
21. An Empirical Study on How the Developers Discussed About Pandas Topics
- Author
-
Joy, Sajib Kumar Saha, Ahmed, Farzad, Mahamud, Al Hasib, Mandal, Nibir Chandra, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin, Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Satu, Md. Shahriare, editor, Moni, Mohammad Ali, editor, Kaiser, M. Shamim, editor, and Arefin, Mohammad Shamsul, editor
- Published
- 2023
- Full Text
- View/download PDF
22. Stack Tag - Predicting the Stack Overflow Questions’ Tags Using Gated Recurrent Unit Networks
- Author
-
Prakash, Varun, Raghav, Sagar, Sood, Shubham, Pandey, Mrinal, Arora, Mamta, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Pllana, Sabri, editor, Casalino, Gabriella, editor, Ma, Kun, editor, and Bajaj, Anu, editor
- Published
- 2023
- Full Text
- View/download PDF
23. What Makes a Good Answer? Analyzing the Content Structure of Answers to Stack Overflow’s Most Popular Question
- Author
-
Morales-Navarro, Luis, Barany, Amanda, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Damşa, Crina, editor, and Barany, Amanda, editor
- Published
- 2023
- Full Text
- View/download PDF
24. The Stack and Subroutines
- Author
-
LaMeres, Brock J. and LaMeres, Brock J.
- Published
- 2023
- Full Text
- View/download PDF
25. A Study of E-commerce Platform Issues Shared by Developers on Stack Overflow
- Author
-
Nugroho, Yusuf Sulistyo, Islam, Syful, Gunawan, Dedi, Kurniawan, Yogiek Indra, Hossain, Md. Javed, Kabir, Mohammed Humayun, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Dutta, Paramartha, editor, Chakrabarti, Satyajit, editor, Bhattacharya, Abhishek, editor, Dutta, Soumi, editor, and Piuri, Vincenzo, editor
- Published
- 2023
- Full Text
- View/download PDF
26. Cybersecurity discussions in Stack Overflow: a developer-centred analysis of engagement and self-disclosure behaviour
- Author
-
Díaz Ferreyra, Nicolás E., Vidoni, Melina, Heisel, Maritta, and Scandariato, Riccardo
- Published
- 2024
- Full Text
- View/download PDF
27. Trend Analysis of Large Language Models through a Developer Community: A Focus on Stack Overflow.
- Author
-
Son, Jungha and Kim, Boyoung
- Subjects
- *
LANGUAGE models , *TREND analysis , *ELECTRONIC data processing , *SPECTRUM allocation - Abstract
In the rapidly advancing field of large language model (LLM) research, platforms like Stack Overflow offer invaluable insights into the developer community's perceptions, challenges, and interactions. This research aims to analyze LLM research and development trends within the professional community. Through the rigorous analysis of Stack Overflow, employing a comprehensive dataset spanning several years, the study identifies the prevailing technologies and frameworks underlining the dominance of models and platforms such as Transformer and Hugging Face. Furthermore, a thematic exploration using Latent Dirichlet Allocation unravels a spectrum of LLM discussion topics. As a result of the analysis, twenty keywords were derived, and a total of five key dimensions, "OpenAI Ecosystem and Challenges", "LLM Training with Frameworks", "APIs, File Handling and App Development", "Programming Constructs and LLM Integration", and "Data Processing and LLM Functionalities", were identified through intertopic distance mapping. This research underscores the notable prevalence of specific Tags and technologies within the LLM discourse, particularly highlighting the influential roles of Transformer models and frameworks like Hugging Face. This dominance not only reflects the preferences and inclinations of the developer community but also illuminates the primary tools and technologies they leverage in the continually evolving field of LLMs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Demystifying Practices, Challenges and Expected Features of Using GitHub Copilot.
- Author
-
Zhang, Beiqi, Liang, Peng, Zhou, Xiyu, Ahmad, Aakash, and Waseem, Muhammad
- Subjects
COMPUTER software developers ,PROGRAMMING languages ,JAVASCRIPT programming language ,ELECTRONIC data processing ,SOURCE code ,PYTHON programming language ,COMPUTER software development - Abstract
With the advances in machine learning, there is a growing interest in AI-enabled tools for autocompleting source code. GitHub Copilot, also referred to as the "AI Pair Programmer", has been trained on billions of lines of open source GitHub code, and is one of such tools that has been increasingly used since its launch in June 2021. However, little effort has been devoted to understanding the practices, challenges, and expected features of using Copilot in programming for auto-completed source code from the point of view of practitioners. To this end, we conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. More specifically, we searched and manually collected 303 SO posts and 927 GitHub discussions related to the usage of Copilot. We identified the programming languages, Integrated Development Environments (IDEs), technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. The results show that when practitioners use Copilot: (1) The major programming languages used with Copilot are JavaScript and Python, (2) the main IDE used with Copilot is Visual Studio Code, (3) the most common used technology with Copilot is Node.js, (4) the leading function implemented by Copilot is data processing, (5) the main purpose of users using Copilot is to help generate code, (6) the significant benefit of using Copilot is useful code generation, (7) the main limitation encountered by practitioners when using Copilot is difficulty of integration, and (8) the most common expected feature is that Copilot can be integrated with more IDEs. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it. Our study provides empirically grounded foundations that could inform software developers and practitioners, as well as provide a basis for future investigations on the role of Copilot as an AI pair programmer in software development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Identification of mobile development issues using semantic topic modeling of Stack Overflow posts.
- Author
-
Gurcan, Fatih
- Subjects
MOBILE app development ,MOBILE apps ,SYSTEMS software ,DATA management ,RESEARCH personnel - Abstract
Background: Increasing demands for mobile apps and services have recently led to an intensification of mobile development activities. With the proliferation of mobile development, there has been a major transformation in the architectures, paradigms, knowledge domains and skills of traditional software systems towards mobile development. Therefore, mobile developers experience a wide spectrum of issues specific to development processes of mobile apps and services. Methods: In this article, we conducted a semantic content analysis based on topic modeling using mobile-related questions on Stack Overflow, a popular Q&A site for developers. With the aim of providing an understanding of the issues and challenges faced by mobile developers, we used a semi-automated methodology based on latent Dirichlet allocation (LDA), a probabilistic and generative approach for topic modeling. Results: Our findings revealed that mobile developers' questions focused on 36 topics in six main categories, including "Development", "UI settings", "Tools", "Data Management", "Multimedia", and "Mobile APIs". Besides, we investigated the temporal trends of the discovered issues and their relationships with mobile technologies. Our findings also revealed which issues are the most popular and which issues are the most difficult for mobile development. The methodology and findings of this study have valuable implications for mobile development stakeholders including tool builders, developers, researchers, and educators. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Retrieving API Knowledge from Tutorials and Stack Overflow Based on Natural Language Queries.
- Author
-
DI WU, XIAO-YUAN JING, HONGYU ZHANG, YANG FENG, HAOWEN CHEN, YUMING ZHOU, and BAOWEN XU
- Subjects
NATURAL languages ,APPLICATION program interfaces ,BASE pairs ,HELP-seeking behavior ,KNOWLEDGE gap theory - Abstract
When encountering unfamiliar APIs, developers tend to seek help from API tutorials and Stack Overflow (SO). API tutorials help developers understand the API knowledge in a general context, while SO often explains the API knowledge in a specific programming task. Thus, tutorials and SO posts together can provide more API knowledge. However, it is non-trivial to retrieve API knowledge from both API tutorials and SO posts based on natural language queries. Two major problems are irrelevant API knowledge in two different resources and the lexical gap between the queries and documents. In this article, we regard a fragment in tutorials and a Question and Answering (Q&A) pair in SO as a knowledge item (KI). We generate ⟨API, FRA⟩ pairs (FRA stands for fragment) from tutorial fragments and APIs and build ⟨API, QA⟩ pairs based on heuristic rules of SO posts. We fuse ⟨API, FRA⟩ pairs and ⟨API, QA⟩ pairs to generate API knowledge (AK for short) datasets, where each data item is an ⟨API, KI⟩ pair. We propose a novel approach, called PLAN, to automatically retrieve API knowledge from both API tutorials and SO posts based on natural language queries. PLAN contains three main stages: (1) API knowledge modeling, (2) query mapping, and (3) API knowledge retrieving. It first utilizes a deep-transfer-metric-learning-based relevance identification (DTML) model to effectively find relevant ⟨API, KI⟩ pairs containing two different knowledge items (⟨API, QA⟩ pairs and ⟨API, FRA⟩ pairs) simultaneously. Then, PLAN generates several potential APIs as a way to reduce the lexical gap between the query and ⟨API, KI⟩ pairs. According to potential APIs, we can select relevant ⟨API, KI⟩ pairs to generate potential results. Finally, PLAN returns a list of ranked ⟨API, KI⟩ pairs that are related to the query. We evaluate the effectiveness of PLAN with 270 queries on Java and Android AK datasets containing 10,072 ⟨API, KI⟩ pairs. Our experimental results show that PLAN is effective and outperforms the state-of-the-art approaches. Our user study further confirms the effectiveness of PLAN in locating useful API knowledge. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. An empirical study of challenges in machine learning asset management
- Author
-
Zhao, Zhimin, Chen, Yihao, Bangash, Abdul Ali, Adams, Bram, and Hassan, Ahmed E.
- Published
- 2024
- Full Text
- View/download PDF
32. Common challenges of deep reinforcement learning applications development: an empirical study
- Author
-
Morovati, Mohammad Mehdi, Tambon, Florian, Taraghi, Mina, Nikanjam, Amin, and Khomh, Foutse
- Published
- 2024
- Full Text
- View/download PDF
33. Identification of mobile development issues using semantic topic modeling of Stack Overflow posts
- Author
-
Fatih Gurcan
- Subjects
Mobile development issues ,Mobile app development ,Topic modeling ,Stack Overflow ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Background Increasing demands for mobile apps and services have recently led to an intensification of mobile development activities. With the proliferation of mobile development, there has been a major transformation in the architectures, paradigms, knowledge domains and skills of traditional software systems towards mobile development. Therefore, mobile developers experience a wide spectrum of issues specific to development processes of mobile apps and services. Methods In this article, we conducted a semantic content analysis based on topic modeling using mobile-related questions on Stack Overflow, a popular Q&A site for developers. With the aim of providing an understanding of the issues and challenges faced by mobile developers, we used a semi-automated methodology based on latent Dirichlet allocation (LDA), a probabilistic and generative approach for topic modeling. Results Our findings revealed that mobile developers’ questions focused on 36 topics in six main categories, including “Development”, “UI settings”, “Tools”, “Data Management”, “Multimedia”, and “Mobile APIs”. Besides, we investigated the temporal trends of the discovered issues and their relationships with mobile technologies. Our findings also revealed which issues are the most popular and which issues are the most difficult for mobile development. The methodology and findings of this study have valuable implications for mobile development stakeholders including tool builders, developers, researchers, and educators.
- Published
- 2023
- Full Text
- View/download PDF
34. Understanding the Role of Stack Overflow in Supporting Software Development Tasks: A Research Perspective.
- Author
-
Yang, Wenhua and Shen, Chaochao
- Subjects
COMPUTER software development ,SOFTWARE engineering ,KNOWLEDGE gap theory ,ASSOCIATION rule mining ,DEBUGGING ,SOFTWARE engineers - Abstract
Stack Overflow is a Q&A website that is popular among developers and extensively used in software engineering (SE) research. A significant body of research has examined how Stack Overflow can assist with software development tasks, such as recommending APIs. However, while researchers have recognized the importance of Stack Overflow in SE research related to software development tasks, the specific ways in which it is utilized and the reasons for its widespread usage in research have not been thoroughly explored. To address these knowledge gaps, we conducted the first study to understand the role of Stack Overflow in assisting with SE research regarding software development tasks by systematically examining relevant and high-quality research works. Meanwhile, we carried out a qualitative survey to gain insight into why researchers choose to utilize Stack Overflow in SE research and to solicit suggestions for the better use of Stack Overflow in research. The study identifies trends in the research area, prominent researchers and organizations, and the types of tasks that utilize Stack Overflow in research, with coding and debugging being the most common. Moreover, it examines how Stack Overflow data is utilized in SE research regarding software development tasks, including searching, training models, and mining associations. Our qualitative survey of researchers indicates that the popularity of Stack Overflow stems from its comprehensive explanations of technical topics that are often not found in documentation or manuals. The findings provide a comprehensive understanding of the role of Stack Overflow in SE research regarding software development tasks, and offer actionable implications for both researchers and stakeholders of Stack Overflow to facilitate future research and improvements. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow.
- Author
-
Guo, Shi-Kai, Wang, Si-Wen, Li, Hui, Fan, Yu-Long, Liu, Ya-Qing, and Zhang, Bin
- Subjects
K-nearest neighbor classification ,RANDOM forest algorithms - Abstract
Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various topics. However, many questions are usually not answered quickly enough. Since the questioners are eager to know the specific time interval at which a question can be answered, it becomes an important task for Stack Overflow to feedback the answer time to the question. To address this issue, we propose a model for predicting the answer time of questions, named Predicting Answer Time (i.e., PAT model), which consists of two parts: a feature acquisition and fusion model, and a deep neural network model. The framework uses a variety of features mined from questions in Stack Overflow, including the question description, question title, question tags, the creation time of the question, and other temporal features. These features are fused and fed into the deep neural network to predict the answer time of the question. As a case study, post data from Stack Overflow are used to assess the model. We use traditional regression algorithms as the baselines, such as Linear Regression, K-Nearest Neighbors Regression, Support Vector Regression, Multilayer Perceptron Regression, and Random Forest Regression. Experimental results show that the PAT model can predict the answer time of questions more accurately than traditional regression algorithms, and shorten the error of the predicted answer time by nearly 10 hours. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Machine-Learning Approach to Automated Doubt Identification on Stack Overflow Comments to Guide Programming Learners.
- Author
-
Tian Hao Chen, Eng Lieh Ouh, Kar Way Tan, and Siaw Ling Lo
- Subjects
MACHINE learning ,DATA analysis ,INFORMATION technology ,ARTIFICIAL intelligence ,DIGITAL technology ,ARTIFICIAL neural networks ,TECHNOLOGICAL innovations - Abstract
Stack Overflow is a popular Q&A platform for developers to find solutions to programming problems. However, due to the varying quality of user-generated answers, there is a need for ways to help users find high-quality answers. While Stack Overflow's community-based approach can be effective, important technical aspects of the answer need to be captured, and users' comments might contain doubts regarding these aspects. In this paper, we showed the feasibility of using a machine learning model to identify doubts and conducted data analysis. We found that highly reputed users tend to raise more doubts; most answers have doubt in the first comment, and many answers have unsolved doubt in the last comment; high-score and low-score answers are equally likely to contain doubts in comments. Our classifier and findings can provide users with a new perspective on determining answers' helpfulness and allow expert users to easily locate doubts to address. [ABSTRACT FROM AUTHOR]
- Published
- 2023
37. Review of Stack-Based Binary Exploitation Techniques
- Author
-
Jain, Vanita, Singh, Bhanupratap, Swapnil, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Noor, Arti, editor, Sen, Abhijit, editor, and Trivedi, Gaurav, editor
- Published
- 2022
- Full Text
- View/download PDF
38. A qualitative study of architectural design issues in DevOps.
- Author
-
Shahin, Mojtaba, Rezaei Nasab, Ali, and Ali Babar, Muhammad
- Subjects
- *
ARCHITECTURAL design , *ARCHITECTURAL practice , *QUALITATIVE research , *DESIGN software , *EXPERIMENTAL design , *SOFTWARE architecture - Abstract
Software architecture is critical in succeeding with Development and Operations (DevOps). However, designing software architectures that enable and support DevOps (DevOps‐driven software architectures) is a challenge for organizations. We assert that one of the essential steps towards characterizing DevOps‐driven architectures is to understand architectural design issues raised in DevOps. At the same time, some of the architectural issues that emerge in the DevOps context (and their corresponding architectural practices or tactics) may stem from the context (i.e., domain) and characteristics of software organizations. To this end, we conducted a mixed‐methods study that consists of a qualitative case study of two teams in a company during their DevOps transformation and a content analysis of Stack Overflow and DevOps Stack Exchange posts to understand architectural design issues in DevOps. Our study found eight specific and contextual architectural design issues faced by the two teams and classified architectural design issues discussed in Stack Overflow and DevOps Stack Exchange into 11 groups. Our aggregated results reveal that the main characteristics of DevOps‐driven architectures are being loosely coupled and prioritizing deployability, testability, supportability, and modifiability over other quality attributes. Finally, we discuss some concrete implications for research and practice. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. I Know What You Are Searching for: Code Snippet Recommendation from Stack Overflow Posts.
- Author
-
ZHIPENG GAO, XIN XIA, LO, DAVID, GRUNDY, JOHN, XINDONG ZHANG, and ZHENCHANG XING
- Abstract
Stack Overflow has been heavily used by software developers to seek programming-related information. More and more developers use Community Question and Answer forums, such as Stack Overflow, to search for code examples of how to accomplish a certain coding task. This is often considered to be more efficient than working from source documentation, tutorials, or full worked examples. However, due to the complexity of these online Question and Answer forums and the very large volume of information they contain, developers can be overwhelmed by the sheer volume of available information. This makes it hard to find and/or even be aware of the most relevant code examples to meet their needs. To alleviate this issue, in this work, we present a query-driven code recommendation tool, named Que2Code, that identifies the best code snippets for a user query from Stack Overflow posts. Our approach has two main stages: (i) semantically equivalent question retrieval and (ii) best code snippet recommendation. During the first stage, for a given query question formulated by a developer, we first generate paraphrase questions for the input query as a way of query boosting and then retrieve the relevant Stack Overflow posted questions based on these generated questions. In the second stage, we collect all of the code snippets within questions retrieved in the first stage and develop a novel scheme to rank code snippet candidates from Stack Overflow posts via pairwise comparisons. To evaluate the performance of our proposed model, we conduct a large-scale experiment to evaluate the effectiveness of the semantically equivalent question retrieval task and best code snippet recommendation task separately on Python and Java datasets in Stack Overflow. We also perform a human study to measure how real-world developers perceive the results generated by our model. Both the automatic and human evaluation results demonstrate the promising performance of our model, and we have released our code and data to assist other researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Assessing the Alignment between the Information Needs of Developers and the Documentation of Programming Languages: A Case Study on Rust.
- Author
-
COGO, FILIPE ROSEIRO, XIN XIA, and HASSAN, AHMED E.
- Subjects
PROGRAMMING languages ,INFORMATION needs ,DATABASE design ,DOCUMENTATION - Abstract
Programming language documentation refers to the set of technical documents that provide application developers with a description of the high-level concepts of a language (e.g., manuals, tutorials, and API references). Such documentation is essential to support application developers in effectively using a programming language. One of the challenges faced by documenters (i.e., personnel that design and produce documentation for a programming language) is to ensure that documentation has relevant information that aligns with the concrete needs of developers, defined as the missing knowledge that developers acquire via voluntary search. In this article, we present an automated approach to support documenters in evaluating the differences and similarities between the concrete information need of developers and the current state of documentation (a problem that we refer to as the topical alignment of a programming language documentation). Our approach leverages semi-supervised topic modelling that uses domain knowledge to guide the derivation of topics. We initially train a baseline topic model from a set of Rust-related Q&A posts. We then use this baseline model to determine the distribution of topic probabilities of each document of the official Rust documentation. Afterwards, we assess the similarities and differences between the topics of the Q&A posts and the official documentation. Our results show a relatively high level of topical alignment in Rust documentation. Still, information about specific topics is scarce in both the Q&A websites and the documentation, particularly related topics with programming niches such as network, game, and database development. For other topics (e.g., related topics with language features such as structs, patterns and matchings, and foreign function interface), information is only available on Q&A websites while lacking in the official documentation. Finally, we discuss implications for programming language documenters, particularly how to leverage our approach to prioritize topics that should be added to the documentation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Empirical Study of the Evolution of Python Questions on Stack Overflow
- Author
-
Gopika Syam, Sangeeta Lal, and Tao Chen
- Subjects
Python programming ,Software Development ,Stack Overflow ,Topic Modelling ,Computer software ,QA76.75-76.765 - Abstract
Background: Python is a popular and easy-to-use programming language. It is constantly expanding, with new features and libraries being introduced daily for a broad range of applications. This dynamic expansion needs a robust support structure for developers to effectively utilise the language. Aim: In this study we conduct an in-depth analysis focusing on several research topics to understand the theme of Python questions and identify the challenges that developers encounter, using the questions posted on Stack Overflow. Method:We perform a quantitative and qualitative analysis of Python questions in Stack Overflow. Topic Modelling is also used to determine the most popular and difficult topics among developers. Results: The findings of this study revealed a recent surge in questions about scientific computing libraries pandas and TensorFlow. Also, we observed that the discussion of Data Structures and Formats is more popular in the Python community, whereas areas such as Installation, Deployment, and IDE are still challenging. Conclusion: This study can direct the research and development community to put more emphasis on tackling the actual issues that Python programmers are facing.
- Published
- 2023
- Full Text
- View/download PDF
42. Prompt enhance API recommendation: visualize the user’s real intention behind this query
- Author
-
Wang, Yong, Chen, Linjun, Gao, Cuiyun, Fang, Yingtao, and Li, Yong
- Published
- 2024
- Full Text
- View/download PDF
43. Empirical Study of the Evolution of Python Questions on Stack Overflow.
- Author
-
Syam, Gopika, Lal, Sangeeta, and Tao Chen
- Subjects
PYTHON programming language ,PYTHONS ,PROGRAMMING languages ,EMPIRICAL research ,DATA structures - Abstract
Background: Python is a popular and easy-to-use programming language. It is constantly expanding, with new features and libraries being introduced daily for a broad range of applications. This dynamic expansion needs a robust support structure for developers to effectively utilise the language. Aim: In this study we conduct an in-depth analysis focusing on several research topics to understand the theme of Python questions and identify the challenges that developers encounter, using the questions posted on Stack Overflow. Method:We perform a quantitative and qualitative analysis of Python questions in Stack Overflow. Topic Modelling is also used to determine the most popular and difficult topics among developers. Results: The findings of this study revealed a recent surge in questions about scientific computing libraries pandas and TensorFlow. Also, we observed that the discussion of Data Structures and Formats is more popular in the Python community, whereas areas such as Installation, Deployment, and IDE are still challenging. Conclusion: This study can direct the research and development community to put more emphasis on tackling the actual issues that Python programmers are facing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Trend Analysis of Large Language Models through a Developer Community: A Focus on Stack Overflow
- Author
-
Jungha Son and Boyoung Kim
- Subjects
large language model ,Transformer ,Hugging Face ,Stack Overflow ,developer community ,Information technology ,T58.5-58.64 - Abstract
In the rapidly advancing field of large language model (LLM) research, platforms like Stack Overflow offer invaluable insights into the developer community’s perceptions, challenges, and interactions. This research aims to analyze LLM research and development trends within the professional community. Through the rigorous analysis of Stack Overflow, employing a comprehensive dataset spanning several years, the study identifies the prevailing technologies and frameworks underlining the dominance of models and platforms such as Transformer and Hugging Face. Furthermore, a thematic exploration using Latent Dirichlet Allocation unravels a spectrum of LLM discussion topics. As a result of the analysis, twenty keywords were derived, and a total of five key dimensions, “OpenAI Ecosystem and Challenges”, “LLM Training with Frameworks”, “APIs, File Handling and App Development”, “Programming Constructs and LLM Integration”, and “Data Processing and LLM Functionalities”, were identified through intertopic distance mapping. This research underscores the notable prevalence of specific Tags and technologies within the LLM discourse, particularly highlighting the influential roles of Transformer models and frameworks like Hugging Face. This dominance not only reflects the preferences and inclinations of the developer community but also illuminates the primary tools and technologies they leverage in the continually evolving field of LLMs.
- Published
- 2023
- Full Text
- View/download PDF
45. Linguistic Analysis of Stack Overflow Data: Native English vs Non-native English Speakers
- Author
-
Morin, Janneke, Ghosh, Krishnendu, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Kamp, Michael, editor, Koprinska, Irena, editor, Bibal, Adrien, editor, Bouadi, Tassadit, editor, Frénay, Benoît, editor, Galárraga, Luis, editor, Oramas, José, editor, Adilova, Linara, editor, Krishnamurthy, Yamuna, editor, Kang, Bo, editor, Largeron, Christine, editor, Lijffijt, Jefrey, editor, Viard, Tiphaine, editor, Welke, Pascal, editor, Ruocco, Massimiliano, editor, Aune, Erlend, editor, Gallicchio, Claudio, editor, Schiele, Gregor, editor, Pernkopf, Franz, editor, Blott, Michaela, editor, Fröning, Holger, editor, Schindler, Günther, editor, Guidotti, Riccardo, editor, Monreale, Anna, editor, Rinzivillo, Salvatore, editor, Biecek, Przemyslaw, editor, Ntoutsi, Eirini, editor, Pechenizkiy, Mykola, editor, Rosenhahn, Bodo, editor, Buckley, Christopher, editor, Cialfi, Daniela, editor, Lanillos, Pablo, editor, Ramstead, Maxwell, editor, Verbelen, Tim, editor, Ferreira, Pedro M., editor, Andresini, Giuseppina, editor, Malerba, Donato, editor, Medeiros, Ibéria, editor, Fournier-Viger, Philippe, editor, Nawaz, M. Saqib, editor, Ventura, Sebastian, editor, Sun, Meng, editor, Zhou, Min, editor, Bitetta, Valerio, editor, Bordino, Ilaria, editor, Ferretti, Andrea, editor, Gullo, Francesco, editor, Ponti, Giovanni, editor, Severini, Lorenzo, editor, Ribeiro, Rita, editor, Gama, João, editor, Gavaldà, Ricard, editor, Cooper, Lee, editor, Ghazaleh, Naghmeh, editor, Richiardi, Jonas, editor, Roqueiro, Damian, editor, Saldana Miranda, Diego, editor, Sechidis, Konstantinos, editor, and Graça, Guilherme, editor
- Published
- 2021
- Full Text
- View/download PDF
46. SONAS: A System to Obtain Insights on Web APIs from Stack Overflow
- Author
-
Wang, Naixuan, Cao, Jian, Qi, Qing, Gu, Qi, Qian, Shiyou, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Yuqing, editor, Liu, Dongning, editor, Liao, Hao, editor, Fan, Hongfei, editor, and Gao, Liping, editor
- Published
- 2021
- Full Text
- View/download PDF
47. Optimizing a Gamified Design Through Reinforcement Learning - a Case Study in Stack Overflow
- Author
-
Martin, Jonathan, Torres, Diego, Fernandez, Alejandro, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Naiouf, Marcelo, editor, Rucci, Enzo, editor, Chichizola, Franco, editor, and De Giusti, Laura, editor
- Published
- 2021
- Full Text
- View/download PDF
48. System Model to Effectively Understand Programming Error Messages Using Similarity Matching and Natural Language Processing
- Author
-
Desai, Veena, Ajawan, Pratijnya, Betadur, Balaji, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Gunjan, Vinit Kumar, editor, Suganthan, P. N., editor, Haase, Jan, editor, and Kumar, Amit, editor
- Published
- 2021
- Full Text
- View/download PDF
49. Norm Violation in Online Communities – A Study of Stack Overflow Comments
- Author
-
Cheriyan, Jithin, Savarimuthu, Bastin Tony Roy, Cranefield, Stephen, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Aler Tubella, Andrea, editor, Cranefield, Stephen, editor, Frantz, Christopher, editor, Meneguzzi, Felipe, editor, and Vasconcelos, Wamberto, editor
- Published
- 2021
- Full Text
- View/download PDF
50. What Refactoring Topics Do Developers Discuss? A Large Scale Empirical Study Using Stack Overflow
- Author
-
Chaima Abid, Khouloud Gaaloul, Marouane Kessentini, and Vahid Alizadeh
- Subjects
Empirical study ,stack overflow ,refactoring ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Due to the growing complexity of software systems, there has been a dramatic increase in research and industry demand on refactoring. Refactoring research nowadays addresses challenges beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommending specific refactoring activities, detecting refactoring opportunities and testing the correctness of applied refactorings. Very few studies focused on the challenges that practitioners face when refactoring software systems and what should be the current refactoring research focus from the developers’ perspective. Without such knowledge, tool builders invest in the wrong direction, and researchers miss many opportunities for improving the practice of refactoring. In this paper, we collected data from the popular online Q&A site, Stack Overflow, and analyzed posts to identify what do developers ask about refactoring. We clustered these questions to find the different refactoring related topics using one of the most popular topic modeling algorithms, Latent Dirichlet Allocation (LDA). We found that developers are asking about design patterns, design and user interface refactoring, web services, parallel programming, and mobile apps. We also identified what popular refactoring challenges are the most difficult and the current important topics and questions related to refactoring. Moreover, we discovered gaps between existing research on refactoring and the challenges developers face. To the best of our knowledge, this paper represents the first Stack Overflow study to identify the refactoring topics discussed by developers. Our study can help researchers to focus on practical refactoring problems, practitioners know more about current challenges and build better refactoring tools, and educators revise curriculum to target current needs on refactoring.
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.