Back to Search Start Over

A Multinomial Naïve Bayesian (MNB) Network to Automatically Recommend Topics for GitHub Repositories

Authors :
Claudio Di Sipio
Phuong T. Nguyen
Riccardo Rubei
Davide Di Ruscio
Source :
Proceedings of the Evaluation and Assessment in Software Engineering, EASE
Publication Year :
2020

Abstract

GitHub has become a precious service for storing and managing software source code. Over the last year, 10M new developers have joined the GitHub community, contributing to more than 44M repositories. In order to help developers increase the reachability of their repositories, in 2017 GitHub introduced the possibility to classify them by means of topics. However, assigning wrong topics to a given repository can compromise the possibility of helping other developers approach it, and thus preventing them from contributing to its development. In this paper we investigate the application of Multinomial Naive Bayesian (MNB) networks to automatically classify GitHub repositories. By analyzing the README file(s) of the repository to be classified and the source code implementing it, the conceived approach is able to recommend GitHub topics. To the best of our knowledge, this is the first supervised approach addressing the considered problem. Consequently, since there exists no suitable baseline for the comparison, we validated the approach by considering different metrics, aiming to study various quality aspects.

Details

ISBN :
978-1-4503-7731-7
ISBNs :
9781450377317
Database :
OpenAIRE
Journal :
Proceedings of the Evaluation and Assessment in Software Engineering
Accession number :
edsair.doi.dedup.....ad6ccd544d048cc27d1bb5e3dea00181
Full Text :
https://doi.org/10.1145/3383219.3383227