Back to Search Start Over

A Novel Model for Automatic Identification of Open Source Software License Terms

Authors :
Zhiqiang Wang
Guoqiang Xiao
Zili Zhang
Sheng Wu
Source :
2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET).
Publication Year :
2021
Publisher :
IEEE, 2021.

Abstract

Open source software nowadays has become an important trend for software technology innovation and software industry development. The use and distribution of open source software will come with an open source license. Open source licenses can regulate the use of open source software and protect its intellectual property. However, the diversity of licenses makes it difficult for developers to correctly understand the license content and has aggravated the chance of illegally combining open source components. To alleviate this problem, existing methods mainly analyse license text manually. In this paper, a novel method is proposed for the automatic identification of open source software license terms. This method consists of three key components, license modeling for license terms extraction, topic model based on Latent Dirichlet Allocation to mine topics in licenses, and topics and terms mapping module for construction of their relationships. To evaluate the effectiveness and practicability of our model, we build a new Open Source License Dataset for model training and testing. Experimental results demonstrate that our approach can achieve a satisfactory solution for identifying license terms.

Details

Database :
OpenAIRE
Journal :
2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET)
Accession number :
edsair.doi...........5aaa2c2122067c2a7081a94b9e1e0fb7
Full Text :
https://doi.org/10.1109/ccet52649.2021.9544240