1. به کارگیری خوشه بندی مفهومی برای استخراج...
- Author
-
رجب کیانی شاهوند, احمد شعبانی, عاصفه عاصمی, and مرتضی محمدی استا
- Subjects
SCIENTIFIC communication ,TERMS & phrases ,ENCYCLOPEDIAS & dictionaries ,TEXT mining ,NOUN phrases (Grammar) - Abstract
Scientific communication encompasses various types and forms of communication conducted through the use of communication methods and tools, aiming to exchange scientific knowledge and information. To gain a comprehensive understanding of scientific and research communications and enhance them, it is crucial to identify the terms and concepts. Therefore, the main objective of this research is to identify and conceptually cluster key terms in the field of scientific communication using text mining techniques. The present research method is quantitative in terms of approach and practical in terms of purpose and utilized various text mining techniques for identifying and clustering key terms in the field of scientific communication. The research population consists of abstracts of articles related to scientific communication, extracted from databases such as Web of Science and Scopus, totaling 558 articles. The sampling method was census. Initially, all nominal phrases were extracted using available libraries. Each compound phrase was decomposed into its constituent words, and based on GloVe dictionary, the average vectors of those words were calculated, assigning a numerical vector to each compound phrase. The researchers created an equivalent expression using existing vocabulary to describe unknown terms that did not exist in the GloVe dictionary. The clustering (using the K-means method) was performed on these vectors. The findings revealed that out of 17,930 extracted keywords, 13,651 terms were noun phrases. Also, 16% of terms in the field of scientific communication were single words and 84% of them were compound. After creating vectors of compound terms and performing clustering, 40 conceptual clusters were created from 792 phrases or terms in the field of scientific communication. After adjusting and removing weak clusters, researchers finally identified 22 clusters in the field of scientific communication. Identifying the concepts and components in scientific communication in the form of conceptual clusters and its elements is attributed to the results of this research. One of the most significant findings was the assignment of numerical vectors to composite phrases based on the vectors of their constituent words. These vectors were then used for clustering and categorizing phrases, as well as improving and correcting some clusters. This method pays attention to the semantics aspects and learning in the clustering and categorization of concepts and, will aid to precise analysis of key terms and phrases in various fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF