Back to Search
Start Over
Recurrent-neural-network-based Boolean factor analysis and its application to word clustering
- Source :
- IEEE transactions on neural networks. 20(7)
- Publication Year :
- 2009
-
Abstract
- The objective of this paper is to introduce a neural-network-based algorithm for word clustering as an extension of the neural-network-based Boolean factor analysis algorithm (Frolov , 2007). It is shown that this extended algorithm supports even the more complex model of signals that are supposed to be related to textual documents. It is hypothesized that every topic in textual data is characterized by a set of words which coherently appear in documents dedicated to a given topic. The appearance of each word in a document is coded by the activity of a particular neuron. In accordance with the Hebbian learning rule implemented in the network, sets of coherently appearing words (treated as factors) create tightly connected groups of neurons, hence, revealing them as attractors of the network dynamics. The found factors are eliminated from the network memory by the Hebbian unlearning rule facilitating the search of other factors. Topics related to the found sets of words can be identified based on the words' semantics. To make the method complete, a special technique based on a Bayesian procedure has been developed for the following purposes: first, to provide a complete description of factors in terms of component probability, and second, to enhance the accuracy of classification of signals to determine whether it contains the factor. Since it is assumed that every word may possibly contribute to several topics, the proposed method might be related to the method of fuzzy clustering. In this paper, we show that the results of Boolean factor analysis and fuzzy clustering are not contradictory, but complementary. To demonstrate the capabilities of this attempt, the method is applied to two types of textual data on neural networks in two different languages. The obtained topics and corresponding words are at a good level of agreement despite the fact that identical topics in Russian and English conferences contain different sets of keywords.
- Subjects :
- Fuzzy clustering
Computer Networks and Communications
Computer science
Fuzzy set
computer.software_genre
Fuzzy logic
Text mining
Fuzzy Logic
Artificial Intelligence
Computer Simulation
Boolean function
Cluster analysis
Mathematical Computing
Language
Models, Statistical
Artificial neural network
business.industry
General Medicine
Content-addressable memory
Computer Science Applications
Semantics
Recurrent neural network
Boolean network
Hebbian theory
Unsupervised learning
Algorithm design
Artificial intelligence
Neural Networks, Computer
business
computer
Software
Natural language processing
Algorithms
Subjects
Details
- ISSN :
- 19410093
- Volume :
- 20
- Issue :
- 7
- Database :
- OpenAIRE
- Journal :
- IEEE transactions on neural networks
- Accession number :
- edsair.doi.dedup.....3d3f8027a34667fdf91afe3a12e82a6b