Back to Search
Start Over
Evolution of document networks
- Source :
- Proceedings of the National Academy of Sciences of the United States of America. 101
- Publication Year :
- 2004
-
Abstract
- How does a network of documents grow without centralized control? This question is becoming crucial as we try to explain the emergent scale-free topology of the World Wide Web and use link analysis to identify important information resources. Existing models of growing information networks have focused on the structure of links but neglected the content of nodes. Here I show that the current models fail to reproduce a critical characteristic of information networks, namely the distribution of textual similarity among linked documents. I propose a more realistic model that generates links by using both popularity and content. This model yields remarkably accurate predictions of both degree and similarity distributions in networks of web pages and scientific literature.
- Subjects :
- Structure (mathematical logic)
Internet
Multidisciplinary
Information retrieval
Models, Statistical
Computer science
Control (management)
Reproducibility of Results
Scientific literature
Documentation
National Academy of Sciences, U.S
Popularity
Databases, Bibliographic
United States
Web page
Similarity (psychology)
Colloquium
Nerve Net
Periodicals as Topic
Topology (chemistry)
Link analysis
Subjects
Details
- ISSN :
- 00278424
- Volume :
- 101
- Database :
- OpenAIRE
- Journal :
- Proceedings of the National Academy of Sciences of the United States of America
- Accession number :
- edsair.doi.dedup.....360b8f2418505aa46388ff13f1a9b152