89 results on '"zipf's law"'
Search Results
2. Intra WBAN routing using Zipf’s law and intelligent transmission power switching approach (ZITA)
- Author
-
Roy, Moumita, Chowdhury, Chandreyee, Ahmed, Ghufran, Aslam, Nauman, Chattopadhyay, Samiran, and Islam, Saif Ul
- Published
- 2022
- Full Text
- View/download PDF
3. A Fast Indexing Algorithm Optimization with User Behavior Pattern
- Author
-
Wang, Zhu, Luo, Tiejian, Xu, Yanxiang, Cheng, Fuxing, Zhang, Xin, Wang, Xiang, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Zu, Qiaohong, editor, Hu, Bo, editor, and Elçi, Atilla, editor
- Published
- 2013
- Full Text
- View/download PDF
4. On Zipf’s law and the bias of Zipf regressions
- Author
-
Schluter, Christian
- Published
- 2021
- Full Text
- View/download PDF
5. Music, New Aesthetic and Complexity
- Author
-
Adams, David, Grigolini, Paolo, Akan, Ozgur, Series editor, Bellavista, Paolo, Series editor, Cao, Jiannong, Series editor, Dressler, Falko, Series editor, Ferrari, Domenico, Series editor, Gerla, Mario, Series editor, Kobayashi, Hisashi, Series editor, Palazzo, Sergio, Series editor, Sahni, Sartaj, Series editor, Shen, Xuemin (Sherman), Series editor, Stan, Mircea, Series editor, Xiaohua, Jia, Series editor, Zomaya, Albert, Series editor, Coulson, Geoffrey, Series editor, and Zhou, Jie, editor
- Published
- 2009
- Full Text
- View/download PDF
6. Generalized Thermodynamics Underlying the Laws of Zipf and Benford
- Author
-
Altamirano, Carlo, Robledo, Alberto, Akan, Ozgur, Series editor, Bellavista, Paolo, Series editor, Cao, Jiannong, Series editor, Dressler, Falko, Series editor, Ferrari, Domenico, Series editor, Gerla, Mario, Series editor, Kobayashi, Hisashi, Series editor, Palazzo, Sergio, Series editor, Sahni, Sartaj, Series editor, Shen, Xuemin (Sherman), Series editor, Stan, Mircea, Series editor, Xiaohua, Jia, Series editor, Zomaya, Albert, Series editor, Coulson, Geoffrey, Series editor, and Zhou, Jie, editor
- Published
- 2009
- Full Text
- View/download PDF
7. Scaling Behavior of Chinese City Size Distribution
- Author
-
Zhu, Xiaowu, Xiong, Aimin, Li, Liangsheng, Liu, Maoxin, Chen, Xiaosong, Akan, Ozgur, Series editor, Bellavista, Paolo, Series editor, Cao, Jiannong, Series editor, Dressler, Falko, Series editor, Ferrari, Domenico, Series editor, Gerla, Mario, Series editor, Kobayashi, Hisashi, Series editor, Palazzo, Sergio, Series editor, Sahni, Sartaj, Series editor, Shen, Xuemin (Sherman), Series editor, Stan, Mircea, Series editor, Xiaohua, Jia, Series editor, Zomaya, Albert, Series editor, Coulson, Geoffrey, Series editor, and Zhou, Jie, editor
- Published
- 2009
- Full Text
- View/download PDF
8. Modelling of Population Migration to Reproduce Rank-Size Distribution of Cities in Japan
- Author
-
Kuninaka, Hiroto, Matsushita, Mitsugu, Akan, Ozgur, Series editor, Bellavista, Paolo, Series editor, Cao, Jiannong, Series editor, Dressler, Falko, Series editor, Ferrari, Domenico, Series editor, Gerla, Mario, Series editor, Kobayashi, Hisashi, Series editor, Palazzo, Sergio, Series editor, Sahni, Sartaj, Series editor, Shen, Xuemin (Sherman), Series editor, Stan, Mircea, Series editor, Xiaohua, Jia, Series editor, Zomaya, Albert, Series editor, Coulson, Geoffrey, Series editor, and Zhou, Jie, editor
- Published
- 2009
- Full Text
- View/download PDF
9. Emotion Recognition of Pop Music Based on Maximum Entropy with Priors
- Author
-
He, Hui, Chen, Bo, Guo, Jun, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Theeramunkong, Thanaruk, editor, Kijsirikul, Boonserm, editor, Cercone, Nick, editor, and Ho, Tu-Bao, editor
- Published
- 2009
- Full Text
- View/download PDF
10. Zipf’s Law, Hyperbolic Distributions and Entropy Loss
- Author
-
Harremoës, P., Topsoe, F., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Ahlswede, Rudolf, editor, Bäumer, Lars, editor, Cai, Ning, editor, Aydinian, Harout, editor, Blinovsky, Vladimir, editor, Deppe, Christian, editor, and Mashurian, Haik, editor
- Published
- 2006
- Full Text
- View/download PDF
11. Quantitative Analysis of Zipf’s Law on Web Cache
- Author
-
Shi, Lei, Gu, Zhimin, Wei, Lin, Shi, Yun, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Pan, Yi, editor, Chen, Daoxu, editor, Guo, Minyi, editor, Cao, Jiannong, editor, and Dongarra, Jack, editor
- Published
- 2005
- Full Text
- View/download PDF
12. Chaotic dynamics of generating Markov partitions, and linguistic sequences mimicking Zipf's law
- Author
-
Katsikas, Anastassis A., Nicolis, John S., Siekmann, J., editor, Goos, G., editor, Hartmanis, J., editor, Becker, J. D., editor, Eisele, I., editor, and Mündemann, F. W., editor
- Published
- 1991
- Full Text
- View/download PDF
13. The detection of natural cities in the Netherlands—Nocturnal satellite imagery and Zipf’s law
- Author
-
Bergs, Rolf
- Published
- 2018
- Full Text
- View/download PDF
14. A new feature extraction scheme in wavelet transform for stego image classification
- Author
-
Laimeche, Lakhdar, Merouani, Hayet Farida, and Mazouzi, Smaine
- Published
- 2018
- Full Text
- View/download PDF
15. Testing Heaps’ law for cities using administrative and gridded population data sets
- Author
-
Simini, Filippo and James, Charlotte
- Published
- 2019
- Full Text
- View/download PDF
16. Scale-free distribution as an economic invariant: a theoretical approach
- Author
-
Chakrabarti, Anindya S.
- Published
- 2017
- Full Text
- View/download PDF
17. The power laws: Zipf and inverse Zipf for automated segmentation and classification of masses within mammograms
- Author
-
Hamoud, Meriem, Merouani, Hayet Farida, and Laimeche, Lakhder
- Published
- 2015
- Full Text
- View/download PDF
18. Scaling laws in emotion-associated words and corresponding network topology
- Author
-
Takehara, Takuma, Ochiai, Fumio, and Suzuki, Naoto
- Published
- 2015
- Full Text
- View/download PDF
19. Monterey Mirror: an experiment in interactive music performance combining evolutionary computation and Zipf’s law
- Author
-
Manaris, Bill, Hughes, Dana, and Vassilandonakis, Yiorgos
- Published
- 2015
- Full Text
- View/download PDF
20. Communication-efficient algorithms for parallel latent Dirichlet allocation
- Author
-
Yan, Jian-Feng, Zeng, Jia, Gao, Yang, and Liu, Zhi-Qiang
- Published
- 2015
- Full Text
- View/download PDF
21. An Introduction to the Novel Challenges in Information Retrieval for Social Media
- Author
-
Giacomo Inches and Fabio Crestani
- Subjects
Information retrieval ,Zipf's law ,Computer science ,Content analysis ,Burstiness ,business.industry ,Human–computer information retrieval ,Cosine similarity ,Social media ,The Internet ,business ,Heap (data structure) - Abstract
The importance of the Internet as a communication medium is reflected in the large amount of documents being generated every day by users of the different services that take place online. This has caused a massive change in the documents being reached and retrieved. In this article we study how Information Retrieval models should change to reflect the changes that are happening to the documents being processed. We analyse the properties of the online user-generated documents of some of the most established services over the Internet (e.g. Kongregate, Twitter, Myspace and Slashdot) and compare them with a consolidated collection of standard information retrieval documents (e.g. Wall Street Journal, Associated Press, Financial Times). We study the statistical properties of these collections (e.g. Zipf’s Law and Heap’s Law) and investigate other important feature, such as document similarity, term burstiness, emoticons and part-of-speech analysis. We highlight the applicability and limits of traditional content analysis techniques to the new online user-generated documents and show the need for a specific processing for those documents in oder to be able to provide effective content analysis.
- Published
- 2014
- Full Text
- View/download PDF
22. Dimensionality Reduction for Information Retrieval Using Vector Replacement of Rare Terms
- Author
-
Marian Vajtersic and Tobias Berka
- Subjects
Reduction (complexity) ,Matrix (mathematics) ,Information retrieval ,Zipf's law ,Computer science ,Dimensionality reduction ,Singular value decomposition ,Search engine indexing ,Benchmark (computing) ,Vector space model - Abstract
Dimensionality reduction by algebraic methods is an established technique to address a number of problems in information retrieval. In this chapter, we introduce a new approach to dimensionality reduction for text retrieval. According to Zipf’s law, the majority of indexing terms occurs only in a small number of documents. Our new algorithm exploits this observation to compute a dimensionality reduction. It replaces rare terms by computing a vector which expresses their semantics in terms of common terms. This process produces a projection matrix, which can be applied to a corpus matrix and individual document and query vectors. We give an accurate mathematical and algorithmic description of our algorithms and present an initial experimental evaluation on two benchmark corpora. These experiments indicate that our algorithm can deliver a substantial reduction in the number of features, from 8,742 to 500 and from 47,236 to 392 features, while preserving or even improving the retrieval performance.
- Published
- 2014
- Full Text
- View/download PDF
23. A New Approach to Decrease Invalidate Rate of Weak Consistency Methods in Web Proxy Caching
- Author
-
Zhou Zhou, Qingyun Liu, Chao Zheng, Chen Chen, and Hongzhou Sha
- Subjects
Scheme (programming language) ,Hardware_MEMORYSTRUCTURES ,Hotspot (Wi-Fi) ,Weak consistency ,Resource (project management) ,Zipf's law ,Computer science ,Distributed computing ,Web page ,Bandwidth (computing) ,Cache ,computer ,computer.programming_language - Abstract
With the growing demand for accelerating large scale web access, web proxy cache is widely used. To make full use of computing resource and bandwidth of proxy cache nodes, weak cache consistency is the best choice in most cases. Traditional refreshing methods like Adaptive TTL will cause high invalidate rate of web pages. We introduce a new effective way to decrease the invalidate rate of frequently queried objects in weak consistency scheme. Based on Zipfs law, our method focuses on giving the hotspot objects more priorities during cache refreshing process, which reduces the invalidate rate on hotspot objects by paying less concentration on the less frequently queried objects.
- Published
- 2014
- Full Text
- View/download PDF
24. A Cost-Aware Strategy for Merging Differential Stores in Column-Oriented In-Memory DBMS
- Author
-
Jens Krüger, Hasso Plattner, Joos-Hendrik Böse, Alexander Zeier, Florian Hübner, and Cafer Tosun
- Subjects
Data store ,Database ,Zipf's law ,Computer science ,Merge algorithm ,computer.software_genre ,computer ,Merge (version control) - Abstract
Fast execution of analytical and transactional queries in column-oriented in-memory DBMS is achieved by combining a read-optimized data store with a write-optimized differential store. To maintain high read performance, both structures must be merged from time to time. In this paper we describe a new merge algorithm that applies full and partial merge operations based on their costs and improvement of read performance. We show by simulation that our algorithm reduces merge costs significantly for workloads found in enterprise applications, while improving read performance at the same time.
- Published
- 2012
- Full Text
- View/download PDF
25. Statistical Metrics for Individual Password Strength
- Author
-
Joseph Bonneau
- Subjects
Password ,Structure (mathematical logic) ,ComputingMilieux_MANAGEMENTOFCOMPUTINGANDINFORMATIONSYSTEMS ,Cognitive password ,Zipf's law ,Computer science ,Contrast (statistics) ,NIST ,Data mining ,Semantics ,computer.software_genre ,computer ,Password strength - Abstract
We propose several possible metrics for measuring the strength of an individual password or any other secret drawn from a known, skewed distribution. In contrast to previous ad hoc approaches which rely on textual properties of passwords, we consider the problem without any knowledge of password structure. This enables rating the strength of a password given a large sample distribution without assuming anything about password semantics. We compare the results of our generic metrics against those of the NIST metrics and other previous "entropy-based" metrics for a large password dataset, which suggest over-fitting in previous metrics.
- Published
- 2012
- Full Text
- View/download PDF
26. A Zipf-Like Distant Supervision Approach for Multi-document Summarization Using Wikinews Articles
- Author
-
Felipe Bravo-Marquez and Manuel Manriquez
- Subjects
Support vector machine ,Information retrieval ,Zipf's law ,Ranking ,Computer science ,Borda count ,Multi-document summarization ,Tuple ,Automatic summarization ,Event (probability theory) - Abstract
This work presents a sentence ranking strategy based on distant supervision for the multi-document summarization problem. Due to the difficulty of obtaining large training datasets formed by document clusters and their respective human-made summaries, we propose building a training and a testing corpus from Wikinews. Wikinews articles are modeled as "distant" summaries of their cited sources, considering that first sentences of Wikinews articles tend to summarize the event covered in the news story. Sentences from cited sources are represented as tuples of numerical features and labeled according to a relationship with the given distant summary that is based on the Zipf law. Ranking functions are trained using linear regressions and ranking SVMs, which are also combined using Borda count. Top ranked sentences are concatenated and used to build summaries, which are compared with the first sentences of the distant summary using ROUGE evaluation measures. Experimental results obtained show the effectiveness of the proposed method and that the combination of different ranking techniques outperforms the quality of the generated summary.
- Published
- 2012
- Full Text
- View/download PDF
27. A Generalized Algorithm for Publish/Subscribe Overlay Design and Its Fast Implementation
- Author
-
Roman Vitenberg, Chen Chen, and Hans-Arno Jacobsen
- Subjects
Speedup ,Theoretical computer science ,Zipf's law ,Computational complexity theory ,Computer science ,Distributed computing ,Search engine indexing ,Scalability ,Overlay network ,Data structure ,Rendering (computer graphics) - Abstract
It is a challenging and fundamental problem to construct the underlying overlay network to support efficient and scalable information distribution in topic-based publish/subscribe systems. Existing overlay design algorithms aim to minimize the node fan-out while building topic-connected overlays, in which all nodes interested in the same topic are organized in a directly connected dissemination sub-overlay. However, most state-of-the-art algorithms suffer from high computational complexity, such as O(|V|4|T|), where V is the node set and T is the topic set. We devise a general indexing data structure that provides a significantly faster implementation, with O(|V|2|T|) running time, for different state-of-the-art algorithms. The generality of the indexing data structure is due to the fact that it enables edge lookup by both node degree and edge contribution, a central metric in all existing algorithms. When tested on typical pub/sub workloads, the speedup observed was by a factor of over 1 000, thereby rendering the algorithms more suitable for practical use. For example, under a typically Zipf distributed pub/sub workload, with 1 000 nodes and 100 topics, our new implementation completes in 3.823 seconds, while the previous alternative takes over 555 minutes.
- Published
- 2012
- Full Text
- View/download PDF
28. Word Familiarity Distributions to Understand Heaps’ Law of Vocabulary Growth of the Internet Forums
- Author
-
Masao Kubo, Takashi Matsubara, and Hiroshi Sato
- Subjects
World Wide Web ,Word lists by frequency ,Vocabulary ,Zipf's law ,Computer science ,business.industry ,Law ,media_common.quotation_subject ,Lexical analysis ,Single person ,The Internet ,business ,media_common - Abstract
In this study, lexical analysis is applied to the log data of conversations on Internet forums. It is well known that many regularities in documents have been found, for example, Zipf's law and Heaps' law. This type of analysis has been applied to documents in various media. However, few studies apply this analysis to documents that have been developed by many authors, for example, the log data of conversations on Internet forums. Usually, the relationship between document size and these regularities is not important, because the size of such documents is determined by its author, which is normally only a single person. However, the size of the communication log of an Internet forum is an emergent property for people who are interested in the forum. We believe that it is important to understand the dynamics of conversations. Owing to the investigation in this study, the following trend has been found: the number of posted messages is small if the vocabulary growth parameter β of Heaps' law is not within preferred range. Additionally, this study propose a new explanation based on the multiple author environment to understand the differences of this parameter β. Traditionally, such documents written by more than 1 person, for example, web sites and programming language, are analyzed from the single author point of view. This traditional approach is very important but not sufficient because this approach cannot discuss differences of vocabulary of each of the authors.
- Published
- 2011
- Full Text
- View/download PDF
29. Using Agentization for Exploring Firm and Labor Dynamics
- Author
-
Robert L. Axtell and Omar A. Guerrero
- Subjects
Stylized fact ,Zipf's law ,media_common.quotation_subject ,Economics ,Wage ,Rationality ,Mathematical economics ,Rendering (computer graphics) ,media_common - Abstract
Agentization is the process of rendering neoclassical models into computational ones. This methodological tool can be used to analyze and test neoclassical theories under a more flexible computational framework. This paper presents agentization and its methodological framework. We propose that, by classifying the assumptions of a neoclassical model, it is possible to systematically analyze their influence in the predictions of a theory. Furthermore, agentization allows the researcher to explore the potentials and limitations of theories. We present an example by agentizing the model of Gabaix (1999) for the emergence of Zipf laws. We show that the agentized model is able to reproduce the main features of the Gabaix process, without holding neoclassical assumptions such as equilibrium, rationality, agent homogeneity, and centralized anonymous interactions. Additionally, the model generates stylized facts such as tent-shaped firm growth rates distributions, and the employer-size wage premium. These regularities are not considered in the neoclassical model. Thus, allows the researcher to explore the boundaries and potentials of the theory.
- Published
- 2011
- Full Text
- View/download PDF
30. Rules of Thumb for Information Acquisition from Large and Redundant Data
- Author
-
Wolfgang Gatterbauer
- Subjects
Zipf's law ,Computer science ,Pareto principle ,Sampling (statistics) ,0102 computer and information sciences ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Power law ,Rule of thumb ,Redundancy (information theory) ,010201 computation theory & mathematics ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Information bias ,Invariant (mathematics) ,computer ,Algorithm - Abstract
We develop an abstract model of information acquisition from redundant data. We assume a random sampling process from data which contain information with bias and are interested in the fraction of information we expect to learn as function of (i) the sampled fraction (recall) and (ii) varying bias of information (redundancy distributions). We develop two rules of thumb with varying robustness. We first show that, when information bias follows a Zipf distribution, the 80-20 rule or Pareto principle does surprisingly not hold, and we rather expect to learn less than 40% of the information when randomly sampling 20% of the overall data. We then analytically prove that for large data sets, randomized sampling from power-law distributions leads to "truncated distributions" with the same power-law exponent. This second rule is very robust and also holds for distributions that deviate substantially from a strict power law. We further give one particular family of powerlaw functions that remain completely invariant under sampling. Finally, we validate our model with two large Web data sets: link distributions to web domains and tag distributions on delicious.com.
- Published
- 2011
- Full Text
- View/download PDF
31. On the Analysis of Queues with Heavy Tails: A Non-Extensive Maximum Entropy Formalism and a Generalisation of the Zipf-Mandelbrot Distribution
- Author
-
Salam A. Assi and Demetres D. Kouvatsos
- Subjects
Computer Science::Performance ,Combinatorics ,Queueing theory ,Self-similarity ,Zipf's law ,Burstiness ,Principle of maximum entropy ,Physical system ,Statistical physics ,Mandelbrot set ,Queue ,Mathematics - Abstract
A critique of a non-extensive maximum entropy (NME) formalism is undertaken in conjunction with its application into the analysis of queues with heavy tails that are often observed in performance evaluation studies of heterogeneous networks exhibiting traffic burstiness, self-similarity and/or long range dependence (LRD). The credibility of the NME formalism, as a method of inductive inference, for the study of non-extensive systems with long-range interactions is explored in terms of four consistency axioms of extensive systems with short-range interactions. Focusing on a a general physical system and, as a special case, a single server queue with finite capacity, it is shown that the NME state probability is characterised by a generalisation of the Zipf-Mandelbrot (Z-M) type distribution depicting heavy tails and asymptotic power law behaviour. Typical numerical experiments are employed to illustrate the adverse combined impact of traffic burstiness and self-similarity on the behaviour of the queue. A reference to open issues relating to the NME formalism and open queueing networks is included.
- Published
- 2011
- Full Text
- View/download PDF
32. Origins of Scaling in Genetic Code
- Author
-
Daniel Polani, Mikhail Prokopenko, and Oliver Obst
- Subjects
Theoretical computer science ,Zipf's law ,business.industry ,Transition (fiction) ,media_common.quotation_subject ,Ambiguity ,Biology ,Genetic code ,Principle of least effort ,Codon usage bias ,Horizontal gene transfer ,Artificial intelligence ,business ,Indexicality ,media_common - Abstract
The principle of least effort in communications has been shown, by Ferrer i Cancho and Sole, to explain emergence of power laws (e.g., Zipf's law) in human languages. This paper applies the principle and the information-theoretic model of Ferrer i Cancho and Sole to genetic coding. The application of the principle is achieved via equating the ambiguity of signals used by "speakers" with codon usage, on the one hand, and the effort of "hearers" with needs of amino acid translation mechanics, on the other hand. The re-interpreted model captures the case of the typical (vertical) gene transfer, and confirms that Zipf's law can be found in the transition between referentially useless systems (i.e., ambiguous genetic coding) and indexical reference systems (i.e., zero-redundancy genetic coding). As with linguistic symbols, arranging genetic codes according to Zipf's law is observed to be the optimal solution for maximising the referential power under the effort constraints. Thus, the model identifies the origins of scaling in genetic coding -- via a trade-off between codon usage and needs of amino acid translation. Furthermore, the paper extends the model to multiple inputs, reaching out toward the case of horizontal gene transfer (HGT) where multiple contributors may share the same genetic coding. Importantly, the extended model also leads to a sharp transition between ambiguous HGT and zero-redundancy HGT. Zipf's law is also observed to be the optimal solution in the HGT case.
- Published
- 2011
- Full Text
- View/download PDF
33. Using Cellular Automata on a Graph to Model the Exchanges of Cash and Goods
- Author
-
Ranaivo Mahaleo Razakanirina and Bastien Chopard
- Subjects
Random graph ,Strongly connected component ,Stochastic cellular automaton ,Zipf's law ,Topological graph theory ,Complex network ,Mathematical economics ,Cellular automaton ,Mathematics ,Mobile automaton - Abstract
This paper investigates the behaviors and the properties of a "Give and Take" cellular automaton on a graph. Using an economical metaphor, this model implements the exchange of cash against goods, among the nodes of a graph G, with a local pricing mechanism. During the time evolution of this model, the strongly connected components (SCC) emerge, mimicking the creation of independent sub-markets. In the steady state, each SCC is characterized by a unique price obeying the supply and demand law for that sub-market. We also show that the distributions of cash and goods are proportional to the indegree of the cells, reproducing a Zipf's law of wealth distribution in case of a scalefree graph topology.
- Published
- 2010
- Full Text
- View/download PDF
34. Graphical Drop Caps Indexing
- Author
-
Hassan Chouaib, Florence Cloppet, and Nicole Vincent
- Subjects
Pixel ,Zipf's law ,Computer science ,Drop (liquid) ,Search engine indexing ,Data mining ,computer.software_genre ,tf–idf ,computer - Abstract
This paper presents a method for graphical drop caps indexing. Drop caps are extracted from old books. Finding a method classifying them according to styles defined by the historian is of considerable interest. The developed method is a statistical approach, where all possible patterns included in a pixel mask are processed in order to extract indexes that characterize the image. Then these indexes are used to classify a query drop cap by searching its most similar drop caps in the indexed base.
- Published
- 2010
- Full Text
- View/download PDF
35. Segmenting and Indexing Old Documents Using a Letter Extraction
- Author
-
Michel Ménard, Sloven Dubois, Jean-Marc Ogier, Mickael Coustaty, Laboratoire Informatique, Image et Interaction - EA 2118 (L3I), Université de La Rochelle (ULR), Jean-Marc Ogier, Wenyin Liu, Josep Llados, and ANR-06-MDCA-0012,Navidomass,NAVigation In DOcument MASSes(2006)
- Subjects
Connected component ,Zipf's law ,Computer science ,Search engine indexing ,Process (computing) ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,02 engineering and technology ,16. Peace & justice ,computer.software_genre ,01 natural sciences ,LNCS ,[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing ,010101 applied mathematics ,Market segmentation ,0202 electrical engineering, electronic engineering, information engineering ,Selection (linguistics) ,020201 artificial intelligence & image processing ,Segmentation ,Data mining ,0101 mathematics ,computer - Abstract
This paper presents a new method to extract areas of interest in drop caps and particularly the most important shape: Letter itself. This method relies on a combination of a Aujol and Chambolle algorithm and a Segmentation using a Zipf Law and can be enhanced as a three-step process: 1) Decomposition in layers 2) Segmentation using a Zipf Law 3) Selection of the connected components.
- Published
- 2010
- Full Text
- View/download PDF
36. Continuous Gibrat’s Law and Gabaix’s Derivation of Zipf’s Law
- Author
-
Yannick Malevergne, Didier Sornette, and Alexander Saichev
- Subjects
symbols.namesake ,Geometric Brownian motion ,Wiener process ,Gibrat's law ,Zipf's law ,Stochastic process ,Scale (chemistry) ,symbols ,Asset (economics) ,Mathematical economics ,Terminology - Abstract
In this chapter, we describe in detail the continuous version of Gibrat’s law and explain its close connection with the geometric Brownian motion (GBM), underlying any scale independent stochastic process. Due to the importance of the GBM for many economical, physical, biological and sociological applications, we focus our attention on the basic key properties of GBM. Some more subtle statistical properties of the GBM necessary for a deep understanding of the behavior of its realizations and, ultimately, the corresponding power distributions, are discussed in the following chapters. Although the GBM adequately simulates stochastic processes occurring in various scientific fields, here and for the remaining of the book, we use the terminology of firm’s asset values.
- Published
- 2009
- Full Text
- View/download PDF
37. Firm’s Sudden Deaths
- Author
-
Didier Sornette, Yannick Malevergne, and Alexander Saichev
- Subjects
Market capitalization ,Zipf's law ,Survival function ,Value (economics) ,Economics ,Failure rate ,Asset (economics) ,Monetary economics ,Sudden death ,Total asset - Abstract
There are a priori two exit mechanisms for firms: Firms disappear when their asset values become smaller than some minimum level. This is based on the standard idea, justified by the existence of a minimum efficient size, that there is a minimum firm size below which the firm cannot exist. This idea has been considered in several models of firm growth (see, e.g., de Wit, 2005 and references therein). An alternative approach suggested for instance by Gabaix (1999), considers that firms cannot decline below a minimum size and remain in business at this size until they start growing up again. In addition to the exit of a firm resulting from its value decreasing below a certain level, it sometimes happens that a firm encounters financial troubles while its asset value is still fairly high. One could cite the striking examples of Enron Corp. and Worldcom, whose market capitalization were supposedly high (actually the result of inflated total asset value of about $11 billion for Worldcom and probably much higher for Enron) when they went bankrupt. Beyond these anecdotic examples, there is a large empirical literature on firm entries and exits, that suggests the need for taking into account the existence of failure of large firms. For example, while it has been established that a first-order characterization for firm death involves lower failure rates for larger firms (Dunne et al., 1988, 1989), Bartelsman et al. (2003) also state that, for sufficiently old firms, there seems to be no difference in the firm failure rate across size categories. In previous chapters, we have examined the consequences and impact on Zipf’s law of the first exit mechanism. The present chapter is devoted to the study of the second mechanism.
- Published
- 2009
- Full Text
- View/download PDF
38. Deviations from Gibrat’s Law and Implications for Generalized Zipf’s Laws
- Author
-
Alexander Saichev, Didier Sornette, and Yannick Malevergne
- Subjects
Geometric Brownian motion ,Class (set theory) ,Diffusion equation ,Zipf's law ,Gibrat's law ,Diffusion process ,Law ,Initial value problem ,Brownian motion ,Mathematics - Abstract
The introduction of a mechanism in which firms die introduces already a deviation from Gibrat’s law for small s-values. Killing firms upon first touching the level s1 > 0 actually means that the corresponding firm’s asset values S(t) do not obey strictly Gibrat’s law of proportionate growth. Indeed, when S(t) becomes close to s1, the possibility of touching s1 arises, and the rate R(t, Δ) given by (2.1) significantly depends on s1. In the present chapter, we will discuss in detail another general class of models in which the stochastic growth process deviates from Gibrat’s law in different ways. Specifically, we will suppose that S(t) is a diffusion process, obeying the stochastic equation $$d S (t) = a [S(t)]dt + b[S(t)]dW (t), \qquad S(t = 0) = s_0,$$ (6.1) so that the corresponding pdf f(s; t) satisfies the diffusion equation (2.39) and the initial condition (2.40). Recall that Gibrat’s law of proportionate growth implies in particular that the coefficients a(s) and b(s) of the stochastic equation (6.1) are given by relations (2.41), i.e., are proportional to s. However, there is a wide and recent empirical literature, that suggests that Gibrat’s law does not hold, in particular for small firms (Reid, 1992; Audretsch, 1995; Harhoff et al., 1998; Weiss, 1998; Audretsch et al., 1999; Almus and Nerlinger, 2000; Calvo, 2006) See however Lotti et al. (2003, 2007) for a dissenting view.
- Published
- 2009
- Full Text
- View/download PDF
39. Useful Properties of Realizations of the Geometric Brownian Motion
- Author
-
Yannick Malevergne, Didier Sornette, and Alexander Saichev
- Subjects
Geometric Brownian motion ,Zipf's law ,Diffusion process ,Basis (linear algebra) ,Flow (mathematics) ,Order (ring theory) ,Statistical physics ,Power law ,Martingale representation theorem ,Mathematics - Abstract
In the previous chapter, we introduced the mean density of firm’s asset values (3.18), taking into account the flow of firm’s births {t i }. We provided a preliminary analysis of the properties of the mean density of firm sizes. In order to understand more deeply the roots of the power laws (3.24) and (3.25), and at the same time the basis of Zipf’s law (3.27) and (3.28), we discuss in detail in this chapter the statistical properties of the realizations of the GBM (2.11).
- Published
- 2009
- Full Text
- View/download PDF
40. U. S. Defense Market Concentration: An Analysis of the Period 1996–2006
- Author
-
Wayne Zandbergen
- Subjects
Market structure ,Zipf's law ,business.industry ,Economics ,Monetary economics ,Market share ,Diffusion (business) ,Small business ,Market concentration ,business ,health care economics and organizations ,Period (music) - Abstract
The defense market in the United States has undergone a significant amount of merger activity over the past 20 years. Several sources claim an increasing level of market concentration to be occurring. This paper examines several measures of the structure of the U. S. defense market from 1996–2006. Firm size is established as being Zipf distributed with exponent stable during this period. Other measures also show that significant market concentration has not resulted from these mergers. Simple computational approaches used to generate similar distributions methods do not explain this observation, suggesting that market entry conditions, firm growth rates, and diffusion of sales associated with purchased firms may be a factor in maintaining market structure.
- Published
- 2009
- Full Text
- View/download PDF
41. Efficient Clustering of Web-Derived Data Sets
- Author
-
Luís Sarmento, Lyle H. Ungar, Alexander Kehlenbeck, and Eugénio Oliveira
- Subjects
Connected component ,Clustering high-dimensional data ,Zipf's law ,Correlation clustering ,Scalability ,Canopy clustering algorithm ,Data mining ,computer.software_genre ,Cluster analysis ,computer ,Derived Data ,Mathematics - Abstract
Many data sets derived from the web are large, high-dimensional, sparse and have a Zipfian distribution of both classes and features. On such data sets, current scalable clustering methods such as streaming clustering suffer from fragmentation, where large classes are incorrectly divided into many smaller clusters, and computational efficiency drops significantly. We present a new clustering algorithm based on connected components that addresses these issues and so works well on web-type data.
- Published
- 2009
- Full Text
- View/download PDF
42. Rank-Size Distribution of Notes in Harmonic Music: Hierarchic Shuffling of Distributions
- Author
-
Manuel Beltrán del Río and Germinal Cocho
- Subjects
symbols.namesake ,Distribution (mathematics) ,Shuffling ,Zipf's law ,Statistics ,symbols ,Harmonic (mathematics) ,Pareto distribution ,Beta distribution ,Power law ,Rank-size distribution ,Mathematics - Abstract
We trace the rank size distribution of notes in harmonic music, which on previous works we suggested was much better represented by the Two-parameter, first class Beta distribution than the customary power law, to the ranked mixing of distributions dictated by the harmonic and instrumental nature of the piece. The same representation is shown to arise in other fields by the same type of ranked shuffling of distributions. We include the codon content of intergenic DNA sequences and the ranked distribution of sizes of trees in a determined area as examples. We show that the fittings proposed increase their accuracy with the number of distributions that are mixed and ranked.
- Published
- 2009
- Full Text
- View/download PDF
43. Group Method of Documentary Collections Using Genetic Algorithms
- Author
-
S. Jose Luis Castillo, León González Sotos, and José Raúl Fernández del Castillo
- Subjects
Information retrieval ,Zipf's law ,Similarity (network science) ,Point (typography) ,Computer science ,Genetic algorithm ,Group method ,Evolutionary algorithm ,Data mining ,Cluster analysis ,computer.software_genre ,computer - Abstract
We present a method of grouping documents with genetic algorithms, the groups are created from the tokens representing the document. The system select the tokens starting from the Goffman point, selecting an area of suitable transition making use for it of the Zipf law. The experiments are carried out with the collection Reuters 21578 and the genetic algorithm uses the new operators designed to find the affinity and similarity of the documents without having prior knowledge of other characteristics. The proposed method is an alternative to the methods of traditional clustering and the results show that genetic algorithm is robust, clustering the documents in the collection of documents efficiently.
- Published
- 2009
- Full Text
- View/download PDF
44. Spatial and Commuting Networks
- Author
-
Peter Nijkamp, Roberto Patuelli, Franz-Josef Bade, and Aura Reggiani
- Subjects
Transport engineering ,Wright ,Zipf's law ,Urban agglomeration ,Economies of agglomeration ,Perspective (graphical) ,Relevance (information retrieval) ,Economic geography ,Information theory ,Location theory - Abstract
A wide literature is devoted to the study of the relevance of space, encompassing several fields and disciplines, such as geography, economics, epidemiology, environmental and regional sciences. For example, space-time modelling has been a relevant focus of research in spatial economics starting from Hagerstrand (1967) and Wilson (1967, 1970). While the former paid attention to the modelling of spatial diffusion phenomena, the latter unified movements of spatial flows under the umbrella of statistical and information theory, by means of spatial interaction models. In these models, the relevance of spatial structure emerged in the associated cost/impedance functions. In parallel, starting from Zipf (1932) and Simon (1955), the importance of spatial structures (homogeneous or heterogeneous) has been discussed extensively in the literature, by focusing on the relationships between urban growth, agglomeration economies, and commuting costs (see, among others, Krugman 1991; Rossi-Hansberg and Wright 2006). A point of concern is that, in these spatial (growth and interaction) models, the effects of spatial topology and connectivity are only implicitly included, but never explicitly considered and discussed.
- Published
- 2009
- Full Text
- View/download PDF
45. Selectivity Estimation for Exclusive Query Translation in Deep Web Data Integration
- Author
-
Weiyi Meng, Fangjiao Jiang, and Xiaofeng Meng
- Subjects
Correlative ,Word lists by frequency ,Information retrieval ,Zipf's law ,Computer science ,Attribute domain ,Of the form ,Data mining ,computer.software_genre ,computer ,Categorical variable ,Predicate (grammar) ,Data integration - Abstract
In Deep Web data integration, some Web database interfaces express exclusive predicates of the form Q e = P i (P i *** P 1 , P 2 ,...,P m ), which permits only one predicate to be selected at a time. Accurately and efficiently estimating the selectivity of each Q e is of critical importance to optimal query translation. In this paper, we mainly focus on the selectivity estimation on infinite-value attribute which is more difficult than that on key attribute and categorical attribute. Firstly, we compute the attribute correlation and retrieve approximate random attribute-level samples through submitting queries on the least correlative attribute to the actual Web database. Then we estimate Zipf equation based on the word rank of the sample and the actual selectivity of several words from the actual Web database. Finally, the selectivity of any word on the infinite-value attribute can be derived by the Zipf equation. An experimental evaluation of the proposed selectivity estimation method is provided and experimental results are highly accurate.
- Published
- 2009
- Full Text
- View/download PDF
46. Modeling Parallel System Workloads with Temporal Locality
- Author
-
Tran Ngoc Minh and Lex Wolters
- Subjects
Theoretical computer science ,Similarity (geometry) ,Zipf's law ,Computer science ,Locality ,Key (cryptography) ,Parallelism (grammar) ,Process (computing) ,Locality of reference ,Workload ,Parallel computing - Abstract
In parallel systems, similar jobs tend to arrive within bursty periods. This fact leads to the existence of the locality phenomenon, a persistent similarity between nearby jobs, in real parallel computer workloads. This important phenomenon deserves to be taken into account and used as a characteristic of any workload model. Regrettably, this property has received little if any attention of researchers and synthetic workloads used for performance evaluation to date often do not have locality. With respect to this research trend, Feitelson has suggested a general repetition approach to model locality in synthetic workloads [6]. Using this approach, Li et al. recently introduced a new method for modeling temporal locality in workload attributes such as run time and memory [14]. However, with the assumption that each job in the synthetic workload requires a single processor, the parallelism has not been taken into account in their study. In this paper, we propose a new model for parallel computer workloads based on their result. In our research, we firstly improve their model to control locality of a run time process better and then model the parallelism. The key idea for modeling the parallelism is to control the cross-correlation between the run time and the number of processors. Experimental results show that not only the cross-correlation is controlled well by our model, but also the marginal distribution can be fitted nicely. Furthermore, the locality feature is also obtained in our model.
- Published
- 2009
- Full Text
- View/download PDF
47. Firm Size Distribution in Fortune Global 500
- Author
-
Liujun Chen, Qinghua Chen, and Kai Liu
- Subjects
Ideal (set theory) ,Zipf's law ,Simple (abstract algebra) ,business.industry ,Log-normal distribution ,Econometrics ,Distribution (economics) ,business ,Mathematics - Abstract
By analyzing the data of Fortune Global 500 firms from 1996 to 2008, we found that their ranks and revenues always obey the same distribution, which implies that worldwide firm structure has been stable for a long time. The fitting results show that simple Zipf distribution is not an ideal model for global firms, while SCL, FSS have better fitting goodness, and lognormal fitting is the best. And then, we proposed a simple explanation.
- Published
- 2009
- Full Text
- View/download PDF
48. Reducing Splaying by Taking Advantage of Working Sets
- Author
-
Jussi Kujala, Timo Aho, and Tapio Elomaa
- Subjects
Theoretical computer science ,Zipf's law ,Logarithm ,Computer science ,Binary search tree ,TheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY ,Working set ,Locality of reference ,Splay tree ,Data structure ,Algorithm ,Access time - Abstract
Access requests to keys stored into a data structure often exhibit locality of reference in practice. Such a regularity can be modeled, e.g., by working sets. In this paper we study to what extent can the existence of working sets be taken advantage of in splay trees. In order to reduce the number of costly splay operations we monitor for information on the current working set and its change. We introduce a simple algorithm which attempts to splay only when necessary. Under worst-case analysis the algorithm guarantees an amortized logarithmic bound. In empirical experiments it is 5% more efficient than randomized splay trees and at most 10% more efficient than the original splay tree. We also briefly analyze the usefulness of the commonly-used Zipf's distribution as a general model of locality of reference.
- Published
- 2008
- Full Text
- View/download PDF
49. Scaling Record Linkage to Non-uniform Distributed Class Sizes
- Author
-
Lars Schmidt-Thieme and Steffen Rendle
- Subjects
Class (computer programming) ,Zipf's law ,Scale (descriptive set theory) ,Data mining ,Markov logic network ,computer.software_genre ,computer ,Scaling ,Record linkage ,Blocking (computing) ,Metamodeling ,Mathematics - Abstract
Record linkage is a central task when information from different sources is integrated. Record linkage models use so-called blockers for reducing the search space by discarding obviously different record pairs. In practice, important problems have Zipf distributed class sizes with some large classes where blocking is not applicable any more. Therefore we propose two novel meta algorithms for scaling arbitrary record linkage models to such data sets. The first one parallelizes problems by creating overlapping subproblems and the second one reduces the search space for large classes effectively. Our evaluation shows that both scaling techniques are effective and are able to scale state-of-the-art models to challenging datasets.
- Published
- 2008
- Full Text
- View/download PDF
50. Long-range correlations and generalized Lévy walks in DNA sequences
- Author
-
Rosario N. Mantegna, M. H. R. Stanley, Shlomo Havlin, Sergey V. Buldyrev, H. E. Stanley, Chung-Kang Peng, Michael Simons, and Ary L. Goldberger
- Subjects
Zipf's law ,Range (statistics) ,Redundancy (engineering) ,Statistical physics ,Random walk ,DNA sequencing ,Plot (graphics) ,Mathematics - Abstract
There is a mounting body of evidence suggesting that the noncoding regions of DNA are rather special for at least two reasons: 1. They display long-range power-law correlations, as opposed to previously-believed exponentially-decaying correlations. 2. They display features common to hierarchically-structured languages-specifically, a linear Zipf plot and a non-zero redundancy.
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.