201. An evolutionary factor analysis computation for mining website structures
- Author
-
B. Palacios, María del Rocío Martínez-Torres, Federico Barrero, Sergio Toral, Universidad de Sevilla. Departamento de Administración de Empresas y Comercialización e Investigación de Mercados (Marketing), Universidad de Sevilla. Departamento de Ingeniería Electrónica, Ministerio de Educación y Ciencia (MEC). España, and Junta de Andalucía
- Subjects
Structure (mathematical logic) ,Computer science ,General Engineering ,Evolutionary computation ,Genetic algorithms ,computer.software_genre ,Field (computer science) ,Link analysis ,Computer Science Applications ,Domain (software engineering) ,Web mining ,Artificial Intelligence ,Web page ,Data mining ,Factor analysis ,Website structure ,computer - Abstract
This paper explores website link structure considering websites as interconnected graphs and analyzing their features as a social network. Two networks have been extracted for representing websites: a domain network containing subdomains or external domains linked through the website and a page network containing webpages browsed from the root domain. Factor analysis provides the statistical methodology to adequately extract the main website profiles in terms of their internal structure. However, due to the large number of indicators, the task of selecting a representative subset of indicators becomes unaffordable. A genetic search of an optimum subset of indicators is proposed in this paper, selecting a multiobjective fitness function based on factor analysis results. The optimum solution provides a coherent and relevant categorization of website profiles, and highlights the possibilities of genetic algorithms as a tool for discovering new knowledge in the field of web mining Ministerio de Educación y Ciencia DPI2007- 60128 Junta de Andalucía. Consejería de Innovación, Ciencia y Empresa P07-TIC-02621
- Published
- 2012