Author: "Arias Vicente, Marta" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Arias Vicente, Marta"' showing total 131 results

Start Over Author "Arias Vicente, Marta"

131 results on '"Arias Vicente, Marta"'

1. ML approach to build an NBA model to optimize client engagement

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Muñiz, Alejandro, Gomez Jorba, Gerard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Muñiz, Alejandro, and Gomez Jorba, Gerard
Published: 2024

2. Causal discovery and prediction: methods and algorithms

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Gavaldà Mestre, Ricard, Arias Vicente, Marta, Blondel, Gilles, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Gavaldà Mestre, Ricard, Arias Vicente, Marta, and Blondel, Gilles
Abstract: (English) This thesis focuses on the discovery of causal relations and on the prediction of causal effects. Regarding causal discovery, this thesis introduces a novel and generic method to learn causal graphs by performing a sequence of interventions, where each intervention is applied on a single value of the intervened variables, while minimizing the overall cost of the sequence of intervened and observed variables during the discovery process. Regarding causal effect prediction, this thesis introduces a comprehensive causal reasoning method for models recurrent in time. In this thesis, all causal models are assumed to contain hidden confounders that have an influence on observed variables in the causal model, except when explicitly referring to causal models without hidden confounders as a sub-case. Also all variables are assumed to be in a finite domain. Contributions to the Discovery of Causal Relations Our method for the discovery of causal relations introduces several novelties. Firstly, we use interventions on a single value of the intervened variables. All previous methods require interventions on several values of the intervened variables in order to measure correlation or conditional independence among variables. By seeing do-calculus as a tool to predict systematically and numerically the effect of all the interventions that are possible, without having to actually perform them, we have moved the search space out of the real world, and eliminated the need for systematic correlation and independence testing in the real world. We assume that computational cost is not a concern, if we compare it with the cost of actually experimenting in the real world. Secondly, we accept any set of candidate graphs as input to our method. Previous knowledge may or may not be in the form of an equivalence class of graphs, and the set of candidate graphs may or may not have any particular parametrical characteristic. Some candidate graphs may have been discarded previously, (Español) Esta tesis estudia el aprendizaje de relaciones causales y la predicción de efectos causales. En cuanto al aprendizaje de relaciones causales, esta tesis presenta un método novedoso y genérico para aprender grafos causales mediante la realización de una secuencia de intervenciones, donde cada intervención se aplica sobre un único valor de las variables intervenidas, minimizando el coste total de la secuencia de variables intervenidas y observadas durante el proceso de aprendizaje. En cuanto a la predicción de efectos causales, esta tesis introduce un método de razonamiento causal para modelos recurrentes en el tiempo. En esta tesis, asumimos que los modelos causales contienen variables ocultas que tienen influencia sobre las variables observadas del modelo, excepto cuando se hace referencia explícita a modelos causales sin variables ocultas como subcaso. También asumimos que todas las variables están en un dominio finito. Contribución al aprendizaje de relaciones causales Nuestro método para el aprendizaje de relaciones causales introduce varias novedades. En primer lugar, utilizamos intervenciones sobre un único valor de las variables intervenidas. Todos los métodos anteriores requieren intervenciones sobre varios valores de las variables intervenidas para medir la correlación o la independencia condicional entre variables. Al utilizar el do-calculus como una herramienta para predecir sistemática y numéricamente el efecto de todas las intervenciones que son posibles, sin tener que realizarlas en la realidad, trasladamos el espacio de búsqueda fuera del mundo real y eliminamos la necesidad de correlación sistemática y pruebas de independencia condicional en el mundo real. Asumimos que disponemos de recursos computacionales ilimitados, y que disponer de ellos es preferible al costo de experimentar en el mundo real. En segundo lugar, aceptamos cualquier conjunto de grafos candidatos. El conocimiento previo de algunas partes del modelo puede tener o no la for, Postprint (published version)
Published: 2023

3. ML empowered vulnerability pattern detection

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Calvo Ibáñez, Albert, Arias Vicente, Marta, Sánchez i Casals, Alexandre, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Calvo Ibáñez, Albert, Arias Vicente, Marta, and Sánchez i Casals, Alexandre
Abstract: This report presents SIEVA, an AI-powered software that utilizes internet traffic logs to provide a taxonomy for classifying cyberattacks and intrusions using the MITRE framework. The software is divided into three parts: a database for storing logs, an AI engine for classifying and mapping the logs, and a graphical user interface for visualizing the results. The AI engine will implement a natural language processing machine learning classifier to classify the logs, and experimentation with name entity recognition techniques will be conducted to gain a better understanding of the logs. The ultimate goal of the project is to provide a user-friendly and efficient way for visualizing potential threats to a network using internet traffic logs.
Published: 2023

4. Optimizing energy market participation with batteries

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, i.LECO, Arias Vicente, Marta, Mihaylov, Mihail, Cheaib, Alaa, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, i.LECO, Arias Vicente, Marta, Mihaylov, Mihail, and Cheaib, Alaa
Abstract: Due to the fact that the energy sector is in transition, there are goals for lowering the energy cost with the use of renewables and batteries. This presents challenges to the system and the solution is the issuing of energy communities that can be used to make electricity provision more clean and secure. It is also to see how energy flexibility elements or elements on the consumption side can make the system more efficient and cheaper, which is being done in this paper concerning the day-ahead bid and batteries. Traditional day-ahead bidding methods have become costly, mainly when the forecasted energy consumption differs from the actual consumption, which has to be resolved by penalizing with an imbalance cost. This thesis is part of a more significant project (Layered Energy System) that is to be deployed in Spain. Applying such changes to the electricity system first requires becoming familiar with and understanding Spain's context. The first part of this thesis provides research to understand the Spanish regulatory framework, how the market works, and the status of these technologies in Spain. Following that, this thesis's primary work is to explore how day-ahead market bid could be improved through the use of batteries for better planning and error assumptions. It mentions several day-ahead bidding strategies in the context of energy and batteries. And then selects a subset (three) of the studied strategies and implements them, comparing their performance on actual electricity data. Finally, selects the one that best fits various scenarios and requirements. A particular objective function is opted to be minimized with respect to the battery constraints that involve the variables. A linear program will find the values that best fits those variables at every time step $t$ of a single day. The methodology is an improvement over traditional predictive models. After comparing different strategies, Results show that strategy one, namely "Stochastic Chance-constraint
Published: 2021

5. Teex: a toolbox for the evaluation of explanations

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Yunzhe Jia, Arias Vicente, Marta, Antoñanzas Acero, Jesús Maria, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Yunzhe Jia, Arias Vicente, Marta, and Antoñanzas Acero, Jesús Maria
Abstract: In the machine learning (ML) community, models are developed, trained and deployed for many applications. Text-to-speech, product and media recommendation, medical aiding, environmental protection and many more are examples of current ML applications. But, more often than not, given the quality requirements for the applications, these models can become very complex. So complex, in fact, that the decisions they take are usually not understandable by humans. These are called black box models. So, given the clear problem of not trusting models' decisions because of the rele- vance of their impact and their low transparency, explanation methods / explainers were born with the objective of distilling the factors that black box models take into account when making decisions into 'explanations', which humans can understand. There are many categorizations into which explanation methods fall. For example, the type of explanations they produce, on which models do they work, their mechanisms for extracting information or if they try to characterize a model's whole behaviour (global explanations) or individual predictions (local explanations). Given the current rise of the field of Explainable AI (XAI), which is driven by necessity, researchers need a tool to easily and swiftly evaluate the performance of state-of-the-art explainer methods. On top of current evaluation techniques such as performing subjective human experiments or manually comparing the quality of explanations, we present a toolbox that will allow to add another layer of credibility to part of XAI research. The toolbox is aimed at the automatic evaluation of local explanations via comparison to ground-truth explanations. Version 1.0 contains several evaluation metrics for different explanation types: saliency maps, decision rules and feature and word importance vectors. Moreover, the library also provides real-world and artificial data with available ground truth explanations so that users can easily benchmark l
Published: 2021

6. Extracting information from images to improve real estate marketplaces' experience

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Benchekroun, Youssef, Bosch i Mustarós, Eduard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Benchekroun, Youssef, and Bosch i Mustarós, Eduard
Published: 2021

7. Feature engineering, dimensionality reduction and interpretability through autoencoders for structured data

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, Bofarull Cabello, Antoni, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, and Bofarull Cabello, Antoni
Abstract: Machine Learning is the area of Artificial Intelligence where algorithms learn from data. Therefore, making a good selection of features is essential for the models to perform their tasks in the best possible way. We employ a denoising autoencoder architecture and extend it to take advantage of the aggregation of features from different contexts using several dilated convolutions. We apply sparse group Lasso regularization to cluster them and automatically identify which ones are the most relevant. In addition to bottleneck neurons to determine if we can further reduce the dimensionality. Besides reconstruction, we include an extra output from the bottleneck that performs classification. Multi-task learning leverages context-specific information that improves the quality of the encoding. Deep Learning models have always been commonly considered black-boxes. However, due to the significant difference in performance compared to interpretable models such as linear regression, it has not been a problem in contexts where understanding the models is not as relevant as obtaining good results. In this project, we study the interpretability of models by using the Shapley value method and its extensions. In the practical part, we have empirically studied the proposed model. The results show that the network architecture can identify the most relevant dilation. On the one hand, we can perform a global interpretation of the model by looking at the weights as we do in linear regression. The advantage over other models is that we group the weights by kernels of the dilated convolutions. On the other hand, through the input-output importance matrix using Shapley Values, we can identify which parts of an instance are most relevant to reconstruct its output.
Published: 2021

8. Deep learning based Recommender System for an online retailer

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arratia Quesada, Argimiro Alejandro, Arias Vicente, Marta, Breve Ramírez, Manuel Alejandro, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arratia Quesada, Argimiro Alejandro, Arias Vicente, Marta, and Breve Ramírez, Manuel Alejandro
Abstract: Since Wide and Deep Learning for Recommender Systems appeared in 2016, multiple architecture models have been created around this idea of jointly train a wide and deep neural networks as this architecture allow the model to learn both memorization and generalization, which are critical for recommender systems. It may be possible that these kind of architecture change forever the way recommendation systems predict the preference of a user with respect to an item? In the spirit of answering this question from our own experience, we explore, design, and reproduce a deep learning-based model recommender system, and trained it with the Camper's e-commerce dataset. We wanted to validate in our own experience how good a wide and deep model can be, and how much could improve the accuracy of different baseline models. We have explored two different experiments. The first model was trained to predict the potential rating with which a user would evaluate his preference for a certain category of shoes, on a [1,5] scale, whereas the second model was trained to determine whether a user would, or would not, have a interaction with a specific category of shoes. Our experiment's results reveal that wide and deep models present slightly better but similar performance with respect to other deep learning models, however, for small to medium size dataset instances, or for those datasets that do not have the most suitable feature variables for a recommendation problem, then it would be better to use classic algorithms. Wide and deep models have a nice theoretical basis, but in practice the results only improve under certain circumstances, and with huge instances of data, even so the improvement could not be that significant. Our results are an invitation to don't neglect or ignore the nature of the data. Although deep learning models are considerably improving multiple algorithms, they do not always perform better than simpler and well-known machine learning models which require less dat
Published: 2021

9. Smart rehabilitation

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Perez-Uribe, Andrés, Sendino García, Víctor, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Perez-Uribe, Andrés, and Sendino García, Víctor
Abstract: This thesis is born from a collaboration project between the HEIG-VD and the CHUV hospital in Lausanne, Switzerland. We study the problem of human grasp recognition from first-person RGB video input data. Grasping is the action of seizing and holding firmly an object and there exist many different types. The objective is to use grasp recognition for automating the monitoring of the rehabilitation sessions of patients with upper-limb neurological disorders. We compared three different approaches based on Deep Learning. Firstly, a naive image model that is trained with the entire images. Secondly, a video model, so apart from the spatial features it also takes advantage of the temporal dimension. Lastly, an image model that is trained with images cropped around the hands, so it focuses only on the part that determines the grasp. We used the Yale Grasping Dataset for training the models. To enhance the interpretability of the results we proposed a coarse-grained grasp grouping based on the Feix grasp taxonomy. We also captured our own small first-person video grasp dataset to test the applicability of the models to our setup, which differs from the training dataset in the camera location and angle. Considering the intrinsic challenges of the data such as the frequent hand-object occlusions or the dataset difficulties like its real-world setting and the low video quality, the results are relatively good. Nevertheless, they are insufficient for deploying a satisfactory system at the hospital and remark the difficulty of grasp recognition from just egocentric RGB data. It would be interesting to further research other data modalities such as depth data or to study the problem from the perspective of hand pose estimation and object detection. It is also clear that the field lacks a more modern and large dataset.
Published: 2021

10. A study of Deep Learning techniques for sequence-based problems

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, Quintana Valenzuela, Diego, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, and Quintana Valenzuela, Diego
Abstract: Transformer Networks are a new type of Deep Learning architecture first introduced in 2017. By only applying attention mechanisms, the transformer network can model relations between text sequences that outperformed other models in natural language processing tasks, such as language translation. In this work, we explore the capabilities of the transformer architecture to model sub-sequences of a time series, and we use this model to produce forecasts of longer horizons. We implement a transformer network model on a time series dataset that describes the daily aggregated sales of Camper, a shoes and apparel store. This model aims to capture the relation between two sub-sequences from the series and produces a forecast of a third sub-sequence in the future. We explore the different parts of the model and their relation to its performance, as well as the impact of modifying the shape of the input sequences used in training and inference. We use this model to forecast one year of data, and we compare these results with those of other, more classical approaches frequently used in time series forecasting, such as Autoregressive Integrated Moving Average (ARIMA) and Long-Short Term Memory (LSTM) networks. We further examine the capabilities of the model to exploit other features from the dataset, such as descriptors of the sales and temporal features from the target. Finally, we look at the attention maps produced by the attention mechanism implemented in the model and discuss its capability to explain the forecasts it produces. Our implementation shows that the model can exploit temporal features and produce forecasts that improve the proposed benchmarks in most scenarios, and that the attention plots produced provide some explainability guidelines that could be further explored.
Published: 2021

11. Weakly-supervised object detection using explanation methods

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Lim Jin Sean, Nick, Adell Ripollés, Víctor, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Lim Jin Sean, Nick, and Adell Ripollés, Víctor
Abstract: In this thesis we explore the object detection task with weak supervision. We propose and evaluate an alternative method to generate bounding boxes directly from image explanations on architectures based on both convolutions and transformers that does not rely on object proposals and is more efficient in terms of memory consumption than the current state of the art. Finally, motivated by a use case with environmental data we explore an architecture based on vision transformers that does not require any kind of labels., En esta tesis exploramos la tarea de detecci´on de objetos con supervisi´on d´ebil. Proponemos y evaluamos un m´etodo alternativo para generar bounding boxes directamente a partir de descripciones de imágenes en arquitecturas basadas tanto en convoluciones como en transformers, que no hace uso de propuestas de objetos y es m´as eficiente en t´erminos de consumo de memoria que los mejores algoritmos actuales. Finalmente, motivados por un caso de uso con datos ambientales, exploramos una arquitectura basada en transformers de visi´on que no requiere ningún tipo de etiquetas., En aquest treball s'explora la tasca de detecció d'objectes amb supervisió feble. Proposem i evaluem un mètode alternatiu per generar bounding boxes directament a partir de descripcions d'imatges en arquitectures basades tant en convolucions com en transformers, que no fa ús de propostes d'objectes i és més eficient en termes de consum de memòria respecte als millors algoritmes actuals. Finalment, motivats per un cas d'ús amb dades ambientals, explorem una arquitectura basada en transformers de visió que no requereix cap tipus d'etiquetes.
Published: 2021

12. Data engineering for cost reduction, efficiency improvement, and business intelligence for an e-commerce company

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Puig Ramirez, Joaquim, Carrillo Alza, Alex, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Puig Ramirez, Joaquim, and Carrillo Alza, Alex
Published: 2021

13. Análisis de expectativas : Plan de Comunicación de una empresa

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Montes Martínez, José Luis, Yang, Ying Ana, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Montes Martínez, José Luis, and Yang, Ying Ana
Published: 2021

14. Data analysis of socio-economic and financial factors from a public world-wide source

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Calvo Fantova, Santiago, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, and Calvo Fantova, Santiago
Abstract: Many socio-economic studies are nowadays trying to accomplish a complete description of how the different elements of our society and world are connected. This work is an attempt to build an architecture that provides an explanation of the connections and impacts that exist among the different indicators (for now on also mentioned as sectors) of a state or population (Agriculture, Climate Change, Economy & Growth, Energy & Mining, Education, Health, Poverty, Science & Technology, Social Development, and others). We will focus our effort in the research of, not only the correlations that may exist between these indicators and thought the different countries analyzed, but also the causality that relates them. With causality (we will deploy a Bayesian Network architecture for each country to accomplish this task), we will be able to describe the impact and influence that one indicator may have in the others. This could lead to an accurate, powerful and global knowledge of the functioning of our world and each single country in particular, along with a vision of the dependencies between the different indicators that describe a country. Finally, we will also propose a clustering model where each individual will be a representation of the Bayesian Network obtained for each country. With this model, we will provide N aggrupation of countries with their Bayesian Network representation for each one, which will give us a global vision of the functioning of our world represented by the causal relationships between the different indicators that can be found in our countries or populations
Published: 2020

15. Knowledge-based segmentation to improve accuracy and explainability in non-technical losses detection

Author: Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. ALBCOM - Algorismia, Bioinformàtica, Complexitat i Mètodes Formals, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Calvo Ibáñez, Albert, Coma Puig, Bernat, Carmona Vargas, Josep, Arias Vicente, Marta, Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. ALBCOM - Algorismia, Bioinformàtica, Complexitat i Mètodes Formals, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Calvo Ibáñez, Albert, Coma Puig, Bernat, Carmona Vargas, Josep, and Arias Vicente, Marta
Abstract: Utility companies have a great interest in identifying energy losses. Here, we focus on Non-Technical Losses (NTL), which refer to losses caused by utility theft or meter errors. Typically, utility companies resort to machine learning solutions to automate and optimise the identification of such losses. This paper extends an existing NTL-detection framework: by including knowledge-based NTL segmentation, we have detected some opportunities for improving the accuracy and the explanations provided to the utility company. Our improved models focus on specific types of NTL and therefore, the explanations provided are easier to interpret, allowing stakeholders to make more informed decisions. The improvements and results presented in the article may benefit other industrial frameworks., This work has been supported by MINECO and FEDER funds under grants TIN2017-86727-C2-1-R and TIN2017-89244-R, the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya)., Peer Reviewed, Objectius de Desenvolupament Sostenible::9 - Indústria, Innovació i Infraestructura, Postprint (published version)
Published: 2020

16. Disseny i implementació d'un sistema ETL en el context d'una fintech

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Lao Monreal, Sergi, Gomez Esteve, Miquel, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Lao Monreal, Sergi, and Gomez Esteve, Miquel
Published: 2020

17. Massive data processing for data analysis and visualization to help understand sector trends and monitor sales

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Tandonnet, Charles, Gazel-Anthoine, Paul, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Tandonnet, Charles, and Gazel-Anthoine, Paul
Published: 2020

18. Creating a model for expected Goals in football using qualitative player information

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Fernández, Javier, Madrero Pardo, Pau, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Fernández, Javier, and Madrero Pardo, Pau
Abstract: The field of sports analytics has been growing a lot in recent years. Sports like baseball and basketball were among the first to embrace it, but football has also taken big steps in that direction. One of the causes is that data analysis allows for the development of new advanced metrics which can provide a competitive advantage. This project presents a new version of one of these advanced metrics applied to football, the Expected Goals. The metric estimates how likely it is for a shot to end up becoming a goal. We present two different approaches for building the predictors: one that uses player qualitative information and another player agnostic. We then reflect on the importance of the calibration of the probabilities yielded by the models, as well as their possible interpretations, and present some of the applications that can be used to evaluate team and player performance. We also show the impact each feature has on the models to make their outputs interpretable and to demonstrate that the addition of the player qualitative information is important for the performance of the model.
Published: 2020

19. Visual Search: finding similar images

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Camli, Gorkem, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, and Camli, Gorkem
Abstract: Visual Search task focuses on finding visually similar images given a query image and returning the results in a ranked order where the most similar images ranked first. The main contributions of this thesis are implementing an end-toend system to perform the visual search that can be used in further research or applications, and conducting experiments on different types of feature extraction and dimensionality reduction methods to understand which ones are more likely to give better search relevance and quality results.
Published: 2020

20. Automatic organizing of user travel items

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Cugat, Josep, Torres Bellido, Bernat, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Cugat, Josep, and Torres Bellido, Bernat
Published: 2020

21. Roots of Trumpism: Homophily and Social Feedback in Donald Trump Support on Reddit

Author: Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics, ISI Foundation, Bonchi, Francesco, Monti, Corrado, De Francisci Morales, Gianmarco, Arias Vicente, Marta, Massachs Güell, Joan, Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics, ISI Foundation, Bonchi, Francesco, Monti, Corrado, De Francisci Morales, Gianmarco, Arias Vicente, Marta, and Massachs Güell, Joan
Abstract: Estudiem l’emergència del suport a Donald Trump a la discussió política de Reddit. Amb gairebé 800k subscriptors, “r/The_Donald” és una de les comunitats més grans de Reddit i un dels nuclis principals de partidaris de Trump. Es va crear el 2015, poc després que Donald Trump comencés la campanya presidencial. Utilitzant només dades del 2012, prediem la versemblança de ser un partidari de Donald Trump el 2016, l’any de les darreres eleccions presidencials dels EUA. Per caracteritzar el comportament dels simpatitzants de Trump, partim de tres hipòtesis sociològiques diferents: l’homofília, la influència social i la rebuda social. Operacionalitzem cada hipòtesi com un conjunt de característiques per cada usuari i entrenem classificadors per predir-ne la participació en r/The_Donald. Trobem que les característiques basades en l’homofília i la rebuda social són els senyals més predictius. Per contra, no observem un fort impacte dels mecanismes d’influència social. També realitzem una introspecció del model amb més bons resultats per construir una “persona” del típic partidari de Donald Trump a Reddit. Trobem evidències que els trets més prominents inclouen una predominança d’interessos masculins, una inclinació política conservadora i llibertariana i vincles amb contingut políticament incorrecte i conspiratori., Estudiamos la emergencia del soporte a Donald Trump en la discusión política de Reddit. Con casi 800k suscriptores, “r/The_Donald” es una de las comunidades más grandes de Reddit y uno de los núcleos principales de partidarios de Trump. Se creó el 2015, poco después que Donald Trump comenzara la campaña electoral. Utilizando solamente datos del 2012, predecimos la verosimilitud de ser un partidario de Donald Trump el 2016, el año de las últimas elecciones presidenciales de los EEUU. Para caracterizar el comportamiento de los simpatizantes de Trump, partimos de tres hipótesis sociológicas diferentes: la homofilia, la influencia social y el recibimiento social. Operacionalizamos cada hipótesis como un conjunto de características por cada usuario y entrenamos clasificadores para predecir la participación la participación en r/The_Donald. Encontramos que las características basadas en la homofilia y el recibimiento social son los señales más predictivos. En cambio, no observamos un fuerte impacto de los mecanismos de influencia social. También realizamos una introspección del modelo con mejores resultados para construir una “persona” del típico partidario de Donald Trump en Reddit. Encontramos evidencias que los rasgos más prominentes incluyen una predominancia de intereses masculinos, una inclinación política conservadora y libertaria y vínculos con contenido políticamente incorrecto y conspiratorio., We study the emergence of support for Donald Trump in Reddit’s political discussion. With almost 800k subscribers, “r/The Donald” is one of the largest communities on Reddit, and one of the main hubs for Trump supporters. It was created in 2015, shortly after Donald Trump began his presidential campaign. By using only data from 2012, we predict the likelihood of being a supporter of Donald Trump in 2016, the year of the last US presidential elections. To characterize the behavior of Trump supporters, we draw from three different sociological hypotheses: homophily, social influence, and social feedback. We operationalize each hypothesis as a set of features for each user, and train classifiers to predict their participation in r/The Donald. We find that homophily-based and social feedback-based features are the most predictive signals. Conversely, we do not observe a strong impact of social influence mechanisms. We also perform an introspection of the best-performing model to build a “persona” of the typical supporter of Donald Trump on Reddit. We find evidence that the most prominent traits include a predominance of masculine interests, a conservative and libertarian political leaning, and links with politically incorrect and conspiratorial content., Outgoing
Published: 2020

22. Characterizing transactional databases for frequent itemset mining

Author: Lezcano Ríos, Christian Gerardo, Arias Vicente, Marta, Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, and Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
Subjects: Bases de dades, Databases, Transactional databases, Informàtica::Intel·ligència artificial::Aprenentatge automàtic [Àrees temàtiques de la UPC], Machine learning, Aprenentatge automàtic, Frequent itemset mining, Mineria de dades, Data mining, Data characterization
Abstract: This paper presents a study of the characteristics of transactional databases used in frequent itemset mining. Such characterizations have typically been used to benchmark and understand the data mining algorithms working on these databases. The aim of our study is to give a picture of how diverse and representative these benchmarking databases are, both in general but also in the context of particular empirical studies found in the literature. Our proposed list of metrics contains many of the existing metrics found in the literature, as well as new ones. Our study shows that our list of metrics is able to capture much of the datasets’ inner complexity and thus provides a good basis for the characterization of transactional datasets. Finally, we provide a set of representative datasets based on our characterization that may be used as a benchmark safely. Both authors have been partially supported by TIN2017-89244-R from MINECO (Spain’s Ministerio de Economia, Industria y Competitividad) and the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya). Christian Lezcano is supported by Paraguay’s Foreign Postgraduate Scholarship Programme Don Carlos Antonio López (BECAL).
Published: 2019

23. Challenging the generalization capabilities of Graph Neural Networks for network modeling

Author: Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. CBA - Sistemes de Comunicacions i Arquitectures de Banda Ampla, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Suárez-Varela Maciá, José Rafael, Carol Bosch, Sergi, Rusek, Krzysztof, Almasan Puscas, Felician Paul, Arias Vicente, Marta, Barlet Ros, Pere, Cabellos Aparicio, Alberto, Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. CBA - Sistemes de Comunicacions i Arquitectures de Banda Ampla, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Suárez-Varela Maciá, José Rafael, Carol Bosch, Sergi, Rusek, Krzysztof, Almasan Puscas, Felician Paul, Arias Vicente, Marta, Barlet Ros, Pere, and Cabellos Aparicio, Alberto
Abstract: Today, network operators still lack functional network models able to make accurate predictions of end-to-end Key Performance Indicators (e.g., delay or jitter) at limited cost. Recently a novel Graph Neural Network (GNN) model called RouteNet was proposed as a cost-effective alternative to estimate the per-source/destination pair mean delay and jitter in networks. Thanks to its GNN architecture that operates over graph-structured data, RouteNet revealed an unprecedented ability to learn and model the complex relationships among topology, routing and input traffic in networks. As a result, it was able to make performance predictions with similar accuracy than resource-hungry packet-level simulators even in network scenarios unseen during training. In this demo, we will challenge the generalization capabilities of RouteNet with more complex scenarios, including larger topologies., This work was supported by the Spanish MINECO under contract TEC2017-90034-C2-1-R (ALLIANCE), the Catalan Institution for Research and Advanced Studies (ICREA) and the AGH University of Science and Technology grant, under contract no. 15.11.230.400. The research was also supported in part by PL-Grid Infrastructure., Peer Reviewed, Postprint (author's final draft)
Published: 2019

24. Graph Neural Networks and its applications

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Rodríguez Esmerats, Pau, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, and Rodríguez Esmerats, Pau
Abstract: This project will explore some of the most prominent Graph Neural Network variants and apply them to two tasks: approximation of the community detection Girvan-Newman algorithm and compiled code snippet classification.
Published: 2019

25. A benchmark for graph neural networks for computer network modeling

Author: Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Cabellos Aparicio, Alberto, Arias Vicente, Marta, Carol Bosch, Sergi, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Cabellos Aparicio, Alberto, Arias Vicente, Marta, and Carol Bosch, Sergi
Abstract: Today, network operators still lack functional network models able to make accurate predictions of end-to-end Key Performance Indicators (e.g., delay).This thesis introduces the benchmark for computer network modeling using the RouteNet Graph Neural Network as well as a routing creation algorithm.
Published: 2019

26. Optimization of the search engine ElasticSearch

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Orange, Ecole d’Ingénieurs d’Informatique et Système d’Information en Santé, Arias Vicente, Marta, Soto-Romero, Georges, Naamani, Karim, Coviaux, Quentin, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Orange, Ecole d’Ingénieurs d’Informatique et Système d’Information en Santé, Arias Vicente, Marta, Soto-Romero, Georges, Naamani, Karim, and Coviaux, Quentin
Abstract: This thesis will present the work done in the Search on Demand team at Orange. It will present the optimization of the search engine Elasticsearch, the ways to bring data into it with the mean of an ETL and how relevance can be tuned using Lucene's inverted indices.
Published: 2019

27. Synthetic dataset generation with itemset-based generative models

Author: Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Lezcano Ríos, Christian Gerardo, Arias Vicente, Marta, Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Lezcano Ríos, Christian Gerardo, and Arias Vicente, Marta
Abstract: This paper proposes three different data generators, tailored to transactional datasets, based on existing itemset-based generative models. All these generators are intuitive and easy to implement and show satisfactory performance. The quality of each generator is assessed by means of three different methods that capture how well the original dataset structure is preserved., Both authors have been partially supported by TIN2017-89244-R from MINECO (Spain’s Ministerio de Economia, Industria y Competitividad) and the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya). Christian Lezcano is supported by Paraguay’s Foreign Postgraduate Scholarship Programme Don Carlos Antonio López (BECAL)., Peer Reviewed, Postprint (author's final draft)
Published: 2019

28. Characterizing transactional databases for frequent itemset mining

Author: Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Lezcano Ríos, Christian Gerardo, Arias Vicente, Marta, Universitat Politècnica de Catalunya. Doctorat en Computació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Lezcano Ríos, Christian Gerardo, and Arias Vicente, Marta
Abstract: This paper presents a study of the characteristics of transactional databases used in frequent itemset mining. Such characterizations have typically been used to benchmark and understand the data mining algorithms working on these databases. The aim of our study is to give a picture of how diverse and representative these benchmarking databases are, both in general but also in the context of particular empirical studies found in the literature. Our proposed list of metrics contains many of the existing metrics found in the literature, as well as new ones. Our study shows that our list of metrics is able to capture much of the datasets’ inner complexity and thus provides a good basis for the characterization of transactional datasets. Finally, we provide a set of representative datasets based on our characterization that may be used as a benchmark safely., Both authors have been partially supported by TIN2017-89244-R from MINECO (Spain’s Ministerio de Economia, Industria y Competitividad) and the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya). Christian Lezcano is supported by Paraguay’s Foreign Postgraduate Scholarship Programme Don Carlos Antonio López (BECAL)., Peer Reviewed, Postprint (published version)
Published: 2019

29. Implementació de la funcionalitat offline d'un sistema Point-Of-Sale

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Alonso Bohigas, Gerard, Solís Gilabert, Roger, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Alonso Bohigas, Gerard, and Solís Gilabert, Roger
Published: 2019

30. Analysis on distance metrics approaches in graphs and their applications

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, Cebollero Ruiz, Laura, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, and Cebollero Ruiz, Laura
Published: 2019

31. Plataforma NIRS PAT para la industria 4.0 con Tensorflow

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, IRIS, Arias Vicente, Marta, Rosales Lavielle, Alejandro Alberto, Chamizo Álvarez, Víctor, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, IRIS, Arias Vicente, Marta, Rosales Lavielle, Alejandro Alberto, and Chamizo Álvarez, Víctor
Abstract: Utilizando la plataforma para deep learning Tensorflow, se han obtenido modelos de clasificación y cuantificación con datos espectroscópicos. Estos se han comparado con los métodos analíticos tradicionales de la quimiometría, para ver si son la opción a seguir en la Industria 4.0.
Published: 2019

32. Learning complex games through self play - Pokémon battles

Author: Cabellos Aparicio, Alberto, Arias Vicente, Marta, Giró Nieto, Xavier, Llobet Sanchez, Miquel, Cabellos Aparicio, Alberto, Arias Vicente, Marta, Giró Nieto, Xavier, and Llobet Sanchez, Miquel
Abstract: En aquest projecte s'analitza la viabilitat d'utilitzar aprenentatge per reforç i "self- play" per entrenar un agent a jugar Batalles Pokémon. El joc és analitzat en detall i les seves propietats úniques són revelades. El projecte analitza diverses plataformes d'aprenentatge per reforç., In this project we analyze the feasibility of using reinforcement learning and self-play to train an agent playing Pokémon Battles. The game is analyzed in depth and it's unique properties and challenges revealed. The project surveys different reinforcement learning libraries.
Published: 2018

33. Automated construction and analysis of political networks via open government and media sources

Author: García-Olano, Diego, Arias Vicente, Marta|||0000-0001-7359-1815, Larriba Pey, Josep|||0000-0002-7070-9256, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, and Universitat Politècnica de Catalunya. DAMA-UPC - Data Management Group
Subjects: ComputingMilieux_THECOMPUTINGPROFESSION, Text mining, Informàtica::Sistemes d'informació [Àrees temàtiques de la UPC], Network science, Open data, Mineria de dades, Data mining, Political science, Ciències polítiques
Abstract: We present a tool to generate real world political networks from user provided lists of politicians and news sites. Additional output includes visualizations, interactive tools and maps that allow a user to better understand the politicians and their surrounding environments as portrayed by the media. As a case study, we construct a comprehensive list of current Texas politicians, select news sites that convey a spectrum of political viewpoints covering Texas politics, and examine the results. We propose a ”Combined” co-occurrence distance metric to better reflect the relationship between two entities. A topic modeling technique is also proposed as a novel, automated way of labeling communities that exist within a politician’s ”extended” network.
Published: 2016

34. Semblant cerca semblant?: la formació de grups de treball en la pràctica de la programació

Author: Sanou Gozalo, Eduard, Arias Vicente, Marta, Ferrer Cancho, Ramon, Hernández Fernández, Antonio, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Institut de Ciències de l'Educació, and Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
Subjects: Ensenyament de la programació, Grau d'informàtica, Electronic data processing -- Study and teaching (Higher), Ensenyament i aprenentatge::Ensenyament universitari [Àrees temàtiques de la UPC], Programació per parelles, Informàtica -- Ensenyament universitari -- Problemes, exercicis, etc, Formació d'equips, Treball en parelles, Neuroeducació, Ensenyament i aprenentatge::Metodologies docents [Àrees temàtiques de la UPC]
Abstract: En una assignatura del grau d'enginyeria d'informàtica, la pràctica de programació ha passat de ser un treball individual a un treball en equip, en principi per parelles. L'alumnat té llibertat total per formar equips amb una intervenció mínima per part del professorat. L'anàlisi de les parelles formades indica que no hi ha una tendència dels alumnes a associar-se amb alumnes de rendiment semblant, potser perquè paràmetres cognitius generals no regeixen la tria de parella acadèmica. In a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from professors. The analysis of the couples made indicates that students do not tend associate with students with a similar academic performance, maybe because general cognitive parameters do not govern the choice of academic partners.
Published: 2016

35. Identifiability and transportability in dynamic causal networks

Author: Blondel, Gilles, Arias Vicente, Marta, Gavaldà Mestre, Ricard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, and Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge
Subjects: Graph theory, Belief networks, Dynamic Bayesian networks, Informàtica::Informàtica teòrica [Àrees temàtiques de la UPC], Standard causal graphs, Grafs, Teoria de, Dynamic causal networks
Abstract: In this paper we propose a causal analog to the purely observational Dynamic Bayesian Networks, which we call Dynamic Causal Networks. We provide a sound and complete algorithm for identification of Dynamic Causal Networks, namely, for computing the effect of an intervention or experiment, based on passive observations only, whenever possible. We note the existence of two types of confounder variables that affect in substantially different ways the identification procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in Dynamic Causal Network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain.
Published: 2016

36. Does like seek like?: the formation of working groups in a programming project

Author: Sanou Gozalo, Eduard, Hernández Fernández, Antonio, Arias Vicente, Marta, Ferrer Cancho, Ramon, Sanou Gozalo, Eduard, Hernández Fernández, Antonio, Arias Vicente, Marta, and Ferrer Cancho, Ramon
Abstract: In a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from teachers. The analysis of the couples made indicates that students do not tend to associate with students with a similar academic performance, maybe because general cognitive parameters do not govern the choice of academic partners. Pair programming seems to give great results, so the efforts of future research in this field should focus precisely on how these pairs are formed, underpinning the mechanisms of human social interactions, Peer Reviewed
Published: 2017

37. Learning definite Horn formulas from closure queries

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Arias Vicente, Marta, Balcázar Navarro, José Luis, Tîrnauca, Cristina, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Arias Vicente, Marta, Balcázar Navarro, José Luis, and Tîrnauca, Cristina
Abstract: A definite Horn theory is a set of n-dimensional Boolean vectors whose characteristic function is expressible as a definite Horn formula, that is, as conjunction of definite Horn clauses. The class of definite Horn theories is known to be learnable under different query learning settings, such as learning from membership and equivalence queries or learning from entailment. We propose yet a different type of query: the closure query. Closure queries are a natural extension of membership queries and also a variant, appropriate in the context of definite Horn formulas, of the so-called correction queries. We present an algorithm that learns conjunctions of definite Horn clauses in polynomial time, using closure and equivalence queries, and show how it relates to the canonical Guigues–Duquenne basis for implicational systems. We also show how the different query models mentioned relate to each other by either showing full-fledged reductions by means of query simulation (where possible), or by showing their connections in the context of particular algorithms that use them for learning definite Horn formulas., Peer Reviewed, Postprint (author's final draft)
Published: 2017

38. Does like seek like? The formation of working groups in a programming project

Author: Universitat Politècnica de Catalunya. Institut de Ciències de l'Educació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Sanou Gozalo, Eduard, Hernández Fernández, Antonio, Arias Vicente, Marta, Ferrer Cancho, Ramon, Universitat Politècnica de Catalunya. Institut de Ciències de l'Educació, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Sanou Gozalo, Eduard, Hernández Fernández, Antonio, Arias Vicente, Marta, and Ferrer Cancho, Ramon
Abstract: In a course of the degree of computer science, the programming project has changed from individual to teamed work, tentatively in couples (pair programming). Students have full freedom to team up with minimum intervention from teachers. The analysis of the working groups made indicates that students do not tend to associate with students with a similar academic performance, perhaps because general cognitive parameters do not drive the choice of academic partners. Pair programming seems to give great results, so the efforts of future research in this field should focus precisely on how these pairs are formed, underpinning the mechanisms of human social interactions., Peer Reviewed, Postprint (published version)
Published: 2017

39. Classifier selection with permutation tests

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, Duarte López, Ariel, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, and Duarte López, Ariel
Abstract: This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-of-the-art metrics taking into account physical structure, statistics, and information theory. A novelty with respect to prior work is the use of a robust approach based on permutation tests to directly assess whether a given learning algorithm is able to exploit the attributes in a data set to predict class labels, and compare it to the more commonly used F-score metric for evaluating classifier performance. To evaluate our approach, we have conducted an extensive experimentation including 8 of the main machine learning classification methods with varying configurations and 65 binary data sets, leading to over 2331 experiments. Our results show that using the information from the permutation test clearly improves the quality of the recommendations., Peer Reviewed, Postprint (author's final draft)
Published: 2017

40. Identifiability and transportability in dynamic causal networks

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Blondel, Gilles, Arias Vicente, Marta, Gavaldà Mestre, Ricard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Blondel, Gilles, Arias Vicente, Marta, and Gavaldà Mestre, Ricard
Abstract: In this paper, we propose a causal analog to the purely observational dynamic Bayesian networks, which we call dynamic causal networks. We provide a sound and complete algorithm for the identification of causal effects in dynamic causal networks, namely for computing the effect of an intervention or experiment given a dynamic causal network and probability distributions of passive observations of its variables, whenever possible. We note the existence of two types of hidden confounder variables that affect in substantially different ways the identification procedures, a distinction with no analog in either dynamic Bayesian networks or standard causal graphs. We further propose a procedure for the transportability of causal effects in dynamic causal network settings, where the result of causal experiments in a source domain may be used for the identification of causal effects in a target domain., Peer Reviewed, Postprint (author's final draft)
Published: 2017

41. Sentiment analysis on Twitter

Author: ServiZurich, Arias Vicente, Marta, Balcázar Navarro, José Luis, Tolos Rigueiro, Marta, Proscia, Rocco, ServiZurich, Arias Vicente, Marta, Balcázar Navarro, José Luis, Tolos Rigueiro, Marta, and Proscia, Rocco
Abstract: In recent years more and more people have been connecting with Social Networks. One of the most used is Twitter. This huge amount of information is attracting the interest of companies. One reason is that this huge source of information can be used to detect public opinion about their brands and thus improve their business values. In order to transform the information present in the Social Networks into knowledge several steps are required. This project aim to describe them and provide tools that are able to perform this task. The first problem is how to retrieve the data. Several ways are available, each one with its own pros and cons. After that it is necessary to study and define proper queries in order to retrieve the information needed. Once the data is retrieved you may need to filter and explore your data. For this task a Topic Model Algorithm ( LDA ) has been studied and analyzed. LDA has shown positive results when it is tuned in the proper way and it is combined with appropriate visualization techniques. The difference between a Topic Model Algorithm and other Clustering/Segmentation techniques is that Topic Models allows each ”document” ( instance ) to belong to more than one topic ( cluster ). LDA doesn’t natively work well on Twitter due to the very short length of the tweets. An investigation in the literature has revealed a solution to this problem. Another problem that is common in clustering is how to validate the Algorithm and how to choose the proper number of topics ( clusters), for this problem several metrics in the literature have been explored. Afterwards, Sentiment Analysis techniques can be applied in order to measure the opinion of the users . The literature presents several approaches and ways to solving this problem. This work is focused in solving the Polarity Detection task, with three classes , so, classify if a tweet express a positive , a negative or a neutral sentiment. Here reach accurate results can be challenging, due to the mes
Published: 2017

42. Does training affect match performance? A study using data mining and tracking devices

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Fernández, Javier, Medina Leal, Daniel, Gómez, Antonio, Arias Vicente, Marta, Gavaldà Mestre, Ricard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Fernández, Javier, Medina Leal, Daniel, Gómez, Antonio, Arias Vicente, Marta, and Gavaldà Mestre, Ricard
Abstract: FIFA has recently allowed the use of electronic performance and tracking systems (EPTS) in professional football competition, providing teams with novel and more accurate data. Physical performance has not yet taken much attention from the research community, due to the difficulty of accessing this information with the same devices during training and competition. This study provides a methodology based on machine learning and statistical methods to relate the physical performance variation of players during time-framed training sessions, and their performance in the following matches. The analysis is carried out over F.C. Barcelona B, season 2015-2016 data, and makes emphasis on exploiting the design characteristics of the structured training methodology implemented within the club. The use of summarized physical variation data has provided a remarkable relation between higher magnitudes of variation in 3-week time frames during training, and higher physical values in the following matches. With increased data availability this and new approaches could provide a new frontier in physical performance analysis. This is, up to our knowledge, the first study to relate training and matches performance through the same EPTS devices in professional football., Peer Reviewed, Postprint (published version)
Published: 2016

43. From training to match performance: A predictive and explanatory study on novel tracking data

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Fernández, Javier, Medina, Daniel, Gómez, Antonio, Arias Vicente, Marta, Gavaldà Mestre, Ricard, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Fernández, Javier, Medina, Daniel, Gómez, Antonio, Arias Vicente, Marta, and Gavaldà Mestre, Ricard
Abstract: The recent FIFA approval of the use of Electronic Performance and Tracking Systems (EPTS) during competition, has provided the availability of novel data regarding physical player performance. The analysis of this kind of information will provide teams with competitive advantages, by gaining a deeper understanding of the relation between training and match load, and individual player's fitness characteristics. In order to make sense of this physical data, which is inherently complex, machine learning algorithms that exploit both non-linear and linear relations among variables could be of great aid on building predictive and explanatory models. Also, the increasing availability of information brings the necessity and the challenge for successful interpretation of these models in order to be able to translate the findings into information that can be quickly applied by fast-paced practitioners, such as physical coaches. For season 2015-2016 F. C. Barcelona has collected both physical information from both training sessions and matches using EPTS devices. This study focuses primarily on evaluating up to what extent is possible to predict match performance from training and match physical information. Different machine learning algorithms are applied for building predictive regression models, in combination with feature selection techniques and Principal Component Analysis (PCA) for dimensionality reduction. Physical Variables are segmented into three groups: Locomotor, Metabolic and Mechanical variables, reaching successful prediction rates in 11 out of 17 total variables, based on a threshold determined by expert physical coaches. A normalized root mean square error metric is proposed that allows better understanding of results for practitioners. The second part of this study is focused on understanding the predictor variables that better explain each of the 17 analyzed match variables. It was found that specific variables can act as representatives of the set of high, Peer Reviewed, Postprint (published version)
Published: 2016

44. Geospatial search engine

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Poyato, Ricard, Sendra Garcia, Pol, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Poyato, Ricard, and Sendra Garcia, Pol
Published: 2016

45. Graph and matrix algorithms for visualizing high dimensional data

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Gavaldà Mestre, Ricard, Arias Vicente, Marta, Shankaranarayanan Venkataraman, Abhinav, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Gavaldà Mestre, Ricard, Arias Vicente, Marta, and Shankaranarayanan Venkataraman, Abhinav
Abstract: Motivated by the problem of understanding data from the medical domain, we consider algorithms for visually representing highly dimensional data so that "similar" entities appear close together. We will study, implement and compare several algorithms based on graph and on matrix representation
Published: 2016

46. GeoSRS: a hybrid social recommender system for geolocated data

Author: Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Capdevila Pujol, Joan, Arias Vicente, Marta, Arratia Quesada, Argimiro Alejandro, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Capdevila Pujol, Joan, Arias Vicente, Marta, and Arratia Quesada, Argimiro Alejandro
Abstract: All right sreserved. We present GeoSRS, a hybrid recommender system for a popular location-based social network (LBSN), in which users are able to write short reviews on the places of interest they visit. Using state-of-the-art text mining techniques, our system recommends locations to users using as source the whole set of text reviews in addition to their geographical location. To evaluate our system, we have collected our own data sets by crawling the social network Foursquare. To do this efficiently, we propose the use of a parallel version of the Quadtree technique, which may be applicable to crawling/exploring other spatially distributed sources. Finally, we study the performance of GeoSRS on our collected data set and conclude that by combining sentiment analysis and text modeling, GeoSRS generates more accurate recommendations. The performance of the system improves as more reviews are available, which further motivates the use of large-scale crawling techniques such as the Quadtree., Preprint
Published: 2016

47. Prototipo de clustering orientado motor de búsqueda

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Sogeti High Tech, Arias Vicente, Marta, Blanco-Hermida Sanz, Eric-Joel, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Sogeti High Tech, Arias Vicente, Marta, and Blanco-Hermida Sanz, Eric-Joel
Abstract: Proyecto realizado en empresa.Estudio sobre las posibilidades de algoritmos de clustering y aprendizaje automático aplicadas a la red social Twitter. Distinguir tweets que hablan de la empresa Orange de los que no. Hacer análisis de sentimiento y clustering a los tweets para extraer información., Project realized at an enterprise. Study on clustering and machine learning algorithms applied to Twitter. Using supervised learning algorithms, be able to tell apart tweets that talk about Orange that the ones who don't. Dentiment analysis on the tweets to see whether they are talking positively o
Published: 2016

48. From training to match performance: an exploratory and predictive analysis on F.C. Barcelona GPS data

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Gavaldà Mestre, Ricard, Fernández, Javier, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Gavaldà Mestre, Ricard, and Fernández, Javier
Abstract: An exploratory and predictive analysis on GPS data is presented. Physical performance variables from professional football players are analysed in a holistic approach that involves data exploration, analysis of adaptation through clustering, and predictive models for estimating future performance.
Published: 2016

49. Recommender system as viral lever

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Valverde Arredondo, Fernando, Megias Duran, Iván, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Arias Vicente, Marta, Valverde Arredondo, Fernando, and Megias Duran, Iván
Published: 2016

50. Automated construction and analysis of political networks via open government and media sources

Author: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Universitat Politècnica de Catalunya. DAMA-UPC - Data Management Group, García-Olano, Diego, Arias Vicente, Marta, Larriba Pey, Josep, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya. LARCA - Laboratori d'Algorísmia Relacional, Complexitat i Aprenentatge, Universitat Politècnica de Catalunya. DAMA-UPC - Data Management Group, García-Olano, Diego, Arias Vicente, Marta, and Larriba Pey, Josep
Abstract: We present a tool to generate real world political networks from user provided lists of politicians and news sites. Additional output includes visualizations, interactive tools and maps that allow a user to better understand the politicians and their surrounding environments as portrayed by the media. As a case study, we construct a comprehensive list of current Texas politicians, select news sites that convey a spectrum of political viewpoints covering Texas politics, and examine the results. We propose a ”Combined” co-occurrence distance metric to better reflect the relationship between two entities. A topic modeling technique is also proposed as a novel, automated way of labeling communities that exist within a politician’s ”extended” network., Peer Reviewed, Postprint (author's final draft)
Published: 2016

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

131 results on '"Arias Vicente, Marta"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources