Author: "Klabjan, Diego" / Topic: machine learning (stat.ml) - Searchworks@Jio Institute Digital Library Search Results

1. Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

Author: Xu, Mengfan and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We study a decentralized multi-agent multi-armed bandit problem in which multiple clients are connected by time dependent random graphs provided by an environment. The reward distributions of each arm vary across clients and rewards are generated independently over time by an environment based on distributions that include both sub-exponential and sub-gaussian distributions. Each client pulls an arm and communicates with neighbors based on the graph provided by the environment. The goal is to minimize the overall regret of the entire system through collaborations. To this end, we introduce a novel algorithmic framework, which first provides robust simulation methods for generating random graphs using rapidly mixing Markov chains or the random graph model, and then combines an averaging-based consensus approach with a newly proposed weighting technique and the upper confidence bound to deliver a UCB-type solution. Our algorithms account for the randomness in the graphs, removing the conventional doubly stochasticity assumption, and only require the knowledge of the number of clients at initialization. We derive optimal instance-dependent regret upper bounds of order $\log{T}$ in both sub-gaussian and sub-exponential environments, and a nearly optimal mean-gap independent regret upper bound of order $\sqrt{T}\log T$ up to a $\log T$ factor. Importantly, our regret bounds hold with high probability and capture graph randomness, whereas prior works consider expected regret under assumptions and require more stringent reward distributions., Comment: 14 pages, under review
Published: 2023
Full Text: View/download PDF

2. Dynamic Cell Structure via Recursive-Recurrent Neural Networks

Author: Qian, Xin, Kennedy, Matthew, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Computer Science - Neural and Evolutionary Computing, Machine Learning (stat.ML), Neural and Evolutionary Computing (cs.NE), Machine Learning (cs.LG)
Abstract: In a recurrent setting, conventional approaches to neural architecture search find and fix a general model for all data samples and time steps. We propose a novel algorithm that can dynamically search for the structure of cells in a recurrent neural network model. Based on a combination of recurrent and recursive neural networks, our algorithm is able to construct customized cell structures for each data sample and time step, allowing for a more efficient architecture search than existing models. Experiments on three common datasets show that the algorithm discovers high-performance cell architectures and achieves better prediction accuracy compared to the GRU structure for language modelling and sentiment analysis.
Published: 2022

3. Pareto Regret Analyses in Multi-objective Multi-armed Bandit

Author: Xu, Mengfan and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We study Pareto optimality in multi-objective multi-armed bandit by providing a formulation of adversarial multi-objective multi-armed bandit and defining its Pareto regrets that can be applied to both stochastic and adversarial settings. The regrets do not rely on any scalarization functions and reflect Pareto optimality compared to scalarized regrets. We also present new algorithms assuming both with and without prior information of the multi-objective multi-armed bandit setting. The algorithms are shown optimal in adversarial settings and nearly optimal up to a logarithmic factor in stochastic settings simultaneously by our established upper bounds and lower bounds on Pareto regrets. Moreover, the lower bound analyses show that the new regrets are consistent with the existing Pareto regret for stochastic settings and extend an adversarial attack mechanism from bandit to the multi-objective one., 19 pages; accepted at ICML 2023 and to be published in Proceedings of Machine Learning Research (PMLR)
Published: 2022

4. Keyword-based Topic Modeling and Keyword Selection

Author: Wang, Xingyu, Zhang, Lida, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Machine Learning (stat.ML), Information Retrieval (cs.IR), Computer Science - Information Retrieval, Machine Learning (cs.LG)
Abstract: Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of knowing the forthcoming documents and the underlying topics. The future topics should mimic past topics of interest yet there should be some novelty in them. We develop a keyword-based topic model that dynamically selects a subset of keywords to be used to collect future documents. The generative process first selects keywords and then the underlying documents based on the specified keywords. The model is trained by using a variational lower bound and stochastic gradient optimization. The inference consists of finding a subset of keywords where given a subset the model predicts the underlying topic-word matrix for the unknown forthcoming documents. We compare the keyword topic model against a benchmark model using viral predictions of tweets combined with a topic model. The keyword-based topic model outperforms this sophisticated baseline model by 67%.
Published: 2021

5. Classification Models for Partially Ordered Sequences

Author: Ger, Stephanie, Klabjan, Diego, and Utke, Jean
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Many models such as Long Short Term Memory (LSTMs), Gated Recurrent Units (GRUs) and transformers have been developed to classify time series data with the assumption that events in a sequence are ordered. On the other hand, fewer models have been developed for set based inputs, where order does not matter. There are several use cases where data is given as partially-ordered sequences because of the granularity or uncertainty of time stamps. We introduce a novel transformer based model for such prediction tasks, and benchmark against extensions of existing order invariant models. We also discuss how transition probabilities between events in a sequence can be used to improve model performance. We show that the transformer-based equal-time model outperforms extensions of existing set models on three data sets.
Published: 2021
Full Text: View/download PDF

6. Open Set Domain Adaptation by Extreme Value Theory

Author: Xu, Yiming and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Statistics - Machine Learning, Computer Science - Artificial Intelligence, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Common domain adaptation techniques assume that the source domain and the target domain share an identical label space, which is problematic since when target samples are unlabeled we have no knowledge on whether the two domains share the same label space. When this is not the case, the existing methods fail to perform well because the additional unknown classes are also matched with the source domain during adaptation. In this paper, we tackle the open set domain adaptation problem under the assumption that the source and the target label spaces only partially overlap, and the task becomes when the unknown classes exist, how to detect the target unknown classes and avoid aligning them with the source domain. We propose to utilize an instance-level reweighting strategy for domain adaptation where the weights indicate the likelihood of a sample belonging to known classes and to model the tail of the entropy distribution with Extreme Value Theory for unknown class detection. Experiments on conventional domain adaptation datasets show that the proposed method outperforms the state-of-the-art models.
Published: 2020

7. Inverse Classification with Limited Budget and Maximum Number of Perturbed Samples

Author: Koo, Jaehoon, Klabjan, Diego, and Utke, Jean
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Most recent machine learning research focuses on developing new classifiers for the sake of improving classification accuracy. With many well-performing state-of-the-art classifiers available, there is a growing need for understanding interpretability of a classifier necessitated by practical purposes such as to find the best diet recommendation for a diabetes patient. Inverse classification is a post modeling process to find changes in input features of samples to alter the initially predicted class. It is useful in many business applications to determine how to adjust a sample input data such that the classifier predicts it to be in a desired class. In real world applications, a budget on perturbations of samples corresponding to customers or patients is usually considered, and in this setting, the number of successfully perturbed samples is key to increase benefits. In this study, we propose a new framework to solve inverse classification that maximizes the number of perturbed samples subject to a per-feature-budget limits and favorable classification classes of the perturbed samples. We design algorithms to solve this optimization problem based on gradient methods, stochastic processes, Lagrangian relaxations, and the Gumbel trick. In experiments, we find that our algorithms based on stochastic processes exhibit an excellent performance in different budget settings and they scale well.
Published: 2020

8. Efficient Architecture Search for Continual Learning

Author: Gao, Qiang, Luo, Zhipeng, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Continual learning with neural networks is an important learning framework in AI that aims to learn a sequence of tasks well. However, it is often confronted with three challenges: (1) overcome the catastrophic forgetting problem, (2) adapt the current network to new tasks, and meanwhile (3) control its model complexity. To reach these goals, we propose a novel approach named as Continual Learning with Efficient Architecture Search, or CLEAS in short. CLEAS works closely with neural architecture search (NAS) which leverages reinforcement learning techniques to search for the best neural architecture that fits a new task. In particular, we design a neuron-level NAS controller that decides which old neurons from previous tasks should be reused (knowledge transfer), and which new neurons should be added (to learn new knowledge). Such a fine-grained controller allows one to find a very concise architecture that can fit each new task well. Meanwhile, since we do not alter the weights of the reused neurons, we perfectly memorize the knowledge learned from previous tasks. We evaluate CLEAS on numerous sequential classification tasks, and the results demonstrate that CLEAS outperforms other state-of-the-art alternative methods, achieving higher classification accuracy while using simpler neural architectures., 12 pages, 11 figures
Published: 2020

9. Neural Network Retraining for Model Serving

Author: Klabjan, Diego and Zhu, Xiaofeng
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, ComputingMilieux_THECOMPUTINGPROFESSION, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We propose incremental (re)training of a neural network model to cope with a continuous flow of new data in inference during model serving. As such, this is a life-long learning process. We address two challenges of life-long retraining: catastrophic forgetting and efficient retraining. If we combine all past and new data it can easily become intractable to retrain the neural network model. On the other hand, if the model is retrained using only new data, it can easily suffer catastrophic forgetting and thus it is paramount to strike the right balance. Moreover, if we retrain all weights of the model every time new data is collected, retraining tends to require too many computing resources. To solve these two issues, we propose a novel retraining model that can select important samples and important weights utilizing multi-armed bandits. To further address forgetting, we propose a new regularization term focusing on synapse and neuron importance. We analyze multiple datasets to document the outcome of the proposed retraining methods. Various experiments demonstrate that our retraining methodologies mitigate the catastrophic forgetting problem while boosting model performance.
Published: 2020

10. Open-Set Recognition with Gaussian Mixture Variational Autoencoders

Author: Cao, Alexander, Luo, Yuan, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, Machine Learning (stat.ML), General Medicine, Neural and Evolutionary Computing (cs.NE), Machine Learning (cs.LG)
Abstract: In inference, open-set classification is to either classify a sample into a known class from training or reject it as an unknown class. Existing deep open-set classifiers train explicit closed-set classifiers, in some cases disjointly utilizing reconstruction, which we find dilutes the latent representation's ability to distinguish unknown classes. In contrast, we train our model to cooperatively learn reconstruction and perform class-based clustering in the latent space. With this, our Gaussian mixture variational autoencoder (GMVAE) achieves more accurate and robust open-set classification results, with an average F1 improvement of 29.5%, through extensive experiments aided by analytical results., Comment: 12 pages including 8 figures and 4 tables, plus 6 pages of supplementary material
Published: 2020
Full Text: View/download PDF

11. Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis

Author: Sandler, Adam, Klabjan, Diego, and Luo, Yuan
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Statistics - Methodology, Machine Learning (cs.LG)
Abstract: We develop methods for reducing the dimensionality of large data sets, common in biomedical applications. Learning about patients using genetic data often includes more features than observations, which makes direct supervised learning difficult. One method of reducing the feature space is to use latent Dirichlet allocation to group genetic variants in an unsupervised manner. Latent Dirichlet allocation describes a patient as a mixture of topics corresponding to genetic variants. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. Our most significant contributions are with hierarchical topic modeling. We design distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decomposition. We apply these models to examine patients with one of four common types of cancer (breast, lung, prostate, and colorectal) and siblings with and without autism spectrum disorder. We linked the genes with their biological pathways and combine this information into a tensor of patients, counts of their genetic variants, and the genes' membership in pathways. We find that our trained models outperform baseline models, with respect to coherence, by up to 40%., 38 pages, 8 figures, 5 tables
Published: 2019

12. Autoencoders and Generative Adversarial Networks for Imbalanced Sequence Classification

Author: Ger, Stephanie, Jambunath, Yegna Subramanian, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Generative Adversarial Networks (GANs) have been used in many different applications to generate realistic synthetic data. We introduce a novel GAN with Autoencoder (GAN-AE) architecture to generate synthetic samples for variable length, multi-feature sequence datasets. In this model, we develop a GAN architecture with an additional autoencoder component, where recurrent neural networks (RNNs) are used for each component of the model in order to generate synthetic data to improve classification accuracy for a highly imbalanced medical device dataset. In addition to the medical device dataset, we also evaluate the GAN-AE performance on two additional datasets and demonstrate the application of GAN-AE to a sequence-to-sequence task where both synthetic sequence inputs and sequence outputs must be generated. To evaluate the quality of the synthetic data, we train encoder-decoder models both with and without the synthetic data and compare the classification model performance. We show that a model trained with GAN-AE generated synthetic data outperforms models trained with synthetic data generated both with standard oversampling techniques such as SMOTE and Autoencoders as well as with state of the art GAN-based models.
Published: 2019

13. Scale Invariant Power Iteration

Author: Kim, Cheolmin, Kim, Youngseok, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: Power iteration has been generalized to solve many interesting problems in machine learning and statistics. Despite its striking success, theoretical understanding of when and how such an algorithm enjoys good convergence property is limited. In this work, we introduce a new class of optimization problems called scale invariant problems and prove that they can be efficiently solved by scale invariant power iteration (SCI-PI) with a generalized convergence guarantee of power iteration. By deriving that a stationary point is an eigenvector of the Hessian evaluated at the point, we show that scale invariant problems indeed resemble the leading eigenvector problem near a local optimum. Also, based on a novel reformulation, we geometrically derive SCI-PI which has a general form of power iteration. The convergence analysis shows that SCI-PI attains local linear convergence with a rate being proportional to the top two eigenvalues of the Hessian at the optimum. Moreover, we discuss some extended settings of scale invariant problems and provide similar convergence results for them. In numerical experiments, we introduce applications to independent component analysis, Gaussian mixtures, and non-negative matrix factorization. Experimental results demonstrate that SCI-PI is competitive to state-of-the-art benchmark algorithms and often yield better solutions.
Published: 2019
Full Text: View/download PDF

14. Convergence Analyses of Online ADAM Algorithm in Convex Setting and Two-Layer ReLU Neural Network

Author: Fang, Biyi and Klabjan, Diego
Subjects: Computer Science::Machine Learning, FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Nowadays, online learning is an appealing learning paradigm, which is of great interest in practice due to the recent emergence of large scale applications such as online advertising placement and online web ranking. Standard online learning assumes a finite number of samples while in practice data is streamed infinitely. In such a setting gradient descent with a diminishing learning rate does not work. We first introduce regret with rolling window, a new performance metric for online streaming learning, which measures the performance of an algorithm on every fixed number of contiguous samples. At the same time, we propose a family of algorithms based on gradient descent with a constant or adaptive learning rate and provide very technical analyses establishing regret bound properties of the algorithms. We cover the convex setting showing the regret of the order of the square root of the size of the window in the constant and dynamic learning rate scenarios. Our proof is applicable also to the standard online setting where we provide the first analysis of the same regret order (the previous proofs have flaws). We also study a two layer neural network setting with ReLU activation. In this case we establish that if initial weights are close to a stationary point, the same square root regret bound is attainable. We conduct computational experiments demonstrating a superior performance of the proposed algorithms.
Published: 2019
Full Text: View/download PDF

15. Nested multi-instance classification

Author: Stec, Alexander, Klabjan, Diego, and Utke, Jean
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: There are classification tasks that take as inputs groups of images rather than single images. In order to address such situations, we introduce a nested multi-instance deep network. The approach is generic in that it is applicable to general data instances, not just images. The network has several convolutional neural networks grouped together at different stages. This primarily differs from other previous works in that we organize instances into relevant groups that are treated differently. We also introduce a method to replace instances that are missing which successfully creates neutral input instances and consistently outperforms standard fill-in methods in real world use cases. In addition, we propose a method for manual dropout when a whole group of instances is missing that allows us to use richer training data and obtain higher accuracy at the end of training. With specific pretraining, we find that the model works to great effect on our real world and pub-lic datasets in comparison to baseline methods, justifying the different treatment among groups of instances.
Published: 2018

16. Forecasting Crime with Deep Learning

Author: Stec, Alexander and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: The objective of this work is to take advantage of deep neural networks in order to make next day crime count predictions in a fine-grain city partition. We make predictions using Chicago and Portland crime data, which is augmented with additional datasets covering weather, census data, and public transportation. The crime counts are broken into 10 bins and our model predicts the most likely bin for a each spatial region at a daily level. We train this data using increasingly complex neural network structures, including variations that are suited to the spatial and temporal aspects of the crime prediction problem. With our best model we are able to predict the correct bin for overall crime count with 75.6% and 65.3% accuracy for Chicago and Portland, respectively. The results show the efficacy of neural networks for the prediction problem and the value of using external datasets in addition to standard crime data.
Published: 2018

17. Improved Classification Based on Deep Belief Networks

Author: Koo, Jaehoon and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science::Machine Learning, Statistics::Machine Learning, Computer Science - Machine Learning, ComputingMethodologies_PATTERNRECOGNITION, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: For better classification generative models are used to initialize the model and model features before training a classifier. Typically it is needed to solve separate unsupervised and supervised learning problems. Generative restricted Boltzmann machines and deep belief networks are widely used for unsupervised learning. We developed several supervised models based on DBN in order to improve this two-phase strategy. Modifying the loss function to account for expectation with respect to the underlying generative model, introducing weight bounds, and multi-level programming are applied in model development. The proposed models capture both unsupervised and supervised objectives effectively. The computational study verifies that our models perform better than the two-phase training approach.
Published: 2018

18. Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

Author: Wang, Xingyu and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science::Computer Science and Game Theory, Computer Science - Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be not optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert strategies, we introduce a new objective function that directly pits experts against Nash Equilibrium strategies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. In our setting the model and algorithm do not decouple by agent. In order to find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In our numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to previous approaches using tabular representations. Moreover, with sub-optimal expert demonstrations our algorithms recover both reward functions and strategies with good quality., 31 pages, to be presented at ICML 2018
Published: 2018

19. Layer Flexible Adaptive Computational Time

Author: Zhang, Lida, Ebrahimi, Abdolghani, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, I.2.6, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Deep recurrent neural networks perform well on sequence data and are the model of choice. However, it is a daunting task to decide the structure of the networks, i.e. the number of layers, especially considering different computational needs of a sequence. We propose a layer flexible recurrent neural network with adaptive computation time, and expand it to a sequence to sequence model. Different from the adaptive computation time model, our model has a dynamic number of transmission states which vary by step and sequence. We evaluate the model on a financial data set and Wikipedia language modeling. Experimental results show the performance improvement of 7\% to 12\% and indicate the model's ability to dynamically change the number of layers along with the computational steps., Comment: 11 pages, 5 figures
Published: 2018
Full Text: View/download PDF

20. Unified recurrent neural network for many feature types

Author: Stec, Alexander, Klabjan, Diego, and Utke, Jean
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: There are time series that are amenable to recurrent neural network (RNN) solutions when treated as sequences, but some series, e.g. asynchronous time series, provide a richer variation of feature types than current RNN cells take into account. In order to address such situations, we introduce a unified RNN that handles five different feature types, each in a different manner. Our RNN framework separates sequential features into two groups dependent on their frequency, which we call sparse and dense features, and which affect cell updates differently. Further, we also incorporate time features at the sequential level that relate to the time between specified events in the sequence and are used to modify the cell's memory state. We also include two types of static (whole sequence level) features, one related to time and one not, which are combined with the encoder output. The experiments show that the modeling framework proposed does increase performance compared to standard cells.
Published: 2018
Full Text: View/download PDF

21. Dynamic Prediction Length for Time Series with Sequence to Sequence Networks

Author: Harmon, Mark and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Recurrent neural networks and sequence to sequence models require a predetermined length for prediction output length. Our model addresses this by allowing the network to predict a variable length output in inference. A new loss function with a tailored gradient computation is developed that trades off prediction accuracy and output length. The model utilizes a function to determine whether a particular output at a time should be evaluated or not given a predetermined threshold. We evaluate the model on the problem of predicting the prices of securities. We find that the model makes longer predictions for more stable securities and it naturally balances prediction accuracy and length.
Published: 2018
Full Text: View/download PDF

22. A Stochastic Large-scale Machine Learning Algorithm for Distributed Features and Observations

Author: Fang, Biyi and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention, in particular when either observations or features are distributed, but not both. We propose a general stochastic algorithm where observations, features, and gradient components can be sampled in a double distributed setting, i.e., with both features and observations distributed. Very technical analyses establish convergence properties of the algorithm under different conditions on the learning rate (diminishing to zero or constant). Computational experiments in Spark demonstrate a superior performance of our algorithm versus a benchmark in early iterations of the algorithm, which is due to the stochastic components of the algorithm., Comment: 11 figures, 41 pages
Published: 2018
Full Text: View/download PDF

23. Improving the Expected Improvement Algorithm

Author: Qin, Chao, Klabjan, Diego, and Russo, Daniel
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: The expected improvement (EI) algorithm is a popular strategy for information collection in optimization under uncertainty. The algorithm is widely known to be too greedy, but nevertheless enjoys wide use due to its simplicity and ability to handle uncertainty and noise in a coherent decision theoretic framework. To provide rigorous insight into EI, we study its properties in a simple setting of Bayesian optimization where the domain consists of a finite grid of points. This is the so-called best-arm identification problem, where the goal is to allocate measurement effort wisely to confidently identify the best arm using a small number of measurements. In this framework, one can show formally that EI is far from optimal. To overcome this shortcoming, we introduce a simple modification of the expected improvement algorithm. Surprisingly, this simple change results in an algorithm that is asymptotically optimal for Gaussian best-arm identification problems, and provably outperforms standard EI by an order of magnitude., Submitted to NIPS 2017
Published: 2017

24. Skill-Based Differences in Spatio-Temporal Team Behavior in Defence of The Ancients 2

Author: Drachen, Anders, Yancey, Matthew, Maguire, John, Chu, Derrek, Wang, Iris Yuhui, Mahlmann, Tobias, Schubert, Matthias, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Statistics - Machine Learning, Machine Learning (stat.ML)
Abstract: Multiplayer Online Battle Arena (MOBA) games are among the most played digital games in the world. In these games, teams of players fight against each other in arena environments, and the gameplay is focused on tactical combat. Mastering MOBAs requires extensive practice, as is exemplified in the popular MOBA Defence of the Ancients 2 (DotA 2). In this paper, we present three data-driven measures of spatio-temporal behavior in DotA 2: 1) Zone changes; 2) Distribution of team members and: 3) Time series clustering via a fuzzy approach. We present a method for obtaining accurate positional data from DotA 2. We investigate how behavior varies across these measures as a function of the skill level of teams, using four tiers from novice to professional players. Results indicate that spatio-temporal behavior of MOBA teams is related to team skill, with professional teams having smaller within-team distances and conducting more zone changes than amateur teams. The temporal distribution of the within-team distances of professional and high-skilled teams also generally follows patterns distinct from lower skill ranks.
Published: 2016

25. Evaluating the Performance of Offensive Linemen in the NFL

Author: Byanna, Nikhil and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Statistics - Machine Learning, Machine Learning (stat.ML)
Abstract: How does one objectively measure the performance of an individual offensive lineman in the NFL? The existing literature proposes various measures that rely on subjective assessments of game film, but has yet to develop an objective methodology to evaluate performance. Using a variety of statistics related to an offensive lineman's performance, we develop a framework to objectively analyze the overall performance of an individual offensive lineman and determine specific linemen who are overvalued or undervalued relative to their salary. We identify eight players across the 2013-2014 and 2014-2015 NFL seasons that are considered to be overvalued or undervalued and corroborate the results with existing metrics that are based on subjective evaluation. To the best of our knowledge, the techniques set forth in this work have not been utilized in previous works to evaluate the performance of NFL players at any position, including offensive linemen.
Published: 2016

26. Predictive Analytics Using Smartphone Sensors for Depressive Episodes

Author: Jeong, Taeheon, Klabjan, Diego, and Starren, Justin
Subjects: FOS: Computer and information sciences, Computer Science - Computers and Society, Statistics - Machine Learning, Computers and Society (cs.CY), Computer Science - Human-Computer Interaction, Machine Learning (stat.ML), Human-Computer Interaction (cs.HC)
Abstract: The behaviors of patients with depression are usually difficult to predict because the patients demonstrate the symptoms of a depressive episode without a warning at unexpected times. The goal of this research is to build algorithms that detect signals of such unusual moments so that doctors can be proactive in approaching already diagnosed patients before they fall in depression. Each patient is equipped with a smartphone with the capability to track its sensors. We first find the home location of a patient, which is then augmented with other sensor data to identify sleep patterns and select communication patterns. The algorithms require two to three weeks of training data to build standard patterns, which are considered normal behaviors; and then, the methods identify any anomalies in day-to-day data readings of sensors. Four smartphone sensors, including the accelerometer, the gyroscope, the location probe and the communication log probe are used for anomaly detection in sleeping and communication patterns., HIAI 2016, Expanding the Boundaries of Health Informatics using AI, Phoenix, AZ
Published: 2016

27. Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game

Author: Drachen, Anders, Riley, Joseph, Baskin, Shawna, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Computers and Society, Statistics - Machine Learning, Computers and Society (cs.CY), Computer Science - Human-Computer Interaction, ComputingMilieux_PERSONALCOMPUTING, Machine Learning (stat.ML), Human-Computer Interaction (cs.HC)
Abstract: The in-game economies of massively multi-player online games (MMOGs) are complex systems that have to be carefully designed and managed. This paper presents the results of an analysis of auction house data from the MMOG Glitch, across a 14 month time period, the entire lifetime of the game. The data comprise almost 3 million data points, over 20,000 unique players and more than 650 products. Furthermore, an interactive visualization, based on Sankey flow diagrams, is presented which shows the proportion of the different clusters across each time bin, as well as the flow of players between clusters. The diagram allows evaluation of migration of players between clusters as a function of time, as well as churn analysis. The presented work provides a template analysis and visualization model for progression-based or temporal-based analysis of player behavior broadly applicable to games.
Published: 2016
Full Text: View/download PDF

28. Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories

Author: Harmon, Mark, Ebrahimi, Abdolghani, Lucey, Patrick, and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: In this paper, we predict the likelihood of a player making a shot in basketball from multiagent trajectories. Previous approaches to similar problems center on hand-crafting features to capture domain specific knowledge. Although intuitive, recent work in deep learning has shown this approach is prone to missing important predictive features. To circumvent this issue, we present a convolutional neural network (CNN) approach where we initially represent the multiagent behavior as an image. To encode the adversarial nature of basketball, we use a multi-channel image which we then feed into a CNN. Additionally, to capture the temporal aspect of the trajectories we "fade" the player trajectories. We find that this approach is superior to a traditional FFN model. By using gradient ascent to create images using an already trained CNN, we discover what features the CNN filters learn. Last, we find that a combined CNN+FFN is the best performing network with an error rate of 39%.
Published: 2016
Full Text: View/download PDF

29. Optimization for Large-Scale Machine Learning with Distributed Features and Observations

Author: Nathan, Alexandros and Klabjan, Diego
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid methods. We conduct numerical experiments in Spark using real-world and simulated data sets and study the scaling properties of our methods. Our empirical evaluation of the proposed algorithms demonstrates the out-performance of a block distributed ADMM method, which, to the best of our knowledge is the only other existing doubly distributed optimization algorithm.
Published: 2016
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

29 results on '"Klabjan, Diego"'

1. Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

2. Dynamic Cell Structure via Recursive-Recurrent Neural Networks

3. Pareto Regret Analyses in Multi-objective Multi-armed Bandit

4. Keyword-based Topic Modeling and Keyword Selection

5. Classification Models for Partially Ordered Sequences

6. Open Set Domain Adaptation by Extreme Value Theory

7. Inverse Classification with Limited Budget and Maximum Number of Perturbed Samples

8. Efficient Architecture Search for Continual Learning

9. Neural Network Retraining for Model Serving

10. Open-Set Recognition with Gaussian Mixture Variational Autoencoders

11. Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis

12. Autoencoders and Generative Adversarial Networks for Imbalanced Sequence Classification

13. Scale Invariant Power Iteration

14. Convergence Analyses of Online ADAM Algorithm in Convex Setting and Two-Layer ReLU Neural Network

15. Nested multi-instance classification

16. Forecasting Crime with Deep Learning

17. Improved Classification Based on Deep Belief Networks

18. Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

19. Layer Flexible Adaptive Computational Time

20. Unified recurrent neural network for many feature types

21. Dynamic Prediction Length for Time Series with Sequence to Sequence Networks

22. A Stochastic Large-scale Machine Learning Algorithm for Distributed Features and Observations

23. Improving the Expected Improvement Algorithm

24. Skill-Based Differences in Spatio-Temporal Team Behavior in Defence of The Ancients 2

25. Evaluating the Performance of Offensive Linemen in the NFL

26. Predictive Analytics Using Smartphone Sensors for Depressive Episodes

27. Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game

28. Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories

29. Optimization for Large-Scale Machine Learning with Distributed Features and Observations

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

29 results on '"Klabjan, Diego"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources