Author: "Elkan, Charles" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Elkan, Charles"' showing total 248 results

Start Over Author "Elkan, Charles"

248 results on '"Elkan, Charles"'

1. Gamified crowd-sourcing of high-quality data for visual fine-tuning

Author: Yadav, Shashank, Tomar, Rohan, Jain, Garvit, Ahooja, Chirag, Chaudhary, Shubham, and Elkan, Charles
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper introduces Gamified Adversarial Prompting (GAP), a framework that crowd-sources high-quality data for visual instruction tuning of large multimodal models. GAP transforms the data collection process into an engaging game, incentivizing players to provide fine-grained, challenging questions and answers that target gaps in the model's knowledge. Our contributions include (1) an approach to capture question-answer pairs from humans that directly address weaknesses in a model's knowledge, (2) a method for evaluating and rewarding players that successfully incentivizes them to provide high-quality submissions, and (3) a scalable, gamified platform that succeeds in collecting this data from over 50,000 participants in just a few weeks. Our implementation of GAP has significantly improved the accuracy of a small multimodal model, namely MiniCPM-Llama3-V-2.5-8B, increasing its GPT score from 0.147 to 0.477 on our dataset, approaching the benchmark set by the much larger GPT-4V. Moreover, we demonstrate that the data generated using MiniCPM-Llama3-V-2.5-8B also enhances its performance across other benchmarks, and exhibits cross-model benefits. Specifically, the same data improves the performance of QWEN2-VL-2B and QWEN2-VL-7B on the same multiple benchmarks.
Published: 2024

2. MLSys: The New Frontier of Machine Learning Systems

Author: Ratner, Alexander, Alistarh, Dan, Alonso, Gustavo, Andersen, David G., Bailis, Peter, Bird, Sarah, Carlini, Nicholas, Catanzaro, Bryan, Chayes, Jennifer, Chung, Eric, Dally, Bill, Dean, Jeff, Dhillon, Inderjit S., Dimakis, Alexandros, Dubey, Pradeep, Elkan, Charles, Fursin, Grigori, Ganger, Gregory R., Getoor, Lise, Gibbons, Phillip B., Gibson, Garth A., Gonzalez, Joseph E., Gottschlich, Justin, Han, Song, Hazelwood, Kim, Huang, Furong, Jaggi, Martin, Jamieson, Kevin, Jordan, Michael I., Joshi, Gauri, Khalaf, Rania, Knight, Jason, Konečný, Jakub, Kraska, Tim, Kumar, Arun, Kyrillidis, Anastasios, Lakshmiratan, Aparna, Li, Jing, Madden, Samuel, McMahan, H. Brendan, Meijer, Erik, Mitliagkas, Ioannis, Monga, Rajat, Murray, Derek, Olukotun, Kunle, Papailiopoulos, Dimitris, Pekhimenko, Gennady, Rekatsinas, Theodoros, Rostamizadeh, Afshin, Ré, Christopher, De Sa, Christopher, Sedghi, Hanie, Sen, Siddhartha, Smith, Virginia, Smola, Alex, Song, Dawn, Sparks, Evan, Stoica, Ion, Sze, Vivienne, Udell, Madeleine, Vanschoren, Joaquin, Venkataraman, Shivaram, Vinayak, Rashmi, Weimer, Markus, Wilson, Andrew Gordon, Xing, Eric, Zaharia, Matei, Zhang, Ce, and Talwalkar, Ameet
Subjects: Computer Science - Machine Learning, Computer Science - Databases, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Software Engineering, Statistics - Machine Learning
Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a new systems machine learning research community at the intersection of the traditional systems and ML communities, focused on topics such as hardware systems for ML, software systems for ML, and ML optimized for metrics beyond predictive accuracy. To do this, we describe a new conference, MLSys, that explicitly targets research at the intersection of systems and machine learning with a program committee split evenly between experts in systems and ML, and an explicit focus on topics at the intersection of the two.
Published: 2019

3. Achieving Fluency and Coherency in Task-oriented Dialog

Author: Gangadharaiah, Rashmi, Narayanaswamy, Balakrishnan, and Elkan, Charles
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We consider real world task-oriented dialog settings, where agents need to generate both fluent natural language responses and correct external actions like database queries and updates. We demonstrate that, when applied to customer support chat transcripts, Sequence to Sequence (Seq2Seq) models often generate short, incoherent and ungrammatical natural language responses that are dominated by words that occur with high frequency in the training data. These phenomena do not arise in synthetic datasets such as bAbI, where we show Seq2Seq models are nearly perfect. We develop techniques to learn embeddings that succinctly capture relevant information from the dialog history, and demonstrate that nearest neighbor based approaches in this learned neural embedding space generate more fluent responses. However, we see that these methods are not able to accurately predict when to execute an external action. We show how to combine nearest neighbor and Seq2Seq methods in a hybrid model, where nearest neighbor is used to generate fluent responses and Seq2Seq type models ensure dialog coherency and generate accurate external actions. We show that this approach is well suited for customer support scenarios, where agents' responses are typically script-driven, and correct external actions are critically important. The hybrid model on the customer support data achieves a 78% relative improvement in fluency scores, and a 130% improvement in accuracy of external calls., Comment: Workshop on Conversational AI, NIPS 2017, Long Beach, CA, USA
Published: 2018

4. End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient

Author: Zhou, Li, Small, Kevin, Rokhlenko, Oleg, and Elkan, Charles
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Learning
Abstract: Learning a goal-oriented dialog policy is generally performed offline with supervised learning algorithms or online with reinforcement learning (RL). Additionally, as companies accumulate massive quantities of dialog transcripts between customers and trained human agents, encoder-decoder methods have gained popularity as agent utterances can be directly treated as supervision without the need for utterance-level annotations. However, one potential drawback of such approaches is that they myopically generate the next agent utterance without regard for dialog-level considerations. To resolve this concern, this paper describes an offline RL method for learning from unannotated corpora that can optimize a goal-oriented policy at both the utterance and dialog level. We introduce a novel reward function and use both on-policy and off-policy policy gradient to learn a policy offline without requiring online user interaction or an explicit state space definition., Comment: Workshop on Conversational AI, NIPS 2017, Long Beach, CA, USA
Published: 2017

5. Visualizing the Consequences of Evidence in Bayesian Networks

Author: Champion, Clifford and Elkan, Charles
Subjects: Computer Science - Artificial Intelligence
Abstract: This paper addresses the challenge of viewing and navigating Bayesian networks as their structural size and complexity grow. Starting with a review of the state of the art of visualizing Bayesian networks, an area which has largely been passed over, we improve upon existing visualizations in three ways. First, we apply a disciplined approach to the graphic design of the basic elements of the Bayesian network. Second, we propose a technique for direct, visual comparison of posterior distributions resulting from alternative evidence sets. Third, we leverage a central mathematical tool in information theory, to assist the user in finding variables of interest in the network, and to reduce visual complexity where unimportant. We present our methods applied to two modestly large Bayesian networks constructed from real-world data sets. Results suggest the new techniques can be a useful tool for discovering information flow phenomena, and also for qualitative comparisons of different evidence configurations, especially in large probabilistic networks., Comment: 9 pages, 11 figures
Published: 2017

6. Predicting Surgery Duration with Neural Heteroscedastic Regression

Author: Ng, Nathan, Gabriel, Rodney A, McAuley, Julian, Elkan, Charles, and Lipton, Zachary C
Subjects: Statistics - Machine Learning, Computer Science - Learning, Computer Science - Neural and Evolutionary Computing
Abstract: Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a surgery-specific notion of our uncertainty about its duration. Estimating this uncertainty can lead to more nuanced and effective scheduling strategies, as we are able to schedule surgeries more efficiently while allowing an informed and case-specific margin of error. Using surgery records %from the UC San Diego Health System, from a large United States health system we demonstrate potential improvements on the order of 20% (in terms of minutes overbooked) compared to current scheduling techniques. Moreover, we demonstrate that surgery durations are indeed heteroscedastic. We show that models that estimate case-specific uncertainty better fit the data (log likelihood). Additionally, we show that the heteroscedastic predictions can more optimally trade off between over and under-booking minutes, especially when idle minutes and scheduling collisions confer disparate costs.
Published: 2017

7. Learning to Diagnose with LSTM Recurrent Neural Networks

Author: Lipton, Zachary C., Kale, David C., Elkan, Charles, and Wetzel, Randall
Subjects: Computer Science - Learning
Abstract: Clinical medical data, especially in the intensive care unit (ICU), consist of multivariate time series of observations. For each patient visit (or episode), sensor data and lab test results are recorded in the patient's Electronic Health Record (EHR). While potentially containing a wealth of insights, the data is difficult to mine effectively, owing to varying length, irregular sampling and missing data. Recurrent Neural Networks (RNNs), particularly those using Long Short-Term Memory (LSTM) hidden units, are powerful and increasingly popular models for learning from sequence data. They effectively model varying length sequences and capture long range dependencies. We present the first study to empirically evaluate the ability of LSTMs to recognize patterns in multivariate time series of clinical measurements. Specifically, we consider multilabel classification of diagnoses, training a model to classify 128 diagnoses given 13 frequently but irregularly sampled clinical measurements. First, we establish the effectiveness of a simple LSTM network for modeling clinical data. Then we demonstrate a straightforward and effective training strategy in which we replicate targets at each sequence step. Trained only on raw time series, our models outperform several strong baselines, including a multilayer perceptron trained on hand-engineered features.
Published: 2015

8. A Critical Review of Recurrent Neural Networks for Sequence Learning

Author: Lipton, Zachary C., Berkowitz, John, and Elkan, Charles
Subjects: Computer Science - Learning, Computer Science - Neural and Evolutionary Computing
Abstract: Countless learning tasks require dealing with sequential data. Image captioning, speech synthesis, and music generation all require that a model produce outputs that are sequences. In other domains, such as time series prediction, video analysis, and musical information retrieval, a model must learn from inputs that are sequences. Interactive tasks, such as translating natural language, engaging in dialogue, and controlling a robot, often demand both capabilities. Recurrent neural networks (RNNs) are connectionist models that capture the dynamics of sequences via cycles in the network of nodes. Unlike standard feedforward neural networks, recurrent networks retain a state that can represent information from an arbitrarily long context window. Although recurrent neural networks have traditionally been difficult to train, and often contain millions of parameters, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful large-scale learning with them. In recent years, systems based on long short-term memory (LSTM) and bidirectional (BRNN) architectures have demonstrated ground-breaking performance on tasks as varied as image captioning, language translation, and handwriting recognition. In this survey, we review and synthesize the research that over the past three decades first yielded and then made practical these powerful learning models. When appropriate, we reconcile conflicting notation and nomenclature. Our goal is to provide a self-contained explication of the state of the art together with a historical perspective and references to primary research.
Published: 2015

9. Efficient Elastic Net Regularization for Sparse Linear Models

Author: Lipton, Zachary C. and Elkan, Charles
Subjects: Computer Science - Learning
Abstract: This paper presents an algorithm for efficient training of sparse linear models with elastic net regularization. Extending previous work on delayed updates, the new algorithm applies stochastic gradient updates to non-zero features only, bringing weights current as needed with closed-form updates. Closed-form delayed updates for the $\ell_1$, $\ell_{\infty}$, and rarely used $\ell_2$ regularizers have been described previously. This paper provides closed-form updates for the popular squared norm $\ell^2_2$ and elastic net regularizers. We provide dynamic programming algorithms that perform each delayed update in constant time. The new $\ell^2_2$ and elastic net methods handle both fixed and varying learning rates, and both standard {stochastic gradient descent} (SGD) and {forward backward splitting (FoBoS)}. Experimental results show that on a bag-of-words dataset with $260,941$ features, but only $88$ nonzero features on average per training example, the dynamic programming method trains a logistic regression classifier with elastic net regularization over $2000$ times faster than otherwise.
Published: 2015

10. Differential Privacy and Machine Learning: a Survey and Review

Author: Ji, Zhanglong, Lipton, Zachary C., and Elkan, Charles
Subjects: Computer Science - Learning, Computer Science - Cryptography and Security, Computer Science - Databases
Abstract: The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when mining sensitive data. For example, medical research represents an important application where it is necessary both to extract useful information and protect patient privacy. One way to resolve the conflict is to extract general characteristics of whole populations without disclosing the private information of individuals. In this paper, we consider differential privacy, one of the most popular and powerful definitions of privacy. We explore the interplay between machine learning and differential privacy, namely privacy-preserving machine learning algorithms and learning-based data release mechanisms. We also describe some theoretical results that address what can be learned differentially privately and upper bounds of loss functions for differentially private algorithms. Finally, we present some open questions, including how to incorporate public data, how to deal with missing data in private datasets, and whether, as the number of observed samples grows arbitrarily large, differentially private machine learning algorithms can be achieved at no cost to utility as compared to corresponding non-differentially private algorithms.
Published: 2014

11. Thresholding Classifiers to Maximize F1 Score

Author: Lipton, Zachary Chase, Elkan, Charles, and Narayanaswamy, Balakrishnan
Subjects: Statistics - Machine Learning, Computer Science - Information Retrieval, Computer Science - Learning
Abstract: This paper provides new insight into maximizing F1 scores in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall, F1 score is widely used to measure the success of a binary classifier when one class is rare. Micro average, macro average, and per instance average F1 scores are used in multilabel classification. For any classifier that produces a real-valued output, we derive the relationship between the best achievable F1 score and the decision-making threshold that achieves this optimum. As a special case, if the classifier outputs are well-calibrated conditional probabilities, then the optimal threshold is half the optimal F1 score. As another special case, if the classifier is completely uninformative, then the optimal behavior is to classify all examples as positive. Since the actual prevalence of positive examples typically is low, this behavior can be considered undesirable. As a case study, we discuss the results, which can be surprising, of applying this procedure when predicting 26,853 labels for Medline documents.
Published: 2014

12. Predicting accurate probabilities with a ranking loss

Author: Menon, Aditya, Jiang, Xiaoqian, Vembu, Shankar, Elkan, Charles, and Ohno-Machado, Lucila
Subjects: Computer Science - Learning, Statistics - Machine Learning
Abstract: In many real-world applications of machine learning classifiers, it is essential to predict the probability of an example belonging to a particular class. This paper proposes a simple technique for predicting probabilities based on optimizing a ranking loss, followed by isotonic regression. This semi-parametric technique offers both good ranking and regression performance, and models a richer set of probability distributions than statistical workhorses such as logistic regression. We provide experimental results that show the effectiveness of this technique on real-world applications of probability prediction., Comment: ICML2012
Published: 2012

13. Dyadic Prediction Using a Latent Feature Log-Linear Model

Author: Menon, Aditya Krishna and Elkan, Charles
Subjects: Computer Science - Learning
Abstract: In dyadic prediction, labels must be predicted for pairs (dyads) whose members possess unique identifiers and, sometimes, additional features called side-information. Special cases of this problem include collaborative filtering and link prediction. We present the first model for dyadic prediction that satisfies several important desiderata: (i) labels may be ordinal or nominal, (ii) side-information can be easily exploited if present, (iii) with or without side-information, latent features are inferred for dyad members, (iv) it is resistant to sample-selection bias, (v) it can learn well-calibrated probabilities, and (vi) it can scale to very large datasets. To our knowledge, no existing method satisfies all the above criteria. In particular, many methods assume that the labels are ordinal and ignore side-information when it is present. Experimental results show that the new method is competitive with state-of-the-art methods for the special cases of collaborative filtering and link prediction, and that it makes accurate predictions on nominal data.
Published: 2010

14. Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases

Author: Krallinger, Martin, Vazquez, Miguel, Leitner, Florian, Salgado, David, Chatr-aryamontri, Andrew, Winter, Andrew, Perfetto, Livia, Briganti, Leonardo, Licata, Luana, Iannuccelli, Marta, Castagnoli, Luisa, Cesareni, Gianni, Tyers, Mike, Schneider, Gerold, Rinaldi, Fabio, Leaman, Robert, Gonzalez, Graciela, Matos, Sergio, Kim, Sun, Wilbur, W, Rocha, Luis, Shatkay, Hagit, Tendulkar, Ashish V, Agarwal, Shashank, Liu, Feifan, Wang, Xinglong, Rak, Rafal, Noto, Keith, Elkan, Charles, Lu, Zhiyong, Dogan, Rezarta, Fontaine, Jean-Fred, Andrade-Navarro, Miguel A, and Valencia, Alfonso
Abstract: Abstract Background Determining usefulness of biomedical text mining systems requires realistic task definition and data selection criteria without artificial constraints, measuring performance aspects that go beyond traditional metrics. The BioCreative III Protein-Protein Interaction (PPI) tasks were motivated by such considerations, trying to address aspects including how the end user would oversee the generated output, for instance by providing ranked results, textual evidence for human interpretation or measuring time savings by using automated systems. Detecting articles describing complex biological events like PPIs was addressed in the Article Classification Task (ACT), where participants were asked to implement tools for detecting PPI-describing abstracts. Therefore the BCIII-ACT corpus was provided, which includes a training, development and test set of over 12,000 PPI relevant and non-relevant PubMed abstracts labeled manually by domain experts and recording also the human classification times. The Interaction Method Task (IMT) went beyond abstracts and required mining for associations between more than 3,500 full text articles and interaction detection method ontology concepts that had been applied to detect the PPIs reported in them. Results A total of 11 teams participated in at least one of the two PPI tasks (10 in ACT and 8 in the IMT) and a total of 62 persons were involved either as participants or in preparing data sets/evaluating these tasks. Per task, each team was allowed to submit five runs offline and another five online via the BioCreative Meta-Server. From the 52 runs submitted for the ACT, the highest Matthew's Correlation Coefficient (MCC) score measured was 0.55 at an accuracy of 89% and the best AUC iP/R was 68%. Most ACT teams explored machine learning methods, some of them also used lexical resources like MeSH terms, PSI-MI concepts or particular lists of verbs and nouns, some integrated NER approaches. For the IMT, a total of 42 runs were evaluated by comparing systems against manually generated annotations done by curators from the BioGRID and MINT databases. The highest AUC iP/R achieved by any run was 53%, the best MCC score 0.55. In case of competitive systems with an acceptable recall (above 35%) the macro-averaged precision ranged between 50% and 80%, with a maximum F-Score of 55%. Conclusions The results of the ACT task of BioCreative III indicate that classification of large unbalanced article collections reflecting the real class imbalance is still challenging. Nevertheless, text-mining tools that report ranked lists of relevant articles for manual selection can potentially reduce the time needed to identify half of the relevant articles to less than 1/4 of the time when compared to unranked results. Detecting associations between full text articles and interaction detection method PSI-MI terms (IMT) is more difficult than might be anticipated. This is due to the variability of method term mentions, errors resulting from pre-processing of articles provided as PDF files, and the heterogeneity and different granularity of method term concepts encountered in the ontology. However, combining the sophisticated techniques developed by the participants with supporting evidence strings derived from the articles for human interpretation could result in practical modules for biological annotation workflows.
Published: 2011

15. Predicting labels for dyadic data

Author: Menon, Aditya Krishna and Elkan, Charles
Subjects: Computer Science, Information Storage and Retrieval, Statistics for Engineering, Physics, Computer Science, Chemistry and Earth Sciences, Statistics, general, Computing Methodologies, Data Mining and Knowledge Discovery, Artificial Intelligence (incl. Robotics), Dyadic prediction, Collaborative filtering, Link prediction, Social networks, Within-network classification, Relational learning
Abstract: In dyadic prediction, the input consists of a pair of items (a dyad), and the goal is to predict the value of an observation related to the dyad. Special cases of dyadic prediction include collaborative filtering, where the goal is to predict ratings associated with (user, movie) pairs, and link prediction, where the goal is to predict the presence or absence of an edge between two nodes in a graph. In this paper, we study the problem of predicting labels associated with dyad members. Special cases of this problem include predicting characteristics of users in a collaborative filtering scenario, and predicting the label of a node in a graph, which is a task sometimes called within-network classification or relational learning. This paper shows how to extend a recent dyadic prediction method to predict labels for nodes and labels for edges simultaneously. The new method learns latent features within a log-linear model in a supervised way, to maximize predictive accuracy for both dyad observations and item labels. We compare the new approach to existing methods for within-network classification, both experimentally and analytically. The experiments show, surprisingly, that learning latent features in an unsupervised way is superior for some applications to learning them in a supervised way.
Published: 2010

16. Learning gene regulatory networks from only positive and unlabeled data

Author: Cerulo, Luigi, Elkan, Charles, and Ceccarelli, Michele
Abstract: Abstract Background Recently, supervised learning methods have been exploited to reconstruct gene regulatory networks from gene expression data. The reconstruction of a network is modeled as a binary classification problem for each pair of genes. A statistical classifier is trained to recognize the relationships between the activation profiles of gene pairs. This approach has been proven to outperform previous unsupervised methods. However, the supervised approach raises open questions. In particular, although known regulatory connections can safely be assumed to be positive training examples, obtaining negative examples is not straightforward, because definite knowledge is typically not available that a given pair of genes do not interact. Results A recent advance in research on data mining is a method capable of learning a classifier from only positive and unlabeled examples, that does not need labeled negative examples. Applied to the reconstruction of gene regulatory networks, we show that this method significantly outperforms the current state of the art of machine learning methods. We assess the new method using both simulated and experimental data, and obtain major performance improvement. Conclusions Compared to unsupervised methods for gene network inference, supervised methods are potentially more accurate, but for training they need a complete set of known regulatory connections. A supervised method that can be trained using only positive and unlabeled data, as presented in this paper, is especially beneficial for the task of inferring gene regulatory networks, because only an incomplete set of known regulatory connections is available in public databases such as RegulonDB, TRRD, KEGG, Transfac, and IPA.
Published: 2010

17. The Transporter Classification Database: recent advances

Author: Saier, Milton H, Yen, Ming Ren, Noto, Keith, Tamang, Dorjee G, and Elkan, Charles
Subjects: Biological Sciences, Bioinformatics and Computational Biology, Generic health relevance, Artificial Intelligence, Databases, Protein, Membrane Transport Proteins, Phylogeny, Sequence Homology, Amino Acid, Environmental Sciences, Information and Computing Sciences, Developmental Biology, Biological sciences, Chemical sciences, Environmental sciences
Abstract: The Transporter Classification Database (TCDB), freely accessible at http://www.tcdb.org, is a relational database containing sequence, structural, functional and evolutionary information about transport systems from a variety of living organisms, based on the International Union of Biochemistry and Molecular Biology-approved transporter classification (TC) system. It is a curated repository for factual information compiled largely from published references. It uses a functional/phylogenetic system of classification, and currently encompasses about 5000 representative transporters and putative transporters in more than 500 families. We here describe novel software designed to support and extend the usefulness of TCDB. Our recent efforts render it more user friendly, incorporate machine learning to input novel data in a semiautomatic fashion, and allow analyses that are more accurate and less time consuming. The availability of these tools has resulted in recognition of distant phylogenetic relationships and tremendous expansion of the information available to TCDB users.
Published: 2009

18. Learning the k in k-means

Author: Hamerly, Greg and Elkan, Charles
Abstract: When clustering a dataset, the right number $k$ of clusters to useis often not obvious, and choosing k automatically is a hard algorithmicproblem. In this paper we present a new algorithm for choosing k that is basedon a new statistical test for the hypothesis that a subset of data follows aGaussian distribution. The algorithm runs k-means with increasing k until thetest fails to reject the hypothesis that the data assigned to each k-meanscenter are Gaussian. We present results from experiments on synthetic andreal-world data showing that the algorithm works well, and better than a recentmethod based on the BIC penalty for model complexity.Pre-2018 CSE ID: CS2002-0716
Published: 2002

19. Alternatives to the k-means algorithm that find better clusterings

Author: Hamerly, Greg and Elkan, Charles
Abstract: We investigate here the behavior of the standard k-means clusteringalgorithm and several alternatives to it: the k-harmonic means algorithm due toZhang and colleagues, fuzzy k-means, Gaussian expectation-maximization, and twonew variants of k-harmonic means. Our aim is to find which aspects of thesealgorithms contribute to finding good clusterings, as opposed to converging toa low-quality local optimum. We describe each algorithm in a unified frameworkthat introduces separate cluster membership and data weight functions. We thenshow that the algorithms do behave very differently from each other on simplelow-dimensional synthetic datasets, and that the k-harmonic means method issuperior. Having a soft membership function is essential for findinghigh-quality clusterings, but having a non-constant data weight function isuseful also.Pre-2018 CSE ID: CS2002-0702
Published: 2002

20. Sources of Success for Information Extraction Methods

Author: Kauchak, David, Smarr, Joseph, and Elkan, Charles
Abstract: In this paper, we examine an important recent rule-based informationextraction (IE) technique named Boosted Wrapper Induction (BWI), by conductingexperiments on a wider variety of tasks than previously studied, includingtasks using several collections of natural text documents. We provide asystematic analysis of how each algorithmic component of BWI, in particularboosting, contributes to its success. We show that the benefit of boostingarises from the ability to reweight examples to learn specific rules (resultingin high precision) combined with the ability to continue learning rules afterall positive examples have been covered (resulting in high recall). As aquantitative indicator of the regularity of an extraction task, we propose anew measure that we call SWI ratio. We show that this measure is a goodpredictor of IE success. Based on these results, we analyze the strengths andlimitations of current rule-based IE methods in general. Specifically, weexplain limitations in the information made available to these methods, and inthe representations they use. We also discuss how confidence values returnedduring extraction are not true probabilities. In this analysis, we investigatethe benefits of including grammatical and semantic information for natural textdocuments, as well as parse tree and attribute-value information for XML andHTML documents. We show experimentally that incorporating even limitedgrammatical information can improve the regularity of and hence performance onnatural text extraction tasks. We conclude with proposals for enriching therepresentational power of rule-based IE methods to exploit these and othertypes of regularities.Pre-2018 CSE ID: CS2002-0696
Published: 2002

21. Learning and Making Decisions When Costs and Probabilities are Both Unknown

Author: Zadrozny, Bianca and Elkan, Charles
Abstract: In many machine learning domains, misclassification costs aredifferent for different examples, in the same way that class membershipprobabilities are example-dependent. In these domains, both costs andprobabilities are unknown for test examples, so both cost estimators andprobability estimators must be learned. This paper first discusses how to makeoptimal decisions given cost and probability estimates, and then presentsdecision tree learning methods for obtaining well-calibrated probabilityestimates. The paper then explains how to obtain unbiased estimators forexample- dependent costs, taking into account the difficulty that in general,probabilities and costs are not independent random variables, and the trainingexamples for which costs are known are not representative of all examples. Thelatter problem is called sample selection bias in econometrics. Our solutionto it is based on Nobel prize-winning work due to the economist James Heckman.We show that the methods we propose are successful in a comprehensivecomparison with MetaCost that uses the well-known and difficult dataset fromthe KDD'98 data mining contest.Pre-2018 CSE ID: CS2001-0664
Published: 2001

22. Optimal Thresholding of Classifiers to Maximize F1 Measure

Author: Lipton, Zachary C., Elkan, Charles, Naryanaswamy, Balakrishnan, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, Series editor, Tanaka, Yuzuru, Series editor, Wahlster, Wolfgang, Series editor, Siekmann, Jörg, Series editor, Calders, Toon, editor, Esposito, Floriana, editor, Hüllermeier, Eyke, editor, and Meo, Rosa, editor
Published: 2014
Full Text: View/download PDF

23. Nowcasting with Numerous Candidate Predictors

Author: Duncan, Brendan, Elkan, Charles, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, Series editor, Tanaka, Yuzuru, Series editor, Wahlster, Wolfgang, Series editor, Siekmann, Jörg, Series editor, Calders, Toon, editor, Esposito, Floriana, editor, Hüllermeier, Eyke, editor, and Meo, Rosa, editor
Published: 2014
Full Text: View/download PDF

24. Policy Iteration Based on a Learned Transition Model

Author: Ramavajjala, Vivek, Elkan, Charles, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Flach, Peter A., editor, De Bie, Tijl, editor, and Cristianini, Nello, editor
Published: 2012
Full Text: View/download PDF

25. Learning and Inference in Probabilistic Classifier Chains with Beam Search

Author: Kumar, Abhishek, Vembu, Shankar, Menon, Aditya Krishna, Elkan, Charles, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Flach, Peter A., editor, De Bie, Tijl, editor, and Cristianini, Nello, editor
Published: 2012
Full Text: View/download PDF

26. Reinforcement Learning with a Bilinear Q Function

Author: Elkan, Charles, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Sanner, Scott, editor, and Hutter, Marcus, editor
Published: 2012
Full Text: View/download PDF

27. Link Prediction via Matrix Factorization

Author: Menon, Aditya Krishna, Elkan, Charles, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Gunopulos, Dimitrios, editor, Hofmann, Thomas, editor, Malerba, Donato, editor, and Vazirgiannis, Michalis, editor
Published: 2011
Full Text: View/download PDF

28. Preserving Privacy in Data Mining via Importance Weighting

Author: Elkan, Charles, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Dimitrakakis, Christos, editor, Gkoulalas-Divanis, Aris, editor, Mitrokotsa, Aikaterini, editor, Verykios, Vassilios S., editor, and Saygin, Yücel, editor
Published: 2011
Full Text: View/download PDF

29. Learning to Find Relevant Biological Articles without Negative Training Examples

Author: Noto, Keith, Saier, Milton H., Jr., Elkan, Charles, Carbonell, Jaime G., editor, Siekmann, Jörg, editor, Wobcke, Wayne, editor, and Zhang, Mengjie, editor
Published: 2008
Full Text: View/download PDF

30. Finding Transport Proteins in a General Protein Database

Author: Das, Sanmay, Saier, Milton H., Jr., Elkan, Charles, Carbonell, Jaime G., editor, Siekmann, Jörg, editor, Kok, Joost N., editor, Koronacki, Jacek, editor, Lopez de Mantaras, Ramon, editor, Matwin, Stan, editor, Mladenič, Dunja, editor, and Skowron, Andrzej, editor
Published: 2007
Full Text: View/download PDF

31. Deriving TF-IDF as a Fisher Kernel

Author: Elkan, Charles, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Consens, Mariano, editor, and Navarro, Gonzalo, editor
Published: 2005
Full Text: View/download PDF

32. Learning Rules to Improve a Machine Translation System

Author: Kauchak, David, Elkan, Charles, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Carbonell, Jaime G., editor, Siekmann, Jörg, editor, Lavrač, Nada, editor, Gamberger, Dragan, editor, Blockeel, Hendrik, editor, and Todorovski, Ljupčo, editor
Published: 2003
Full Text: View/download PDF

33. Can we model the probability of presence of species without absence data?

Author: Li, Wenkai, Guo, Qinghua, and Elkan, Charles
Published: 2011
Full Text: View/download PDF

34. Reasoning about unknown, counterfactual, and nondeterministic actions in first-order logic

Author: Elkan, Charles, Carbonell, Jaime G., editor, Siekmann, Jörg, editor, Goos, G., editor, Hartmanis, J., editor, van Leeuwen, J., editor, and McCalla, Gordon, editor
Published: 1996
Full Text: View/download PDF

35. LPMEME: A statistical method for inductive logic programming

Author: Bhatia, Karan, Elkan, Charles, Carbonell, Jaime G., editor, Siekmann, Jörg, editor, Goos, G., editor, Hartmanis, J., editor, van Leeuwen, J., editor, and McCalla, Gordon, editor
Published: 1996
Full Text: View/download PDF

36. Adaptive Inference

Author: Segre, Alberto, Elkan, Charles, Scharstein, Daniel, Gordon, Geoffrey, Russell, Alexander, Meyrowitz, Alan L., editor, and Chipman, Susan, editor
Published: 1993
Full Text: View/download PDF

37. Differential privacy based on importance weighting

Author: Ji, Zhanglong and Elkan, Charles
Published: 2013
Full Text: View/download PDF

38. Beam search algorithms for multilabel learning

Author: Kumar, Abhishek, Vembu, Shankar, Menon, Aditya Krishna, and Elkan, Charles
Published: 2013
Full Text: View/download PDF

39. Optimal Thresholding of Classifiers to Maximize F1 Measure

Author: Lipton, Zachary C., primary, Elkan, Charles, additional, and Naryanaswamy, Balakrishnan, additional
Published: 2014
Full Text: View/download PDF

40. One-Class Remote Sensing Classification From Positive and Unlabeled Background Data

Author: Li, Wenkai, primary, Guo, Qinghua, additional, and Elkan, Charles, additional
Published: 2021
Full Text: View/download PDF

41. Fast recognition of musical genres using RBF networks

Author: Turnbull, Douglas and Elkan, Charles
Subjects: Subject cataloging -- Research, Indexing -- Research, Object-oriented databases -- Research, Music -- Research, Music -- Identification and classification, Neural networks -- Research, PRECIS (Indexing system) -- Research, Object-oriented database, Neural network, Business, Computers, Electronics, Electronics and electrical industries
Abstract: This paper explores the automatic classification of audio tracks into musical genres. Our goal is to achieve human-level accuracy with fast training and classification. This goal is achieved with radial basis function (RBF) networks by using a combination of unsupervised and supervised initialization methods. These initialization methods yield classifiers that are as accurate as RBF networks trained with gradient descent (which is hundreds of times slower). In addition, feature subset selection further reduces training and classification time while preserving classification accuracy. Combined, our methods succeed in creating an RBF network that matches the musical classification accuracy of humans. The general algorithmic contribution of this paper is to show experimentally that RBF networks initialized with a combination of methods can yield good classification performance without relying on gradient descent. The simplicity and computational efficiency of our initialization methods produce classifiers that are fast to train as well as fast to apply to novel data. We also present an improved method for initializing the k-means clustering algorithm which is useful for both unsupervised and supervised initialization methods. Index Terms--Radial basis function network, musical genre, initialization method, feature subset selection.
Published: 2005

42. Policy Iteration Based on a Learned Transition Model

Author: Ramavajjala, Vivek, primary and Elkan, Charles, additional
Published: 2012
Full Text: View/download PDF

43. Learning and Inference in Probabilistic Classifier Chains with Beam Search

Author: Kumar, Abhishek, primary, Vembu, Shankar, additional, Menon, Aditya Krishna, additional, and Elkan, Charles, additional
Published: 2012
Full Text: View/download PDF

44. Link Prediction via Matrix Factorization

Author: Menon, Aditya Krishna, primary and Elkan, Charles, additional
Published: 2011
Full Text: View/download PDF

45. Improved disk-drive failure warnings

Author: Hughes, Gordon F., Murray, Joseph F., Kreutz-Delgado, Kenneth, and Elkan, Charles
Subjects: Reliability (Engineering) -- Research, Computer peripherals industry -- Research, Disk drives -- Research, Hard disk drive, Business, Electronics, Electronics and electrical industries
Abstract: New methods for improving disk-drive failure prediction are proposed and discussed.
Published: 2002