Author: "Kannan, Anitha" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kannan, Anitha"' showing total 211 results

Start Over Author "Kannan, Anitha"

211 results on '"Kannan, Anitha"'

1. Extrinsically-Focused Evaluation of Omissions in Medical Summarization

Author: Schumacher, Elliot, Rosenthal, Daniel, Nair, Varun, Price, Luladay, Tso, Geoffrey, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text by focusing on the most critical information. Generative large language models (LLMs) have shown to be robust summarizers, yet traditional metrics struggle to capture resulting performance (Goyal et al, 2022) in more powerful LLMs. In safety-critical domains such as medicine, more rigorous evaluation is required, especially given the potential for LLMs to omit important information in the resulting summary. We propose MED-OMIT, a new omission benchmark for medical summarization. Given a doctor-patient conversation and a generated summary, MED-OMIT categorizes the chat into a set of facts and identifies which are omitted from the summary. We further propose to determine fact importance by simulating the impact of each fact on a downstream clinical task: differential diagnosis (DDx) generation. MED-OMIT leverages LLM prompt-based approaches which categorize the importance of facts and cluster them as supporting or negating evidence to the diagnosis. We evaluate MED-OMIT on a publicly-released dataset of patient-doctor conversations and find that MED-OMIT captures omissions better than alternative metrics.
Published: 2023

2. Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue

Author: Eremeev, Maksim, Valmianski, Ilya, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language
Abstract: Factual correctness is often the limiting factor in practical applications of natural language generation in high-stakes domains such as healthcare. An essential requirement for maintaining factuality is the ability to deal with rare tokens. This paper focuses on rare tokens that appear in both the source and the reference sequences, and which, when missed during generation, decrease the factual correctness of the output text. For high-stake domains that are also knowledge-rich, we show how to use knowledge to (a) identify which rare tokens that appear in both source and reference are important and (b) uplift their conditional probability. We introduce the ``utilization rate'' that encodes knowledge and serves as a regularizer by maximizing the marginal probability of selected tokens. We present a study in a knowledge-rich domain of healthcare, where we tackle the problem of generating after-visit care instructions based on patient-doctor dialogues. We verify that, in our dataset, specific medical concepts with high utilization rates are underestimated by conventionally trained sequence-to-sequence models. We observe that correcting this with our approach to knowledge injection reduces the uncertainty of the model as well as improves factuality and coherence without negatively impacting fluency., Comment: ACL 2023 (main conference)
Published: 2023

3. Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

Author: Nair, Varun, Schumacher, Elliot, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: A medical provider's summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing "patient does not have a fever" when a fever is present) can be detrimental to the outcome of care for the patient. This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.
Published: 2023

4. CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants

Author: Sun, Albert Yu, Nair, Varun, Schumacher, Elliot, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: A wave of new task-based virtual assistants has been fueled by increasingly powerful large language models (LLMs), such as GPT-4 (OpenAI, 2023). A major challenge in deploying LLM-based virtual conversational assistants in real world settings is ensuring they operate within what is admissible for the task. To overcome this challenge, the designers of these virtual assistants rely on an independent guardrail system that verifies the virtual assistant's output aligns with the constraints required for the task. However, relying on commonly used, prompt-based guardrails can be difficult to engineer correctly and comprehensively. To address these challenges, we propose CONSCENDI. We use CONSCENDI to exhaustively generate training data with two key LLM-powered components: scenario-augmented generation and contrastive training examples. When generating conversational data, we generate a set of rule-breaking scenarios, which enumerate a diverse set of high-level ways a rule can be violated. This scenario-guided approach produces a diverse training set and provides chatbot designers greater control. To generate contrastive examples, we prompt the LLM to alter conversations with violations into acceptable conversations to enable fine-grained distinctions. We then use this data, generated by CONSCENDI, to train a smaller model. We find that CONSCENDI results in guardrail models that improve over baselines in multiple dialogue domains., Comment: To appear in NAACL 2024
Published: 2023

5. Dialogue-Contextualized Re-ranking for Medical History-Taking

Author: Zhu, Jian, Valmianski, Ilya, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: AI-driven medical history-taking is an important component in symptom checking, automated patient intake, triage, and other AI virtual care applications. As history-taking is extremely varied, machine learning models require a significant amount of data to train. To overcome this challenge, existing systems are developed using indirect data or expert knowledge. This leads to a training-inference gap as models are trained on different kinds of data than what they observe at inference time. In this work, we present a two-stage re-ranking approach that helps close the training-inference gap by re-ranking the first-stage question candidates using a dialogue-contextualized model. For this, we propose a new model, global re-ranker, which cross-encodes the dialogue with all questions simultaneously, and compare it with several existing neural baselines. We test both transformer and S4-based language model backbones. We find that relative to the expert system, the best performance is achieved by our proposed global re-ranker with a transformer backbone, resulting in a 30% higher normalized discount cumulative gain (nDCG) and a 77% higher mean average precision (mAP)., Comment: Code and pre-trained S4 checkpoints will be available after publication
Published: 2023

6. DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents

Author: Nair, Varun, Schumacher, Elliot, Tso, Geoffrey, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Large language models (LLMs) have emerged as valuable tools for many natural language understanding tasks. In safety-critical applications such as healthcare, the utility of these models is governed by their ability to generate outputs that are factually accurate and complete. In this work, we present dialog-enabled resolving agents (DERA). DERA is a paradigm made possible by the increased conversational abilities of LLMs, namely GPT-4. It provides a simple, interpretable forum for models to communicate feedback and iteratively improve output. We frame our dialog as a discussion between two agent types - a Researcher, who processes information and identifies crucial problem components, and a Decider, who has the autonomy to integrate the Researcher's information and makes judgments on the final output. We test DERA against three clinically-focused tasks. For medical conversation summarization and care plan generation, DERA shows significant improvement over the base GPT-4 performance in both human expert preference evaluations and quantitative metrics. In a new finding, we also show that GPT-4's performance (70%) on an open-ended version of the MedQA question-answering (QA) dataset (Jin et al. 2021, USMLE) is well above the passing level (60%), with DERA showing similar performance. We release the open-ended MEDQA dataset at https://github.com/curai/curai-research/tree/main/DERA.
Published: 2023

7. Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach

Author: Wang, Mengqian, Valmianski, Ilya, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language
Abstract: Medical conversations between patients and medical professionals have implicit functional sections, such as "history taking", "summarization", "education", and "care plan." In this work, we are interested in learning to automatically extract these sections. A direct approach would require collecting large amounts of expert annotations for this task, which is inherently costly due to the contextual inter-and-intra variability between these sections. This paper presents an approach that tackles the problem of learning to classify medical dialogue into functional sections without requiring a large number of annotations. Our approach combines pseudo-labeling and human-in-the-loop. First, we bootstrap using weak supervision with pseudo-labeling to generate dialogue turn-level pseudo-labels and train a transformer-based model, which is then applied to individual sentences to create noisy sentence-level labels. Second, we iteratively refine sentence-level labels using a cluster-based human-in-the-loop approach. Each iteration requires only a few dozen annotator decisions. We evaluate the results on an expert-annotated dataset of 100 dialogues and find that while our models start with 69.5% accuracy, we can iteratively improve it to 82.5%. The code used to perform all experiments described in this paper can be found here: https://github.com/curai/curai-research/tree/main/functional-sections., Comment: Changed the github link as it was invalid
Published: 2022

8. Development and Evaluation of an iPad App for Measuring the Cost of a Nutritious Diet

Author: Palermo, Claire, Perera-Schulz, Dharani, Kannan, Anitha, Truby, Helen, Shiell, Alan, Emilda, Sindhu, and Quenette, Steve
Subjects: Information technology, T58.5-58.64, Public aspects of medicine, RA1-1270
Abstract: BackgroundMonitoring food costs informs governments of the affordability of healthy diets. Many countries have adopted a standardized healthy food basket. The Victorian Healthy Food Basket contains 44 food items necessary to meet the nutritional requirements of four different Australian family types for a fortnight. ObjectiveThe aim of this study was to describe the development of a new iPad app as core to the implementation of the Victorian Healthy Food Basket. The app significantly automates the data collection. We evaluate if the new technology enhanced the quality and efficacy of the research. MethodsTime taken for data collection and entry was recorded. Semi-structured evaluative interviews were conducted with five field workers during the pilot of the iPad app. Field workers were familiar with previous manual data collection methods. Qualitative process evaluation data was summarized against key evaluation questions. ResultsField workers reported that using the iPad for data collection resulted in increased data accuracy, time savings, and efficient data management, and was preferred over manual collection. ConclusionsPortable digital devices may be considered to improve and extend data collection in the field of food cost monitoring.
Published: 2014
Full Text: View/download PDF

9. OSLAT: Open Set Label Attention Transformer for Medical Entity Retrieval and Span Extraction

Author: Li, Raymond, Valmianski, Ilya, Deng, Li, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Medical entity span extraction and linking are critical steps for many healthcare NLP tasks. Most existing entity extraction methods either have a fixed vocabulary of medical entities or require span annotations. In this paper, we propose a method for linking an open set of entities that does not require any span annotations. Our method, Open Set Label Attention Transformer (OSLAT), uses the label-attention mechanism to learn candidate-entity contextualized text representations. We find that OSLAT can not only link entities but is also able to implicitly learn spans associated with entities. We evaluate OSLAT on two tasks: (1) span extraction trained without explicit span annotations, and (2) entity linking trained without span-level annotation. We test the generalizability of our method by training two separate models on two datasets with low entity overlap and comparing cross-dataset performance., Comment: 18 pages, 2 figures, Camera-Ready for ML4H 2022 (Proceedings Track)
Published: 2022

10. MEDCOD: A Medically-Accurate, Emotive, Diverse, and Controllable Dialog System

Author: Compton, Rhys, Valmianski, Ilya, Deng, Li, Huang, Costa, Katariya, Namit, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We present MEDCOD, a Medically-Accurate, Emotive, Diverse, and Controllable Dialog system with a unique approach to the natural language generator module. MEDCOD has been developed and evaluated specifically for the history taking task. It integrates the advantage of a traditional modular approach to incorporate (medical) domain knowledge with modern deep learning techniques to generate flexible, human-like natural language expressions. Two key aspects of MEDCOD's natural language output are described in detail. First, the generated sentences are emotive and empathetic, similar to how a doctor would communicate to the patient. Second, the generated sentence structures and phrasings are varied and diverse while maintaining medical consistency with the desired medical concept (provided by the dialogue manager module of MEDCOD). Experimental results demonstrate the effectiveness of our approach in creating a human-like medical dialogue system. Relevant code is available at https://github.com/curai/curai-research/tree/main/MEDCOD, Comment: 9 pages. Accepted at Machine Learning for Health (ML4H) 2021
Published: 2021

11. Adding more data does not always help: A study in medical conversation summarization with PEGASUS

Author: Nair, Varun, Katariya, Namit, Amatriain, Xavier, Valmianski, Ilya, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have emerged as capable alternatives to human summarization, reaching state-of-the-art performance on many summarization benchmarks. However, many downstream tasks still require at least moderately sized datasets to achieve satisfactory performance. In this work we (1) explore the effect of dataset size on transfer learning medical conversation summarization using PEGASUS and (2) evaluate various iterative labeling strategies in the low-data regime, following their success in the classification setting. We find that model performance saturates with increase in dataset size and that the various active-learning strategies evaluated all show equivalent performance consistent with simple dataset size increase. We also find that naive iterative pseudo-labeling is on-par or slightly worse than no pseudo-labeling. Our work sheds light on the successes and challenges of translating low-data regime techniques in classification to medical conversation summarization and helps guides future work in this space. Relevant code available at \url{https://github.com/curai/curai-research/tree/main/medical-summarization-ML4H-2021}., Comment: Accepted to Machine Learning for Healthcare Workshop, NeurIPS 2021
Published: 2021

12. Medically Aware GPT-3 as a Data Generator for Medical Dialogue Summarization

Author: Chintagunta, Bharath, Katariya, Namit, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In medical dialogue summarization, summaries must be coherent and must capture all the medically relevant information in the dialogue. However, learning effective models for summarization require large amounts of labeled data which is especially hard to obtain. We present an algorithm to create synthetic training data with an explicit focus on capturing medically relevant information. We utilize GPT-3 as the backbone of our algorithm and scale 210 human labeled examples to yield results comparable to using 6400 human labeled examples (~30x) leveraging low-shot learning and an ensemble method. In detailed experiments, we show that this approach produces high quality training data that can further be combined with human labeled data to get summaries that are strongly preferable to those produced by models trained on human data alone both in terms of medical accuracy and coherency., Comment: Accepted to Machine learning for healthcare 2021
Published: 2021

13. Medical symptom recognition from patient text: An active learning approach for long-tailed multilabel distributions

Author: Mottaghi, Ali, Sarma, Prathusha K, Amatriain, Xavier, Yeung, Serena, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We study the problem of medical symptoms recognition from patient text, for the purposes of gathering pertinent information from the patient (known as history-taking). A typical patient text is often descriptive of the symptoms the patient is experiencing and a single instance of such a text can be "labeled" with multiple symptoms. This makes learning a medical symptoms recognizer challenging on account of i) the lack of availability of voluminous annotated data as well as ii) the large unknown universe of multiple symptoms that a single text can map to. Furthermore, patient text is often characterized by a long tail in the data (i.e., some labels/symptoms occur more frequently than others for e.g "fever" vs "hematochezia"). In this paper, we introduce an active learning method that leverages underlying structure of a continually refined, learned latent space to select the most informative examples to label. This enables the selection of the most informative examples that progressively increases the coverage on the universe of symptoms via the learned model, despite the long tail in data distribution.
Published: 2020

14. Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures

Author: Joshi, Anirudh, Katariya, Namit, Amatriain, Xavier, and Kannan, Anitha
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Understanding a medical conversation between a patient and a physician poses a unique natural language understanding challenge since it combines elements of standard open ended conversation with very domain specific elements that require expertise and medical knowledge. Summarization of medical conversations is a particularly important aspect of medical conversation understanding since it addresses a very real need in medical practice: capturing the most important aspects of a medical encounter so that they can be used for medical decision making and subsequent follow ups. In this paper we present a novel approach to medical conversation summarization that leverages the unique and independent local structures created when gathering a patient's medical history. Our approach is a variation of the pointer generator network where we introduce a penalty on the generator distribution, and we explicitly model negations. The model also captures important properties of medical conversations such as medical knowledge coming from standardized medical ontologies better than when those concepts are introduced explicitly. Through evaluation by doctors, we show that our approach is preferred on twice the number of summaries to the baseline pointer generator model and captures most or all of the information in 80% of the conversations making it a realistic alternative to costly manual summarization by medical experts., Comment: Accepted for publication in Findings of EMNLP at EMNLP 2020
Published: 2020

15. COVID-19 in differential diagnosis of online symptom assessments

Author: Kannan, Anitha, Chen, Richard, Venkataraman, Vignesh, Tso, Geoffrey J., and Amatriain, Xavier
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The COVID-19 pandemic has magnified an already existing trend of people looking for healthcare solutions online. One class of solutions are symptom checkers, which have become very popular in the context of COVID-19. Traditional symptom checkers, however, are based on manually curated expert systems that are inflexible and hard to modify, especially in a quickly changing situation like the one we are facing today. That is why all COVID-19 existing solutions are manual symptom checkers that can only estimate the probability of this disease and cannot contemplate alternative hypothesis or come up with a differential diagnosis. While machine learning offers an alternative, the lack of reliable data does not make it easy to apply to COVID-19 either. In this paper we present an approach that combines the strengths of traditional AI expert systems and novel deep learning models. In doing so we can leverage prior knowledge as well as any amount of existing data to quickly derive models that best adapt to the current state of the world and latest scientific knowledge. We use the approach to train a COVID-19 aware differential diagnosis model that can be used for medical decision support both for doctors or patients. We show that our approach is able to accurately model new incoming data about COVID-19 while still preserving accuracy on conditions that had been modeled in the past. While our approach shows evident and clear advantages for an extreme situation like the one we are currently facing, we also show that its flexibility generalizes beyond this concrete, but very important, example., Comment: Accepted at the Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract
Published: 2020

16. Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs

Author: McCreery, Clara H., Katariya, Namit, Kannan, Anitha, Chablani, Manish, and Amatriain, Xavier
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: People increasingly search online for answers to their medical questions but the rate at which medical questions are asked online significantly exceeds the capacity of qualified people to answer them. This leaves many questions unanswered or inadequately answered. Many of these questions are not unique, and reliable identification of similar questions would enable more efficient and effective question answering schema. COVID-19 has only exacerbated this problem. Almost every government agency and healthcare organization has tried to meet the informational need of users by building online FAQs, but there is no way for people to ask their question and know if it is answered on one of these pages. While many research efforts have focused on the problem of general question similarity, these approaches do not generalize well to domains that require expert knowledge to determine semantic similarity, such as the medical domain. In this paper, we show how a double fine-tuning approach of pretraining a neural network on medical question-answer pairs followed by fine-tuning on medical question-question pairs is a particularly useful intermediate task for the ultimate goal of determining medical question similarity. While other pretraining tasks yield an accuracy below 78.7% on this task, our model achieves an accuracy of 82.6% with the same number of training examples, an accuracy of 80.0% with a much smaller training set, and an accuracy of 84.5% when the full corpus of medical question-answer data is used. We also describe a currently live system that uses the trained model to match user questions to COVID-related FAQs., Comment: arXiv admin note: substantial text overlap with arXiv:1910.04192
Published: 2020

17. The accuracy vs. coverage trade-off in patient-facing diagnosis models

Author: Kannan, Anitha, Fries, Jason Alan, Kramer, Eric, Chen, Jen Jen, Shah, Nigam, and Amatriain, Xavier
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: A third of adults in America use the Internet to diagnose medical concerns, and online symptom checkers are increasingly part of this process. These tools are powered by diagnosis models similar to clinical decision support systems, with the primary difference being the coverage of symptoms and diagnoses. To be useful to patients and physicians, these models must have high accuracy while covering a meaningful space of symptoms and diagnoses. To the best of our knowledge, this paper is the first in studying the trade-off between the coverage of the model and its performance for diagnosis. To this end, we learn diagnosis models with different coverage from EHR data. We find a 1\% drop in top-3 accuracy for every 10 diseases added to the coverage. We also observe that complexity for these models does not affect performance, with linear models performing as well as neural networks.
Published: 2019

18. Classification as Decoder: Trading Flexibility for Control in Medical Dialogue

Author: Shleifer, Sam, Chablani, Manish, Kannan, Anitha, Katariya, Namit, and Amatriain, Xavier
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deeper understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control, a concerning tradeoff in doctor/patient interactions. Inaccuracies, typos, or undesirable content in the training data will be reproduced by the model at inference time. We trade a small amount of labeling effort and some loss of response variety in exchange for quality control. More specifically, a pretrained language model encodes the conversational context, and we finetune a classification head to map an encoded conversational context to a response class, where each class is a noisily labeled group of interchangeable responses. Experts can update these exemplar responses over time as best practices change without retraining the classifier or invalidating old training data. Expert evaluation of 775 unseen doctor/patient conversations shows that only 12% of the discriminative model's responses are worse than the what the doctor ended up writing, compared to 18% for the generative model., Comment: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:1910.03476
Published: 2019

19. Domain-Relevant Embeddings for Medical Question Similarity

Author: McCreery, Clara, Katariya, Namit, Kannan, Anitha, Chablani, Manish, and Amatriain, Xavier
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language, Statistics - Machine Learning
Abstract: The rate at which medical questions are asked online far exceeds the capacity of qualified people to answer them, and many of these questions are not unique. Identifying same-question pairs could enable questions to be answered more effectively. While many research efforts have focused on the problem of general question similarity for non-medical applications, these approaches do not generalize well to the medical domain, where medical expertise is often required to determine semantic similarity. In this paper, we show how a semi-supervised approach of pre-training a neural network on medical question-answer pairs is a particularly useful intermediate task for the ultimate goal of determining medical question similarity. While other pre-training tasks yield an accuracy below 78.7% on this task, our model achieves an accuracy of 82.6% with the same number of training examples, and an accuracy of 80.0% with a much smaller training set., Comment: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract
Published: 2019

20. Open Set Medical Diagnosis

Author: Prabhu, Viraj, Kannan, Anitha, Tso, Geoffrey J., Katariya, Namit, Chablani, Manish, Sontag, David, and Amatriain, Xavier
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Machine-learned diagnosis models have shown promise as medical aides but are trained under a closed-set assumption, i.e. that models will only encounter conditions on which they have been trained. However, it is practically infeasible to obtain sufficient training data for every human condition, and once deployed such models will invariably face previously unseen conditions. We frame machine-learned diagnosis as an open-set learning problem, and study how state-of-the-art approaches compare. Further, we extend our study to a setting where training data is distributed across several healthcare sites that do not allow data pooling, and experiment with different strategies of building open-set diagnostic ensembles. Across both settings, we observe consistent gains from explicitly modeling unseen conditions, but find the optimal training strategy to vary across settings., Comment: Abbreviated version to appear at Machine Learning for Healthcare (ML4H) Workshop at NeurIPS 2019
Published: 2019

21. Classification As Decoder: Trading Flexibility For Control In Neural Dialogue

Author: Shleifer, Sam, Chablani, Manish, Katariya, Namit, Kannan, Anitha, and Amatriain, Xavier
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Generative seq2seq dialogue systems are trained to predict the next word in dialogues that have already occurred. They can learn from large unlabeled conversation datasets, build a deep understanding of conversational context, and generate a wide variety of responses. This flexibility comes at the cost of control. Undesirable responses in the training data will be reproduced by the model at inference time, and longer generations often don't make sense. Instead of generating responses one word at a time, we train a classifier to choose from a predefined list of full responses. The classifier is trained on (conversation context, response class) pairs, where each response class is a noisily labeled group of interchangeable responses. At inference, we generate the exemplar response associated with the predicted response class. Experts can edit and improve these exemplar responses over time without retraining the classifier or invalidating old training data. Human evaluation of 775 unseen doctor/patient conversations shows that this tradeoff improves responses. Only 12% of our discriminative approach's responses are worse than the doctor's response in the same conversational context, compared to 18% for the generative model. A discriminative model trained without any manual labeling of response classes achieves equal performance to the generative model.
Published: 2019

22. Contributors

Author: Amatriain, Xavier, primary, Balaji, Yogesh, additional, Bekiranov, Stefan, additional, Belagiannis, Vasileios, additional, Benyoussef, Anas-Alexis, additional, Carneiro, Gustavo, additional, Chablani, Manish, additional, Chen, Cheng, additional, Cho, Hyun Jae, additional, Chou, Jingyuan, additional, Cochener, Béatrice, additional, Conze, Pierre-Henri, additional, Dawoud, Youssef, additional, Do, Thanh-Toan, additional, Dou, Qi, additional, Farshad, Azade, additional, Fu, Chi-Wing, additional, Guha Roy, Abhijit, additional, Guo, Pengfei, additional, Heng, Pheng-Ann, additional, Hoang, Hieu, additional, Jiang, Shanshan, additional, Jin, Yueming, additional, Kannan, Anitha, additional, Kim, Jieum, additional, Lamard, Mathieu, additional, Le, Ngan, additional, Le Callet, Patrick, additional, Le Guilcher, Alexandre, additional, Li, Xiaomeng, additional, Ling, Suiyi, additional, Liu, Quande, additional, Massin, Pascale, additional, Matta, Sarah, additional, Mobiny, Aryan, additional, Nascimento, Jacinto C., additional, Navab, Nassir, additional, Nguyen, Cuong C., additional, Nguyen, Hien Van, additional, Pastor, Andreas, additional, Patel, Vishal M., additional, Paul, Angshuman, additional, Pölsterl, Sebastian, additional, Prabhu, Viraj, additional, Quellec, Gwenolé, additional, Ravuri, Murali, additional, Ricquebourg, Vincent, additional, Rottier, Jean-Bernard, additional, Sankaranarayanan, Swami, additional, Shen, Thomas C., additional, Siddiqui, Shayan, additional, Sontag, David, additional, Summers, Ronald M., additional, Suo, Qiuling, additional, Tang, Yu-Xing, additional, Tran, Minh-Triet, additional, Vo-Ho, Viet-Khoa, additional, Wachinger, Christian, additional, Wang, Puyang, additional, Xing, Lei, additional, Yamazaki, Kashu, additional, Yeganeh, Yousef, additional, Yu, Lequan, additional, Yuan, Pengyu, additional, Zang, Chongzhi, additional, Zhang, Aidong, additional, and Zhou, Jinyuan, additional
Published: 2023
Full Text: View/download PDF

23. Few-shot learning for dermatological disease diagnosis

Author: Prabhu, Viraj, primary, Kannan, Anitha, additional, Ravuri, Murali, additional, Chablani, Manish, additional, Sontag, David, additional, and Amatriain, Xavier, additional
Published: 2023
Full Text: View/download PDF

24. Prototypical Clustering Networks for Dermatological Disease Diagnosis

Author: Prabhu, Viraj, Kannan, Anitha, Ravuri, Murali, Chablani, Manish, Sontag, David, and Amatriain, Xavier
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We consider the problem of image classification for the purpose of aiding doctors in dermatological diagnosis. Dermatological diagnosis poses two major challenges for standard off-the-shelf techniques: First, the data distribution is typically extremely long tailed. Second, intra-class variability is often large. To address the first issue, we formulate the problem as low-shot learning, where once deployed, a base classifier must rapidly generalize to diagnose novel conditions given very few labeled examples. To model diverse classes effectively, we propose Prototypical Clustering Networks (PCN), an extension to Prototypical Networks that learns a mixture of prototypes for each class. Prototypes are initialized for each class via clustering and refined via an online update scheme. Classification is performed by measuring similarity to a weighted combination of prototypes within a class, where the weights are the inferred cluster responsibilities. We demonstrate the strengths of our approach in effective diagnosis on a realistic dataset of dermatological conditions.
Published: 2018

25. Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations

Author: Kalyan, Ashwin, Lee, Stefan, Kannan, Anitha, and Batra, Dhruv
Subjects: Statistics - Machine Learning, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning
Abstract: Many structured prediction problems (particularly in vision and language domains) are ambiguous, with multiple outputs being correct for an input - e.g. there are many ways of describing an image, multiple ways of translating a sentence; however, exhaustively annotating the applicability of all possible outputs is intractable due to exponentially large output spaces (e.g. all English sentences). In practice, these problems are cast as multi-class prediction, with the likelihood of only a sparse set of annotations being maximized - unfortunately penalizing for placing beliefs on plausible but unannotated outputs. We make and test the following hypothesis - for a given input, the annotations of its neighbors may serve as an additional supervisory signal. Specifically, we propose an objective that transfers supervision from neighboring examples. We first study the properties of our developed method in a controlled toy setup before reporting results on multi-label classification and two image-grounded sequence modeling tasks - captioning and question generation. We evaluate using standard task-specific metrics and measures of output diversity, finding consistent improvements over standard maximum likelihood training and other baselines., Comment: To be presented at ICML 2018; 10 pages 5 figures
Published: 2018

26. Learning from the experts: From expert systems to machine-learned diagnosis models

Author: Ravuri, Murali, Kannan, Anitha, Tso, Geoffrey J., and Amatriain, Xavier
Subjects: Computer Science - Artificial Intelligence
Abstract: Expert diagnostic support systems have been extensively studied. The practical applications of these systems in real-world scenarios have been somewhat limited due to well-understood shortcomings, such as lack of extensibility. More recently, machine-learned models for medical diagnosis have gained momentum, since they can learn and generalize patterns found in very large datasets like electronic health records. These models also have shortcomings - in particular, there is no easy way to incorporate prior knowledge from existing literature or experts. In this paper, we present a method to merge both approaches by using expert systems as generative models that create simulated data on which models can be learned. We demonstrate that such a learned model not only preserves the original properties of the expert systems but also addresses some of their limitations. Furthermore, we show how this approach can also be used as the starting point to combine expert knowledge with knowledge extracted from other data sources, such as electronic health records.
Published: 2018

27. Tackling Over-pruning in Variational Autoencoders

Author: Yeung, Serena, Kannan, Anitha, Dauphin, Yann, and Fei-Fei, Li
Subjects: Computer Science - Learning
Abstract: Variational autoencoders (VAE) are directed generative models that learn factorial latent variables. As noted by Burda et al. (2015), these models exhibit the problem of factor over-pruning where a significant number of stochastic factors fail to learn anything and become inactive. This can limit their modeling power and their ability to learn diverse and meaningful latent representations. In this paper, we evaluate several methods to address this problem and propose a more effective model-based approach called the epitomic variational autoencoder (eVAE). The so-called epitomes of this model are groups of mutually exclusive latent factors that compete to explain the data. This approach helps prevent inactive units since each group is pressured to explain the data. We compare the approaches with qualitative and quantitative results on MNIST and TFD datasets. Our results show that eVAE makes efficient use of model capacity and generalizes better than VAE.
Published: 2017

28. Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Author: Lu, Jiasen, Kannan, Anitha, Yang, Jianwei, Parikh, Devi, and Batra, Dhruv
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: We present a novel training framework for neural sequence models, particularly for grounded dialog generation. The standard training paradigm for these models is maximum likelihood estimation (MLE), or minimizing the cross-entropy of the human responses. Across a variety of domains, a recurring problem with MLE trained generative neural dialog models (G) is that they tend to produce 'safe' and generic responses ("I don't know", "I can't tell"). In contrast, discriminative dialog models (D) that are trained to rank a list of candidate human responses outperform their generative counterparts; in terms of automatic metrics, diversity, and informativeness of the responses. However, D is not useful in practice since it cannot be deployed to have real conversations with users. Our work aims to achieve the best of both worlds -- the practical usefulness of G and the strong performance of D -- via knowledge transfer from D to G. Our primary contribution is an end-to-end trainable generative visual dialog model, where G receives gradients from D as a perceptual (not adversarial) loss of the sequence sampled from G. We leverage the recently proposed Gumbel-Softmax (GS) approximation to the discrete distribution -- specifically, an RNN augmented with a sequence of GS samplers, coupled with the straight-through gradient estimator to enable end-to-end differentiability. We also introduce a stronger encoder for visual dialog, and employ a self-attention mechanism for answer encoding along with a metric learning loss to aid D in better capturing semantic similarities in answer responses. Overall, our proposed model outperforms state-of-the-art on the VisDial dataset by a significant margin (2.67% on recall@10). The source code can be downloaded from https://github.com/jiasenlu/visDial.pytorch., Comment: 11 pages, 3 figures
Published: 2017

29. LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation

Author: Yang, Jianwei, Kannan, Anitha, Batra, Dhruv, and Parikh, Devi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning
Abstract: We present LR-GAN: an adversarial image generation model which takes scene structure and context into account. Unlike previous generative adversarial networks (GANs), the proposed GAN learns to generate image background and foregrounds separately and recursively, and stitch the foregrounds on the background in a contextually relevant manner to produce a complete natural image. For each foreground, the model learns to generate its appearance, shape and pose. The whole model is unsupervised, and is trained in an end-to-end manner with gradient descent methods. The experiments demonstrate that LR-GAN can generate more natural images with objects that are more human recognizable than DCGAN., Comment: 21 pages, 22 figures, published as a conference paper at ICLR 2017, code available on GitHub
Published: 2017

30. Transformation-Based Models of Video Sequences

Author: van Amersfoort, Joost, Kannan, Anitha, Ranzato, Marc'Aurelio, Szlam, Arthur, Tran, Du, and Chintala, Soumith
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work we propose a simple unsupervised approach for next frame prediction in video. Instead of directly predicting the pixels in a frame given past frames, we predict the transformations needed for generating the next frame in a sequence, given the transformations of the past frames. This leads to sharper results, while using a smaller prediction model. In order to enable a fair comparison between different video frame prediction models, we also propose a new evaluation protocol. We use generated frames as input to a classifier trained with ground truth sequences. This criterion guarantees that models scoring high are those producing sequences which preserve discriminative features, as opposed to merely penalizing any deviation, plausible or not, from the ground truth. Our proposed approach compares favourably against more sophisticated ones on the UCF-101 data set, while also being more efficient in terms of the number of parameters and computational cost.
Published: 2017

31. Detecting diabetic retinopathy using a hybrid ensemble XL machine model with dual weighted-Kernel ELM and improved mayfly optimization

Author: Kannan, Anitha, primary, Palanivel, ShanmugaPrabha, additional, Karthikeyan, SashiRekha, additional, Mholds, VigilsonPrem, additional, and Joseph, JeganAmarnath, additional
Published: 2024
Full Text: View/download PDF

32. Study Navigator: An Algorithmically Generated Aid for Learning from Electronic Textbooks

Author: Agrawal, Rakesh, Gollapudi, Sreenivas, Kannan, Anitha, and Kenthapadi, Krishnaram
Abstract: We present "study navigator," an algorithmically-generated aid for enhancing the experience of studying from electronic textbooks. The study navigator for a section of the book consists of helpful "concept references" for understanding this section. Each concept reference is a pair consisting of a concept phrase explained elsewhere and the link to the section in which it has been explained. We propose a novel reader model for textbooks and an algorithm for generating the study navigator based on this model. We also present an extension of the study navigator specialized to accommodate information processing preference of the student. Specifically, this specialization allows a student to control the balance between references to sections that help refresh material already studied vs. sections that provide more advanced information. We also present two user studies that demonstrate the efficacy of the proposed system across textbooks on different subjects from different grades.
Published: 2014

33. Structured Query Reformulations in Commerce Search

Author: Gollapudi, Sreenivas, Ieong, Samuel, and Kannan, Anitha
Subjects: Computer Science - Information Retrieval, Computer Science - Databases
Abstract: Recent work in commerce search has shown that understanding the semantics in user queries enables more effective query analysis and retrieval of relevant products. However, due to lack of sufficient domain knowledge, user queries often include terms that cannot be mapped directly to any product attribute. For example, a user looking for {\tt designer handbags} might start with such a query because she is not familiar with the manufacturers, the price ranges, and/or the material that gives a handbag designer appeal. Current commerce search engines treat terms such as {\tt designer} as keywords and attempt to match them to contents such as product reviews and product descriptions, often resulting in poor user experience. In this study, we propose to address this problem by reformulating queries involving terms such as {\tt designer}, which we call \emph{modifiers}, to queries that specify precise product attributes. We learn to rewrite the modifiers to attribute values by analyzing user behavior and leveraging structured data sources such as the product catalog that serves the queries. We first produce a probabilistic mapping between the modifiers and attribute values based on user behavioral data. These initial associations are then used to retrieve products from the catalog, over which we infer sets of attribute values that best describe the semantics of the modifiers. We evaluate the effectiveness of our approach based on a comprehensive Mechanical Turk study. We find that users agree with the attribute values selected by our approach in about 95% of the cases and they prefer the results surfaced for our reformulated queries to ones for the original queries in 87% of the time., Comment: A shorter version appeared in CIKM 2012
Published: 2012

34. Mining Videos from the Web for Electronic Textbooks

Author: Agrawal, Rakesh, Christoforaki, Maria, Gollapudi, Sreenivas, Kannan, Anitha, Kenthapadi, Krishnaram, Swaminathan, Adith, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Kobsa, Alfred, editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Tanaka, Yuzuru, editor, Wahlster, Wolfgang, editor, Siekmann, Jörg, editor, Glodeanu, Cynthia Vera, editor, Kaytoue, Mehdi, editor, and Sacarea, Christian, editor
Published: 2014
Full Text: View/download PDF

35. An efficient novel framework for determining accuracy to retrieve thesaurus for information classification by implementing SVM classifier in comparison with KNN classifier

Author: Pepakayala, Sri Devi Satya Dileep, primary and Kannan, Anitha, additional
Published: 2023
Full Text: View/download PDF

36. Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue

Author: Eremeev, Maksim, primary, Valmianski, Ilya, additional, Amatriain, Xavier, additional, and Kannan, Anitha, additional
Published: 2023
Full Text: View/download PDF

37. Classification and analysis of customer data using a novel criterion based random forest algorithm to improve retention rate over SVM algorithm in terms of prediction rate

Author: Pepakayala, Sai Surya, primary and Kannan, Anitha, additional
Published: 2023
Full Text: View/download PDF

38. Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models

Author: Nair, Varun, primary, Schumacher, Elliot, additional, and Kannan, Anitha, additional
Published: 2023
Full Text: View/download PDF

39. Evaluation of Block Allograft Efficacy in Lateral Alveolar Ridge Augmentation

Author: Balaji, Thodur Madapusi, primary, Jagannathan, Raghunathan, additional, Bose, Bhuvaneswari Birla, additional, Natarajan, Prabhu Manickam, additional, Kannan, Anitha Logaranjani, additional, and Jebaraj, Juala Catherine, additional
Published: 2022
Full Text: View/download PDF

40. Chapter 13 - Few-shot learning for dermatological disease diagnosis

Author: Prabhu, Viraj, Kannan, Anitha, Ravuri, Murali, Chablani, Manish, Sontag, David, and Amatriain, Xavier
Published: 2023
Full Text: View/download PDF

41. Accounting for Non-genetic Factors Improves the Power of eQTL Studies

Author: Stegle, Oliver, Kannan, Anitha, Durbin, Richard, Winn, John, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Vingron, Martin, editor, and Wong, Limsoon, editor
Published: 2008
Full Text: View/download PDF

42. A Bayesian Model That Links Microarray mRNA Measurements to Mass Spectrometry Protein Measurements

Author: Kannan, Anitha, Emili, Andrew, Frey, Brendan J., Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Speed, Terry, editor, and Huang, Haiyan, editor
Published: 2007
Full Text: View/download PDF

43. A Generative Model of Dense Optical Flow in Layers

Author: Kannan, Anitha, Frey, Brendan, Jojic, Nebojsa, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, and MacLean, W. James, editor
Published: 2006
Full Text: View/download PDF

44. People Research Data Commons Feedback from Consultations

Author: Kannan, Anitha
Subjects: health research data, people research data commons, australian research data commons, health research data commons, digital research infrastructure
Abstract: This is a report on the findings of consultations held in 2022 with the Australian health research community on the People Research Data Commons (People RDC). The People RDC is a new Thematic Research Data Commons, a model that sees the ARDC bring together its expertise, services and capabilities, and work closely in partnership with the research community on longer term, large-scale programs to create enduring digital infrastructure. The focus of the initial consultations was on “What the People RDC needs to deliver” and “Why the challenges addressed by the People RDC would be transformative for research.” The next stage of the People RDC consultation, design and development will include NCRIS facilities, data custodians and research infrastructure providers. The discussions will cover the question of “How can the People RDC deliver infrastructure to address the identified challenges by building on existing capabilities, both within the ARDC and across the sector?&rdquo
Published: 2022
Full Text: View/download PDF

45. Mining Videos from the Web for Electronic Textbooks

Author: Agrawal, Rakesh, primary, Christoforaki, Maria, additional, Gollapudi, Sreenivas, additional, Kannan, Anitha, additional, Kenthapadi, Krishnaram, additional, and Swaminathan, Adith, additional
Published: 2014
Full Text: View/download PDF

46. Electronic Textbooks and Data Mining

Author: Agrawal, Rakesh, Gollapudi, Sreenivas, Kannan, Anitha, Kenthapadi, Krishnaram, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Gao, Hong, editor, Lim, Lipyeow, editor, Wang, Wei, editor, Li, Chuan, editor, and Chen, Lei, editor
Published: 2012
Full Text: View/download PDF

47. Enriching Education through Data Mining

Author: Agrawal, Rakesh, Gollapudi, Sreenivas, Kannan, Anitha, Kenthapadi, Krishnaram, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Kuznetsov, Sergei O., editor, Mandal, Deba P., editor, Kundu, Malay K., editor, and Pal, Sankar K., editor
Published: 2011
Full Text: View/download PDF

48. Fast Transformation-Invariant Component Analysis

Author: Kannan, Anitha, Jojic, Nebojsa, and Frey, Brendan J.
Published: 2008
Full Text: View/download PDF

49. Effective Transfer Learning for Identifying Similar Questions

Author: McCreery, Clara H., Katariya, Namit, Kannan, Anitha, Chablani, Manish, and Amatriain, Xavier
Published: 2020
Full Text: View/download PDF

50. Holistic Approach to Data Linkage at Monash University

Author: Andrew, Nadine, primary, Mac Manus, Chris, additional, Padmanabhan, Komathy, additional, Lucas, Mark, additional, and Kannan, Anitha, additional
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

211 results on '"Kannan, Anitha"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources