Author: "Chen, Shan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen, Shan"' showing total 7,375 results

Start Over Author "Chen, Shan"

7,375 results on '"Chen, Shan"'

1. ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Author: Chen, Canyu, Yu, Jian, Chen, Shan, Liu, Che, Wan, Zhongwei, Bitterman, Danielle, Wang, Fei, and Shu, Kai
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) hold great promise to revolutionize current clinical systems for their superior capacities on medical text processing tasks and medical licensing exams. Meanwhile, traditional ML models such as SVM and XGBoost have still been mainly adopted in clinical prediction tasks. An emerging question is Can LLMs beat traditional ML models in clinical prediction? Thus, we build a new benchmark ClinicalBench to comprehensively study the clinical predictive modeling capacities of both general-purpose and medical LLMs, and compare them with traditional ML models. ClinicalBench embraces three common clinical prediction tasks, two databases, 14 general-purpose LLMs, 8 medical LLMs, and 11 traditional ML models. Through extensive empirical investigation, we discover that both general-purpose and medical LLMs, even with different model scales, diverse prompting or fine-tuning strategies, still cannot beat traditional ML models in clinical prediction yet, shedding light on their potential deficiency in clinical reasoning and decision-making. We call for caution when practitioners adopt LLMs in clinical applications. ClinicalBench can be utilized to bridge the gap between LLMs' development for healthcare and real-world clinical practice., Comment: The first two authors contributed equally. 10 pages for main paper, 66 pages including appendix. Project website: https://clinicalbench.github.io
Published: 2024

2. Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability

Author: Gao, Yanjun, Myers, Skatje, Chen, Shan, Dligach, Dmitriy, Miller, Timothy A, Bitterman, Danielle, Chen, Guanhua, Mayampurath, Anoop, Churpek, Matthew, and Afshar, Majid
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large language models (LLMs) are being explored for diagnostic decision support, yet their ability to estimate pre-test probabilities, vital for clinical decision-making, remains limited. This study evaluates two LLMs, Mistral-7B and Llama3-70B, using structured electronic health record data on three diagnosis tasks. We examined three current methods of extracting LLM probability estimations and revealed their limitations. We aim to highlight the need for improved techniques in LLM confidence estimation., Comment: Accepted to GenAI4Health Workshop at NeurIPS 2024
Published: 2024

3. Mapping Bias in Vision Language Models: Signposts, Pitfalls, and the Road Ahead

Author: Sasse, Kuleen, Chen, Shan, Pond, Jackson, Bitterman, Danielle, and Osborne, John
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: As Vision Language Models (VLMs) gain widespread use, their fairness remains under-explored. In this paper, we analyze demographic biases across five models and six datasets. We find that portrait datasets like UTKFace and CelebA are the best tools for bias detection, finding gaps in performance and fairness between LLaVa and CLIP models. However, scene based datasets like PATA, VLStereoSet fail to be useful benchmarks for bias due to their construction. As for pronoun based datasets like VisoGender, we receive mixed signals as only some subsets of the data are useful in providing insights. To alleviate this problem, we introduce a more difficult version of VisoGender to serve as a more rigorous evaluation. Based on these results, we call for more effective and carefully designed datasets to ensure VLMs are both fair and reliable., Comment: Under Review at NAACL 2025
Published: 2024

4. WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation

Author: Matos, João, Chen, Shan, Placino, Siena, Li, Yingya, Pardo, Juan Carlos Climent, Idan, Daphna, Tohyama, Takeshi, Restrepo, David, Nakayama, Luis F., Pascual-Leone, Jose M. M., Savova, Guergana, Aerts, Hugo, Celi, Leo A., Wong, A. Ian, Bitterman, Danielle S., and Gallifant, Jack
Subjects: Computer Science - Computation and Language
Abstract: Multimodal/vision language models (VLMs) are increasingly being deployed in healthcare settings worldwide, necessitating robust benchmarks to ensure their safety, efficacy, and fairness. Multiple-choice question and answer (QA) datasets derived from national medical examinations have long served as valuable evaluation tools, but existing datasets are largely text-only and available in a limited subset of languages and countries. To address these challenges, we present WorldMedQA-V, an updated multilingual, multimodal benchmarking dataset designed to evaluate VLMs in healthcare. WorldMedQA-V includes 568 labeled multiple-choice QAs paired with 568 medical images from four countries (Brazil, Israel, Japan, and Spain), covering original languages and validated English translations by native clinicians, respectively. Baseline performance for common open- and closed-source models are provided in the local language and English translations, and with and without images provided to the model. The WorldMedQA-V benchmark aims to better match AI systems to the diverse healthcare environments in which they are deployed, fostering more equitable, effective, and representative applications., Comment: submitted for review, total of 14 pages
Published: 2024

5. Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation

Author: Chen, Shan, Gao, Mingye, Sasse, Kuleen, Hartvigsen, Thomas, Anthony, Brian, Fan, Lizhou, Aerts, Hugo, Gallifant, Jack, and Bitterman, Danielle
Subjects: Computer Science - Computation and Language
Abstract: Background: Large language models (LLMs) are trained to follow directions, but this introduces a vulnerability to blindly comply with user requests even if they generate wrong information. In medicine, this could accelerate the generation of misinformation that impacts human well-being. Objectives/Methods: We analyzed compliance to requests to generate misleading content about medications in settings where models know the request is illogical. We investigated whether in-context directions and instruction-tuning of LLMs to prioritize logical reasoning over compliance reduced misinformation risk. Results: While all frontier LLMs complied with misinformation requests, both prompt-based and parameter-based approaches can improve the detection of logic flaws in requests and prevent the dissemination of medical misinformation. Conclusion: Shifting LLMs to prioritize logic over compliance could reduce risks of exploitation for medical misinformation., Comment: Submitted for Review
Published: 2024

6. AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

Author: Yu, Huizi, Zhou, Jiayan, Li, Lingyao, Chen, Shan, Gallifant, Jack, Shi, Anye, Li, Xiang, Hua, Wenyue, Jin, Mingyu, Chen, Guang, Zhou, Yang, Li, Zhao, Gupte, Trisha, Chen, Ming-Li, Azizi, Zahra, Zhang, Yongfeng, Assimes, Themistocles L., Ma, Xin, Bitterman, Danielle S., Lu, Lin, and Fan, Lizhou
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Simulated patient systems play a crucial role in modern medical education and research, providing safe, integrative learning environments and enabling clinical decision-making simulations. Large Language Models (LLM) could advance simulated patient systems by replicating medical conditions and patient-doctor interactions with high fidelity and low cost. However, ensuring the effectiveness and trustworthiness of these systems remains a challenge, as they require a large, diverse, and precise patient knowledgebase, along with a robust and stable knowledge diffusion to users. Here, we developed AIPatient, an advanced simulated patient system with AIPatient Knowledge Graph (AIPatient KG) as the input and the Reasoning Retrieval-Augmented Generation (Reasoning RAG) agentic workflow as the generation backbone. AIPatient KG samples data from Electronic Health Records (EHRs) in the Medical Information Mart for Intensive Care (MIMIC)-III database, producing a clinically diverse and relevant cohort of 1,495 patients with high knowledgebase validity (F1 0.89). Reasoning RAG leverages six LLM powered agents spanning tasks including retrieval, KG query generation, abstraction, checker, rewrite, and summarization. This agentic framework reaches an overall accuracy of 94.15% in EHR-based medical Question Answering (QA), outperforming benchmarks that use either no agent or only partial agent integration. Our system also presents high readability (median Flesch Reading Ease 77.23; median Flesch Kincaid Grade 5.6), robustness (ANOVA F-value 0.6126, p>0.1), and stability (ANOVA F-value 0.782, p>0.1). The promising performance of the AIPatient system highlights its potential to support a wide range of applications, including medical education, model evaluation, and system integration., Comment: 42 pages, 6 figures, 7 tables
Published: 2024

7. Dense Point Clouds Matter: Dust-GS for Scene Reconstruction from Sparse Viewpoints

Author: Chen, Shan, Zhou, Jiale, and Li, Lei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D Gaussian Splatting (3DGS) has demonstrated remarkable performance in scene synthesis and novel view synthesis tasks. Typically, the initialization of 3D Gaussian primitives relies on point clouds derived from Structure-from-Motion (SfM) methods. However, in scenarios requiring scene reconstruction from sparse viewpoints, the effectiveness of 3DGS is significantly constrained by the quality of these initial point clouds and the limited number of input images. In this study, we present Dust-GS, a novel framework specifically designed to overcome the limitations of 3DGS in sparse viewpoint conditions. Instead of relying solely on SfM, Dust-GS introduces an innovative point cloud initialization technique that remains effective even with sparse input data. Our approach leverages a hybrid strategy that integrates an adaptive depth-based masking technique, thereby enhancing the accuracy and detail of reconstructed scenes. Extensive experiments conducted on several benchmark datasets demonstrate that Dust-GS surpasses traditional 3DGS methods in scenarios with sparse viewpoints, achieving superior scene reconstruction quality with a reduced number of input images.
Published: 2024

8. Recurrent evolution and selection shape structural diversity at the amylase locus.

Author: Bolognini, Davide, Halgren, Alma, Lou, Runyang, Raveane, Alessandro, Rocha, Joana, Guarracino, Andrea, Soranzo, Nicole, Chin, Chen-Shan, Garrison, Erik, and Sudmant, Peter
Subjects: Humans, Agriculture, Amylases, Evolution, Molecular, Gene Dosage, Gene Duplication, Genetic Loci, Genome, Human, Haplotypes, History, Ancient, Mutation Rate, Polymorphism, Single Nucleotide, Selection, Genetic, Hunting, Gene Deletion, DNA, Ancient
Abstract: The adoption of agriculture triggered a rapid shift towards starch-rich diets in human populations1. Amylase genes facilitate starch digestion, and increased amylase copy number has been observed in some modern human populations with high-starch intake2, although evidence of recent selection is lacking3,4. Here, using 94 long-read haplotype-resolved assemblies and short-read data from approximately 5,600 contemporary and ancient humans, we resolve the diversity and evolutionary history of structural variation at the amylase locus. We find that amylase genes have higher copy numbers in agricultural populations than in fishing, hunting and pastoral populations. We identify 28 distinct amylase structural architectures and demonstrate that nearly identical structures have arisen recurrently on different haplotype backgrounds throughout recent human history. AMY1 and AMY2A genes each underwent multiple duplication/deletion events with mutation rates up to more than 10,000-fold the single-nucleotide polymorphism mutation rate, whereas AMY2B gene duplications share a single origin. Using a pangenome-based approach, we infer structural haplotypes across thousands of humans identifying extensively duplicated haplotypes at higher frequency in modern agricultural populations. Leveraging 533 ancient human genomes, we find that duplication-containing haplotypes (with more gene copies than the ancestral haplotype) have rapidly increased in frequency over the past 12,000 years in West Eurasians, suggestive of positive selection. Together, our study highlights the potential effects of the agricultural revolution on human genomes and the importance of structural variation in human adaptation.
Published: 2024

9. When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?

Author: Gao, Yanjun, Myers, Skatje, Chen, Shan, Dligach, Dmitriy, Miller, Timothy A, Bitterman, Danielle, Churpek, Matthew, and Afshar, Majid
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The introduction of Large Language Models (LLMs) has advanced data representation and analysis, bringing significant progress in their use for medical questions and answering. Despite these advancements, integrating tabular data, especially numerical data pivotal in clinical contexts, into LLM paradigms has not been thoroughly explored. In this study, we examine the effectiveness of vector representations from last hidden states of LLMs for medical diagnostics and prognostics using electronic health record (EHR) data. We compare the performance of these embeddings with that of raw numerical EHR data when used as feature inputs to traditional machine learning (ML) algorithms that excel at tabular data learning, such as eXtreme Gradient Boosting. We focus on instruction-tuned LLMs in a zero-shot setting to represent abnormal physiological data and evaluating their utilities as feature extractors to enhance ML classifiers for predicting diagnoses, length of stay, and mortality. Furthermore, we examine prompt engineering techniques on zero-shot and few-shot LLM embeddings to measure their impact comprehensively. Although findings suggest the raw data features still prevails in medical ML tasks, zero-shot LLM embeddings demonstrate competitive results, suggesting a promising avenue for future research in medical applications., Comment: Accepted to Findings of EMNLP 2024
Published: 2024

10. Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks

Author: Gallifant, Jack, Chen, Shan, Moreira, Pedro, Munch, Nikolaj, Gao, Mingye, Pond, Jackson, Celi, Leo Anthony, Aerts, Hugo, Hartvigsen, Thomas, and Bitterman, Danielle
Subjects: Computer Science - Computation and Language
Abstract: Medical knowledge is context-dependent and requires consistent reasoning across various natural language expressions of semantically equivalent phrases. This is particularly crucial for drug names, where patients often use brand names like Advil or Tylenol instead of their generic equivalents. To study this, we create a new robustness dataset, RABBITS, to evaluate performance differences on medical benchmarks after swapping brand and generic drug names using physician expert annotations. We assess both open-source and API-based LLMs on MedQA and MedMCQA, revealing a consistent performance drop ranging from 1-10\%. Furthermore, we identify a potential source of this fragility as the contamination of test data in widely used pre-training datasets. All code is accessible at https://github.com/BittermanLab/RABBITS, and a HuggingFace leaderboard is available at https://huggingface.co/spaces/AIM-Harvard/rabbits-leaderboard., Comment: submitted for review, total 15 pages
Published: 2024

11. Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias

Author: Chen, Shan, Gallifant, Jack, Gao, Mingye, Moreira, Pedro, Munch, Nikolaj, Muthukkumar, Ajay, Rajan, Arvind, Kolluri, Jaya, Fiske, Amelia, Hastings, Janna, Aerts, Hugo, Anthony, Brian, Celi, Leo Anthony, La Cava, William G., and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) are increasingly essential in processing natural languages, yet their application is frequently compromised by biases and inaccuracies originating in their training data. In this study, we introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in LLMs, specifically focusing on the representation of disease prevalence across diverse demographic groups. We systematically evaluate how demographic biases embedded in pre-training corpora like $ThePile$ influence the outputs of LLMs. We expose and quantify discrepancies by juxtaposing these biases against actual disease prevalences in various U.S. demographic groups. Our results highlight substantial misalignment between LLM representation of disease prevalence and real disease prevalence rates across demographic subgroups, indicating a pronounced risk of bias propagation and a lack of real-world grounding for medical applications of LLMs. Furthermore, we observe that various alignment methods minimally resolve inconsistencies in the models' representation of disease prevalence across different languages. For further exploration and analysis, we make all data and a data visualization tool available at: www.crosscare.net., Comment: Submitted for review, data visualization tool available at: www.crosscare.net
Published: 2024

12. Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data

Author: Chen, Shan, Gallifant, Jack, Guevara, Marco, Gao, Yanjun, Afshar, Majid, Miller, Timothy, Dligach, Dmitriy, and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language
Abstract: Generative models have been showing potential for producing data in mass. This study explores the enhancement of clinical natural language processing performance by utilizing synthetic data generated from advanced language models. Promising results show feasible applications in such a high-stakes domain., Comment: submitted to review
Published: 2024

13. Uniform vorticity depletion and inviscid damping for periodic shear flows in the high Reynolds number regime

Author: Beekie, Rajendra, Chen, Shan, and Jia, Hao
Subjects: Mathematics - Analysis of PDEs
Abstract: We study the dynamics of the two dimensional Navier-Stokes equations linearized around a shear flow on a (non-square) torus which possesses exactly two non-degenerate critical points. We obtain linear inviscid damping and vorticity depletion estimates for the linearized flow that are uniform with respect to the viscosity, and enhanced dissipation type decay estimates. The main task is to understand the associated Rayleigh and Orr-Sommerfeld equations, under the natural assumption that the linearized operator around the shear flow in the inviscid case has no discrete eigenvalues. The key difficulty is to understand the behavior of the solution to Orr-Sommerfeld equations in three distinct regimes depending on the spectral parameter: the non-degenerate case when the spectral parameter is away from the critical values, the intermediate case when the spectral parameter is close to but still separated from the critical values, and the most singular case when the spectral parameter is inside the viscous layer., Comment: 70 pages; comments welcome; Several typos and small technical glitches fixed
Published: 2024

14. Research on a subway fire evacuation model based on system dynamics

Author: Lin, Xiaofei, Chen, Shan, Ji, Baojing, Fan, Haiyan, Pan, Ziwei, and Zhai, Huaiyuan
Published: 2024
Full Text: View/download PDF

15. On ceremonial paintings by the Yao people (瑤族)and their acculturation from Taoist Shuilu paintings (道教水陆画)

Author: Chen, Shan and Zhuang, Peina
Published: 2024
Full Text: View/download PDF

16. Analysis and validation of theoretical equations for a seismic isolation system with a multi-level friction damper

Author: Chang Chien, Chia-Shang, Lu, Lyan-Ywan, Chen, Shan-Ru, and Guo, Mei-Ting
Published: 2024
Full Text: View/download PDF

17. Psychological Resilience Mediates the Association Between Childhood Maltreatment and Self-Harm Phenotype in Chinese Early Adolescents

Author: Li, Yuan, Li, Yong-Han, He, Yang, Chen, Shan-Shan, Chang, Jun-Jie, Yuan, Meng-Yuan, Cao, Lei-Lei, Wang, Shao-Jie, Wang, Geng-Fu, and Su, Pu-Yu
Published: 2024
Full Text: View/download PDF

18. Effects of C content and tempering temperature on impact-abrasive wear resistance of high-C martensitic steel

Author: Liu, Tian-long, Zhang, Xin-yue, Cui, Xiao-bo, Chen, Shan-shan, Sun, Xiao-yan, Long, Jun, and Zheng, Zhi-bin
Published: 2024
Full Text: View/download PDF

19. The complete sequence and comparative analysis of ape sex chromosomes

Author: Makova, Kateryna D., Pickett, Brandon D., Harris, Robert S., Hartley, Gabrielle A., Cechova, Monika, Pal, Karol, Nurk, Sergey, Yoo, DongAhn, Li, Qiuhui, Hebbar, Prajna, McGrath, Barbara C., Antonacci, Francesca, Aubel, Margaux, Biddanda, Arjun, Borchers, Matthew, Bornberg-Bauer, Erich, Bouffard, Gerard G., Brooks, Shelise Y., Carbone, Lucia, Carrel, Laura, Carroll, Andrew, Chang, Pi-Chuan, Chin, Chen-Shan, Cook, Daniel E., Craig, Sarah J. C., de Gennaro, Luciana, Diekhans, Mark, Dutra, Amalia, Garcia, Gage H., Grady, Patrick G. S., Green, Richard E., Haddad, Diana, Hallast, Pille, Harvey, William T., Hickey, Glenn, Hillis, David A., Hoyt, Savannah J., Jeong, Hyeonsoo, Kamali, Kaivan, Pond, Sergei L. Kosakovsky, LaPolice, Troy M., Lee, Charles, Lewis, Alexandra P., Loh, Yong-Hwee E., Masterson, Patrick, McGarvey, Kelly M., McCoy, Rajiv C., Medvedev, Paul, Miga, Karen H., Munson, Katherine M., Pak, Evgenia, Paten, Benedict, Pinto, Brendan J., Potapova, Tamara, Rhie, Arang, Rocha, Joana L., Ryabov, Fedor, Ryder, Oliver A., Sacco, Samuel, Shafin, Kishwar, Shepelev, Valery A., Slon, Viviane, Solar, Steven J., Storer, Jessica M., Sudmant, Peter H., Sweetalana, Sweeten, Alex, Tassia, Michael G., Thibaud-Nissen, Françoise, Ventura, Mario, Wilson, Melissa A., Young, Alice C., Zeng, Huiqing, Zhang, Xinru, Szpiech, Zachary A., Huber, Christian D., Gerton, Jennifer L., Yi, Soojin V., Schatz, Michael C., Alexandrov, Ivan A., Koren, Sergey, O’Neill, Rachel J., Eichler, Evan E., and Phillippy, Adam M.
Published: 2024
Full Text: View/download PDF

20. Low NDRG2, regulated by the MYC/MIZ-1 complex and methylation, predicts poor outcomes in DLBCL patients

Author: Wu, Shuang, Zhang, Jie, Chen, Shan, Zhou, Xinyi, Liu, Yankui, Hua, Haiying, Qi, Xiaowei, Mao, Yong, Young, Ken H., and Lu, Tingxun
Published: 2024
Full Text: View/download PDF

21. Comorbid asthma is associated with rhinitis severity in children exposed to air pollutants

Author: Ho, Sai-Wai, Lue, Ko-Huang, Chen, Shan-Ming, and Ku, Min-Sho
Published: 2024
Full Text: View/download PDF

22. The effect of college students’ school sports policy attitudes on physical quality: the mediating role of physical exercise and gender difference

Author: Song, Di, Chen, Shan‑ping, Shang, Yao, Xie, Li‑Jun, Liu, Li-ping, and Zhang, Zhong-jiang
Published: 2024
Full Text: View/download PDF

23. The Effect of Physical Exercise on Subjective Well-Being in Chinese Middle School Students: The Mediation Roles of Peer Relationships and Self-Actualization

Author: Shang, Yao, Chen, Shan-Ping, and Xie, Hao-Dong
Published: 2024
Full Text: View/download PDF

24. Enhanced mechanical squeezing in an optomechanical system via backward stimulated Brillouin scattering

Author: Chen, Shan-Shan, Zhang, Na-Na, Guo, Yong-Rui, Yang, Huan, and Ma, Yong
Subjects: Quantum Physics
Abstract: We investigate theoretically the enhancement of mechanical squeezing in a multimode optomechanical system by introducing a coherent phonon-photon interaction via the backward stimulated Brillouin scattering (BSBS) process. The coherent photon-phonon interaction where two optical modes couple to a Brillouin acoustic mode with a large decay rate provides an extra channel for the cooling of a Duffing mechanical oscillator. The squeezing degree and the robustness to the thermal noises of the Duffing mechanical mode can be enhanced greatly. When the Duffing nonlinearity is weak, the squeezing degree of the mechanical mode in the presence of BSBS can be improved more than one order of magnitude compared with the absence of BSBS. Our scheme may be extended to other quantum systems to study novel quantum effects.
Published: 2023

25. The impact of using an AI chatbot to respond to patient messages

Author: Chen, Shan, Guevara, Marco, Moningi, Shalini, Hoebers, Frank, Elhalawani, Hesham, Kann, Benjamin H., Chipidza, Fallon E., Leeman, Jonathan, Aerts, Hugo J. W. L., Miller, Timothy, Savova, Guergana K., Mak, Raymond H., Lustberg, Maryam, Afshar, Majid, and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language
Abstract: Documentation burden is a major contributor to clinician burnout, which is rising nationally and is an urgent threat to our ability to care for patients. Artificial intelligence (AI) chatbots, such as ChatGPT, could reduce clinician burden by assisting with documentation. Although many hospitals are actively integrating such systems into electronic medical record systems, AI chatbots utility and impact on clinical decision-making have not been studied for this intended use. We are the first to examine the utility of large language models in assisting clinicians draft responses to patient questions. In our two-stage cross-sectional study, 6 oncologists responded to 100 realistic synthetic cancer patient scenarios and portal messages developed to reflect common medical situations, first manually, then with AI assistance. We find AI-assisted responses were longer, less readable, but provided acceptable drafts without edits 58% of time. AI assistance improved efficiency 77% of time, with low harm risk (82% safe). However, 7.7% unedited AI responses could severely harm. In 31% cases, physicians thought AI drafts were human-written. AI assistance led to more patient education recommendations, fewer clinical actions than manual responses. Results show promise for AI to improve clinician efficiency and patient care through assisting documentation, if used judiciously. Monitoring model outputs and human-AI interaction remains crucial for safe implementation., Comment: 4 figures and tables in main, submitted for review
Published: 2023

26. Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly

Author: Lu, Sheng, Chen, Shan, Li, Yingya, Bitterman, Danielle, Savova, Guergana, and Gurevych, Iryna
Subjects: Computer Science - Computation and Language
Abstract: In-context learning (ICL) is a new learning paradigm that has gained popularity along with the development of large language models. In this work, we adapt a recently proposed hardness metric, pointwise $\mathcal{V}$-usable information (PVI), to an in-context version (in-context PVI). Compared to the original PVI, in-context PVI is more efficient in that it requires only a few exemplars and does not require fine-tuning. We conducted a comprehensive empirical analysis to evaluate the reliability of in-context PVI. Our findings indicate that in-context PVI estimates exhibit similar characteristics to the original PVI. Specific to the in-context setting, we show that in-context PVI estimates remain consistent across different exemplar selections and numbers of shots. The variance of in-context PVI estimates across different exemplar selections is insignificant, which suggests that in-context PVI are stable. Furthermore, we demonstrate how in-context PVI can be employed to identify challenging instances. Our work highlights the potential of in-context PVI and provides new insights into the capabilities of ICL., Comment: EMNLP 2023 Findings
Published: 2023

27. Associations between palliative-care consultations and end-of-life quality in cancer patients’ last 6 months

Author: Chen, Shan Ting, Chen, San Chi, Lee, Hsing Jung, and Chen, Chen Hsiu
Published: 2024
Full Text: View/download PDF

28. Research Progress on Alginate Nanocarrier Delivery of Lipophilic Active Substances and Its Application

Author: XU Wei, LI Huixue, LIU Xiaoying, SUN Yapeng, ZHANG Runfeng, CHEN Shan
Subjects: alginate, nanocarrier, lipophilic active substances, application, Food processing and manufacture, TP368-456
Abstract: Lipophilic active substances are of great interest to the field of food and health products because of their natural advantages and potential functional properties. However, they are difficult to effectively digest and absorb in the human body due to their hydrophobicity, environmental sensitivity deficiencies (such as oxygen, light, temperature) and partial or complete degradation in the digestive tract after ingestion. Using alginate nano-delivery carriers with strong stability and barrier effect can effectively solve the above problems while prolonging the release of lipophilic active substances. This paper reviews the structure of alginate and its functional properties such as gelation properties, pH response and biological activity, with a focus on the current status of research on alginate nanocarriers, such as nanoemulsions, nanoparticles, nanogels and nanofibers for the delivery of lipophilic active substances as well as the advantages and disadvantages of different nanocarriers. Finally, this article summarizes the application status of alginate in the food field, aiming to provide a reference for the application of lipophilic active substances in the food field.
Published: 2024
Full Text: View/download PDF

29. Large Language Models to Identify Social Determinants of Health in Electronic Health Records

Author: Guevara, Marco, Chen, Shan, Thomas, Spencer, Chaunzwa, Tafadzwa L., Franco, Idalid, Kann, Benjamin, Moningi, Shalini, Qian, Jack, Goldstein, Madeleine, Harper, Susan, Aerts, Hugo JWL, Savova, Guergana K., Mak, Raymond H., and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHR). This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented, and explored the role of synthetic clinical text for improving the extraction of these scarcely documented, yet extremely valuable, clinical data. 800 patient notes were annotated for SDoH categories, and several transformer-based models were evaluated. The study also experimented with synthetic data generation and assessed for algorithmic bias. Our best-performing models were fine-tuned Flan-T5 XL (macro-F1 0.71) for any SDoH, and Flan-T5 XXL (macro-F1 0.70). The benefit of augmenting fine-tuning with synthetic data varied across model architecture and size, with smaller Flan-T5 models (base and large) showing the greatest improvements in performance (delta F1 +0.12 to +0.23). Model performance was similar on the in-hospital system dataset but worse on the MIMIC-III dataset. Our best-performing fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models for both tasks. These fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p<0.05). At the patient-level, our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. Our method can effectively extracted SDoH information from clinic notes, performing better compare to GPT zero- and few-shot settings. These models could enhance real-world evidence on SDoH and aid in identifying patients needing social support., Comment: Peer-reviewed version published at NPJ Digital Medicine: https://www.nature.com/articles/s41746-023-00970-0
Published: 2023
Full Text: View/download PDF

30. Publisher Correction: Synergistic inhibition effect of Chlorella sp. and benzotriazole on the corrosion of Q235 carbon steel in alkaline artificial seawater

Author: Chen, Shan, Zhang, Shen, Yuan, Mingzhe, and Zhang, Ping
Published: 2024
Full Text: View/download PDF

31. Mutation of the SUMOylation site of Aurora-B disrupts spindle formation and chromosome alignment in oocytes

Author: Chen, Shan-Shan, Li, Li, Yao, Bo, Guo, Jia-Lun, Lu, Ping-Shuang, Zhang, Hao-Lin, Zhang, Kun-Huan, Zou, Yuan-Jing, Luo, Nan-Jian, Sun, Shao-Chen, Hu, Lin-Lin, and Ren, Yan-Ping
Published: 2024
Full Text: View/download PDF

32. Synergistic inhibition effect of Chlorella sp. and benzotriazole on the corrosion of Q235 carbon steel in alkaline artificial seawater

Author: Chen, Shan, Zhang, Shen, Yuan, Mingzhe, and Zhang, Ping
Published: 2024
Full Text: View/download PDF

33. Use of colloids and crystalloids for perioperative clinical infusion management in cardiac surgery patients and postoperative outcomes: a meta-analysis

Author: Chen, Shan-Dong, Ma, Yu-Tong, Wei, Hui-Xia, Ou, Xin-Rong, Liu, Jia-Yi, Tian, Ya-Lan, Zhang, Chao, Xu, Yun-Jin, and Kong, Yao
Published: 2024
Full Text: View/download PDF

34. Biofeedback combined with percutaneous electrical pudendal nerve stimulation for the treatment of low anterior rectal resection syndrome: a study protocol for a randomized controlled trial

Author: Cao, Gaoyang, Zhang, Xinjie, Wang, Fei, Man, Da, Wu, Lijie, Pan, Xuchu, and Chen, Shan
Published: 2024
Full Text: View/download PDF

35. Chlamydia psittaci detected at a live poultry wholesale market in central China

Author: Zhang, Rusheng, Fu, Huiyuan, Luo, Can, Huang, Zheng, Pei, Ruiqing, Di, Yu, Zhu, Caiying, Peng, Jiayi, Hu, Huiqi, Chen, Shan, Chen, Jingfang, Chen, Lamei, Xu, Mingzhong, Yang, Xuewen, and Yang, Rengui
Published: 2024
Full Text: View/download PDF

36. Sleep initiation patterns and sleep quality among toddlers in the southeast of China: initial study results

Author: Lin, Xiaoxia, Chen, Xianrui, Chen, Yanhui, Xu, Ping, and Chen, Shan
Published: 2024
Full Text: View/download PDF

37. Origin and dispersal history of Hepatitis B virus in Eastern Eurasia

Author: Sun, Bing, Andrades Valtueña, Aida, Kocher, Arthur, Gao, Shizhu, Li, Chunxiang, Fu, Shuang, Zhang, Fan, Ma, Pengcheng, Yang, Xuan, Qiu, Yulan, Zhang, Quanchao, Ma, Jian, Chen, Shan, Xiao, Xiaoming, Damchaabadgar, Sodnomjamts, Li, Fajun, Kovalev, Alexey, Hu, Chunbai, Chen, Xianglong, Wang, Lixin, Li, Wenying, Zhou, Yawei, Zhu, Hong, Krause, Johannes, Herbig, Alexander, and Cui, Yinqiu
Published: 2024
Full Text: View/download PDF

38. De novo variants of IRF2BPL result in developmental epileptic disorder

Author: Wang, Yong, Ke, Zhongling, Li, Yufen, Qiu, Mingqi, Liu, Jing, Yang, Zuozhen, Wen, Shu, Liang, Mengmeng, and Chen, Shan
Published: 2024
Full Text: View/download PDF

39. Reconfiguration of low-voltage distributed power sources within electric power's distribution network based on improved particle swarm-fish swarm fusibility algorithm

Author: Xu, Xiaowei, Nie, Ding, Xu, Wenhua, Xiang, Enxin, Chen, Shan, Nie, Yongjie, Fu, Xiao, Xu, Wan, and Han, Yiming
Published: 2024
Full Text: View/download PDF

40. Molecular mechanism of bovine Gasdermin D-mediated pyroptosis

Author: Ge, Zhendong, Xu, Jinxia, Yang, Ke, Wu, Longjian, Chen, Shan, Chen, Biao, Tian, Jiangyao, Zhang, Jinpeng, Xu, Ahui, Huang, Bei, Song, Houhui, and Yang, Yang
Published: 2024
Full Text: View/download PDF

41. Dipterocarpoidae genomics reveal their demography and adaptations to Asian rainforests

Author: Wang, Rong, Liu, Chao-Nan, Segar, Simon T., Jiang, Yu-Ting, Zhang, Kai-Jian, Jiang, Kai, Wang, Gang, Cai, Jing, Chen, Lu-Fan, Chen, Shan, Cheng, Jing, Compton, Stephen G., Deng, Jun-Yin, Ding, Yuan-Yuan, Du, Fang K., Hu, Xiao-Di, Hu, Xing-Hua, Kang, Ling, Li, Dong-Hai, Lu, Ling, Li, Yuan-Yuan, Tang, Liang, Tong, Xin, Wang, Zheng-Shi, Xu, Wei-Wei, Yang, Yang, Zang, Run-Guo, Zu, Zhuo-Xin, Zhang, Yuan-Ye, and Chen, Xiao-Yong
Published: 2024
Full Text: View/download PDF

42. Circulating exosomal mir-16-2-3p is associated with coronary microvascular dysfunction in diabetes through regulating the fatty acid degradation of endothelial cells

Author: Liu, Yihai, Zhong, Chongxia, Chen, Shan, Xue, Yanan, Wei, Zhonghai, Dong, Li, and Kang, Lina
Published: 2024
Full Text: View/download PDF

43. SLA-ORECS: an SLA-oriented framework for reallocating resources in edge-cloud systems

Author: Lan, Shizhan, Duan, Zhuoxi, Lu, Song, Tan, Bin, Chen, Shi, Liang, Yeyu, and Chen, Shan
Published: 2024
Full Text: View/download PDF

44. Large language models to identify social determinants of health in electronic health records

Author: Guevara, Marco, Chen, Shan, Thomas, Spencer, Chaunzwa, Tafadzwa L., Franco, Idalid, Kann, Benjamin H., Moningi, Shalini, Qian, Jack M., Goldstein, Madeleine, Harper, Susan, Aerts, Hugo J. W. L., Catalano, Paul J., Savova, Guergana K., Mak, Raymond H., and Bitterman, Danielle S.
Published: 2024
Full Text: View/download PDF

45. Advancing STEAM education: a comprehensive assessment of competence

Author: Chen, Shan and Ding, Yuanzhao
Published: 2024
Full Text: View/download PDF

46. Tunable microwave absorption properties of Ni particles/carbon composites by calcination temperature

Author: Wei, Yupeng, Zhang, Meng, Li, Changze, Wang, Xudong, Zhang, Jiankang, Chen, Shan, Lin, Jingpeng, and Xiao, Rongzhen
Published: 2024
Full Text: View/download PDF

47. Free shipping policy for imported cross-border e-commerce platforms

Author: Han, Shuihua, Chen, Shan, Yang, Kai, Li, Hongcan, Yang, Fanjing, and Luo, Zongwei
Published: 2024
Full Text: View/download PDF

48. Sirpα on tumor-associated myeloid cells restrains antitumor immunity in colorectal cancer independent of its interaction with CD47

Author: Huang, Chunliu, Wang, Xuefei, Wang, Yingzhao, Feng, Yongyi, Wang, Xiumei, Chen, Shan, Yan, Peidong, Liao, Jing, Zhang, Qi, Mao, Chengzhou, Li, Yang, Wang, Lixiang, Wang, Xinyu, Yi, Wei, Cai, Weibin, Chen, Shoudeng, Hong, Ni, He, Weiling, Chen, Jun, and Jin, Wenfei
Published: 2024
Full Text: View/download PDF

49. Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification

Author: Chen, Shan, Li, Yingya, Lu, Sheng, Van, Hoang, Aerts, Hugo JWL, Savova, Guergana K., and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Recent advances in large language models (LLMs) have shown impressive ability in biomedical question-answering, but have not been adequately investigated for more specific biomedical applications. This study investigates the performance of LLMs such as the ChatGPT family of models (GPT-3.5s, GPT-4) in biomedical tasks beyond question-answering. Because no patient data can be passed to the OpenAI API public interface, we evaluated model performance with over 10000 samples as proxies for two fundamental tasks in the clinical domain - classification and reasoning. The first task is classifying whether statements of clinical and policy recommendations in scientific literature constitute health advice. The second task is causal relation detection from the biomedical literature. We compared LLMs with simpler models, such as bag-of-words (BoW) with logistic regression, and fine-tuned BioBERT models. Despite the excitement around viral ChatGPT, we found that fine-tuning for two fundamental NLP tasks remained the best strategy. The simple BoW model performed on par with the most complex LLM prompting. Prompt engineering required significant investment., Comment: 28 pages, 2 tables and 4 figures. Submitting for review
Published: 2023
Full Text: View/download PDF

50. Natural language processing to automatically extract the presence and severity of esophagitis in notes of patients undergoing radiotherapy

Author: Chen, Shan, Guevara, Marco, Ramirez, Nicolas, Murray, Arpi, Warner, Jeremy L., Aerts, Hugo JWL, Miller, Timothy A., Savova, Guergana K., Mak, Raymond H., and Bitterman, Danielle S.
Subjects: Computer Science - Computation and Language
Abstract: Radiotherapy (RT) toxicities can impair survival and quality-of-life, yet remain under-studied. Real-world evidence holds potential to improve our understanding of toxicities, but toxicity information is often only in clinical notes. We developed natural language processing (NLP) models to identify the presence and severity of esophagitis from notes of patients treated with thoracic RT. We fine-tuned statistical and pre-trained BERT-based models for three esophagitis classification tasks: Task 1) presence of esophagitis, Task 2) severe esophagitis or not, and Task 3) no esophagitis vs. grade 1 vs. grade 2-3. Transferability was tested on 345 notes from patients with esophageal cancer undergoing RT. Fine-tuning PubmedBERT yielded the best performance. The best macro-F1 was 0.92, 0.82, and 0.74 for Task 1, 2, and 3, respectively. Selecting the most informative note sections during fine-tuning improved macro-F1 by over 2% for all tasks. Silver-labeled data improved the macro-F1 by over 3% across all tasks. For the esophageal cancer notes, the best macro-F1 was 0.73, 0.74, and 0.65 for Task 1, 2, and 3, respectively, without additional fine-tuning. To our knowledge, this is the first effort to automatically extract esophagitis toxicity severity according to CTCAE guidelines from clinic notes. The promising performance provides proof-of-concept for NLP-based automated detailed toxicity monitoring in expanded domains., Comment: 17 pages, 6 tables, 1figure, submiting to JCO-CCI for review
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

7,375 results on '"Chen, Shan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources