Database: 3 selected / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. Papilusion at DAGPap24: Paper or Illusion? Detecting AI-generated Scientific Papers

Author: Andreev, Nikita, Shirnin, Alexander, Mikhailov, Vladislav, and Artemova, Ekaterina
Subjects: Computer Science - Computation and Language
Abstract: This paper presents Papilusion, an AI-generated scientific text detector developed within the DAGPap24 shared task on detecting automatically generated scientific papers. We propose an ensemble-based approach and conduct ablation studies to analyze the effect of the detector configurations on the performance. Papilusion is ranked 6th on the leaderboard, and we improve our performance after the competition ended, achieving 99.46 (+9.63) of the F1-score on the official test set., Comment: to appear in "The 4th Workshop on Scholarly Document Processing @ ACL 2024" proceedings
Published: 2024

2. Pen-and-paper Rituals in Service Interaction: Combining High-touch and High-tech in Financial Advisory Encounters

Author: Dolata, Mateusz, Agotai, Doris, Schubiger, Simon, and Schwabe, Gerhard
Subjects: Computer Science - Human-Computer Interaction
Abstract: Advisory services are ritualized encounters between an expert and an advisee. Empathetic, high-touch relationship between those two parties was identified as the key aspect of a successful advisory encounter. To facilitate the high-touch interaction, advisors established rituals which stress the unique, individual character of each client and each single encounter. Simultaneously, organizations like banks or insurances rolled out tools and technologies for use in advisory services to offer a uniform experience and consistent quality across branches and advisors. As a consequence, advisors were caught between the high-touch and high-tech aspects of an advisory service. This manuscript presents a system that accommodates for high-touch rituals and practices and combines them with high-tech collaboration. The proposed solution augments pen-and-paper practices with digital content and affords new material performances coherent with the existing rituals. The evaluation in realistic mortgage advisory services unveils the potential of mixed reality approaches for application in professional, institutional settings. The blow-by-blow analysis of the conversations reveals how an advisory service can become equally high-tech and high-touch thanks to a careful ritual-oriented system design. As a consequence, this paper presents a solution to the tension between the high-touch and high-tech tendencies in advisory services.
Published: 2024
Full Text: View/download PDF

3. LLM-Powered Ensemble Learning for Paper Source Tracing: A GPU-Free Approach

Author: Chen, Kunlong, Wang, Junjun, Chen, Zhaoqun, Chen, Kunjin, and Chen, Yitian
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: We participated in the KDD CUP 2024 paper source tracing competition and achieved the 3rd place. This competition tasked participants with identifying the reference sources (i.e., ref-sources, as referred to by the organizers of the competition) of given academic papers. Unlike most teams that addressed this challenge by fine-tuning pre-trained neural language models such as BERT or ChatGLM, our primary approach utilized closed-source large language models (LLMs). With recent advancements in LLM technology, closed-source LLMs have demonstrated the capability to tackle complex reasoning tasks in zero-shot or few-shot scenarios. Consequently, in the absence of GPUs, we employed closed-source LLMs to directly generate predicted reference sources from the provided papers. We further refined these predictions through ensemble learning. Notably, our method was the only one among the award-winning approaches that did not require the use of GPUs for model training. Code available at https://github.com/Cklwanfifa/KDDCUP2024-PST.
Published: 2024

4. Flexible Trilayer Cellulosic Paper Separators engineered with BaTiO$_3$ ferroelectric fillers for High Energy Density Sodium-ion Batteries

Author: Sapra, Simranjot K., Das, Mononita, Raja, M. Wasim, Chang, Jeng-Kuei, and Dhaka, Rajendra S.
Subjects: Condensed Matter - Materials Science, Physics - Applied Physics, Physics - Chemical Physics
Abstract: We design a full cell configuration having Na$_{3}$V$_{2}$(PO$_{4}$)$_{3}$ as cathode and pre-sodiated hard carbon as an anode with Cellulosic Paper Separators and compare the electrochemical performance of these ceramic-impregnated polymer-coated cellulose paper separators with commercial glass fiber separator. Notably, the paper-based multilayer separators provide desirable characteristics such as excellent electrolyte wettability, thermal stability up to 200\degree C, and ionic conductivity, which are essential for the efficient operation of SIBs. The cellulose separator is coated by a layer of polyvinylidene fluoride polymer, followed by a second layer of styrene butadiene rubber (SBR) polymer in which ferroelectric fillers BaTiO$_{3}$ are integrated, which interacts with the polymer hosts through Lewis acid-base interactions ion and improves the conduction mechanism for the Na$^{+}$ ions. The final lamination is performed by varying the SBR concentrations (0.5, 0.75, and 1.0 w/v\%). The incorporated polymer matrices improve the flexibility, adhesion and dispersion of the nanoparticles and affinity of the electrolyte to the electrode. The morphology of the paper separators shows the uniform interconnected fibers with the porous structure. Interestingly, we find that the paper separator with 0.75 w/v\% content of SBR exhibit decreased interfacial resistance and improved electrochemical performance, having retention of 62\% and nearly 100\% Coulombic efficiency up to 240 cycles, as compared to other concentrations. Moreover, we observe the energy density around 376 Wh kg$^{-1}$ (considering cathode weight), which found to be comparable to the commercially available glass fiber separator. Our results demonstrate the potential of these multilayer paper separators towards achieving sustainability and safety in energy storage systems., Comment: submitted
Published: 2024

5. Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Author: Lin, Guanyu, Feng, Tao, Han, Pengrui, Liu, Ge, and You, Jiaxuan
Subjects: Computer Science - Computation and Language
Abstract: As scientific research proliferates, researchers face the daunting task of navigating and reading vast amounts of literature. Existing solutions, such as document QA, fail to provide personalized and up-to-date information efficiently. We present Paper Copilot, a self-evolving, efficient LLM system designed to assist researchers, based on thought-retrieval, user profile and high performance optimization. Specifically, Paper Copilot can offer personalized research services, maintaining a real-time updated database. Quantitative evaluation demonstrates that Paper Copilot saves 69.92\% of time after efficient deployment. This paper details the design and implementation of Paper Copilot, highlighting its contributions to personalized academic support and its potential to streamline the research process.
Published: 2024

6. Jet observables in heavy ion collisions : a white paper

Author: Budhraja, Ankita, van Leeuwen, Marco, and Milhano, José Guilherme
Subjects: High Energy Physics - Phenomenology, High Energy Physics - Experiment
Abstract: This paper presents an overview of a survey of jet substructure observables used to study modifications of jets induced by interaction with a Quark Gluon Plasma. We further outline ideas that were presented and discussed at the \textit{New jet quenching tools to explore equilibrium and non-equilibrium dynamics in heavy-ion collisions} workshop, which was held in February 2024 at the ECT$^{*}$ in Trento, Italy. The goal of this white paper is to provide a brief report on the study of jet quenching observables earlier conducted and to present new ideas that could be relevant for future explorations., Comment: 12 pages, 2 figures, NA3:Jet-QGP group white paper
Published: 2024

7. Recommended receiver papers for ALMA users

Author: Bakx, Tom and Conway, John
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Earth and Planetary Astrophysics, Astrophysics - Astrophysics of Galaxies, Astrophysics - Solar and Stellar Astrophysics
Abstract: The Atacama Large Millimetre/submillimetre Array (ALMA) receivers and technical papers are cited fewer than once in every six publications. This citation shortage is impeding the development of future (sub)millimetre instruments. In an effort to facilitate the correct citations of ALMA receivers and technical papers, this memo provides a comprehensive list of papers for the scientific community. This list was produced in discussion with the scientific and instrumentalist community, based on a June 2024 survey at the European Southern Observatory workshop on the ALMA Wideband Sensitivity Upgrade, as well as with the ALMA technical staff. The authors now encourage the community to enhance their already-excellent ALMA science with the appropriate references to ensure future (sub)millimetre instrumentation can keep addressing the key questions about our Universe., Comment: 8 pages; 1 figure; 3 tables; continuously updated as more papers become available. Suggestions are warmly welcomed. ALMA memos solely contains the opinion of the authors in discussion with the astronomer and instrument builder community, and does not reflect the official policy of the ALMA telescope
Published: 2024

8. Temporal Graph Neural Network-Powered Paper Recommendation on Dynamic Citation Networks

Author: Shen, Junhao, Haqqani, Mohammad Ausaf Ali, Hu, Beichen, Huang, Cheng, Xie, Xihao, Lee, Tsengdar, and Zhang, Jia
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Due to the rapid growth of scientific publications, identifying all related reference articles in the literature has become increasingly challenging yet highly demanding. Existing methods primarily assess candidate publications from a static perspective, focusing on the content of articles and their structural information, such as citation relationships. There is a lack of research regarding how to account for the evolving impact among papers on their embeddings. Toward this goal, this paper introduces a temporal dimension to paper recommendation strategies. The core idea is to continuously update a paper's embedding when new citation relationships appear, enhancing its relevance for future recommendations. Whenever a citation relationship is added to the literature upon the publication of a paper, the embeddings of the two related papers are updated through a Temporal Graph Neural Network (TGN). A learnable memory update module based on a Recurrent Neural Network (RNN) is utilized to study the evolution of the embedding of a paper in order to predict its reference impact in a future timestamp. Such a TGN-based model learns a pattern of how people's views of the paper may evolve, aiming to guide paper recommendations more precisely. Extensive experiments on an open citation network dataset, including 313,278 articles from https://paperswithcode.com/about PaperWithCode, have demonstrated the effectiveness of the proposed approach., Comment: 10 pages, 4 figures, accepted by SDU@AAAI-2024. The AAAI Workshop on Scientific Document Understanding (2024)
Published: 2024

9. Analysis of the ICML 2023 Ranking Data: Can Authors' Opinions of Their Own Papers Assist Peer Review in Machine Learning?

Author: Su, Buxin, Zhang, Jiayao, Collina, Natalie, Yan, Yuling, Li, Didong, Cho, Kyunghyun, Fan, Jianqing, Roth, Aaron, and Su, Weijie J.
Subjects: Statistics - Applications, Computer Science - Digital Libraries, Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We conducted an experiment during the review process of the 2023 International Conference on Machine Learning (ICML) that requested authors with multiple submissions to rank their own papers based on perceived quality. We received 1,342 rankings, each from a distinct author, pertaining to 2,592 submissions. In this paper, we present an empirical analysis of how author-provided rankings could be leveraged to improve peer review processes at machine learning conferences. We focus on the Isotonic Mechanism, which calibrates raw review scores using author-provided rankings. Our analysis demonstrates that the ranking-calibrated scores outperform raw scores in estimating the ground truth ``expected review scores'' in both squared and absolute error metrics. Moreover, we propose several cautious, low-risk approaches to using the Isotonic Mechanism and author-provided rankings in peer review processes, including assisting senior area chairs' oversight of area chairs' recommendations, supporting the selection of paper awards, and guiding the recruitment of emergency reviewers. We conclude the paper by addressing the study's limitations and proposing future research directions., Comment: See more details about the experiment at https://openrank.cc/
Published: 2024

10. CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers

Author: Trofimova, Ekaterina, Sataev, Emil, and Jowhari, Abhijit Singh
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: This paper presents CodeRefine, a novel framework for automatically transforming research paper methodologies into functional code using Large Language Models (LLMs). Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph using a predefined ontology. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach. CodeRefine addresses the challenge of bridging theoretical research and practical implementation, offering a more accurate alternative to LLM zero-shot prompting. Evaluations on diverse scientific papers demonstrate CodeRefine's ability to improve code implementation from the paper, potentially accelerating the adoption of cutting-edge algorithms in real-world applications.
Published: 2024

11. Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis

Author: Nishio, S., Nonaka, H., Tsuchiya, N., Migita, A., Banno, Y., Hayashi, T., Sakaji, H., Sakumoto, T., and Watabe, K.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Machine learning is widely utilized across various industries. Identifying the appropriate machine learning models and datasets for specific tasks is crucial for the effective industrial application of machine learning. However, this requires expertise in both machine learning and the relevant domain, leading to a high learning cost. Therefore, research focused on extracting combinations of tasks, machine learning models, and datasets from academic papers is critically important, as it can facilitate the automatic recommendation of suitable methods. Conventional information extraction methods from academic papers have been limited to identifying machine learning models and other entities as named entities. To address this issue, this study proposes a methodology extracting tasks, machine learning methods, and dataset names from scientific papers and analyzing the relationships between these information by using LLM, embedding model, and network clustering. The proposed method's expression extraction performance, when using Llama3, achieves an F-score exceeding 0.8 across various categories, confirming its practical utility. Benchmarking results on financial domain papers have demonstrated the effectiveness of this method, providing insights into the use of the latest datasets, including those related to ESG (Environmental, Social, and Governance) data., Comment: 10 pages, 8 figures
Published: 2024

12. Design and Development of Paper-based Spirometry Device and its Smart-phone based Lung Condition Monitoring and Analysis Software

Author: Miglani, Yatin, P, Muddukrishna, Pande, Harshvardhan, Varma, Manoj M., and Toley, Bhushan
Subjects: Electrical Engineering and Systems Science - Signal Processing, Physics - Applied Physics
Abstract: We describe the design, construction and characterisation of paper based devices to perform spirometry, a standard test for lung function assessment. In this device the instantaneous flowrate of the incoming breath from a person gets transformed to a specific acoustic frequency. In this manner, the time course of the persons breath profile is mapped to a time varying acoustic signal. The captured acoustic signal can be converted to standard spirometry curves and relevant parameters can be extracted as we describe in this paper. We compared our device with commercially available devices and showed that these paper based devices provide similar performance. These devices have advantage of low cost and simplicity of operation compared to currently available commercial products.
Published: 2024

13. Handwritten Code Recognition for Pen-and-Paper CS Education

Author: Islam, Md Sazzad, Doumbouya, Moussa Koulako Bala, Manning, Christopher D., and Piech, Chris
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computers and Society, Computer Science - Human-Computer Interaction
Abstract: Teaching Computer Science (CS) by having students write programs by hand on paper has key pedagogical advantages: It allows focused learning and requires careful thinking compared to the use of Integrated Development Environments (IDEs) with intelligent support tools or "just trying things out". The familiar environment of pens and paper also lessens the cognitive load of students with no prior experience with computers, for whom the mere basic usage of computers can be intimidating. Finally, this teaching approach opens learning opportunities to students with limited access to computers. However, a key obstacle is the current lack of teaching methods and support software for working with and running handwritten programs. Optical character recognition (OCR) of handwritten code is challenging: Minor OCR errors, perhaps due to varied handwriting styles, easily make code not run, and recognizing indentation is crucial for languages like Python but is difficult to do due to inconsistent horizontal spacing in handwriting. Our approach integrates two innovative methods. The first combines OCR with an indentation recognition module and a language model designed for post-OCR error correction without introducing hallucinations. This method, to our knowledge, surpasses all existing systems in handwritten code recognition. It reduces error from 30\% in the state of the art to 5\% with minimal hallucination of logical fixes to student programs. The second method leverages a multimodal language model to recognize handwritten programs in an end-to-end fashion. We hope this contribution can stimulate further pedagogical research and contribute to the goal of making CS education universally accessible. We release a dataset of handwritten programs and code to support future research at https://github.com/mdoumbouya/codeocr
Published: 2024

14. The State of Reproducibility Stamps for Visualization Research Papers

Author: Isenberg, Tobias
Subjects: Computer Science - Graphics, Computer Science - Digital Libraries, Computer Science - Human-Computer Interaction
Abstract: I analyze the evolution of papers certified by the Graphics Replicability Stamp Initiative (GRSI) to be reproducible, with a specific focus on the subset of publications that address visualization-related topics. With this analysis I show that, while the number of papers is increasing overall and within the visualization field, we still have to improve quite a bit to escape the replication crisis. I base my analysis on the data published by the GRSI as well as publication data for the different venues in visualization and lists of journal papers that have been presented at visualization-focused conferences. I also analyze the differences between the involved journals as well as the percentage of reproducible papers in the different presentation venues. Furthermore, I look at the authors of the publications and, in particular, their affiliation countries to see where most reproducible papers come from. Finally, I discuss potential reasons for the low reproducibility numbers and suggest possible ways to overcome these obstacles. This paper is reproducible itself, with source code and data available from github.com/tobiasisenberg/Visualization-Reproducibility as well as a free paper copy and all supplemental materials at osf.io/mvnbj., Comment: 9 pages plus appendix; 12 figures plus 14 figures in the appendix
Published: 2024

15. SoK: Fighting Counterfeits with Cyber-Physical Synergy Based on Physically-Unclonable Identifiers of Paper Surface

Author: Nakra, Anirudh, Wu, Min, and Wong, Chau-Wai
Subjects: Computer Science - Cryptography and Security
Abstract: Counterfeit products cause severe harm to public safety and health by penetrating untrusted supply chains. Numerous anti-counterfeiting techniques have been proposed, among which the use of inherent, unclonable irregularities of paper surfaces has shown considerable potential as a high-performance economical solution. Prior works do not consider supply chains cohesively, either focusing on creating or improving unclonable identifiers or on securing digital records of products. This work aims to systematically unify these two separate but connected research areas by comprehensively analyzing the needs of supply chains. We construct a generalized paper-based authentication framework and identify important shortcomings and promising ideas in the existing literature. Next, we do a stage-wise security analysis of our consolidated framework by drawing inspiration from works in signal processing, cryptography, and biometric systems. Finally, we examine key representative scenarios that illustrate the range of practical and technical challenges in real-world supply chains, and we outline the best practices to guide future research.
Published: 2024

16. Predicting citation impact of research papers using GPT and other text embeddings

Author: Vital Jr., Adilson, Silva, Filipi N., Oliveira Jr., Osvaldo N., and Amancio, Diego R.
Subjects: Computer Science - Digital Libraries
Abstract: The impact of research papers, typically measured in terms of citation counts, depends on several factors, including the reputation of the authors, journals, and institutions, in addition to the quality of the scientific work. In this paper, we present an approach that combines natural language processing and machine learning to predict the impact of papers in a specific journal. Our focus is on the text, which should correlate with impact and the topics covered in the research. We employed a dataset of over 40,000 articles from ACS Applied Materials and Interfaces spanning from 2012 to 2022. The data was processed using various text embedding techniques and classified with supervised machine learning algorithms. Papers were categorized into the top 20% most cited within the journal, using both yearly and cumulative citation counts as metrics. Our analysis reveals that the method employing generative pre-trained transformers (GPT) was the most efficient for embedding, while the random forest algorithm exhibited the best predictive power among the machine learning algorithms. An optimized accuracy of 80\% in predicting whether a paper was among the top 20% most cited was achieved for the cumulative citation count when abstracts were processed. This accuracy is noteworthy, considering that author, institution, and early citation pattern information were not taken into account. The accuracy increased only slightly when the full texts of the papers were processed. Also significant is the finding that a simpler embedding technique, term frequency-inverse document frequency (TFIDF), yielded performance close to that of GPT. Since TFIDF captures the topics of the paper we infer that, apart from considering author and institution biases, citation counts for the considered journal may be predicted by identifying topics and "reading" the abstract of a paper.
Published: 2024

17. A Process for Reviewing Design Science Research Papers to Enhance Content Knowledge & Research Opportunities

Author: Osei-Bryson, Kweku-Muata
Subjects: Computer Science - Computers and Society, Computer Science - Digital Libraries, H.m, H.1
Abstract: Most published Information Systems research are of the behavioral science research (BSR) category rather than the design science research (DSR) category. This is due in part to the BSR orientation of many IS doctoral programs, which often do not involve much technical courses. This includes IS doctoral programs that train Information and Communication Technologies for Development (ICT4D) researchers. Without such technical knowledge many doctoral and postdoctoral researchers will not feel confident in engaging in DSR research. Given the importance of designing artifacts that are appropriate for a given context, an important question is how can ICT4D and other IS researchers increase their IS technical content knowledge and intimacy with the DSR process. In this paper we present, a process for reviewing DSR papers that has as its objectives: enhancing technical content knowledge, increasing knowledge and understanding of approaches to designing and evaluating IS/IT artifacts, and facilitating the identification of new DSR opportunities. This process has been applied for more than a decade at a USA research university., Comment: 44 pages, 9 Figures
Published: 2024

18. Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing

Author: Xu, Aobo, Chang, Bingyu, Liu, Qingpeng, and Jian, Ling
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP OAG-Challenge PST track, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal., Comment: KDD CUP 2024 OAG-Challenges, Paper Source Tracing, Technical Report of Team AoboSama @ KDD CUP 2024. August 25--29, 2024. Barcelona, Spain
Published: 2024

19. SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Author: Pramanick, Shraman, Chellappa, Rama, and Venugopalan, Subhashini
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Seeking answers to questions within long scientific research articles is a crucial area of study that aids readers in quickly addressing their inquiries. However, existing question-answering (QA) datasets based on scientific papers are limited in scale and focus solely on textual content. To address this limitation, we introduce SPIQA (Scientific Paper Image Question Answering), the first large-scale QA dataset specifically designed to interpret complex figures and tables within the context of scientific research articles across various domains of computer science. Leveraging the breadth of expertise and ability of multimodal large language models (MLLMs) to understand figures, we employ automatic and manual curation to create the dataset. We craft an information-seeking task involving multiple images that cover a wide variety of plots, charts, tables, schematic diagrams, and result visualizations. SPIQA comprises 270K questions divided into training, validation, and three different evaluation splits. Through extensive experiments with 12 prominent foundational models, we evaluate the ability of current multimodal systems to comprehend the nuanced aspects of research articles. Additionally, we propose a Chain-of-Thought (CoT) evaluation strategy with in-context retrieval that allows fine-grained, step-by-step assessment and improves model performance. We further explore the upper bounds of performance enhancement with additional textual information, highlighting its promising potential for future research and the dataset's impact on revolutionizing how we interact with scientific literature., Comment: preprint
Published: 2024

20. H-FCBFormer Hierarchical Fully Convolutional Branch Transformer for Occlusal Contact Segmentation with Articulating Paper

Author: Banks, Ryan, Rovira-Lastra, Bernat, Martinez-Gomis, Jordi, Chaurasia, Akhilanand, and Li, Yunpeng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, I.2.1, I.2.10, J.3
Abstract: Occlusal contacts are the locations at which the occluding surfaces of the maxilla and the mandible posterior teeth meet. Occlusal contact detection is a vital tool for restoring the loss of masticatory function and is a mandatory assessment in the field of dentistry, with particular importance in prosthodontics and restorative dentistry. The most common method for occlusal contact detection is articulating paper. However, this method can indicate significant medically false positive and medically false negative contact areas, leaving the identification of true occlusal indications to clinicians. To address this, we propose a multiclass Vision Transformer and Fully Convolutional Network ensemble semantic segmentation model with a combination hierarchical loss function, which we name as Hierarchical Fully Convolutional Branch Transformer (H-FCBFormer). We also propose a method of generating medically true positive semantic segmentation masks derived from expert annotated articulating paper masks and gold standard masks. The proposed model outperforms other machine learning methods evaluated at detecting medically true positive contacts and performs better than dentists in terms of accurately identifying object-wise occlusal contact areas while taking significantly less time to identify them. Code is available at https://github.com/Banksylel/H-FCBFormer., Comment: 15 pages, 5 figures, 2 tables, 5 equations, peer reviewed and accepted to Medical Imaging Understanding and Analysis (MIUA 2024)
Published: 2024
Full Text: View/download PDF

21. Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

Author: Yu, Jianxiang, Ding, Zichen, Tan, Jiaqi, Luo, Kangyang, Weng, Zhenmin, Gong, Chenghua, Zeng, Long, Cui, Renjing, Han, Chengcheng, Sun, Qiushi, Wu, Zhiyong, Lan, Yunshi, and Li, Xiang
Subjects: Computer Science - Computation and Language, Computer Science - Digital Libraries, Computer Science - Information Retrieval
Abstract: In recent years, the rapid increase in scientific papers has overwhelmed traditional review mechanisms, resulting in varying quality of publications. Although existing methods have explored the capabilities of Large Language Models (LLMs) for automated scientific reviewing, their generated contents are often generic or partial. To address the issues above, we introduce an automated paper reviewing framework SEA. It comprises of three modules: Standardization, Evaluation, and Analysis, which are represented by models SEA-S, SEA-E, and SEA-A, respectively. Initially, SEA-S distills data standardization capabilities of GPT-4 for integrating multiple reviews for a paper. Then, SEA-E utilizes standardized data for fine-tuning, enabling it to generate constructive reviews. Finally, SEA-A introduces a new evaluation metric called mismatch score to assess the consistency between paper contents and reviews. Moreover, we design a self-correction strategy to enhance the consistency. Extensive experimental results on datasets collected from eight venues show that SEA can generate valuable insights for authors to improve their papers.
Published: 2024

22. Corrections to a paper of Allard and Almgren on the uniqueness of tangent cones

Author: Allard, William K.
Subjects: Mathematics - Differential Geometry, 49Q20: Variational problems in a geometric measure-theoretic setting
Abstract: The paper referred to in the title is Allard/Almgren 1981 [AA81]. Several months ago Francesco Maggi emailed me saying that the inequalities 5.3(4),(5) of \cite{AA81} were wrong. In fact, as he pointed out, their incorrectness is immediately apparent if one takes $Z=0$ there. Maggi and his coauthor wanted to use these inequalities in their paper \cite{MN}. They were able to obtain a version of these inequalities which suffice for the carrying out the work in \cite{MN}. I started writing this paper in order to provide a version of these inequalities as needed in [AA81]. In thinking about this material I began to realize there were other problems with the paper. As a result I ended up {\em completely rewriting 5.1-5.4 on pages 243-248 of \cite{AA81}}; this rewrite is the contents of this paper. In addition to many annoying misprints many of the needed definitions and proofs in 5.1-5.4 are incomplete or absent. For example, it is not said where $z$ in 5.1(2) comes from; this omission completely surprised me since I remember doing a lot of work to come up with $z$ when \cite{AA81} was being written. Also, much of the necessary material about the reach of a submanifold as in \cite{FE2} is not provided in \cite{AA81}. The table of contents can serve as an index. In particular one sees there where the constants $\epsilon_1$ through $\epsilon_6$ are introduced. This material is extremely technical. One way to navigate this paper would be to start looking at Proposition 9.5 and work backwards.
Published: 2024

23. Synthetic Test Data Generation Using Recurrent Neural Networks: A Position Paper

Author: Behjati, Razieh, Arisholm, Erik, Tan, Chao, and Bedregal, Margrethe M.
Subjects: Computer Science - Software Engineering, Computer Science - Databases, Computer Science - Machine Learning, Computer Science - Logic in Computer Science
Abstract: Testing in production-like test environments is an essential part of quality assurance processes in many industries. Provisioning of such test environments, for information-intensive services, involves setting up databases that are rich-enough to enable simulating a wide variety of user scenarios. While production data is perhaps the gold-standard here, many organizations, particularly within the public sectors, are not allowed to use production data for testing purposes due to privacy concerns. The alternatives are to use anonymized data, or synthetically generated data. In this paper, we elaborate on these alternatives and compare them in an industrial context. Further we focus on synthetic data generation and investigate the use of recurrent neural networks for this purpose. In our preliminary experiments, we were able to generate representative and highly accurate data using a recurrent neural network. These results open new research questions that we discuss here, and plan to investigate in our future research., Comment: This paper was published in the proceedings of RAISE@ICSE in 2019
Published: 2024
Full Text: View/download PDF

24. What do we study when studying politics and democracy? A semantic analysis of how politics and democracy are treated in SIGCHI conference papers

Author: Nelimarkka, Matti and Vuorenmaa, Ville
Subjects: Computer Science - Human-Computer Interaction
Abstract: Human-computer interaction scholars are increasingly touching on topics related to politics or democracy. As these concepts are ambiguous, an examination of concepts' invoked meanings aids in the self-reflection of our research efforts. We conduct a thematic analysis of all papers with the word `politics' in abstract, title or keywords ($n$=378) and likewise 152 papers with the word `democracy.' We observe that these words are increasingly being used in human-computer interaction, both in absolute and relative terms. At the same time, we show that researchers invoke these words with diverse levels of analysis in mind: the early research focused on mezzo-level (i.e., small groups), but more recently the work has begun to include macro-level analysis (i.e., society and politics as played in the public sphere). After the increasing focus on the macro-level, we see a transition towards more normative and activist research, in some areas it replaces observational and empirical research. These differences indicate semantic differences, which -- in the worst case -- may limit scientific progress. We bring these differences visible to help in further exchanges of ideas and human-computer interaction community to explore how it orients itself to politics and democracy.
Published: 2024

25. NLP-Powered Repository and Search Engine for Academic Papers: A Case Study on Cyber Risk Literature with CyLit

Author: Zhang, Linfeng, Hu, Changyue, and Quan, Zhiyu
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: As the body of academic literature continues to grow, researchers face increasing difficulties in effectively searching for relevant resources. Existing databases and search engines often fall short of providing a comprehensive and contextually relevant collection of academic literature. To address this issue, we propose a novel framework that leverages Natural Language Processing (NLP) techniques. This framework automates the retrieval, summarization, and clustering of academic literature within a specific research domain. To demonstrate the effectiveness of our approach, we introduce CyLit, an NLP-powered repository specifically designed for the cyber risk literature. CyLit empowers researchers by providing access to context-specific resources and enabling the tracking of trends in the dynamic and rapidly evolving field of cyber risk. Through the automatic processing of large volumes of data, our NLP-powered solution significantly enhances the efficiency and specificity of academic literature searches. We compare the literature categorization results of CyLit to those presented in survey papers or generated by ChatGPT, highlighting the distinctive insights this tool provides into cyber risk research literature. Using NLP techniques, we aim to revolutionize the way researchers discover, analyze, and utilize academic resources, ultimately fostering advancements in various domains of knowledge.
Published: 2024

26. FRAMER/Miu: Tagged Pointer-based Capability and Fundamental Cost of Memory Safety & Coherence (Position Paper)

Author: Nam, Myoung Jin
Subjects: Computer Science - Cryptography and Security
Abstract: Ensuring system correctness, such as memory safety, can eliminate security vulnerabilities that attackers could exploit in the first place. However, high and unpredictable performance degradation remains a primary challenge. Recognizing that it is extremely difficult to achieve complete system correctness for production deployment, researchers make trade-offs between performance, detection coverage, interoperability, precision, and detection timing. This research strikes a balance between comprehensive system protection and the costs required to obtain it, identifies the desirable roles of software and hardware, and presents a tagged pointer-based capability system as a stand-alone software solution and a prototype for future hardware design. This paper presents follow-up plans for the FRAMER/Miu generic framework to achieve these goals., Comment: 8 pages, 4 figures
Published: 2024

27. Survey Paper on Control Barrier Functions

Author: Panja, Promit
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: Control Barrier Functions (CBFs) have emerged as a powerful paradigm in control theory, providing a principled approach to enforcing safety-critical constraints in dynamic systems. This survey paper comprehensively explores the foundational principles of CBFs, delves into the complexities of High Order Control Barrier Functions (HOCBFs), and extends the discussion to the adaptive realm with adaptive Control Barrier Functions (aCBFs). Through a systematic examination of theoretical underpinnings, practical applications, and the evolving landscape of research, this survey highlights the versatility of CBFs in addressing safety and stability challenges., Comment: 6 pages, 1 figure
Published: 2024

28. Working Paper: Conflicts and the New Scramble for African Resources -- A Shift-Share Approach

Author: Boulat, Raphaël
Subjects: Economics - General Economics
Abstract: This paper estimates the causal effect of mineral trade on conflicts in Africa using a Shift-Share IV approach based on an exogenous price-commodity shock. The main result is that an increase in mineral trade significantly increases the number of conflicts while it has no clear effect on fatalities. Exploring heterogeneous effects, I find that a specific group of minerals, oil and fuels, drives the results on the number of conflicts. Moreover, the group of rare minerals such as coltan, precious metals or cobalt has no effect on the number of conflicts but appears to have an important impact on the number of fatalities., Comment: 25 pages (with appendix)
Published: 2024

29. Investing in the Unrivaled Potential of Wide-Separation Sub-Jupiter Exoplanet Detection and Characterisation with JWST -- Strategic Exoplanet Initiatives with HST and JWST White Paper

Author: Carter, Aarynn L., Bowens-Rubin, Rachel, Calissendorff, Per, Kammerer, Jens, Li, Yiting, Meyer, Michael R., Booth, Mark, Factor, Samuel M., Franson, Kyle, Gaidos, Eric, Leisenring, Jarron M., Lew, Ben W. P., Martinez, Raquel A., Rebollido, Isabel, Rickman, Emily, Sutlieff, Ben J., Ward-Duong, Kimberly, and Zhang, Zhoujian
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Earth and Planetary Astrophysics, Astrophysics - Solar and Stellar Astrophysics
Abstract: We advocate for a large scale imaging survey of nearby young moving groups and star-forming regions to directly detect exoplanets over an unexplored range of masses, ages and orbits. Discovered objects will be identified early enough in JWST's lifetime to leverage its unparalleled capabilities for long-term atmospheric characterisation, and will uniquely complement the known population of exoplanets and brown dwarfs. Furthermore, this survey will constrain the occurrence of the novel wide sub-Jovian exoplanet population, informing multiple theories of planetary formation and evolution. Observations with NIRCam F200W+F444W dual-band coronagraphy will readily provide sub-Jupiter mass sensitivities beyond ~0.4" (F444W) and can also be used to rule out some contaminating background sources (F200W). At this large scale, targets can be sequenced by spectral type to enable robust self-referencing for PSF subtraction. This eliminates the need for dedicated reference observations required by GO programs and dramatically increases the overall science observing efficiency. With an exposure of ~30 minutes per target, the sub-Jupiter regime can be explored across 250 targets for ~400 hours of exposure time including overheads. An additional, pre-allocated, ~100 hours of observing time would enable rapid multi-epoch vetting of the lowest mass detections (which are undetectable in F200W). The total time required for a survey such as this is not fixed, and could be scaled in conjunction with the minimum number of detected exoplanet companions., Comment: 5 pages, 2 figures. This white paper was submitted following a call from the "Working Group on Strategic Exoplanet Initiatives with HST and JWST" (https://sites.google.com/view/exoplanet-strategy-wg, final report in 10.48550/arXiv.2404.02932)
Published: 2024

30. The Unrealised Interdisciplinary Advantage of Observing High Mass Transiting Exoplanets and Brown Dwarfs -- Strategic Exoplanet Initiatives with HST and JWST White Paper

Author: Carter, Aarynn L., Alam, Munazza. K., Beatty, Thomas, Casewell, Sarah, Chubb, Katy L., Hoch, Kielan, Lewis, Nikole, Lothringer, Joshua D., Manjavacas, Elena, Moran, Sarah E., and Wakeford, Hannah R.
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - Earth and Planetary Astrophysics, Astrophysics - Solar and Stellar Astrophysics
Abstract: We advocate for further prioritisation of atmospheric characterisation observations of high mass transiting exoplanets and brown dwarfs. This population acts as a unique comparative sample to the directly imaged exoplanet and brown dwarf populations, of which a range of JWST characterisation observations are planned. In contrast, only two observations of transiting exoplanets in this mass regime were performed in Cycle 1, and none are planned for Cycle 2. Such observations will: improve our understanding of how irradiation influences high gravity atmospheres, provide insights towards planetary formation and evolution across this mass regime, and exploit JWST's unique potential to characterise exoplanets across the known population., Comment: 4 pages, 2 figures. This white paper was submitted following a call from the "Working Group on Strategic Exoplanet Initiatives with HST and JWST" (https://sites.google.com/view/exoplanet-strategy-wg, final report in 10.48550/arXiv.2404.02932)
Published: 2024

31. The 1964 paper of John Bell

Author: Sen, Ujjwal
Subjects: Quantum Physics
Abstract: We present a commentary on the famous 1964 paper of John Bell that rules out the entire class of underlying hidden variable theories for quantum mechanics that are local., Comment: Commentary, 4 pages, to appear in the November issue of Resonance
Published: 2024

32. Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper)

Author: Han, Bin, Yang, Yiwei, Caspi, Anat, and Howe, Bill
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Equitable urban transportation applications require high-fidelity digital representations of the built environment: not just streets and sidewalks, but bike lanes, marked and unmarked crossings, curb ramps and cuts, obstructions, traffic signals, signage, street markings, potholes, and more. Direct inspections and manual annotations are prohibitively expensive at scale. Conventional machine learning methods require substantial annotated training data for adequate performance. In this paper, we consider vision language models as a mechanism for annotating diverse urban features from satellite images, reducing the dependence on human annotation to produce large training sets. While these models have achieved impressive results in describing common objects in images captured from a human perspective, their training sets are less likely to include strong signals for esoteric features in the built environment, and their performance in these settings is therefore unclear. We demonstrate proof-of-concept combining a state-of-the-art vision language model and variants of a prompting strategy that asks the model to consider segmented elements independently of the original image. Experiments on two urban features -- stop lines and raised tables -- show that while direct zero-shot prompting correctly annotates nearly zero images, the pre-segmentation strategies can annotate images with near 40% intersection-over-union accuracy. We describe how these results inform a new research agenda in automatic annotation of the built environment to improve equity, accessibility, and safety at broad scale and in diverse environments.
Published: 2024

33. CEERS Key Paper. IX. Identifying Galaxy Mergers in CEERS NIRCam Images Using Random Forests and Convolutional Neural Networks

Author: Rose, Caitlin, Kartaltepe, Jeyhan S., Snyder, Gregory F., Huertas-Company, Marc, Yung, L. Y. Aaron, Haro, Pablo Arrabal, Bagley, Micaela B., Bisigello, Laura, Calabrò, Antonello, Cleri, Nikko J., Dickinson, Mark, Ferguson, Henry C., Finkelstein, Steven L., Fontana, Adriano, Grazian, Andrea, Grogin, Norman A., Holwerda, Benne W., Iyer, Kartheik G., Kewley, Lisa J., Kirkpatrick, Allison, Kocevski, Dale D., Koekemoer, Anton M., Lotz, Jennifer M., Lucas, Ray A., Napolitan, Lorenzo, Papovich, Casey, Pentericci, Laura, Pérez-González, Pablo G., Pirzkal, Nor, Ravindranath, Swara, Somerville, Rachel S., Straughn, Amber N., Trump, Jonathan R., Wilkins, Stephen M., and Yang, Guang
Subjects: Astrophysics - Astrophysics of Galaxies
Abstract: A crucial yet challenging task in galaxy evolution studies is the identification of distant merging galaxies, a task which suffers from a variety of issues ranging from telescope sensitivities and limitations to the inherently chaotic morphologies of young galaxies. In this paper, we use random forests and convolutional neural networks to identify high-redshift JWST CEERS galaxy mergers. We train these algorithms on simulated $3
Published: 2024

34. Ball characterizations in planes and spaces of constant curvature, II \vskip.1cm \centerline{\rm{This pdf-file is not identical with the printed paper.}}

Author: Jerónimo-Castro, J. and Makai Jr, E.
Subjects: Mathematics - Metric Geometry, 52A55
Abstract: High proved the following theorem. If the intersections of any two congruent copies of a plane convex body are centrally symmetric, then this body is a circle. In our paper we extend the theorem of High to the sphere and the hyperbolic plane, and partly to spaces of constant curvature. We also investigate the dual question about the convex hull of the unions, rather than the intersections. Let us have in $H^2$ proper closed convex subsets $K,L$ with interior points, such that the numbers of the connected components of the boundaries of $K$ and $L$ are finite. We exactly describe all pairs of such subsets $K,L$, whose any congruent copies have an intersection with axial symmetry; there are nine cases. (The cases of $S^2$ and ${\Bbb{R}}^2$ were described in Part I, i.e., \cite{5}.) Let us have in $S^d$, ${\Bbb{R}}^d$ or $H^d$ proper closed convex $C^2_+$ subsets $K,L$ with interior points, such that all sufficiently small intersections of their congruent copies are symmetric w.r.t.\ a particular hyperplane. Then the boundary components of both $K$ and $L$ are congruent, and each of them is a sphere, a parasphere or a hypersphere. Let us have a pair of convex bodies in $S^d$, ${\Bbb{R}}^d$ or $H^d$, which have at any boundary points supporting spheres (for $S^d$ of radius less than $\pi /2$). If the convex hull of the union of any congruent copies of these bodies is centrally symmetric, then our bodies are congruent balls (for $S^d$ of radius less than $\pi /2$). An analogous statement holds for symmetry w.r.t.\ a particular hyperplane. For $d=2$, suppose the existence of the above supporting circles (for $S^2$ of radius less than $\pi /2$), and, for $S^2$, smoothness of $K$ and $L$. If we suppose axial symmetry of all the above convex hulls, then our bodies are (incongruent) circles (for $S^2$ of radii less than $\pi /2$)., Comment: 48 pages. arXiv admin note: text overlap with arXiv:1601.04494
Published: 2024

35. Comments concerning the paper 'On the calibration of ultra-high energy EASs at the Yakutsk array and Telescope Array' by A.V.Glushkov et. al

Author: Matthews, John N. and Tsunesada, Yoshiki
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: In a recent review of a paper by the Yakutsk Group, submitted to the Journal Physics of Atomic Nuclei and arXiv, the energy scales of the Yakutsk and Telescope Array (TA) experiments were examined. The authors developed a custom detector response simulator incorporating ionization, bremsstrahlung, pair production, and Compton scattering. Applying this simulator to both Yakutsk and TA surface detectors, they concluded that the TA energy scale might be incorrect due to a misdefined ``response unit.'' They referenced the TA's ``energy deposit formula'' from the literature, scaling it by two factors attributed to the thickness and density of the TA scintillator. Their simulations, using the QGSJET-II-04 hadronic interaction model, agreed with TA's calculations for vertical showers but not for inclined showers, suggesting an incorrect VEM unit of 2.05 MeV. However, this conclusion was found to be incorrect. The TA's energy deposit formula, derived from detailed Monte Carlo simulations using GEANT4, accurately represents the most probable energy deposit by a charged particle in the TA scintillator. The value of 2.05 MeV accounts for the scintillator's thickness and density and is validated by excellent agreement between TA's simulated and observed data. The Yakutsk Group's misinterpretation and incorrect application of the TA formula led to their erroneous conclusion., Comment: 2pages, comments on arXiv:2404.16948
Published: 2024

36. White Paper on Polarized Target Studies with Real Photons in Hall D

Author: Afzal, F., Dalton, M. M., Deur, A., Hurck, P., Keith, C. D., Mathieu, V., Sirca, S., and Yu, Z.
Subjects: Nuclear Experiment
Abstract: This white paper summarizes the Workshop on Polarized Target Studies with Real Photons in Hall D at Jefferson Lab, that took place on 21 February 2024. The Workshop included about 45 participants both online and in person at Florida State University in Tallahassee. Contributions describe the experimental infrastructure available in Hall D and potential physics applications. The rate and detection capabilities of Hall D are outlined, as well as the properties of a circularly polarized photon beam and a polarized target. Possible physics measurements include light and strange quark baryon spectroscopy, the GDH sum rule, proton structure accessed through measurement of Generalized Parton Distributions and modification of nucleon structure within the nuclear medium., Comment: 25 pages, 8 figures
Published: 2024

37. Using LLMs to label medical papers according to the CIViC evidence model

Author: Hisch, Markus and Wang, Xing David
Subjects: Computer Science - Computation and Language
Abstract: We introduce the sequence classification problem CIViC Evidence to the field of medical NLP. CIViC Evidence denotes the multi-label classification problem of assigning labels of clinical evidence to abstracts of scientific papers which have examined various combinations of genomic variants, cancer types, and treatment approaches. We approach CIViC Evidence using different language models: We fine-tune pretrained checkpoints of BERT and RoBERTa on the CIViC Evidence dataset and challenge their performance with models of the same architecture which have been pretrained on domain-specific text. In this context, we find that BiomedBERT and BioLinkBERT can outperform BERT on CIViC Evidence (+0.8% and +0.9% absolute improvement in class-support weighted F1 score). All transformer-based models show a clear performance edge when compared to a logistic regression trained on bigram tf-idf scores (+1.5 - 2.7% improved F1 score). We compare the aforementioned BERT-like models to OpenAI's GPT-4 in a few-shot setting (on a small subset of our original test dataset), demonstrating that, without additional prompt-engineering or fine-tuning, GPT-4 performs worse on CIViC Evidence than our six fine-tuned models (66.1% weighted F1 score compared to 71.8% for the best fine-tuned model). However, performance gets reasonably close to the benchmark of a logistic regression model trained on bigram tf-idf scores (67.7% weighted F1 score).
Published: 2024

38. Report on some papers related to the function $\mathop{\mathcal R }(s)$ found by Siegel in Riemann's posthumous papers

Author: de Reyna, J. Arias
Subjects: Mathematics - Number Theory, Primary 11M06, Secondary 30D99
Abstract: In a letter to Weierstrass Riemann asserted that the number $N_0(T)$ of zeros of $\zeta(s)$ on the critical line to height $T$ is approximately equal to the total number of zeros to this height $N(T)$. Siegel studied some posthumous papers of Riemann trying to find a proof of this. He found a function $\mathop{\mathcal R }(s)$ whose zeros are related to the zeros of the function $\zeta(s)$. Siegel concluded that Riemann's papers contained no ideas for a proof of his assertion, connected the position of the zeros of $\mathop{\mathcal R }(s)$ with the position of the zeros of $\zeta(s)$ and asked about the position of the zeros of $\mathop{\mathcal R }(s)$. This paper is a summary of several papers that we will soon upload to arXiv, in which we try to answer Siegel's question about the position of the zeros of $\mathop{\mathcal R }(s)$. The articles contain also improvements on Siegel's results and also other possible ways to prove Riemann's assertion, but without achieving this goal., Comment: 18 pages, 4 figures. Added links to the papers in arXiv
Published: 2024

39. Multi-variable Quantification of BDDs in External Memory using Nested Sweeping (Extended Paper)

Author: Sølvsten, Steffan Christ and van de Pol, Jaco
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Databases, 68W30 (primary) 68Q60, 68R07 (secondary), E.1, F.2.2, I.1.2
Abstract: Previous research on the Adiar BDD package has been successful at designing algorithms capable of handling large Binary Decision Diagrams (BDDs) stored in external memory. To do so, it uses consecutive sweeps through the BDDs to resolve computations. Yet, this approach has kept algorithms for multi-variable quantification, the relational product, and variable reordering out of its scope. In this work, we address this by introducing the nested sweeping framework. Here, multiple concurrent sweeps pass information between eachother to compute the result. We have implemented the framework in Adiar and used it to create a new external memory multi-variable quantification algorithm. Compared to conventional depth-first implementations, Adiar with nested sweeping is able to solve more instances of our benchmarks and/or solve them faster., Comment: 26 pages, 14 figures, 2 tables
Published: 2024

40. Rediscussion of eclipsing binaries. Paper XX. HO Tel checkout

Author: Southworth, John
Subjects: Astrophysics - Solar and Stellar Astrophysics
Abstract: We present a detailed analysis of the detached eclipsing binary system HO Telescopii, which contains two A-type stars in a circular orbit of period 1.613 d. We use light curves from the Transiting Exoplanet Survey Satellite (TESS), which observed HO Tel in three sectors, to determine its photometric properties and a precise orbital ephemeris. We augment these results with radial velocity measurements from Surgit et al. to determine the masses and radii of the component stars: M_A = 1.906 +/- 0.031 Msun, M_B = 1.751 +/- 0.034 Msun, R_A = 2.296 +/- 0.027 Rsun and R_B = 2.074 +/- 0.028 Rsun. Combined with temperature measurements from Surgit et al. and optical-infrared apparent magnitudes from the literature, we find a distance to the system of 280.8 +/- 4.6 pc which agrees well with the distance from the Gaia DR3 parallax measurement. Theoretical predictions do not quite match the properties of the system, and there are small discrepancies in measurements of the spectroscopic orbits of the stars. Future observations from Gaia will allow further investigation of these issues., Comment: Accepted for publication in The Observatory. 12 pages, 5 tables, 3 black/white figures
Published: 2024

41. WHITE PAPER: A Brief Exploration of Data Exfiltration using GCG Suffixes

Author: Valbuena, Victor
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: The cross-prompt injection attack (XPIA) is an effective technique that can be used for data exfiltration, and that has seen increasing use. In this attack, the attacker injects a malicious instruction into third party data which an LLM is likely to consume when assisting a user, who is the victim. XPIA is often used as a means for data exfiltration, and the estimated cost of the average data breach for a business is nearly $4.5 million, which includes breaches such as compromised enterprise credentials. With the rise of gradient-based attacks such as the GCG suffix attack, the odds of an XPIA occurring which uses a GCG suffix are worryingly high. As part of my work in Microsoft's AI Red Team, I demonstrated a viable attack model using a GCG suffix paired with an injection in a simulated XPIA scenario. The results indicate that the presence of a GCG suffix can increase the odds of successful data exfiltration by nearly 20%, with some caveats., Comment: 8 pages, 8 figures. Conducted as part of employment at Microsoft Corporation
Published: 2024

42. Generative AI in Evidence-Based Software Engineering: A White Paper

Author: Esposito, Matteo, Janes, Andrea, Taibi, Davide, and Lenarduzzi, Valentina
Subjects: Computer Science - Software Engineering
Abstract: Context. In less than a year practitioners and researchers witnessed a rapid and wide implementation of Generative Artificial Intelligence. The daily availability of new models proposed by practitioners and researchers has enabled quick adoption. Textual GAIs capabilities enable researchers worldwide to explore new generative scenarios simplifying and hastening all timeconsuming text generation and analysis tasks. Motivation. The exponentially growing number of publications in our field with the increased accessibility to information due to digital libraries makes conducting systematic literature reviews and mapping studies an effort and timeinsensitive task Stemmed from this challenge we investigated and envisioned the role of GAIs in evidencebased software engineering. Future Directions. Based on our current investigation we will follow up the vision with the creation and empirical validation of a comprehensive suite of models to effectively support EBSE researchers
Published: 2024

43. A remark on the paper of Deninger and Murre

Author: Moonen, Ben
Subjects: Mathematics - Algebraic Geometry, 14K05, 14C15
Abstract: We show that the results proven by Deninger and Murre directly imply that the Chern classes of the de Rham bundle of an abelian scheme are torsion elements in the Chow ring, a result that was later proven by van der Geer. We also discuss several results about the orders of these classes.
Published: 2024

44. Can citations tell us about a paper's reproducibility? A case study of machine learning papers

Author: Obadage, Rochana R., Rajtmajer, Sarah M., and Wu, Jian
Subjects: Computer Science - Digital Libraries, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The iterative character of work in machine learning (ML) and artificial intelligence (AI) and reliance on comparisons against benchmark datasets emphasize the importance of reproducibility in that literature. Yet, resource constraints and inadequate documentation can make running replications particularly challenging. Our work explores the potential of using downstream citation contexts as a signal of reproducibility. We introduce a sentiment analysis framework applied to citation contexts from papers involved in Machine Learning Reproducibility Challenges in order to interpret the positive or negative outcomes of reproduction attempts. Our contributions include training classifiers for reproducibility-related contexts and sentiment analysis, and exploring correlations between citation context sentiment and reproducibility scores. Study data, software, and an artifact appendix are publicly available at https://github.com/lamps-lab/ccair-ai-reproducibility ., Comment: 9 pages, 4 figures
Published: 2024
Full Text: View/download PDF

45. Two results from Mandelbaum's paper: 'The dynamic complementarity problem'

Author: Bass, Richard
Subjects: Mathematics - Probability, Mathematics - Optimization and Control, 60
Abstract: A draft of a paper by Mandelbaum, "The dynamic complementarity problem", was circulated in 1987, but has never been published. We give an exposition of two important results from that paper which are not readily accessible in the literature. The first is an example of a Skorokhod problem in two dimensions in the quadrant for which there is not uniqueness. The second is a proof of uniqueness for the Skorokhod problem in two dimensions in the quadrant in a critical case.
Published: 2024

46. LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Author: Du, Jiangshu, Wang, Yibo, Zhao, Wenting, Deng, Zhongfen, Liu, Shuaiqi, Lou, Renze, Zou, Henry Peng, Venkit, Pranav Narayanan, Zhang, Nan, Srinath, Mukund, Zhang, Haoran Ranran, Gupta, Vipul, Li, Yinghui, Li, Tao, Wang, Fei, Liu, Qin, Liu, Tianlin, Gao, Pengzhi, Xia, Congying, Xing, Chen, Cheng, Jiayang, Wang, Zhaowei, Su, Ying, Shah, Raj Sanjay, Guo, Ruohao, Gu, Jing, Li, Haoran, Wei, Kangda, Wang, Zihao, Cheng, Lu, Ranathunga, Surangika, Fang, Meng, Fu, Jie, Liu, Fei, Huang, Ruihong, Blanco, Eduardo, Cao, Yixin, Zhang, Rui, Yu, Philip S., and Yin, Wenpeng
Subjects: Computer Science - Computation and Language
Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis.
Published: 2024

47. Predicting Award Winning Research Papers at Publication Time

Author: Vella, Riccardo, Vitaletti, Andrea, and Silvestri, Fabrizio
Subjects: Computer Science - Information Retrieval
Abstract: In recent years, many studies have been focusing on predicting the scientific impact of research papers. Most of these predictions are based on citations count or rely on features obtainable only from already published papers. In this study, we predict the likelihood for a research paper of winning an award only relying on information available at publication time. For each paper, we build the citation subgraph induced from its bibliography. We initially consider some features of this subgraph, such as the density and the global clustering coefficient, to make our prediction. Then, we mix this information with textual features, extracted from the abstract and the title, to obtain a more accurate final prediction. We made our experiments considering the ArnetMiner citation graph, while the ground truth on award-winning papers has been obtained from a collection of best paper awards from 32 computer science conferences. In our experiment, we obtained an encouraging F1 score of 0.694. Remarkably, The high recall and the low false negatives rate, show how the model performs very well at identifying papers that will not win an award. This behavior can help researchers in getting a first evaluation of their work at publication time. Lastly, we made some first experiments on interpretability. Our results highlight some interesting patterns both in topological and textual features.
Published: 2024

48. Let's Get to the Point: LLM-Supported Planning, Drafting, and Revising of Research-Paper Blog Posts

Author: Radensky, Marissa, Weld, Daniel S., Chang, Joseph Chee, Siangliulue, Pao, and Bragg, Jonathan
Subjects: Computer Science - Human-Computer Interaction
Abstract: Research-paper blog posts help scientists disseminate their work to a larger audience, but translating papers into this format requires substantial additional effort. Blog post creation is not simply transforming a long-form article into a short output, as studied in most prior work on human-AI summarization. In contrast, blog posts are typically full-length articles that require a combination of strategic planning grounded in the source document, well-organized drafting, and thoughtful revisions. Can tools powered by large language models (LLMs) assist scientists in writing research-paper blog posts? To investigate this question, we conducted a formative study (N=6) to understand the main challenges of writing such blog posts with an LLM: high interaction costs for 1) reviewing and utilizing the paper content and 2) recurrent sub-tasks of generating and modifying the long-form output. To address these challenges, we developed Papers-to-Posts, an LLM-powered tool that implements a new Plan-Draft-Revise workflow, which 1) leverages an LLM to generate bullet points from the full paper to help users find and select content to include (Plan) and 2) provides default yet customizable LLM instructions for generating and modifying text (Draft, Revise). Through a within-subjects lab study (N=20) and between-subjects deployment study (N=37 blog posts, 26 participants) in which participants wrote blog posts about their papers, we compared Papers-to-Posts to a strong baseline tool that provides an LLM-generated draft and access to free-form LLM prompting. Results show that Papers-to-Posts helped researchers to 1) write significantly more satisfying blog posts and make significantly more changes to their blog posts in a fixed amount of time without a significant change in cognitive load (lab) and 2) make more changes to their blog posts for a fixed number of writing actions (deployment)., Comment: 28 pages, 9 figures in main text (not appendix)
Published: 2024

49. RelevAI-Reviewer: A Benchmark on AI Reviewers for Survey Paper Relevance

Author: Couto, Paulo Henrique, Ho, Quang Phuoc, Kumari, Nageeta, Rachmat, Benedictus Kent, Khuong, Thanh Gia Hieu, Ullah, Ihsan, and Sun-Hosoya, Lisheng
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Recent advancements in Artificial Intelligence (AI), particularly the widespread adoption of Large Language Models (LLMs), have significantly enhanced text analysis capabilities. This technological evolution offers considerable promise for automating the review of scientific papers, a task traditionally managed through peer review by fellow researchers. Despite its critical role in maintaining research quality, the conventional peer-review process is often slow and subject to biases, potentially impeding the swift propagation of scientific knowledge. In this paper, we propose RelevAI-Reviewer, an automatic system that conceptualizes the task of survey paper review as a classification problem, aimed at assessing the relevance of a paper in relation to a specified prompt, analogous to a "call for papers". To address this, we introduce a novel dataset comprised of 25,164 instances. Each instance contains one prompt and four candidate papers, each varying in relevance to the prompt. The objective is to develop a machine learning (ML) model capable of determining the relevance of each paper and identifying the most pertinent one. We explore various baseline approaches, including traditional ML classifiers like Support Vector Machine (SVM) and advanced language models such as BERT. Preliminary findings indicate that the BERT-based end-to-end classifier surpasses other conventional ML methods in performance. We present this problem as a public challenge to foster engagement and interest in this area of research.
Published: 2024

50. UruBots UAV -- Air Emergency Service Indoor Team Description Paper for FIRA 2024

Author: Sodre, Hiago, Barcelona, Sebastian, Scirgalea, Anthony, Macedo, Brandon, Sampson, Gabriel, Moraes, Pablo, Moraes, William, Saravia, Victoria, Deniz, Juan, Guterres, Bruna, Kelbouscas, Andre, and Grando, Ricardo
Subjects: Computer Science - Robotics
Abstract: This document addresses the description of the corresponding "Urubots" Team for the 2024 Fira Air League, "Air Emergency Service (Indoor)." We introduce our team and an autonomous Unmanned Aerial Vehicle (UAV) that relies on computer vision for its flight control. This UAV has the capability to perform a wide variety of navigation tasks in indoor environments, without requiring the intervention of an external operator or any form of external processing, resulting in a significant decrease in workload and manual dependence. Additionally, our software has been designed to be compatible with the vehicle's structure and for its application to the competition circuit. In this paper, we detail additional aspects about the mechanical structure, software, and application to the FIRA competition., Comment: Team Description Paper for the FIRA RoboWorld Cup 2024
Published: 2024

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,704,384 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources