8,312,602 results on '"Automatic"'
Search Results
52. Influence of Key Parameters of Medicinal Aluminum Tube on Automatic Casing Process
- Author
-
Yan, Guoping, Ming, Zhengjun, Zhou, Junhong, Tao, Qi, and Li, Shihuang
- Published
- 2024
- Full Text
- View/download PDF
53. Iterative Model Predictive Control for Automatic Carrier Landing of Carrier-Based Aircrafts Under Complex Surroundings and Constraints
- Author
-
Zhang, Xiaotian, He, Defeng, and Liao, Fei
- Published
- 2024
- Full Text
- View/download PDF
54. Unsupervised Machine Learning for Automatic Image Segmentation of Impact Damage in CFRP Composites
- Author
-
Zhupanska, Olesya and Krokhmal, Pavlo
- Published
- 2024
- Full Text
- View/download PDF
55. A FairMOT approach based on video recognition for real-time automatic incident detection on expressways
- Author
-
Xiao, Daiquan, Wang, Zeyu, Shen, Zhenwu, Xu, Xuecai, and Ma, Changxi
- Published
- 2024
- Full Text
- View/download PDF
56. Automatic adjustment of the parameters of the temperature measurement algorithm in a wide range
- Author
-
Bondar, O. G., Brezhneva, E. O., and Botikov, K. A.
- Published
- 2024
- Full Text
- View/download PDF
57. Improved wafer map defect pattern classification using automatic data augmentation based lightweight encoder network in contrastive learning
- Author
-
Sheng, Yi, Yan, Jinda, and Piao, Minghao
- Published
- 2024
- Full Text
- View/download PDF
58. A new thread-level speculative automatic parallelization model and library based on duplicate code execution
- Author
-
Martínez, Millán A., Fraguela, Basilio B., Cabaleiro, José C., and Rivera, Francisco F.
- Published
- 2024
- Full Text
- View/download PDF
59. Achieving improved stability for automatic voltage regulation with fractional-order PID plus double-derivative controller and mountain gazelle optimizer
- Author
-
Izci, Davut, Abualigah, Laith, Can, Özay, Andiç, Cenk, and Ekinci, Serdar
- Published
- 2024
- Full Text
- View/download PDF
60. Automatic Screening of COVID-19 Using an Optimized Generative Adversarial Network
- Author
-
Goel, Tripti, Murugan, R., Mirjalili, Seyedali, and Chakrabartty, Deba Kumar
- Published
- 2024
- Full Text
- View/download PDF
61. ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
- Author
-
Jia, Chengyou, Xia, Changliang, Dang, Zhuohang, Wu, Weijia, Qian, Hangwei, and Luo, Minnan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Despite the significant advancements in text-to-image (T2I) generative models, users often face a trial-and-error challenge in practical scenarios. This challenge arises from the complexity and uncertainty of tedious steps such as crafting suitable prompts, selecting appropriate models, and configuring specific arguments, making users resort to labor-intensive attempts for desired images. This paper proposes Automatic T2I generation, which aims to automate these tedious steps, allowing users to simply describe their needs in a freestyle chatting way. To systematically study this problem, we first introduce ChatGenBench, a novel benchmark designed for Automatic T2I. It features high-quality paired data with diverse freestyle inputs, enabling comprehensive evaluation of automatic T2I models across all steps. Additionally, recognizing Automatic T2I as a complex multi-step reasoning task, we propose ChatGen-Evo, a multi-stage evolution strategy that progressively equips models with essential automation skills. Through extensive evaluation across step-wise accuracy and image quality, ChatGen-Evo significantly enhances performance over various baselines. Our evaluation also uncovers valuable insights for advancing automatic T2I. All our data, code, and models will be available in \url{https://chengyou-jia.github.io/ChatGen-Home}
- Published
- 2024
62. MiceBoneChallenge: Micro-CT public dataset and six solutions for automatic growth plate detection in micro-CT mice bone scans
- Author
-
Burlutskiy, Nikolay, Kekic, Marija, de la Torre, Jordi, Plewa, Philipp, Boroumand, Mehdi, Jurkowska, Julia, Venovski, Borjan, Biagi, Maria Chiara, Hagos, Yeman Brhane, Malinowska-Traczyk, Roksana, Wang, Yibo, Zalewski, Jacek, Sawczuk, Paula, Pintarić, Karlo, Yousefi, Fariba, and Hultin, Leif
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Statistics - Machine Learning - Abstract
Detecting and quantifying bone changes in micro-CT scans of rodents is a common task in preclinical drug development studies. However, this task is manual, time-consuming and subject to inter- and intra-observer variability. In 2024, Anonymous Company organized an internal challenge to develop models for automatic bone quantification. We prepared and annotated a high-quality dataset of 3D $\mu$CT bone scans from $83$ mice. The challenge attracted over $80$ AI scientists from around the globe who formed $23$ teams. The participants were tasked with developing a solution to identify the plane where the bone growth happens, which is essential for fully automatic segmentation of trabecular bone. As a result, six computer vision solutions were developed that can accurately identify the location of the growth plate plane. The solutions achieved the mean absolute error of $1.91\pm0.87$ planes from the ground truth on the test set, an accuracy level acceptable for practical use by a radiologist. The annotated 3D scans dataset along with the six solutions and source code, is being made public, providing researchers with opportunities to develop and benchmark their own approaches. The code, trained models, and the data will be shared., Comment: Under Review
- Published
- 2024
63. Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
- Author
-
Ramprasad, Sanjana and Wallace, Byron C.
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Modern LLMs can now produce highly readable abstractive summaries, to the point where traditional automated metrics for evaluating summary quality, such as ROUGE, have become saturated. However, LLMs still sometimes introduce unwanted content into summaries, i.e., information inconsistent with or unsupported by their source. Measuring the occurrence of these often subtle ``hallucinations'' automatically has proved to be challenging. This in turn has motivated development of a variety of metrics intended to measure the factual consistency of generated summaries against their source. But are these approaches measuring what they purport to do? In this work, we stress-test automatic factuality metrics. Specifically, we investigate whether and to what degree superficial attributes of summary texts suffice to predict ``factuality'', finding that a (supervised) model using only such shallow features is reasonably competitive with SOTA factuality scoring methods. We then evaluate how factuality metrics respond to factual corrections in inconsistent summaries and find that only a few show meaningful improvements. In contrast, some metrics are more sensitive to benign, non-factual edits. Motivated by these insights, we show that one can ``game'' (most) automatic factuality metrics, i.e., reliably inflate ``factuality'' scores by appending innocuous sentences to generated summaries. Taken together, our results raise questions about the degree to which we should rely on existing automated factuality metrics and what exactly we want ``factuality metrics'' to measure.
- Published
- 2024
64. Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark
- Author
-
Tu, Rong-Cheng, Ma, Zi-Ao, Lan, Tian, Zhao, Yuehao, Huang, Heyan, and Mao, Xian-Ling
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Driven by the remarkable progress in diffusion models, text-to-image generation has made significant strides, creating a pressing demand for automatic quality evaluation of generated images. Current state-of-the-art automatic evaluation methods heavily rely on Multi-modal Large Language Models (MLLMs), particularly powerful commercial models like GPT-4o. While these models are highly effective, their substantial costs limit scalability in large-scale evaluations. Adopting open-source MLLMs is an alternative; however, their performance falls short due to significant limitations in processing multi-modal data compared to commercial MLLMs. To tackle these problems, we first propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset, where the complex evaluation task is decoupled into simpler sub-tasks, effectively reducing the learning complexity. Based on this dataset, we design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6. Furthermore, to reliably and comprehensively assess prior works and our proposed model, we manually annotate a meta-evaluation benchmark that includes chain-of-thought explanations alongside quality scores for generated images. Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline, VIEScore, with over 4.6\% improvement in Spearman and Kendall correlations with human judgments.
- Published
- 2024
65. Deep Learning-Based Automatic Delineation of Liver Domes in kV Triggered Images for Online Breath-hold Reproducibility Verification of Liver Stereotactic Body Radiation Therapy
- Author
-
Weragoda, Sugandima, Xia, Ping, Stephans, Kevin, Woody, Neil, Martens, Michael, Brown, Robert, and Guo, Bingqi
- Subjects
Physics - Medical Physics ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Stereotactic Body Radiation Therapy (SBRT) can be a precise, minimally invasive treatment method for liver cancer and liver metastases. However, the effectiveness of SBRT relies on the accurate delivery of the dose to the tumor while sparing healthy tissue. Challenges persist in ensuring breath-hold reproducibility, with current methods often requiring manual verification of liver dome positions from kV-triggered images. To address this, we propose a proof-of-principle study of a deep learning-based pipeline to automatically delineate the liver dome from kV-planar images. From 24 patients who received SBRT for liver cancer or metastasis inside liver, 711 KV-triggered images acquired for online breath-hold verification were included in the current study. We developed a pipeline comprising a trained U-Net for automatic liver dome region segmentation from the triggered images followed by extraction of the liver dome via thresholding, edge detection, and morphological operations. The performance and generalizability of the pipeline was evaluated using 2-fold cross validation. The training of the U-Net model for liver region segmentation took under 30 minutes and the automatic delineation of a liver dome for any triggered image took less than one second. The RMSE and rate of detection for Fold1 with 366 images was (6.4 +/- 1.6) mm and 91.7%, respectively. For Fold2 with 345 images, the RMSE and rate of detection was (7.7 +/- 2.3) mm and 76.3% respectively.
- Published
- 2024
66. Automatic brain tumor segmentation in 2D intra-operative ultrasound images using MRI tumor annotations
- Author
-
Faanes, Mathilde, Helland, Ragnhild Holden, Solheim, Ole, and Reinertsen, Ingerid
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,I.4.6 ,J.3 - Abstract
Automatic segmentation of brain tumors in intra-operative ultrasound (iUS) images could facilitate localization of tumor tissue during resection surgery. The lack of large annotated datasets limits the current models performances. In this paper, we investigate the use of tumor annotations in pre-operative MRI images, which are more easily accessible than annotations in iUS images, for training of deep learning models for iUS brain tumor segmentation. We used 180 annotated pre-operative MRI images with corresponding unannotated iUS images, and 29 annotated iUS images. Image registration was performed to transfer the MRI annotations to the corresponding iUS images before training models with the nnU-Net framework. To validate the use of MRI labels, the models were compared to a model trained with only US annotated tumors, and a model with both US and MRI annotated tumors. In addition, the results were compared to annotations validated by an expert neurosurgeon on the same test set to measure inter-observer variability. The results showed similar performance for a model trained with only MRI annotated tumors, compared to a model trained with only US annotated tumors. The model trained using both modalities obtained slightly better results with an average Dice score of 0.62, where external expert annotations achieved a score of 0.67. The results also showed that the deep learning models were comparable to expert annotation for larger tumors (> 200 mm2), but perform clearly worse for smaller tumors (< 200 mm2). This shows that MRI tumor annotations can be used as a substitute for US tumor annotations to train a deep learning model for automatic brain tumor segmentation in intra-operative ultrasound images. Small tumors is a limitation for the current models and will be the focus of future work. The main models are available here: https://github.com/mathildefaanes/us_brain_tumor_segmentation., Comment: 19, 8 figures, submitted to International Journal of Computer Assisted Radiology and Surgery
- Published
- 2024
67. ComfyGI: Automatic Improvement of Image Generation Workflows
- Author
-
Sobania, Dominik, Briesch, Martin, and Rothlauf, Franz
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Neural and Evolutionary Computing - Abstract
Automatic image generation is no longer just of interest to researchers, but also to practitioners. However, current models are sensitive to the settings used and automatic optimization methods often require human involvement. To bridge this gap, we introduce ComfyGI, a novel approach to automatically improve workflows for image generation without the need for human intervention driven by techniques from genetic improvement. This enables image generation with significantly higher quality in terms of the alignment with the given description and the perceived aesthetics. On the performance side, we find that overall, the images generated with an optimized workflow are about 50% better compared to the initial workflow in terms of the median ImageReward score. These already good results are even surpassed in our human evaluation, as the participants preferred the images improved by ComfyGI in around 90% of the cases.
- Published
- 2024
68. Cyborg Insect Factory: Automatic Assembly System to Build up Insect-computer Hybrid Robot Based on Vision-guided Robotic Arm Manipulation of Custom Bipolar Electrodes
- Author
-
Lin, Qifeng, Vuong, Nghia, Song, Kewei, Tran-Ngoc, Phuoc Thanh, Nonato, Greg Angelo Gonzales, and Sato, Hirotaka
- Subjects
Computer Science - Robotics - Abstract
The advancement of insect-computer hybrid robots holds significant promise for navigating complex terrains and enhancing robotics applications. This study introduced an automatic assembly method for insect-computer hybrid robots, which was accomplished by mounting backpack with precise implantation of custom-designed bipolar electrodes. We developed a stimulation protocol for the intersegmental membrane between pronotum and mesothorax of the Madagascar hissing cockroach, allowing for bipolar electrodes' automatic implantation using a robotic arm. The assembly process was integrated with a deep learning-based vision system to accurately identify the implantation site, and a dedicated structure to fix the insect (68 s for the whole assembly process). The automatically assembled hybrid robots demonstrated steering control (over 70 degrees for 0.4 s stimulation) and deceleration control (68.2% speed reduction for 0.4 s stimulation), matching the performance of manually assembled systems. Furthermore, a multi-agent system consisting of 4 hybrid robots successfully covered obstructed outdoor terrain (80.25% for 10 minutes 31 seconds), highlighting the feasibility of mass-producing these systems for practical applications. The proposed automatic assembly strategy reduced preparation time for the insect-computer hybrid robots while maintaining their precise control, laying a foundation for scalable production and deployment in real-world applications.
- Published
- 2024
69. Classification of Stable Surfaces with respect to Automatic Continuity
- Author
-
Bestvina, Mladen, Domat, George, and Rafi, Kasra
- Subjects
Mathematics - Geometric Topology ,Mathematics - Group Theory ,57S05, 57K20, 20F65, 22A05 - Abstract
We provide a complete classification of when the homeomorphism group of a stable surface, $\Sigma$, has the automatic continuity property: Any homomorphism from Homeo$(\Sigma)$ to a separable group is necessarily continuous. This result descends to a classification of when the mapping class group of $\Sigma$ has the automatic continuity property. Towards this classification, we provide a general framework for proving automatic continuity for groups of homeomorphisms. Applying this framework, we also show that the homeomorphism group of any stable second countable Stone space has the automatic continuity property. Under the presence of stability this answers two questions of Mann., Comment: 37 pages, 5 figures
- Published
- 2024
70. Improving the solver for the Balitsky-Kovchegov evolution equation with Automatic Differentiation
- Author
-
Cougoulic, Florian, Korcyl, Piotr, and Stebel, Tomasz
- Subjects
High Energy Physics - Phenomenology - Abstract
The Balitsky-Kovchegov (BK) evolution equation is an equation derived from perturbative Quantum Chromodynamics that allows one to calculate the scattering amplitude of a pair of quark and antiquark off a hadron target, called the dipole amplitude, as a function of the collision energy. The initial condition, being a non-perturbative object, usually has to be modeled separately. Typically, the model contains several tunable parameters that are determined by fitting to experimental data. In this contribution, we propose an implementation of the BK solver using differentiable programming. Automatic differentiation offers the possibility that the first and second derivatives of the amplitude with respect to the initial condition parameters are automatically calculated at all stages of the simulation. This fact should considerably facilitate and speed up the fitting step. Moreover, in the context of Transverse Momentum Dis- tributions (TMD), we demonstrate that automatic differentiation can be used to obtain the first and second derivatives of the amplitude with respect to the quark-antiquark separation. These derivatives can be used to relate various TMD functions to the dipole amplitude. Our C++ code for the solver, which is available in a public repository [1], includes the Balitsky one-loop running coupling prescription and the kinematic constraint. This version of the BK equation is widely used in the small-x evolution framework., Comment: 17 pages, 6 figures, source code is published in the repository
- Published
- 2024
71. SG-LRA: Self-Generating Automatic Scoliosis Cobb Angle Measurement with Low-Rank Approximation
- Author
-
Shao, Zhiwen, Yuan, Yichen, Ma, Lizhuang, Yeung, Dit-Yan, and Zhu, Xiaojia
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Automatic Cobb angle measurement from X-ray images is crucial for scoliosis screening and diagnosis. However, most existing regression-based methods and segmentation-based methods struggle with inaccurate spine representations or mask connectivity/fragmentation issues. Besides, landmark-based methods suffer from insufficient training data and annotations. To address these challenges, we propose a novel framework including Self-Generation pipeline and Low-Rank Approximation representation (SG-LRA) for automatic Cobb angle measurement. Specifically, we propose a parameterized spine contour representation based on LRA, which enables eigen-spine decomposition and spine contour reconstruction. We can directly obtain spine contour with only regressed LRA coefficients, which form a more accurate spine representation than rectangular boxes. Also, we combine LRA coefficient regression with anchor box classification to solve inaccurate predictions and mask connectivity issues. Moreover, we develop a data engine with automatic annotation and automatic selection in an iterative manner, which is trained on a private Spinal2023 dataset. With our data engine, we generate the largest scoliosis X-ray dataset named Spinal-AI2024 largely without privacy leaks. Extensive experiments on public AASCE2019, private Spinal2023, and generated Spinal-AI2024 datasets demonstrate that our method achieves state-of-the-art Cobb angle measurement performance. Our code and Spinal-AI2024 dataset are available at https://github.com/Ernestchenchen/SG-LRA and https://github.com/Ernestchenchen/Spinal-AI2024, respectively.
- Published
- 2024
72. Qurts: Automatic Quantum Uncomputation by Affine Types with Lifetime
- Author
-
Hirata, Kengo and Heunen, Chris
- Subjects
Computer Science - Programming Languages ,Quantum Physics ,D.3.1 ,F.3.1 ,F.3.2 - Abstract
Uncomputation is a feature in quantum programming that allows the programmer to discard a value without losing quantum information, and that allows the compiler to reuse resources. Whereas quantum information has to be treated linearly by the type system, automatic uncomputation enables the programmer to treat it affinely to some extent. Automatic uncomputation requires a substructural type system between linear and affine, a subtlety that has only been captured by existing languages in an ad hoc way. We extend the Rust type system to the quantum setting to give a uniform framework for automatic uncomputation called Qurts (pronounced quartz). Specifically, we parameterise types by lifetimes, permitting them to be affine during their lifetime, while being restricted to linear use outside their lifetime. We also provide two operational semantics: one based on classical simulation, and one that does not depend on any specific uncomputation strategy., Comment: 59 pages
- Published
- 2024
73. HIST-AID: Leveraging Historical Patient Reports for Enhanced Multi-Modal Automatic Diagnosis
- Author
-
Huang, Haoxu, Deniz, Cem M., Cho, Kyunghyun, Chopra, Sumit, and Madaan, Divyam
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Chest X-ray imaging is a widely accessible and non-invasive diagnostic tool for detecting thoracic abnormalities. While numerous AI models assist radiologists in interpreting these images, most overlook patients' historical data. To bridge this gap, we introduce Temporal MIMIC dataset, which integrates five years of patient history, including radiographic scans and reports from MIMIC-CXR and MIMIC-IV, encompassing 12,221 patients and thirteen pathologies. Building on this, we present HIST-AID, a framework that enhances automatic diagnostic accuracy using historical reports. HIST-AID emulates the radiologist's comprehensive approach, leveraging historical data to improve diagnostic accuracy. Our experiments demonstrate significant improvements, with AUROC increasing by 6.56% and AUPRC by 9.51% compared to models that rely solely on radiographic scans. These gains were consistently observed across diverse demographic groups, including variations in gender, age, and racial categories. We show that while recent data boost performance, older data may reduce accuracy due to changes in patient conditions. Our work paves the potential of incorporating historical data for more reliable automatic diagnosis, providing critical support for clinical decision-making., Comment: In Proceedings of Machine Learning for Health
- Published
- 2024
74. Interactive Cycle Model -- The Linkage Combination among Automatic Speech Recognition, Large Language Models and Smart Glasses
- Author
-
Wang, Libo
- Subjects
Computer Science - Human-Computer Interaction - Abstract
This research proposes the interaction loop model "ASR-LLM-Smart Glasses", which model combines automatic speech recognition, large language model and smart glasses to facilitate seamless human-computer interaction. And the methodology of this research involves decomposing the interaction process into different stages and elements. Speech is captured and processed by ASR, then analyzed and interpreted by LLM. The results are then transmitted to smart glasses for display. The feedback loop is complete when the user interacts with the displayed data. Mathematical formulas are used to quantify the performance of the model that revolves around core evaluation points: accuracy, coherence, and latency during ASR speech-to-text conversion. The research results are provided theoretically to test and evaluate the feasibility and performance of the model. Although such human-computer interaction products have not yet appeared in the industry, the performance indicators of this model in enhancing user experience in fields that rely on human-computer interaction have also verified its utility as a technology to promote human-computer interaction. In addition, this research pioneered the idea of integrating cutting-edge technologies such as generative pre-trained Transformer models into unique interaction models, LLM provides raw value through powerful evaluation techniques and innovative use, which provides a new perspective to evaluate and enhanced human-computer interaction. Keywords: Automatic speech recognition, Large Language Model, Smart glasses, Interaction mechanism, Comment: OpenReview submitted. 11 pages of text and 1 figure
- Published
- 2024
75. CorrectBench: Automatic Testbench Generation with Functional Self-Correction using LLMs for HDL Design
- Author
-
Qiu, Ruidi, Zhang, Grace Li, Drechsler, Rolf, Schlichtmann, Ulf, and Li, Bing
- Subjects
Computer Science - Software Engineering - Abstract
Functional simulation is an essential step in digital hardware design. Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for hardware testbench generation tasks. However, the inherent instability associated with LLMs often leads to functional errors in the generated testbenches. Previous methods do not incorporate automatic functional correction mechanisms without human intervention and still suffer from low success rates, especially for sequential tasks. To address this issue, we propose CorrectBench, an automatic testbench generation framework with functional self-validation and self-correction. Utilizing only the RTL specification in natural language, the proposed approach can validate the correctness of the generated testbenches with a success rate of 88.85%. Furthermore, the proposed LLM-based corrector employs bug information obtained during the self-validation process to perform functional self-correction on the generated testbenches. The comparative analysis demonstrates that our method achieves a pass ratio of 70.13% across all evaluated tasks, compared with the previous LLM-based testbench generation framework's 52.18% and a direct LLM-based generation method's 33.33%. Specifically in sequential circuits, our work's performance is 62.18% higher than previous work in sequential tasks and almost 5 times the pass ratio of the direct method. The codes and experimental results are open-sourced at the link: https://github.com/AutoBench/CorrectBench
- Published
- 2024
76. CRepair: CVAE-based Automatic Vulnerability Repair Technology
- Author
-
Liu, Penghui, Bi, Yingzhou, Huang, Jiangtao, Jiang, Xinxin, and Wang, Lianmei
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence - Abstract
Software vulnerabilities are flaws in computer software systems that pose significant threats to the integrity, security, and reliability of modern software and its application data. These vulnerabilities can lead to substantial economic losses across various industries. Manual vulnerability repair is not only time-consuming but also prone to errors. To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention. However, existing methods often focus on learning more vulnerability data to improve repair outcomes, while neglecting the diverse characteristics of vulnerable code, and suffer from imprecise vulnerability localization.To address these shortcomings, this paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code. We first preprocess the vulnerability data using a prompt-based method to serve as input to the model. Then, we apply causal inference techniques to map the vulnerability feature data to probability distributions. By employing multi-sample feature fusion, we capture diverse vulnerability feature information. Finally, conditional control is used to guide the model in repairing the vulnerabilities.Experimental results demonstrate that the proposed method significantly outperforms other benchmark models, achieving a perfect repair rate of 52%. The effectiveness of the approach is validated from multiple perspectives, advancing AI-driven code vulnerability repair and showing promising applications.
- Published
- 2024
77. Automatic Authoring of Physical and Perceptual/Affective Motion Effects for Virtual Reality
- Author
-
Lee, Jiwan and Choi, Seungmoon
- Subjects
Computer Science - Human-Computer Interaction - Abstract
This demo is about automatic authoring of various motion effects that are provided with audiovisual content to improve user experiences. Traditionally, motion effects have been used for simulators, e.g., flight simulators for pilots and astronauts, to present physically accurate vestibular feedback. At present, we have greatly wider use of motion effects for entertainment purposes, such as 4D rides in amusement parks and even shopping malls, 4D films in theaters, and relative new virtual reality games with head-mounted displays and personal motion platforms. However, the production of motion effects is done solely by manual authoring or coding, and this costly process prevents the faster and wider dissemination of 4D content. It is imperative to facilitate motion effect production by providing automatic synthesis algorithms. This demo video presents nine different automatic synthesis algorithms for motion effects and a recorded demonstration of each., Comment: Part of proceedings of 6th International Conference AsiaHaptics 2024
- Published
- 2024
78. A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model
- Author
-
Hu, Panwen, Xiao, Nan, Li, Feifei, Chen, Yongquan, and Huang, Rui
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this era of videos, automatic video editing techniques attract more and more attention from industry and academia since they can reduce workloads and lower the requirements for human editors. Existing automatic editing systems are mainly scene- or event-specific, e.g., soccer game broadcasting, yet the automatic systems for general editing, e.g., movie or vlog editing which covers various scenes and events, were rarely studied before, and converting the event-driven editing method to a general scene is nontrivial. In this paper, we propose a two-stage scheme for general editing. Firstly, unlike previous works that extract scene-specific features, we leverage the pre-trained Vision-Language Model (VLM) to extract the editing-relevant representations as editing context. Moreover, to close the gap between the professional-looking videos and the automatic productions generated with simple guidelines, we propose a Reinforcement Learning (RL)-based editing framework to formulate the editing problem and train the virtual editor to make better sequential editing decisions. Finally, we evaluate the proposed method on a more general editing task with a real movie dataset. Experimental results demonstrate the effectiveness and benefits of the proposed context representation and the learning ability of our RL-based editing framework.
- Published
- 2024
79. A multi-purpose automatic editing system based on lecture semantics for remote education
- Author
-
Hu, Panwen and Huang, Rui
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Multimedia - Abstract
Remote teaching has become popular recently due to its convenience and safety, especially under extreme circumstances like a pandemic. However, online students usually have a poor experience since the information acquired from the views provided by the broadcast platforms is limited. One potential solution is to show more camera views simultaneously, but it is technically challenging and distracting for the viewers. Therefore, an automatic multi-camera directing/editing system, which aims at selecting the most concerned view at each time instance to guide the attention of online students, is in urgent demand. However, existing systems mostly make simple assumptions and focus on tracking the position of the speaker instead of the real lecture semantics, and therefore have limited capacities to deliver optimal information flow. To this end, this paper proposes an automatic multi-purpose editing system based on the lecture semantics, which can both direct the multiple video streams for real-time broadcasting and edit the optimal video offline for review purposes. Our system directs the views by semantically analyzing the class events while following the professional directing rules, mimicking a human director to capture the regions of interest from the viewpoint of the onsite students. We conduct both qualitative and quantitative analyses to verify the effectiveness of the proposed system and its components.
- Published
- 2024
80. Automatic Structured Pruning for Efficient Architecture in Federated Learning
- Author
-
Nguyen, Thai Vu, Le, Long Bao, and Avila, Anderson
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In Federated Learning (FL), training is conducted on client devices, typically with limited computational resources and storage capacity. To address these constraints, we propose an automatic pruning scheme tailored for FL systems. Our solution improves computation efficiency on client devices, while minimizing communication costs. One of the challenges of tuning pruning hyper-parameters in FL systems is the restricted access to local data. Thus, we introduce an automatic pruning paradigm that dynamically determines pruning boundaries. Additionally, we utilized a structured pruning algorithm optimized for mobile devices that lack hardware support for sparse computations. Experimental results demonstrate the effectiveness of our approach, achieving accuracy comparable to existing methods. Our method notably reduces the number of parameters by 89% and FLOPS by 90%, with minimal impact on the accuracy of the FEMNIST and CelebFaces datasets. Furthermore, our pruning method decreases communication overhead by up to 5x and halves inference time when deployed on Android devices.
- Published
- 2024
81. Automatic programming via large language models with population self-evolution for dynamic job shop scheduling problem
- Author
-
Huang, Jin, Li, Xinyu, Gao, Liang, Liu, Qihao, and Teng, Yue
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
Heuristic dispatching rules (HDRs) are widely regarded as effective methods for solving dynamic job shop scheduling problems (DJSSP) in real-world production environments. However, their performance is highly scenario-dependent, often requiring expert customization. To address this, genetic programming (GP) and gene expression programming (GEP) have been extensively used for automatic algorithm design. Nevertheless, these approaches often face challenges due to high randomness in the search process and limited generalization ability, hindering the application of trained dispatching rules to new scenarios or dynamic environments. Recently, the integration of large language models (LLMs) with evolutionary algorithms has opened new avenues for prompt engineering and automatic algorithm design. To enhance the capabilities of LLMs in automatic HDRs design, this paper proposes a novel population self-evolutionary (SeEvo) method, a general search framework inspired by the self-reflective design strategies of human experts. The SeEvo method accelerates the search process and enhances exploration capabilities. Experimental results show that the proposed SeEvo method outperforms GP, GEP, end-to-end deep reinforcement learning methods, and more than 10 common HDRs from the literature, particularly in unseen and dynamic scenarios.
- Published
- 2024
82. A Survey on Automatic Credibility Assessment of Textual Credibility Signals in the Era of Large Language Models
- Author
-
Srba, Ivan, Razuvayevskaya, Olesya, Leite, João A., Moro, Robert, Schlicht, Ipek Baris, Tonelli, Sara, García, Francisco Moreno, Lottmann, Santiago Barrio, Teyssou, Denis, Porcellini, Valentin, Scarton, Carolina, Bontcheva, Kalina, and Bielikova, Maria
- Subjects
Computer Science - Computation and Language - Abstract
In the current era of social media and generative AI, an ability to automatically assess the credibility of online social media content is of tremendous importance. Credibility assessment is fundamentally based on aggregating credibility signals, which refer to small units of information, such as content factuality, bias, or a presence of persuasion techniques, into an overall credibility score. Credibility signals provide a more granular, more easily explainable and widely utilizable information in contrast to currently predominant fake news detection, which utilizes various (mostly latent) features. A growing body of research on automatic credibility assessment and detection of credibility signals can be characterized as highly fragmented and lacking mutual interconnections. This issue is even more prominent due to a lack of an up-to-date overview of research works on automatic credibility assessment. In this survey, we provide such systematic and comprehensive literature review of 175 research papers while focusing on textual credibility signals and Natural Language Processing (NLP), which undergoes a significant advancement due to Large Language Models (LLMs). While positioning the NLP research into the context of other multidisciplinary research works, we tackle with approaches for credibility assessment as well as with 9 categories of credibility signals (we provide a thorough analysis for 3 of them, namely: 1) factuality, subjectivity and bias, 2) persuasion techniques and logical fallacies, and 3) claims and veracity). Following the description of the existing methods, datasets and tools, we identify future challenges and opportunities, while paying a specific attention to recent rapid development of generative AI.
- Published
- 2024
83. Towards Fully Automatic Distributed Lower Bounds
- Author
-
Balliu, Alkida, Brandt, Sebastian, Kuhn, Fabian, Olivetti, Dennis, and Saarhelo, Joonatan
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
In the past few years, a successful line of research has lead to lower bounds for several fundamental local graph problems in the distributed setting. These results were obtained via a technique called round elimination. On a high level, the round elimination technique can be seen as a recursive application of a function that takes as input a problem $\Pi$ and outputs a problem $\Pi'$ that is one round easier than $\Pi$. Applying this function recursively to concrete problems of interest can be highly nontrivial, which is one of the reasons that has made the technique difficult to approach. The contribution of our paper is threefold. Firstly, we develop a new and fully automatic method for finding lower bounds of $\Omega(\log_\Delta n)$ and $\Omega(\log_\Delta \log n)$ rounds for deterministic and randomized algorithms, respectively, via round elimination. Secondly, we show that this automatic method is indeed useful, by obtaining lower bounds for defective coloring problems. We show that the problem of coloring the nodes of a graph with $3$ colors and defect at most $(\Delta - 3)/2$ requires $\Omega(\log_\Delta n)$ rounds for deterministic algorithms and $\Omega(\log_\Delta \log n)$ rounds for randomized ones. We note that lower bounds for coloring problems are notoriously challenging to obtain, both in general, and via the round elimination technique. Both the first and (indirectly) the second contribution build on our third contribution -- a new and conceptually simple way to compute the one-round easier problem $\Pi'$ in the round elimination framework. This new procedure provides a clear and easy recipe for applying round elimination, thereby making a substantial step towards the greater goal of having a fully automatic procedure for obtaining lower bounds in the distributed setting.
- Published
- 2024
84. Automatic Extraction and Compensation of P-Bit Device Variations in Large Array Utilizing Boltzmann Machine Training
- Author
-
Zhang, Bolin, Liu, Yu, Gao, Tianqi, Yin, Jialiang, Guan, Zhenyu, Zhang, Deming, and Zeng, Lang
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Physics - Applied Physics - Abstract
Probabilistic Bit (P-Bit) device serves as the core hardware for implementing Ising computation. However, the severe intrinsic variations of stochastic P-Bit devices hinder the large-scale expansion of the P-Bit array, significantly limiting the practical usage of Ising computation. In this work, a behavioral model which attributes P-Bit variations to two parameters {\alpha} and {\Delta}V is proposed. Then the weight compensation method is introduced, which can mitigate {\alpha} and {\Delta}V of P-Bits device variations by rederiving the weight matrix, enabling them to compute as ideal identical PBits without the need for weights retraining. Accurately extracting the {\alpha} and {\Delta}V simultaneously from a large P-Bit array which is prerequisite for the weight compensation method is a crucial and challenging task. To solve this obstacle, we present the novel automatic variation extraction algorithm which can extract device variations of each P-Bit in a large array based on Boltzmann machine learning. In order for the accurate extraction of variations from an extendable P-Bit array, an Ising Hamiltonian based on 3D ferromagnetic model is constructed, achieving precise and scalable array variation extraction. The proposed Automatic Extraction and Compensation algorithm is utilized to solve both 16-city traveling salesman problem(TSP) and 21-bit integer factorization on a large P-Bit array with variation, demonstrating its accuracy, transferability, and scalability., Comment: 15 pages, 17 figures
- Published
- 2024
85. ScreenWriter: Automatic Screenplay Generation and Movie Summarisation
- Author
-
Mahon, Louis and Lapata, Mirella
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
The proliferation of creative video content has driven demand for textual descriptions or summaries that allow users to recall key plot points or get an overview without watching. The volume of movie content and speed of turnover motivates automatic summarisation, which is nevertheless challenging, requiring identifying character intentions and very long-range temporal dependencies. The few existing methods attempting this task rely heavily on textual screenplays as input, greatly limiting their applicability. In this work, we propose the task of automatic screenplay generation, and a method, ScreenWriter, that operates only on video and produces output which includes dialogue, speaker names, scene breaks, and visual descriptions. ScreenWriter introduces a novel algorithm to segment the video into scenes based on the sequence of visual vectors, and a novel method for the challenging problem of determining character names, based on a database of actors' faces. We further demonstrate how these automatic screenplays can be used to generate plot synopses with a hierarchical summarisation method based on scene breaks. We test the quality of the final summaries on the recent MovieSum dataset, which we augment with videos, and show that they are superior to a number of comparison models which assume access to goldstandard screenplays.
- Published
- 2024
86. Automatic Navigation and Voice Cloning Technology Deployment on a Humanoid Robot
- Author
-
Han, Dongkun and Shao, Boyuan
- Subjects
Computer Science - Robotics - Abstract
Mobile robots have shown immense potential and are expected to be widely used in the service industry. The importance of automatic navigation and voice cloning cannot be overstated as they enable functional robots to provide high-quality services. The objective of this work is to develop a control algorithm for the automatic navigation of a humanoid mobile robot called Cruzr, which is a service robot manufactured by Ubtech. Initially, a virtual environment is constructed in the simulation software Gazebo using Simultaneous Localization And Mapping (SLAM), and global path planning is carried out by means of local path tracking. The two-wheel differential chassis kinematics model is employed to ensure autonomous dynamic obstacle avoidance for the robot chassis. Furthermore, the mapping and trajectory generation algorithms developed in the simulation environment are successfully implemented on the real robot Cruzr. The performance of automatic navigation is compared between the Dynamic Window Approach (DWA) and Model Predictive Control (MPC) algorithms. Additionally, a mobile application for voice cloning is created based on a Hidden Markov Model, and the proposed Chatbot is also tested and deployed on Cruzr., Comment: 7 pages, 6 figures
- Published
- 2024
87. FSOS-AMC: Few-Shot Open-Set Learning for Automatic Modulation Classification
- Author
-
Zhang, Hao, Zhou, Fuhui, Wu, Qihui, and Yuen, Chau
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Automatic modulation classification (AMC) is essential for the advancement and efficiency of future wireless communication networks. Deep learning (DL)-based AMC frameworks have garnered extensive attention for their impressive classification performance. However, existing DL-based AMC frameworks rely on two assumptions, large-scale training data and the same class pool between the training and testing data, which are not suitable for \emph{few-shot and open-set} scenarios. To address this issue, a novel few-shot open-set automatic modulation classification (FSOS-AMC) framework is proposed by exploiting a multi-scale attention network, meta-prototype training, and a modular open-set classifier. The multi-scale attention network is used to extract the features from the input signal, the meta-prototype training is adopted to train the feature extractor and the modular open-set classifier can be utilized to classify the testing data into one of the known modulations or potential unknown modulations. Extensive simulation results demonstrate that the proposed FSOS-AMC framework can achieve higher classification accuracy than the state-of-the-art methods for known modulations and unknown modulations in terms of accuracy and area under the receiver operating characteristic curve (AUROC). Moreover, the performance of the proposed FSOS-AMC framework under low signal-to-noise ratio (SNR) conditions is much better than the compared schemes., Comment: accepted by 16th International Conference on Wireless Communications and Signal Processing (WCSP 2024)
- Published
- 2024
88. EasyHeC++: Fully Automatic Hand-Eye Calibration with Pretrained Image Models
- Author
-
Hong, Zhengdong, Zheng, Kangfu, and Chen, Linghao
- Subjects
Computer Science - Robotics - Abstract
Hand-eye calibration plays a fundamental role in robotics by directly influencing the efficiency of critical operations such as manipulation and grasping. In this work, we present a novel framework, EasyHeC++, designed for fully automatic hand-eye calibration. In contrast to previous methods that necessitate manual calibration, specialized markers, or the training of arm-specific neural networks, our approach is the first system that enables accurate calibration of any robot arm in a marker-free, training-free, and fully automatic manner. Our approach employs a two-step process. First, we initialize the camera pose using a sampling or feature-matching-based method with the aid of pretrained image models. Subsequently, we perform pose optimization through differentiable rendering. Extensive experiments demonstrate the system's superior accuracy in both synthetic and real-world datasets across various robot arms and camera settings. Project page: https://ootts.github.io/easyhec_plus., Comment: Accepted by IROS 2024
- Published
- 2024
89. AMPO: Automatic Multi-Branched Prompt Optimization
- Author
-
Yang, Sheng, Wu, Yurong, Gao, Yan, Zhou, Zineng, Zhu, Bin Benjamin, Sun, Xiaodi, Lou, Jian-Guang, Ding, Zhiming, Hu, Anbang, Fang, Yuan, Li, Yunsong, Chen, Junyan, and Yang, Linjun
- Subjects
Computer Science - Computation and Language - Abstract
Prompt engineering is very important to enhance the performance of large language models (LLMs). When dealing with complex issues, prompt engineers tend to distill multiple patterns from examples and inject relevant solutions to optimize the prompts, achieving satisfying results. However, existing automatic prompt optimization techniques are only limited to producing single flow instructions, struggling with handling diverse patterns. In this paper, we present AMPO, an automatic prompt optimization method that can iteratively develop a multi-branched prompt using failure cases as feedback. Our goal is to explore a novel way of structuring prompts with multi-branches to better handle multiple patterns in complex tasks, for which we introduce three modules: Pattern Recognition, Branch Adjustment, and Branch Pruning. In experiments across five tasks, AMPO consistently achieves the best results. Additionally, our approach demonstrates significant optimization efficiency due to our adoption of a minimal search strategy., Comment: 13 pages, 7 figures, 6 tables
- Published
- 2024
90. Modeling User Preferences with Automatic Metrics: Creating a High-Quality Preference Dataset for Machine Translation
- Author
-
Agrawal, Sweta, de Souza, José G. C., Rei, Ricardo, Farinhas, António, Faria, Gonçalo, Fernandes, Patrick, Guerreiro, Nuno M, and Martins, Andre
- Subjects
Computer Science - Computation and Language - Abstract
Alignment with human preferences is an important step in developing accurate and safe large language models. This is no exception in machine translation (MT), where better handling of language nuances and context-specific variations leads to improved quality. However, preference data based on human feedback can be very expensive to obtain and curate at a large scale. Automatic metrics, on the other hand, can induce preferences, but they might not match human expectations perfectly. In this paper, we propose an approach that leverages the best of both worlds. We first collect sentence-level quality assessments from professional linguists on translations generated by multiple high-quality MT systems and evaluate the ability of current automatic metrics to recover these preferences. We then use this analysis to curate a new dataset, MT-Pref (metric induced translation preference) dataset, which comprises 18k instances covering 18 language directions, using texts sourced from multiple domains post-2022. We show that aligning TOWER models on MT-Pref significantly improves translation quality on WMT23 and FLORES benchmarks., Comment: Accepted at EMNLP Main 2024
- Published
- 2024
91. Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
- Author
-
Zheng, Xiaosen, Pang, Tianyu, Du, Chao, Liu, Qian, Jiang, Jing, and Lin, Min
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Automatic LLM benchmarks, such as AlpacaEval 2.0, Arena-Hard-Auto, and MT-Bench, have become popular for evaluating language models due to their cost-effectiveness and scalability compared to human evaluation. Achieving high win rates on these benchmarks can significantly boost the promotional impact of newly released language models. This promotional benefit may motivate tricks, such as manipulating model output length or style to game win rates, even though several mechanisms have been developed to control length and disentangle style to reduce gameability. Nonetheless, we show that even a "null model" that always outputs a constant response (irrelevant to input instructions) can cheat automatic benchmarks and achieve top-ranked win rates: an 86.5% LC win rate on AlpacaEval 2.0; an 83.0 score on Arena-Hard-Auto; and a 9.55 score on MT-Bench. Moreover, the crafted cheating outputs are transferable because we assume that the instructions of these benchmarks (e.g., 805 samples of AlpacaEval 2.0) are private and cannot be accessed. While our experiments are primarily proof-of-concept, an adversary could use LLMs to generate more imperceptible cheating responses, unethically benefiting from high win rates and promotional impact. Our findings call for the development of anti-cheating mechanisms for reliable automatic benchmarks. The code is available at https://github.com/sail-sg/Cheating-LLM-Benchmarks.
- Published
- 2024
92. Automatic Instantiation of Assurance Cases from Patterns Using Large Language Models
- Author
-
Odu, Oluwafemi, Belle, Alvine B., Wang, Song, Kpodjedo, Segla, Lethbridge, Timothy C., and Hemmati, Hadi
- Subjects
Computer Science - Software Engineering - Abstract
An assurance case is a structured set of arguments supported by evidence, demonstrating that a system's non-functional requirements (e.g., safety, security, reliability) have been correctly implemented. Assurance case patterns serve as templates derived from previous successful assurance cases, aimed at facilitating the creation of new assurance cases. Despite the use of these patterns to generate assurance cases, their instantiation remains a largely manual and error-prone process that heavily relies on domain expertise. Thus, exploring techniques to support their automatic instantiation becomes crucial. This study aims to investigate the potential of Large Language Models (LLMs) in automating the generation of assurance cases that comply with specific patterns. Specifically, we formalize assurance case patterns using predicate-based rules and then utilize LLMs, i.e., GPT-4o and GPT-4 Turbo, to automatically instantiate assurance cases from these formalized patterns. Our findings suggest that LLMs can generate assurance cases that comply with the given patterns. However, this study also highlights that LLMs may struggle with understanding some nuances related to pattern-specific relationships. While LLMs exhibit potential in the automatic generation of assurance cases, their capabilities still fall short compared to human experts. Therefore, a semi-automatic approach to instantiating assurance cases may be more practical at this time.
- Published
- 2024
93. CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation
- Author
-
He, Han, Liu, Qianchu, Xu, Lei, Shivade, Chaitanya, Zhang, Yi, Srinivasan, Sundararajan, and Kirchhoff, Katrin
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Existing automatic prompt engineering methods are typically designed for discriminative tasks, where new task prompts are iteratively refined with limited feedback from a single metric reflecting a single aspect. However, these approaches are suboptimal for generative tasks, which require more nuanced guidance beyond a single numeric metric to improve the prompt and optimize multiple aspects of the generated text. To address these challenges, we propose a novel multi-aspect Critique-Suggestion-guided automatic Prompt Optimization (CriSPO) approach. CriSPO introduces a critique-suggestion module as its core component. This module spontaneously discovers aspects, and compares generated and reference texts across these aspects, providing specific suggestions for prompt modification. These clear critiques and actionable suggestions guide a receptive optimizer module to make more substantial changes, exploring a broader and more effective search space. To further improve CriSPO with multi-metric optimization, we introduce an Automatic Suffix Tuning (AST) extension to enhance the performance of task prompts across multiple metrics. We evaluate CriSPO on 4 state-of-the-art LLMs across 4 summarization and 5 QA datasets. Extensive experiments show 3-4\% ROUGE score improvement on summarization and substantial improvement of various metrics on QA.
- Published
- 2024
94. Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems
- Author
-
Iakovenko, Olga, Bondarenko, Ivan, Borovikova, Mariya, and Vodolazsky, Daniil
- Subjects
Computer Science - Computation and Language - Abstract
This paper presents an overview of rule-based system for automatic accentuation and phonemic transcription of Russian texts for speech connected tasks, such as Automatic Speech Recognition (ASR). Two parts of the developed system, accentuation and transcription, use different approaches to achieve correct phonemic representations of input phrases. Accentuation is based on "Grammatical dictionary of the Russian language" of A.A. Zaliznyak and wiktionary corpus. To distinguish homographs, the accentuation system also utilises morphological information of the sentences based on Recurrent Neural Networks (RNN). Transcription algorithms apply the rules presented in the monograph of B.M. Lobanov and L.I. Tsirulnik "Computer Synthesis and Voice Cloning". The rules described in the present paper are implemented in an open-source module, which can be of use to any scientific study connected to ASR or Speech To Text (STT) tasks. Automatically marked up text annotations of the Russian Voxforge database were used as training data for an acoustic model in CMU Sphinx. The resulting acoustic model was evaluated on cross-validation, mean Word Accuracy being 71.2%. The developed toolkit is written in the Python language and is accessible on GitHub for any researcher interested., Comment: Speech and Computer 20th International Conference, SPECOM 2018, Leipzig, Germany, Proceedings 20
- Published
- 2024
- Full Text
- View/download PDF
95. Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge
- Author
-
Elangovan, Aparna, Ko, Jongwoo, Xu, Lei, Elyasi, Mahsa, Liu, Ling, Bodapati, Sravan, and Roth, Dan
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence - Abstract
The effectiveness of automatic evaluation of generative models is typically measured by comparing it to human evaluation using correlation metrics. However, metrics like Krippendorff's $\alpha$ and Randolph's $\kappa$, originally designed to measure the reliability of human labeling, make assumptions about human behavior and the labeling process. In this paper, we show how *relying on a single aggregate correlation score* can obscure fundamental differences between human behavior and automatic evaluation methods, including LLM-as-a-Judge. Specifically, we demonstrate that when the proportion of samples with variation or uncertainty in human labels (gathered during human evaluation) is relatively high, machine labels (generated by automatic evaluation methods) may superficially appear to have similar or better correlation with the human majority label compared to human-to-human (HH) correlation. This can create the illusion that automatic evaluation approximates the human majority label. However, as the proportion of samples with consistent human labels increases, the correlation between machine and human labels fall well below HH correlation. Based on these findings, we first propose stratifying results by human label uncertainty to provide a more robust analysis of automatic evaluation performance. Second, recognizing that uncertainty and variation are inherent in perception-based human evaluations, such as those involving attitudes or preferences, we introduce a new metric - *binned Jensen-Shannon Divergence for perception* for such scenarios to better measure the effectiveness of automatic evaluations. Third, we present visualization techniques -- *perception charts*, to compare the strengths and limitations of automatic evaluation and to contextualize correlation measures appropriately
- Published
- 2024
96. HiReview: Hierarchical Taxonomy-Driven Automatic Literature Review Generation
- Author
-
Hu, Yuntong, Li, Zhuofeng, Zhang, Zheng, Ling, Chen, Kanjiani, Raasikh, Zhao, Boxin, and Zhao, Liang
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
In this work, we present HiReview, a novel framework for hierarchical taxonomy-driven automatic literature review generation. With the exponential growth of academic documents, manual literature reviews have become increasingly labor-intensive and time-consuming, while traditional summarization models struggle to generate comprehensive document reviews effectively. Large language models (LLMs), with their powerful text processing capabilities, offer a potential solution; however, research on incorporating LLMs for automatic document generation remains limited. To address key challenges in large-scale automatic literature review generation (LRG), we propose a two-stage taxonomy-then-generation approach that combines graph-based hierarchical clustering with retrieval-augmented LLMs. First, we retrieve the most relevant sub-community within the citation network, then generate a hierarchical taxonomy tree by clustering papers based on both textual content and citation relationships. In the second stage, an LLM generates coherent and contextually accurate summaries for clusters or topics at each hierarchical level, ensuring comprehensive coverage and logical organization of the literature. Extensive experiments demonstrate that HiReview significantly outperforms state-of-the-art methods, achieving superior hierarchical organization, content relevance, and factual accuracy in automatic literature review generation tasks.
- Published
- 2024
97. Automatic deductive coding in discourse analysis: an application of large language models in learning analytics
- Author
-
Zhang, Lishan, Wu, Han, Huang, Xiaoshan, Duan, Tengfei, and Du, Hanxiang
- Subjects
Computer Science - Computation and Language ,Computer Science - Human-Computer Interaction - Abstract
Deductive coding is a common discourse analysis method widely used by learning science and learning analytics researchers for understanding teaching and learning interactions. It often requires researchers to manually label all discourses to be analyzed according to a theoretically guided coding scheme, which is time-consuming and labor-intensive. The emergence of large language models such as GPT has opened a new avenue for automatic deductive coding to overcome the limitations of traditional deductive coding. To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. By analyzing and comparing the accuracy and Kappa values of these three classification methods, we found that GPT with prompt engineering outperformed the other two methods on both datasets with limited number of training samples. By providing detailed prompt structures, the reported work demonstrated how large language models can be used in the implementation of automatic deductive coding., Comment: 20 pages
- Published
- 2024
98. A C++ implementation of the discrete adjoint sensitivity analysis method for explicit adaptive Runge-Kutta methods enabled by automatic adjoint differentiation and SIMD vectorization
- Author
-
Martins, Rui and Lakshtanov, Evgeny
- Subjects
Mathematics - Numerical Analysis ,Computer Science - Mathematical Software ,34-04 (Primary) 65L06, 65K10, 90C31 (Secondary) ,G.1 ,G.4 - Abstract
A C++ library for sensitivity analysis of optimisation problems involving ordinary differential equations (ODEs) enabled by automatic differentiation (AD) and SIMD (Single Instruction, Multiple data) vectorization is presented. The discrete adjoint sensitivity analysis method is implemented for adaptive explicit Runge-Kutta (ERK) methods. Automatic adjoint differentiation (AAD) is employed for efficient evaluations of products of vectors and the Jacobian matrix of the right hand side of the ODE system. This approach avoids the low-level drawbacks of the black box approach of employing AAD on the entire ODE solver and opens the possibility to leverage parallelization. SIMD vectorization is employed to compute the vector-Jacobian products concurrently. We study the performance of other methods and implementations of sensitivity analysis and we find that our algorithm presents a small advantage compared to equivalent existing software., Comment: 30 pages, 15 figures, preprint
- Published
- 2024
99. DeepAerialMapper: Deep Learning-based Semi-automatic HD Map Creation for Highly Automated Vehicles
- Author
-
Krajewski, Robert and Kim, Huijo
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
High-definition maps (HD maps) play a crucial role in the development, safety validation, and operation of highly automated vehicles. Efficiently collecting up-to-date sensor data from road segments and obtaining accurate maps from these are key challenges in HD map creation. Commonly used methods, such as dedicated measurement vehicles and crowd-sourced data from series vehicles, often face limitations in commercial viability. Although high-resolution aerial imagery offers a cost-effective or even free alternative, it requires significant manual effort and time to transform it into maps. In this paper, we introduce a semi-automatic method for creating HD maps from high-resolution aerial imagery. Our method involves training neural networks to semantically segment aerial images into classes relevant to HD maps. The resulting segmentation is then hierarchically post-processed to generate a prototypical HD map of visible road elements. Exporting the map to the Lanelet2 format allows easy extension for different use cases using standard tools. To train and evaluate our method, we created a dataset using public aerial imagery of urban road segments in Germany. In our evaluation, we achieved an automatic mapping of lane markings and road borders with a recall and precision exceeding 96%. The source code for our method is publicly available at https://github.com/RobertKrajewski/DeepAerialMapper., Comment: For source code, see https://github.com/RobertKrajewski/DeepAerialMapper
- Published
- 2024
100. BioFace3D: A fully automatic pipeline for facial biomarkers extraction of 3D face reconstructions segmented from MRI
- Author
-
Heredia-Lidón, Álvaro, Echeverry-Quiceno, Luis M., González, Alejandro, Hostalet, Noemí, Pomarol-Clotet, Edith, Fortea, Juan, Fatjó-Vilas, Mar, Martínez-Abadías, Neus, and Sevillano, Xavier
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Quantitative Biology - Quantitative Methods - Abstract
Facial dysmorphologies have emerged as potential critical indicators in the diagnosis and prognosis of genetic, psychotic and rare disorders. While in certain conditions these dysmorphologies are severe, in other cases may be subtle and not perceivable to the human eye, requiring precise quantitative tools for their identification. Manual coding of facial dysmorphologies is a burdensome task and is subject to inter- and intra-observer variability. To overcome this gap, we present BioFace3D as a fully automatic tool for the calculation of facial biomarkers using facial models reconstructed from magnetic resonance images. The tool is divided into three automatic modules for the extraction of 3D facial models from magnetic resonance images, the registration of homologous 3D landmarks encoding facial morphology, and the calculation of facial biomarkers from anatomical landmarks coordinates using geometric morphometrics techniques.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.