Author: "HUANG, Jing" / Publication Type: Reports - Searchworks@Jio Institute Digital Library Search Results

1. On graphs which have locally complete 2-edge-colourings and their relationship to proper circular-arc graphs

Author: Bang-Jensen, Jørgen and Huang, Jing
Subjects: Mathematics - Combinatorics, 05c20
Abstract: A 2-edge-coloured graph $G$ is called {\bf locally complete} if for each vertex $v$, the vertices adjacent to $v$ through edges of the same colour induce a complete subgraph in $G$. Locally complete 2-edge-coloured graphs have nice properties and there exists a polynomial algorithm to decide whether such a graph has an alternating hamiltonian cycle, where alternating means that the colour of two consecutive edges on the cycle are different. In this paper we show that graphs having locally complete 2-edge-colourings can be recognized in polynomial time. We give a forbidden substructure characterization for this class of graphs analogous to Gallai's characterization for cocomparability graphs. Finally, we characterize proper interval graphs and proper circular-arc graphs which have locally complete 2-edge-colourings by forbidden subgraphs.
Published: 2024

2. An Improved Variational Method for Image Denoising

Author: Huang, Jing-En, Liao, Jia-Wei, Lin, Ku-Te, Tsai, Yu-Ju, and Yueh, Mei-Heng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Mathematics - Numerical Analysis
Abstract: The total variation (TV) method is an image denoising technique that aims to reduce noise by minimizing the total variation of the image, which measures the variation in pixel intensities. The TV method has been widely applied in image processing and computer vision for its ability to preserve edges and enhance image quality. In this paper, we propose an improved TV model for image denoising and the associated numerical algorithm to carry out the procedure, which is particularly effective in removing several types of noises and their combinations. Our improved model admits a unique solution and the associated numerical algorithm guarantees the convergence. Numerical experiments are demonstrated to show improved effectiveness and denoising quality compared to other TV models. Such encouraging results further enhance the utility of the TV method in image processing.
Published: 2024

3. Quantum Machine Learning for Semiconductor Fabrication: Modeling GaN HEMT Contact Process

Author: Wang, Zeheng, Wang, Fangzhou, Li, Liang, Wang, Zirui, van der Laan, Timothy, Leon, Ross C. C., Huang, Jing-Kai, and Usman, Muhammad
Subjects: Computer Science - Machine Learning, Computer Science - Emerging Technologies, Quantum Physics
Abstract: This paper pioneers the use of quantum machine learning (QML) for modeling the Ohmic contact process in GaN high-electron-mobility transistors (HEMTs) for the first time. Utilizing data from 159 devices and variational auto-encoder-based augmentation, we developed a quantum kernel-based regressor (QKR) with a 2-level ZZ-feature map. Benchmarking against six classical machine learning (CML) models, our QKR consistently demonstrated the lowest mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE). Repeated statistical analysis confirmed its robustness. Additionally, experiments verified an MAE of 0.314 ohm-mm, underscoring the QKR's superior performance and potential for semiconductor applications, and demonstrating significant advancements over traditional CML methods., Comment: This is the manuscript in the conference version. An expanded version for the journal will be released later and more information will be added. The author list, content, conclusion, and figures may change due to further research
Published: 2024

4. Interlayer Engineering of Lattice Dynamics and Elastic Constants of 2D Layered Nanomaterials under Pressure

Author: Du, Guoshuai, Zhao, Lili, Li, Shuchang, Huang, Jing, Fang, Susu, Han, Wuxiao, Li, Jiayin, Du, Yubing, Ming, Jiaxin, Zhang, Tiansong, Zhang, Jun, Kang, Jun, Li, Xiaoyan, Xu, Weigao, and Chen, Yabin
Subjects: Condensed Matter - Materials Science, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Interlayer coupling in two-dimensional (2D) layered nanomaterials can provide us novel strategies to evoke their superior properties, such as the exotic flat bands and unconventional superconductivity of twisted layers, the formation of moir\'e excitons and related nontrivial topology. However, to accurately quantify interlayer potential and further measure elastic properties of 2D materials remains vague, despite significant efforts. Herein, the layer-dependent lattice dynamics and elastic constants of 2D nanomaterials have been systematically investigated via pressure-engineering strategy based on ultralow frequency Raman spectroscopy. The shearing mode and layer-breathing Raman shifts of MoS2 with various thicknesses were analyzed by the linear chain model. Intriguingly, it was found that the layer-dependent d{\omega}/dP of shearing and breathing Raman modes display the opposite trends, quantitatively consistent with our molecular dynamics simulations and density functional theory calculations. These results can be generalized to other van der Waals systems, and may shed light on the potential applications of 2D materials in nanomechanics and nanoelectronics., Comment: 25 pages, 5 figures
Published: 2024

5. Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning

Author: Dou, Zi-Yi, Yang, Xitong, Nagarajan, Tushar, Wang, Huiyu, Huang, Jing, Peng, Nanyun, Kitani, Kris, and Chu, Fu-Jen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: We present EMBED (Egocentric Models Built with Exocentric Data), a method designed to transform exocentric video-language data for egocentric video representation learning. Large-scale exocentric data covers diverse activities with significant potential for egocentric learning, but inherent disparities between egocentric and exocentric data pose challenges in utilizing one view for the other seamlessly. Egocentric videos predominantly feature close-up hand-object interactions, whereas exocentric videos offer a broader perspective on human activities. Additionally, narratives in egocentric datasets are typically more action-centric and closely linked with the visual content, in contrast to the narrative styles found in exocentric datasets. To address these challenges, we employ a data transformation framework to adapt exocentric data for egocentric training, focusing on identifying specific video clips that emphasize hand-object interactions and transforming narration styles to align with egocentric perspectives. By applying both vision and language style transfer, our framework creates a new egocentric dataset derived from exocentric video-language data. Through extensive evaluations, we demonstrate the effectiveness of EMBED, achieving state-of-the-art results across various egocentric downstream tasks, including an absolute improvement of 4.7% on the Epic-Kitchens-100 multi-instance retrieval and 6.2% on the EGTEA classification benchmarks in zero-shot settings. Furthermore, EMBED enables egocentric video-language models to perform competitively in exocentric tasks. Finally, we showcase EMBED's application across various exocentric datasets, exhibiting strong generalization capabilities when applied to different exocentric datasets.
Published: 2024

6. Demystifying Verbatim Memorization in Large Language Models

Author: Huang, Jing, Yang, Diyi, and Potts, Christopher
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. Much prior work has studied such verbatim memorization using observational data. To complement such work, we develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to verbatim memorize sequences, even for out-of-distribution sequences; (3) the generation of memorized sequences is triggered by distributed model states that encode high-level features and makes important use of general language modeling capabilities. Guided by these insights, we develop stress tests to evaluate unlearning methods and find they often fail to remove the verbatim memorized information, while also degrading the LM. Overall, these findings challenge the hypothesis that verbatim memorization stems from specific model weights or mechanisms. Rather, verbatim memorization is intertwined with the LM's general capabilities and thus will be very difficult to isolate and suppress without degrading model quality.
Published: 2024

7. Homotopic Path Set Planning for Robot Manipulation and Navigation

Author: Huang, Jing, Tang, Yunxi, and Au, Kwok Wai Samuel
Subjects: Computer Science - Robotics
Abstract: This paper addresses path set planning that yields important applications in robot manipulation and navigation such as path generation for deformable object keypoints and swarms. A path set refers to the collection of finite agent paths to represent the overall spatial path of a group of keypoints or a swarm, whose collective properties meet spatial and topological constraints. As opposed to planning a single path, simultaneously planning multiple paths with constraints poses nontrivial challenges in complex environments. This paper presents a systematic planning pipeline for homotopic path sets, a widely applicable path set class in robotics. An extended visibility check condition is first proposed to attain a sparse passage distribution amidst dense obstacles. Passage-aware optimal path planning compatible with sampling-based planners is then designed for single path planning with adjustable costs. Large accessible free space for path set accommodation can be achieved by the planned path while having a sufficiently short path length. After specifying the homotopic properties of path sets, path set generation based on deformable path transfer is proposed in an efficient centralized manner. The effectiveness of these methods is validated by extensive simulated and experimental results., Comment: 16 pages, 19 figures, conference
Published: 2024

8. Enhancing interferometry using weak value amplification with real weak values

Author: Huang, Jing-Hui, Jordan, Kyle M., Dada, Adetunmise C., Hu, Xiang-Yun, and Lundeen, Jeff. S.
Subjects: Quantum Physics
Abstract: We introduce an ultra-sensitive interferometric protocol that combines weak value amplification (WVA) with traditional interferometry. This WVA+interferometry protocol uses weak value amplification of the relative delay between two paths to enhance the interferometric sensitivity, approaching the quantum limit for classical light. As an example, we demonstrate a proof-of-principle experiment that achieves few-attosecond timing resolution (few-nanometer path length resolution) with a double-slit interferometer using only common optical components. Since our example uses only the spatial shift of double-slit interference fringes, its precision is not limited by the timing resolution of the detectors, but is instead limited solely by the fundamental shot noise associated with classical light. We experimentally demonstrate that the signal-to-noise ratio can be improved by one to three orders of magnitude and approaches the shot-noise limit in the large amplification regime. Previously, quantum-limited WVA delay measurements were thought to require imaginary weak values, which necessitate light with a broad spectral bandwidth and high-resolution spectrometers. In contrast, our protocol highlights the feasibility of using real weak values and narrowband light. Thus, our protocol is a compelling and cost-effective approach to enhance interferometry., Comment: 3 figures
Published: 2024

9. Semisupervised score based matching algorithm to evaluate the effect of public health interventions

Author: Zhang, Hongzhe, Shi, Jiasheng, and Huang, Jing
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Methodology
Abstract: Multivariate matching algorithms "pair" similar study units in an observational study to remove potential bias and confounding effects caused by the absence of randomizations. In one-to-one multivariate matching algorithms, a large number of "pairs" to be matched could mean both the information from a large sample and a large number of tasks, and therefore, to best match the pairs, such a matching algorithm with efficiency and comparatively limited auxiliary matching knowledge provided through a "training" set of paired units by domain experts, is practically intriguing. We proposed a novel one-to-one matching algorithm based on a quadratic score function $S_{\beta}(x_i,x_j)= \beta^T (x_i-x_j)(x_i-x_j)^T \beta$. The weights $\beta$, which can be interpreted as a variable importance measure, are designed to minimize the score difference between paired training units while maximizing the score difference between unpaired training units. Further, in the typical but intricate case where the training set is much smaller than the unpaired set, we propose a \underline{s}emisupervised \underline{c}ompanion \underline{o}ne-\underline{t}o-\underline{o}ne \underline{m}atching \underline{a}lgorithm (SCOTOMA) that makes the best use of the unpaired units. The proposed weight estimator is proved to be consistent when the truth matching criterion is indeed the quadratic score function. When the model assumptions are violated, we demonstrate that the proposed algorithm still outperforms some popular competing matching algorithms through a series of simulations. We applied the proposed algorithm to a real-world study to investigate the effect of in-person schooling on community Covid-19 transmission rate for policy making purpose.
Published: 2024

10. Time-Since-Infection Model for Hospitalization and Incidence Data

Author: Shi, Jiasheng, Zhou, Yizhao, and Huang, Jing
Subjects: Statistics - Methodology
Abstract: The Time Since Infection (TSI) models, which use disease surveillance data to model infectious diseases, have become increasingly popular recently due to their flexibility and capacity to address complex disease control questions. However, a notable limitation of TSI models is their primary reliance on incidence data. Even when hospitalization data are available, existing TSI models have not been crafted to estimate disease transmission or predict disease-related hospitalizations - metrics crucial for understanding a pandemic and planning hospital resources. Moreover, their dependence on reported infection data makes them vulnerable to variations in data quality. In this study, we advance TSI models by integrating hospitalization data, marking a significant step forward in modeling with TSI models. Our improvements enable the estimation of key infectious disease parameters without relying on contact tracing data, reduce bias in incidence data, and provide a foundation to connect TSI models with other infectious disease models. We introduce hospitalization propensity parameters to jointly model incidence and hospitalization data. We use a composite likelihood function to accommodate complex data structure and an MCEM algorithm to estimate model parameters. We apply our method to COVID-19 data to estimate disease transmission, assess risk factor impacts, and calculate hospitalization propensity.
Published: 2024

11. pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Author: Wu, Zhengxuan, Geiger, Atticus, Arora, Aryaman, Huang, Jing, Wang, Zheng, Goodman, Noah D., Manning, Christopher D., and Potts, Christopher
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce $\textbf{pyvene}$, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. $\textbf{pyvene}$ supports complex intervention schemes with an intuitive configuration format, and its interventions can be static or include trainable parameters. We show how $\textbf{pyvene}$ provides a unified and extensible framework for performing interventions on neural models and sharing the intervened upon models with others. We illustrate the power of the library via interpretability analyses using causal abstraction and knowledge localization. We publish our library through Python Package Index (PyPI) and provide code, documentation, and tutorials at https://github.com/stanfordnlp/pyvene., Comment: 8 pages, 3 figures
Published: 2024

12. Real-Time Simulated Avatar from Head-Mounted Sensors

Author: Luo, Zhengyi, Cao, Jinkun, Khirodkar, Rawal, Winkler, Alexander, Huang, Jing, Kitani, Kris, and Xu, Weipeng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Robotics
Abstract: We present SimXR, a method for controlling a simulated avatar from information (headset pose and cameras) obtained from AR / VR headsets. Due to the challenging viewpoint of head-mounted cameras, the human body is often clipped out of view, making traditional image-based egocentric pose estimation challenging. On the other hand, headset poses provide valuable information about overall body motion, but lack fine-grained details about the hands and feet. To synergize headset poses with cameras, we control a humanoid to track headset movement while analyzing input images to decide body movement. When body parts are seen, the movements of hands and feet will be guided by the images; when unseen, the laws of physics guide the controller to generate plausible motion. We design an end-to-end method that does not rely on any intermediate representations and learns to directly map from images and headset poses to humanoid control signals. To train our method, we also propose a large-scale synthetic dataset created using camera configurations compatible with a commercially available VR headset (Quest 2) and show promising results on real-world captures. To demonstrate the applicability of our framework, we also test it on an AR headset with a forward-facing camera., Comment: CVPR 2024 Hightlight. Website: https://www.zhengyiluo.com/SimXR/
Published: 2024

13. Metasurface spectrometers beyond resolution-sensitivity constraints

Author: Tang, Feng, Wu, Jingjun, Albrow-Owen, Tom, Cui, Hanxiao, Chen, Fujia, Shi, Yaqi, Zou, Lan, Chen, Jun, Guo, Xuhan, Sun, Yijun, Luo, Jikui, Ju, Bingfeng, Huang, Jing, Liu, Shuangli, Li, Bo, Yang, Liming, Munro, Eric Anthony, Zheng, Wanguo, Joyce, Hannah J., Chen, Hongsheng, Che, Lufeng, Dong, Shurong, Hasan, Tawfique, Ye, Xin, Yang, Yihao, and Yang, Zongyin
Subjects: Physics - Optics, Condensed Matter - Materials Science
Abstract: Optical spectroscopy plays an essential role across scientific research and industry for non-contact materials analysis1-3, increasingly through in-situ or portable platforms4-6. However, when considering low-light-level applications, conventional spectrometer designs necessitate a compromise between their resolution and sensitivity7,8, especially as device and detector dimensions are scaled down. Here, we report on a miniaturizable spectrometer platform where light throughput onto the detector is instead enhanced as the resolution is increased. This planar, CMOS-compatible platform is based around metasurface encoders designed to exhibit photonic bound states in the continuum9, where operational range can be altered or extended simply through adjusting geometric parameters. This system can enhance photon collection efficiency by up to two orders of magnitude versus conventional designs; we demonstrate this sensitivity advantage through ultra-low-intensity fluorescent and astrophotonic spectroscopy. This work represents a step forward for the practical utility of spectrometers, affording a route to integrated, chip-based devices that maintain high resolution and SNR without requiring prohibitively long integration times.
Published: 2024

14. RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Author: Huang, Jing, Wu, Zhengxuan, Potts, Christopher, Geva, Mor, and Geiger, Atticus
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Individual neurons participate in the representation of multiple high-level concepts. To what extent can different interpretability methods successfully disentangle these roles? To help address this question, we introduce RAVEL (Resolving Attribute-Value Entanglements in Language Models), a dataset that enables tightly controlled, quantitative comparisons between a variety of existing interpretability methods. We use the resulting conceptual framework to define the new method of Multi-task Distributed Alignment Search (MDAS), which allows us to find distributed representations satisfying multiple causal criteria. With Llama2-7B as the target language model, MDAS achieves state-of-the-art results on RAVEL, demonstrating the importance of going beyond neuron-level analyses to identify features distributed across activations. We release our benchmark at https://github.com/explanare/ravel., Comment: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
Published: 2024

15. Deformable Object Manipulation With Constraints Using Path Set Planning and Tracking

Author: Huang, Jing, Chu, Xiangyu, Ma, Xin, and Au, Kwok Wai Samuel
Subjects: Computer Science - Robotics
Abstract: In robotic deformable object manipulation (DOM) applications, constraints arise commonly from environments and task-specific requirements. Enabling DOM with constraints is therefore crucial for its deployment in practice. However, dealing with constraints turns out to be challenging due to many inherent factors such as inaccessible deformation models of deformable objects (DOs) and varying environmental setups. This article presents a systematic manipulation framework for DOM subject to constraints by proposing a novel path set planning and tracking scheme. First, constrained DOM tasks are formulated into a versatile optimization formalism which enables dynamic constraint imposition. Because of the lack of the local optimization objective and high state dimensionality, the formulated problem is not analytically solvable. To address this, planning of the path set, which collects paths of DO feedback points, is proposed subsequently to offer feasible path and motion references for DO in constrained setups. Both theoretical analyses and computationally efficient algorithmic implementation of path set planning are discussed. Lastly, a control architecture combining path set tracking and constraint handling is designed for task execution. The effectiveness of our methods is validated in a variety of DOM tasks with constrained experimental settings., Comment: 20 pages, 25 figures, journal
Published: 2024
Full Text: View/download PDF

16. A Reply to Makelov et al. (2023)'s 'Interpretability Illusion' Arguments

Author: Wu, Zhengxuan, Geiger, Atticus, Huang, Jing, Arora, Aryaman, Icard, Thomas, Potts, Christopher, and Goodman, Noah D.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions". We first review Makelov et al. (2023)'s technical notion of what an "interpretability illusion" is, and then we show that even intuitive and desirable explanations can qualify as illusions in this sense. As a result, their method of discovering "illusions" can reject explanations they consider "non-illusory". We then argue that the illusions Makelov et al. (2023) see in practice are artifacts of their training and evaluation paradigms. We close by emphasizing that, though we disagree with their core characterization, Makelov et al. (2023)'s examples and discussion have undoubtedly pushed the field of interpretability forward., Comment: 20 pages, 14 figures
Published: 2024

17. HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings

Author: Mehta, Nikhil, Liang, Kevin J, Huang, Jing, Chu, Fu-Jen, Yin, Li, and Hassner, Tal
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: Out-of-distribution (OOD) detection is an important topic for real-world machine learning systems, but settings with limited in-distribution samples have been underexplored. Such few-shot OOD settings are challenging, as models have scarce opportunities to learn the data distribution before being tasked with identifying OOD samples. Indeed, we demonstrate that recent state-of-the-art OOD methods fail to outperform simple baselines in the few-shot setting. We thus propose a hypernetwork framework called HyperMix, using Mixup on the generated classifier parameters, as well as a natural out-of-episode outlier exposure technique that does not require an additional outlier dataset. We conduct experiments on CIFAR-FS and MiniImageNet, significantly outperforming other OOD methods in the few-shot regime.
Published: 2023

18. Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Author: Grauman, Kristen, Westbury, Andrew, Torresani, Lorenzo, Kitani, Kris, Malik, Jitendra, Afouras, Triantafyllos, Ashutosh, Kumar, Baiyya, Vijay, Bansal, Siddhant, Boote, Bikram, Byrne, Eugene, Chavis, Zach, Chen, Joya, Cheng, Feng, Chu, Fu-Jen, Crane, Sean, Dasgupta, Avijit, Dong, Jing, Escobar, Maria, Forigua, Cristhian, Gebreselasie, Abrham, Haresh, Sanjay, Huang, Jing, Islam, Md Mohaiminul, Jain, Suyog, Khirodkar, Rawal, Kukreja, Devansh, Liang, Kevin J, Liu, Jia-Wei, Majumder, Sagnik, Mao, Yongsen, Martin, Miguel, Mavroudi, Effrosyni, Nagarajan, Tushar, Ragusa, Francesco, Ramakrishnan, Santhosh Kumar, Seminara, Luigi, Somayazulu, Arjun, Song, Yale, Su, Shan, Xue, Zihui, Zhang, Edward, Zhang, Jinxu, Castillo, Angela, Chen, Changan, Fu, Xinzhu, Furuta, Ryosuke, Gonzalez, Cristina, Gupta, Prince, Hu, Jiabo, Huang, Yifei, Huang, Yiming, Khoo, Weslie, Kumar, Anush, Kuo, Robert, Lakhavani, Sach, Liu, Miao, Luo, Mi, Luo, Zhengyi, Meredith, Brighid, Miller, Austin, Oguntola, Oluwatumininu, Pan, Xiaqing, Peng, Penny, Pramanick, Shraman, Ramazanova, Merey, Ryan, Fiona, Shan, Wei, Somasundaram, Kiran, Song, Chenan, Southerland, Audrey, Tateno, Masatoshi, Wang, Huiyu, Wang, Yuchen, Yagi, Takuma, Yan, Mingfei, Yang, Xitong, Yu, Zecheng, Zha, Shengxin Cindy, Zhao, Chen, Zhao, Ziwei, Zhu, Zhifan, Zhuo, Jeff, Arbelaez, Pablo, Bertasius, Gedas, Crandall, David, Damen, Dima, Engel, Jakob, Farinella, Giovanni Maria, Furnari, Antonino, Ghanem, Bernard, Hoffman, Judy, Jawahar, C. V., Newcombe, Richard, Park, Hyun Soo, Rehg, James M., Sato, Yoichi, Savva, Manolis, Shi, Jianbo, Shou, Mike Zheng, and Wray, Michael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera poses, IMU, and multiple paired language descriptions -- including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. To push the frontier of first-person video understanding of skilled human activity, we also present a suite of benchmark tasks and their annotations, including fine-grained activity understanding, proficiency estimation, cross-view translation, and 3D hand/body pose. All resources are open sourced to fuel new research in the community. Project page: http://ego-exo4d-data.org/, Comment: Expanded manuscript (compared to arxiv v1 from Nov 2023 and CVPR 2024 paper from June 2024) for more comprehensive dataset and benchmark presentation, plus new results on v2 data release
Published: 2023

19. The physics of solar spectral imaging observations in dm-cm wavelengths and the application on space weather

Author: Tan, Baolin, Yan, Yihua, Huang, Jing, Zhang, Yin, Tan, Chengming, and Zhu, Xiaoshuai
Subjects: Astrophysics - Solar and Stellar Astrophysics, Astrophysics - Astrophysics of Galaxies, Physics - Space Physics
Abstract: Recently, several new solar radio telescopes have been put into operation and provided spectral-imaging observations with much higher resolutions in decimeter (dm) and centimeter (cm) wavelengths. These telescopes include the Mingantu Spectral Radioheliograph (MUSER, at frequencies of 0.4 - 15 GHz), the Expanded Owens Valley Solar Array (EOVSA, at frequencies of 1 - 18 GHz), and the Siberian Radio Heliograph (SRH, at frequencies of 3 - 24 GHz). These observations offer unprecedented opportunities to study solar physics and space weather, especially to diagnose the coronal magnetic fields, reveal the basic nature of solar eruptions and the related non-thermal energy release, particle accelerations and propagation, and the related emission mechanisms. These results might be the important input to the space weather modeling for predicting the occurrence of disastrous powerful space weather events. In order to provide meaningful reference for other solar physicists and space weather researchers, this paper mainly focus on discussing the potential scientific problems of solar radio spectral-imaging observations in dm-cm wavelengths and its possible applications in the field of space weather. These results will provide a helpful reference for colleagues to make full use of the latest and future observation data obtained from the above solar radio telescopes., Comment: 10 pages, 7 figures, accepted by Advance in Space Research, 2022
Published: 2023

20. An Efficient Approach for Identifying Important Biomarkers for Biomedical Diagnosis

Author: Huang, Jing-Wen, Chen, Yan-Hong, Phoa, Frederick Kin Hing, Lin, Yan-Han, and Lin, Shau-Ping
Subjects: Statistics - Methodology
Abstract: In this paper, we explore the challenges associated with biomarker identification for diagnosis purpose in biomedical experiments, and propose a novel approach to handle the above challenging scenario via the generalization of the Dantzig selector. To improve the efficiency of the regularization method, we introduce a transformation from an inherent nonlinear programming due to its nonlinear link function into a linear programming framework. We illustrate the use of of our method on an experiment with binary response, showing superior performance on biomarker identification studies when compared to their conventional analysis. Our proposed method does not merely serve as a variable/biomarker selection tool, its ranking of variable importance provides valuable reference information for practitioners to reach informed decisions regarding the prioritization of factors for further investigations.
Published: 2023

21. Contextual Data Augmentation for Task-Oriented Dialog Systems

Author: Axman, Dustin, Ray, Avik, Garg, Shubham, and Huang, Jing
Subjects: Computer Science - Computation and Language
Abstract: Collection of annotated dialogs for training task-oriented dialog systems have been one of the key bottlenecks in improving current models. While dialog response generation has been widely studied on the agent side, it is not evident if similar generative models can be used to generate a large variety of, and often unexpected, user inputs that real dialog systems encounter in practice. Existing data augmentation techniques such as paraphrase generation do not take the dialog context into consideration. In this paper, we develop a novel dialog augmentation model that generates a user turn, conditioning on full dialog context. Additionally, with a new prompt design for language model, and output re-ranking, the dialogs generated from our model can be directly used to train downstream dialog systems. On common benchmark datasets MultiWoZ and SGD, we show that our dialog augmentation model generates high quality dialogs and improves dialog success rate by as much as $8\%$ over baseline., Comment: ECML-PKDD 2023 Workshop on Challenges and Opportunities of Large Language Models in Real-World Machine Learning Applications (COLLM)
Published: 2023

22. Universal Humanoid Motion Representations for Physics-Based Control

Author: Luo, Zhengyi, Cao, Jinkun, Merel, Josh, Winkler, Alexander, Huang, Jing, Kitani, Kris, and Xu, Weipeng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Robotics
Abstract: We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control. Due to the high dimensionality of humanoids and the inherent difficulties in reinforcement learning, prior methods have focused on learning skill embeddings for a narrow range of movement styles (e.g. locomotion, game characters) from specialized motion datasets. This limited scope hampers their applicability in complex tasks. We close this gap by significantly increasing the coverage of our motion representation space. To achieve this, we first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset. We then create our motion representation by distilling skills directly from the imitator. This is achieved by using an encoder-decoder structure with a variational information bottleneck. Additionally, we jointly learn a prior conditioned on proprioception (humanoid's own pose and velocities) to improve model expressiveness and sampling efficiency for downstream tasks. By sampling from the prior, we can generate long, stable, and diverse human motions. Using this latent space for hierarchical RL, we show that our policies solve tasks using human-like behavior. We demonstrate the effectiveness of our motion representation by solving generative tasks (e.g. strike, terrain traversal) and motion tracking using VR controllers., Comment: ICLR 2024 Spotlight. Project page: https://zhengyiluo.github.io/PULSE/
Published: 2023

23. OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials

Author: Eastman, Peter, Galvelis, Raimondas, Peláez, Raúl P., Abreu, Charlles R. A., Farr, Stephen E., Gallicchio, Emilio, Gorenko, Anton, Henry, Michael M., Hu, Frank, Huang, Jing, Krämer, Andreas, Michel, Julien, Mitchell, Joshua A., Pande, Vijay S., Rodrigues, João PGLM, Rodriguez-Guerra, Jaime, Simmonett, Andrew C., Swails, Jason, Zhang, Ivy, Chodera, John D., De Fabritiis, Gianni, and Markland, Thomas E.
Subjects: Physics - Chemical Physics, Computer Science - Machine Learning, J.2, J.3
Abstract: Machine learning plays an important and growing role in molecular simulation. The newest version of the OpenMM molecular dynamics toolkit introduces new features to support the use of machine learning potentials. Arbitrary PyTorch models can be added to a simulation and used to compute forces and energy. A higher-level interface allows users to easily model their molecules of interest with general purpose, pretrained potential functions. A collection of optimized CUDA kernels and custom PyTorch operations greatly improves the speed of simulations. We demonstrate these features on simulations of cyclin-dependent kinase 8 (CDK8) and the green fluorescent protein (GFP) chromophore in water. Taken together, these features make it practical to use machine learning to improve the accuracy of simulations at only a modest increase in cost., Comment: 15 pages, 4 figures
Published: 2023

24. Rigorously Assessing Natural Language Explanations of Neurons

Author: Huang, Jing, Geiger, Atticus, D'Oosterlinck, Karel, Wu, Zhengxuan, and Potts, Christopher
Subjects: Computer Science - Computation and Language
Abstract: Natural language is an appealing medium for explaining how large language models process and store information, but evaluating the faithfulness of such explanations is challenging. To help address this, we develop two modes of evaluation for natural language explanations that claim individual neurons represent a concept in a text input. In the observational mode, we evaluate claims that a neuron $a$ activates on all and only input strings that refer to a concept picked out by the proposed explanation $E$. In the intervention mode, we construe $E$ as a claim that the neuron $a$ is a causal mediator of the concept denoted by $E$. We apply our framework to the GPT-4-generated explanations of GPT-2 XL neurons of Bills et al. (2023) and show that even the most confident explanations have high error rates and little to no causal efficacy. We close the paper by critically assessing whether natural language is a good choice for explanations and whether neurons are the best level of analysis.
Published: 2023

25. A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions

Author: Zhang, Tianyi, Wang, Zheng, Huang, Jing, Tasnim, Mohiuddin Muhammad, and Shi, Wei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Recently, there has been significant progress in the development of large models. Following the success of ChatGPT, numerous language models have been introduced, demonstrating remarkable performance. Similar advancements have also been observed in image generation models, such as Google's Imagen model, OpenAI's DALL-E 2, and stable diffusion models, which have exhibited impressive capabilities in generating images. However, similar to large language models, these models still encounter unresolved challenges. Fortunately, the availability of open-source stable diffusion models and their underlying mathematical principles has enabled the academic community to extensively analyze the performance of current image generation models and make improvements based on this stable diffusion framework. This survey aims to examine the existing issues and the current solutions pertaining to image generation models.
Published: 2023

26. Model-Free Large-Scale Cloth Spreading With Mobile Manipulation: Initial Feasibility Study

Author: Chu+, Xiangyu, Wang+, Shengzhi, Feng, Minjian, Zheng, Jiaxi, Zhao, Yuxuan, Huang, Jing, and Au, K. W. Samuel
Subjects: Computer Science - Robotics, Electrical Engineering and Systems Science - Systems and Control
Abstract: Cloth manipulation is common in domestic and service tasks, and most studies use fixed-base manipulators to manipulate objects whose sizes are relatively small with respect to the manipulators' workspace, such as towels, shirts, and rags. In contrast, manipulation of large-scale cloth, such as bed making and tablecloth spreading, poses additional challenges of reachability and manipulation control. To address them, this paper presents a novel framework to spread large-scale cloth, with a single-arm mobile manipulator that can solve the reachability issue, for an initial feasibility study. On the manipulation control side, without modeling highly deformable cloth, a vision-based manipulation control scheme is applied and based on an online-update Jacobian matrix mapping from selected feature points to the end-effector motion. To coordinate the control of the manipulator and mobile platform, Behavior Trees (BTs) are used because of their modularity. Finally, experiments are conducted, including validation of the model-free manipulation control for cloth spreading in different conditions and the large-scale cloth spreading framework. The experimental results demonstrate the large-scale cloth spreading task feasibility with a single-arm mobile manipulator and the model-free deformation controller., Comment: 6 pages, 6 figures, submit to CASE2023
Published: 2023

27. A potential third-generation gravitational-wave detector based on autocorrelative weak-value amplification

Author: Huang, Jing-Hui, He, Fei-Fan, Duan, Xue-Ying, Wang, Guang-Jun, and Hu, Xiang-Yun
Subjects: General Relativity and Quantum Cosmology, Astrophysics - Instrumentation and Methods for Astrophysics
Abstract: Reducing noises and enhancing signal-to-noise ratios (SNRs) have become critical for designing third-generation gravitational-wave (GW) detectors with a GW strain of less than $10^{-23}$/$\rm \sqrt{Hz}$. In this paper, we propose a potential third-generation GW detector based on autocorrelative weak-value amplification (AWVA) for GW detection with a strain of $h_g =$ $4 \times 10^{-25}$/$\rm \sqrt{Hz}$. In our scheme, a GW event induces a phase difference $\Delta \phi$ by passing through an 11-bounce delay line, 10-km arm-length, zero-area Sagnac interferometer illuminated with a 1064-nm laser. Subsequently, $\Delta \phi$ is amplified as the parameter of post-selection by choosing the appropriate pre-selected state and coupling strength in AWVA. In particular, we theoretically investigate the AWVA measurements for GW detection within the frequency band of 200 Hz $\leq$ $f_g$ $\leq$ 800 Hz, considering Gaussian noises with negative-decibel SNRs. The peak response of the AWVA sensitivity $\kappa(f_g)$ occurs at frequency $f_{g, max}$ = 500 Hz, which falls within the frequency band of interest of the current third-generation GW detectors. Our simulation results indicate that AWVA can demonstrate a measurable sensitivity of $\Theta(f_g)$ within the frequency band of interest. Moreover, the robustness of WVA shows promising potential in mitigating the effects of Gaussian noises., Comment: 21 pages, 6 figures
Published: 2023
Full Text: View/download PDF

28. Unsupervised Melody-to-Lyric Generation

Author: Tian, Yufei, Narayan-Chen, Anjali, Oraby, Shereen, Cervone, Alessandra, Sigurdsson, Gunnar, Tao, Chenyang, Zhao, Wenbo, Chen, Yiwen, Chung, Tagyoung, Huang, Jing, and Peng, Nanyun
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationship between melody and lyrics. In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. Specifically, we design a hierarchical lyric generation framework that first generates a song outline and second the complete lyrics. The framework enables disentanglement of training (based purely on text) from inference (melody-guided text generation) to circumvent the shortage of parallel data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints as guidance during inference. The two-step hierarchical design also enables content control via the lyric outline, a much-desired feature for democratizing collaborative song creation. Experimental results show that our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines, for example SongMASS, a SOTA model trained on a parallel dataset, with a 24% relative overall quality improvement based on human ratings., Comment: ACL 2023. arXiv admin note: substantial text overlap with arXiv:2305.07760
Published: 2023

29. BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks

Author: Zhang, Kai, Zhou, Rong, Adhikarla, Eashan, Yan, Zhiling, Liu, Yixin, Yu, Jun, Liu, Zhengliang, Chen, Xun, Davison, Brian D., Ren, Hui, Huang, Jing, Chen, Chen, Zhou, Yuyin, Fu, Sunyang, Liu, Wei, Liu, Tianming, Li, Xiang, Chen, Yong, He, Lifang, Zou, James, Li, Quanzheng, Liu, Hongfang, and Sun, Lichao
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners, and patients. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation, and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency., Comment: Fix incorrect citations and add journal reference for the published version. Nat Med (2024)
Published: 2023
Full Text: View/download PDF

30. Code-Switched Text Synthesis in Unseen Language Pairs

Author: Hsu, I-Hung, Ray, Avik, Garg, Shubham, Peng, Nanyun, and Huang, Jing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Existing efforts on text synthesis for code-switching mostly require training on code-switched texts in the target language pairs, limiting the deployment of the models to cases lacking code-switched data. In this work, we study the problem of synthesizing code-switched texts for language pairs absent from the training data. We introduce GLOSS, a model built on top of a pre-trained multilingual machine translation model (PMMTM) with an additional code-switching module. This module, either an adapter or extra prefixes, learns code-switching patterns from code-switched data during training, while the primary component of GLOSS, i.e., the PMMTM, is frozen. The design of only adjusting the code-switching module prevents our model from overfitting to the constrained training data for code-switching. Hence, GLOSS exhibits the ability to generalize and synthesize code-switched texts across a broader spectrum of language pairs. Additionally, we develop a self-training algorithm on target language pairs further to enhance the reliability of GLOSS. Automatic evaluations on four language pairs show that GLOSS achieves at least 55% relative BLEU and METEOR scores improvements compared to strong baselines. Human evaluations on two language pairs further validate the success of GLOSS., Comment: Paper accepted by ACL2023 as a Finding paper
Published: 2023

31. Do We Need an Encoder-Decoder to Model Dynamical Systems on Networks?

Author: Liu, Bing, Luo, Wei, Li, Gang, Huang, Jing, and Yang, Bo
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: As deep learning gains popularity in modelling dynamical systems, we expose an underappreciated misunderstanding relevant to modelling dynamics on networks. Strongly influenced by graph neural networks, latent vertex embeddings are naturally adopted in many neural dynamical network models. However, we show that embeddings tend to induce a model that fits observations well but simultaneously has incorrect dynamical behaviours. Recognising that previous studies narrowly focus on short-term predictions during the transient phase of a flow, we propose three tests for correct long-term behaviour, and illustrate how an embedding-based dynamical model fails these tests, and analyse the causes, particularly through the lens of topological conjugacy. In doing so, we show that the difficulties can be avoided by not using embedding. We propose a simple embedding-free alternative based on parametrising two additive vector-field components. Through extensive experiments, we verify that the proposed model can reliably recover a broad class of dynamics on different network topologies from time series data., Comment: Accepted by IJCAI 2023
Published: 2023

32. Unsupervised Melody-Guided Lyrics Generation

Author: Tian, Yufei, Narayan-Chen, Anjali, Oraby, Shereen, Cervone, Alessandra, Sigurdsson, Gunnar, Tao, Chenyang, Zhao, Wenbo, Chung, Tagyoung, Huang, Jing, and Peng, Nanyun
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Multimedia
Abstract: Automatic song writing is a topic of significant practical interest. However, its research is largely hindered by the lack of training data due to copyright concerns and challenged by its creative nature. Most noticeably, prior works often fall short of modeling the cross-modal correlation between melody and lyrics due to limited parallel data, hence generating lyrics that are less singable. Existing works also lack effective mechanisms for content control, a much desired feature for democratizing song creation for people with limited music background. In this work, we propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data. Instead, we design a hierarchical lyric generation framework that disentangles training (based purely on text) from inference (melody-guided text generation). At inference time, we leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process. Evaluation results show that our model can generate high-quality lyrics that are more singable, intelligible, coherent, and in rhyme than strong baselines including those supervised on parallel data., Comment: Presented at AAAI23 CreativeAI workshop (Non-Archival). A later version is accepted to ACL23
Published: 2023

33. Gallai-like characterization of strong cocomparability graphs

Author: Huang, Jing
Subjects: Mathematics - Combinatorics, Computer Science - Computational Complexity
Abstract: Strong cocomparability graphs are the reflexive graphs whose adjacency matrix can be rearranged by a simultaneous row and column permutation to avoid the submatrix with rows $01, 10$. Strong cocomparability graphs form a subclass of cocomparability graphs (i.e., the complements of comparability graphs) and can be recognized in polynomial time. In his seminal paper, Gallai characterized cocomparability graphs in terms of a forbidden structure called asteroids. Gallai proved that cocomparability graphs are precisely those reflexive graphs which do not contain asteroids. In this paper, we give a characterization of strong cocomparability graphs which is analogous to Gallai's characterization for cocomparability graphs. We prove that strong cocomparability graphs are precisely those reflexive graphs which do not contain weak edge-asteroids (a weaker version of asteroids). Our characterization also leads to a polynomial time recognition algorithm for strong cocomparability graphs., Comment: 9 pages
Published: 2023

34. Exciton-assisted electron tunneling in van der Waals heterostructures

Author: Wang, Lujun, Papadopoulos, Sotirios, Iyikanat, Fadil, Zhang, Jian, Huang, Jing, Watanabe, Kenji, Taniguchi, Takashi, Calame, Michel, Perrin, Mickael L., de Abajo, F. Javier García, and Novotny, Lukas
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: The control of elastic and inelastic electron tunneling relies on materials with well defined interfaces. Van der Waals materials made of two-dimensional constituents form an ideal platform for such studies. Signatures of acoustic phonons and defect states have been observed in current-to-voltage ($I-V$) measurements. These features can be explained by direct electron-phonon or electron-defect interactions. Here, we use a novel tunneling process that involves excitons in transition metal dichalcogenides (TMDs). We study tunnel junctions consisting of graphene and gold electrodes separated by hexagonal boron nitride (hBN) with an adjacent TMD monolayer and observe prominent resonant features in $I-V$ measurements. These resonances appear at bias voltages that correspond to TMD exciton energies. By placing the TMD outside of the tunneling pathway, we demonstrate that this phonon-exciton mediated tunneling process does not require any charge injection into the TMD. This work demonstrates the appearance of optical modes in electrical transport measurements and introduces a new functionality for optoelectronic devices based on van der Waals materials., Comment: 26 pages, 23 figures, 77 references
Published: 2023

35. Overbias photon emission from light-emitting devices based on monolayer transition metal dichalcogenides

Author: Shan, Shengyu, Huang, Jing, Papadopoulos, Sotirios, Khelifa, Ronja, Taniguchi, Takashi, Watanabe, Kenji, Wang, Lujun, and Novotny, Lukas
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: Tunneling light-emitting devices (LEDs) based on transition metal dichalcogenides (TMDs) and other 2D materials are a new platform for on-chip optoelectronic integration. Some of the physical processes underlying this LED architecture are not fully understood, especially the emission at photon energies higher than the applied electrostatic potential, so-called overbias emission. Here we report overbias emission for potentials that are near half of the optical bandgap energy in TMD-based tunneling LEDs. We show that this emission is not thermal in nature, but consistent with exciton generation via a two-electron coherent tunneling process., Comment: 6 pages, 4 figures
Published: 2023

36. Estimating the Instantaneous Reproduction Number With Imperfect Data: A Method to Account for Case-Reporting Variation and Serial Interval Uncertainty

Author: Hettinger, Gary, Rubin, David, and Huang, Jing
Subjects: Statistics - Methodology
Abstract: During an infectious disease outbreak, public health decision-makers require real-time monitoring of disease transmission to respond quickly and intelligently. In these settings, a key measure of transmission is the instantaneous time-varying reproduction number, $R_t$. Estimation of this number using a Time-Since-Infection model relies on case-notification data and the distribution of the serial interval on the target population. However, in practice, case-notification data may contain measurement error due to variation in case reporting while available serial interval estimates may come from studies on non-representative populations. We propose a new data-driven method that accounts for particular forms of case-reporting measurement error and can incorporate multiple partially representative serial interval estimates into the transmission estimation process. In addition, we provide practical tools for automatically identifying measurement error patterns and determining when measurement error may not be adequately accounted for. We illustrate the potential bias undertaken by methods that ignore these practical concerns through a variety of simulated outbreaks. We then demonstrate the use of our method on data from the COVID-19 pandemic to estimate transmission and explore the relationships between social distancing, temperature, and transmission.
Published: 2023

37. Crowd3D: Towards Hundreds of People Reconstruction from a Single Image

Author: Wen, Hao, Huang, Jing, Cui, Huili, Lin, Haozhe, Lai, YuKun, Fang, Lu, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image-based multi-person reconstruction in wide-field large scenes is critical for crowd analysis and security alert. However, existing methods cannot deal with large scenes containing hundreds of people, which encounter the challenges of large number of people, large variations in human scale, and complex spatial distribution. In this paper, we propose Crowd3D, the first framework to reconstruct the 3D poses, shapes and locations of hundreds of people with global consistency from a single large-scene image. The core of our approach is to convert the problem of complex crowd localization into pixel localization with the help of our newly defined concept, Human-scene Virtual Interaction Point (HVIP). To reconstruct the crowd with global consistency, we propose a progressive reconstruction network based on HVIP by pre-estimating a scene-level camera and a ground plane. To deal with a large number of persons and various human sizes, we also design an adaptive human-centric cropping scheme. Besides, we contribute a benchmark dataset, LargeCrowd, for crowd reconstruction in a large scene. Experimental results demonstrate the effectiveness of the proposed method. The code and datasets will be made public., Comment: Accepted by CVPR 2023
Published: 2023

38. Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability

Author: Geiger, Atticus, Ibeling, Duligur, Zur, Amir, Chaudhary, Maheep, Chauhan, Sonakshi, Huang, Jing, Arora, Aryaman, Wu, Zhengxuan, Goodman, Noah, Potts, Christopher, and Icard, Thomas
Subjects: Computer Science - Artificial Intelligence
Abstract: Causal abstraction provides a theoretical foundation for mechanistic interpretability, the field concerned with providing intelligible algorithms that are faithful simplifications of the known, but opaque low-level details of black box AI models. Our contributions are (1) generalizing the theory of causal abstraction from mechanism replacement (i.e., hard and soft interventions) to arbitrary mechanism transformation (i.e., functionals from old mechanisms to new mechanisms), (2) providing a flexible, yet precise formalization for the core concepts of modular features, polysemantic neurons, and graded faithfulness, and (3) unifying a variety of mechanistic interpretability methodologies in the common language of causal abstraction, namely activation and path patching, causal mediation analysis, causal scrubbing, causal tracing, circuit analysis, concept erasure, sparse autoencoders, differential binary masking, distributed alignment search, and activation steering.
Published: 2023

39. Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training

Author: Huang, Jing, Wu, Zhengxuan, Mahowald, Kyle, and Potts, Christopher
Subjects: Computer Science - Computation and Language
Abstract: Language tasks involving character-level manipulations (e.g., spelling corrections, arithmetic operations, word games) are challenging for models operating on subword units. To address this, we develop a causal intervention framework to learn robust and interpretable character representations inside subword-based language models. Our method treats each character as a typed variable in a causal model and learns such causal structures by adapting the interchange intervention training method of Geiger et al. (2021). We additionally introduce a suite of character-level tasks that systematically vary in their dependence on meaning and sequence-level context. While character-level models still perform best on purely form-based tasks like string reversal, our method outperforms character-level models on more complex tasks that blend form, meaning, and context, such as spelling correction in context and word search games. Compared with standard subword-based models, our approach also significantly improves robustness on unseen token sequences and leads to human-interpretable internal representations of characters., Comment: Findings of the Association for Computational Linguistics: ACL 2023
Published: 2022

40. Early-Phase Local-Area Model for Pandemics Using Limited Data: A SARS-CoV-2 Application

Author: Shi, Jiasheng, Morris, Jeffrey S., Rubin, David M., and Huang, Jing
Subjects: Statistics - Methodology, Statistics - Applications
Abstract: The emergence of novel infectious agents presents challenges to statistical models of disease transmission. These challenges arise from limited, poor-quality data and an incomplete understanding of the agent. Moreover, outbreaks manifest differently across regions due to various factors, making it imperative for models to factor in regional specifics. In this work, we offer a model that effectively utilizes constrained data resources to estimate disease transmission rates at the local level, especially during the early outbreak phase when primarily infection counts and aggregated local characteristics are accessible. This model merges a pathogen transmission methodology based on daily infection numbers with regression techniques, drawing correlations between disease transmission and local-area factors, such as demographics, health policies, behavior, and even climate, to estimate and forecast daily infections. We incorporate the quasi-score method and an error term to navigate potential data concerns and mistaken assumptions. Additionally, we introduce an online estimator that facilitates real-time data updates, complemented by an iterative algorithm for parameter estimation. This approach facilitates real-time analysis of disease transmission when data quality is suboptimal and knowledge of the infectious pathogen is limited. It is particularly useful in the early stages of outbreaks, providing support for local decision-making.
Published: 2022

41. Strong cocomparability graphs and Slash-free orderings of matrices

Author: Hell, Pavol, Huang, Jing, and Lin, Jephian C. -H.
Subjects: Mathematics - Combinatorics
Abstract: We introduce the class of strong cocomparability graphs, as the class of reflexive graphs whose adjacency matrix can be rearranged by a simultaneous row and column permutation to avoid the submatrix with rows 01, 10, which we call Slash. We provide an ordering characterization, a forbidden structure characterization, and a polynomial-time recognition algorithm, for the class. These results complete the picture in which in addition to, or instead of, the Slash matrix one forbids the Gamma matrix (which has rows 11, 10). It is well known that in these two cases one obtains the class of interval graphs, and the class of strongly chordal graphs, respectively. By complementation, we obtain the class of strong comparability graphs, whose adjacency matrix can be rearranged by a simultaneous row and column permutation to avoid the two-by-two identity submatrix. Thus our results give characterizations and algorithms for this class of irreflexive graphs as well. In other words, our results may be interpreted as solving the following problem: given a symmetric 0,1-matrix with 0-diagonal, can the rows and columns of be simultaneously permuted to avoid the two-by-two identity submatrix?, Comment: 19 pages, 3 figures
Published: 2022

42. Context-Situated Pun Generation

Author: Sun, Jiao, Narayan-Chen, Anjali, Oraby, Shereen, Gao, Shuyang, Chung, Tagyoung, Huang, Jing, Liu, Yang, and Peng, Nanyun
Subjects: Computer Science - Computation and Language
Abstract: Previous work on pun generation commonly begins with a given pun word (a pair of homophones for heterographic pun generation and a polyseme for homographic pun generation) and seeks to generate an appropriate pun. While this may enable efficient pun generation, we believe that a pun is most entertaining if it fits appropriately within a given context, e.g., a given situation or dialogue. In this work, we propose a new task, context-situated pun generation, where a specific context represented by a set of keywords is provided, and the task is to first identify suitable pun words that are appropriate for the context, then generate puns based on the context keywords and the identified pun words. We collect CUP (Context-sitUated Pun), containing 4.5k tuples of context words and pun pairs. Based on the new data and setup, we propose a pipeline system for context-situated pun generation, including a pun word retrieval module that identifies suitable pun words for a given context, and a generation module that generates puns from context keywords and pun words. Human evaluation shows that 69% of our top retrieved pun words can be used to generate context-situated puns, and our generation module yields successful puns 31% of the time given a plausible tuple of context words and pun pair, almost tripling the yield of a state-of-the-art pun generation model. With an end-to-end evaluation, our pipeline system with the top-1 retrieved pun pair for a given context can generate successful puns 40% of the time, better than all other modeling variations but 32% lower than the human success rate. This highlights the difficulty of the task, and encourages more research in this direction., Comment: Accepted to EMNLP 2022 main conference
Published: 2022

43. ExPUNations: Augmenting Puns with Keywords and Explanations

Author: Sun, Jiao, Narayan-Chen, Anjali, Oraby, Shereen, Cervone, Alessandra, Chung, Tagyoung, Huang, Jing, Liu, Yang, and Peng, Nanyun
Subjects: Computer Science - Computation and Language
Abstract: The tasks of humor understanding and generation are challenging and subjective even for humans, requiring commonsense and real-world knowledge to master. Puns, in particular, add the challenge of fusing that knowledge with the ability to interpret lexical-semantic ambiguity. In this paper, we present the ExPUNations (ExPUN) dataset, in which we augment an existing dataset of puns with detailed crowdsourced annotations of keywords denoting the most distinctive words that make the text funny, pun explanations describing why the text is funny, and fine-grained funniness ratings. This is the first humor dataset with such extensive and fine-grained annotations specifically for puns. Based on these annotations, we propose two tasks: explanation generation to aid with pun classification and keyword-conditioned pun generation, to challenge the current state-of-the-art natural language understanding and generation models' ability to understand and generate humor. We showcase that the annotated keywords we collect are helpful for generating better novel humorous texts in human evaluation, and that our natural language explanations can be leveraged to improve both the accuracy and robustness of humor classifiers., Comment: Accepted to EMNLP 2022 main conference
Published: 2022

44. Task Grouping for Multilingual Text Recognition

Author: Huang, Jing, Liang, Kevin J, Kovvuri, Rama, and Hassner, Tal
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Most existing OCR methods focus on alphanumeric characters due to the popularity of English and numbers, as well as their corresponding datasets. On extending the characters to more languages, recent methods have shown that training different scripts with different recognition heads can greatly improve the end-to-end recognition accuracy compared to combining characters from all languages in the same recognition head. However, we postulate that similarities between some languages could allow sharing of model parameters and benefit from joint training. Determining language groupings, however, is not immediately obvious. To this end, we propose an automatic method for multilingual text recognition with a task grouping and assignment module using Gumbel-Softmax, introducing a task grouping loss and weighted recognition loss to allow for simultaneous training of the models and grouping modules. Experiments on MLT19 lend evidence to our hypothesis that there is a middle ground between combining every task together and separating every task that achieves a better configuration of task grouping/separation., Comment: ECCV 2022: Text in Everything (TIE) Workshop (Oral)
Published: 2022

45. The Effect of Non-Gaussian Noise on Auto-correlative Weak-value Amplification

Author: Huang, Jing-Hui, Lundeen, J. S., Dada, Adetunmise C., Jordan, Kyle M., Wang, Guang-Jun, Duan, Xue-Ying, and Hu, Xiang-Yun
Subjects: Quantum Physics, Physics - Optics
Abstract: Accurate knowledge of the spectral features of noise and their influence on open quantum systems is fundamental for quantitative understanding and prediction of the dynamics in a realistic environment. For the weak measurements of two-level systems, the weak value obtained from experiments will inevitably be affected by the noise of the environment. Following our earlier work on the technique of the auto-correlative weak-value amplification (AWVA) approach under a Gaussian noise environment, here we study the effect of non-Gaussian noise on the AWVA technique.In particular, two types of noise with a negative-dB signal-to-noise ratio, frequency-stationary noises and frequency-nonstationary noises are studied. The various frequency-stationary noises, including low-frequency (1/f) noises, medium-frequency noises, and high-frequency noises, are generated in Simulink by translating the Gaussian white noise with different band-pass filters. While impulsive noise is studied as an example of frequency-non stationary noises. Our simulated results demonstrate that 1/f noises and impulsive noises have greater disturbance on the AWVA measurements. In addition, adding one kind of frequency-stationary noise, clamping the detected signals, and dominating the measurement range may {have} the potential to improve the precision of the AWVA technique with both a smaller deviation of the mean value and a smaller error bar in the presence of many hostile non-Gaussian noises., Comment: 22 pages,8 figures
Published: 2022

46. Semi-strict chordality of digraphs

Author: Huang, Jing and Ye, Ying Ying
Subjects: Mathematics - Combinatorics, Computer Science - Discrete Mathematics
Abstract: Chordal graphs are important in algorithmic graph theory. Chordal digraphs are a digraph analogue of chordal graphs and have been a subject of active studies recently. Unlike chordal graphs, chordal digraphs lack many structural properties such as forbidden subdigraph or representation characterizations. In this paper we introduce the notion of semi-strict chordal digraphs which form a class strictly between chordal digraphs and chordal graphs. Semi-strict chordal digraphs have rich structural properties. We characterize semi-strict chordal digraphs in terms of knotting graphs, a notion analogous to the one introduced by Gallai for the study of comparability graphs. We also give forbidden subdigraph characterizations of semi-strict chordal digraphs within the cases of locally semicomplete digraphs and weakly quasi-transitive digraphs., Comment: 16 pages, 4 figures. arXiv admin note: text overlap with arXiv:2008.03568
Published: 2022

47. Extremal affine subspaces and Khintchine-Jarn\'ik type theorems

Author: Huang, Jing-Jing
Subjects: Mathematics - Number Theory
Abstract: We prove a conjecture of Kleinbock which gives a clear-cut classification of all extremal affine subspaces of $\mathbb{R}^n$. We also give an essentially complete classification of all Khintchine type affine subspaces, except for some boundary cases within two logarithmic scales. More general Jarn\'ik type theorems are proved as well, sometimes without the monotonicity of the approximation function. These results follow as consequences of our novel estimates for the number of rational points close to an affine subspace in terms of diophantine properties of its defining matrix. Our main tool is the multidimensional large sieve inequality and its dual form., Comment: 46 pages, minor revision, to appear in GAFA
Published: 2022

48. SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences

Author: Qi, Peng, Wang, Guangtao, and Huang, Jing
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Distilling supervision signal from a long sequence to make predictions is a challenging task in machine learning, especially when not all elements in the input sequence contribute equally to the desired output. In this paper, we propose SpanDrop, a simple and effective data augmentation technique that helps models identify the true supervision signal in a long sequence with very few examples. By directly manipulating the input sequence, SpanDrop randomly ablates parts of the sequence at a time and ask the model to perform the same task to emulate counterfactual learning and achieve input attribution. Based on theoretical analysis of its properties, we also propose a variant of SpanDrop based on the beta-Bernoulli distribution, which yields diverse augmented sequences while providing a learning objective that is more consistent with the original dataset. We demonstrate the effectiveness of SpanDrop on a set of carefully designed toy tasks, as well as various natural language processing tasks that require reasoning over long sequences to arrive at the correct answer, and show that it helps models improve performance both when data is scarce and abundant., Comment: Peng Qi and Guangtao Wang contributed equally
Published: 2022

49. Type of Non-reciprocity in Fiber Sagnac Interferometer Induced by Geometric Phases

Author: Zhao, Dongzi, Huang, Jing-Zheng, Xiao, Tailong, Li, Hongjing, Wu, Xiaoyan, and Zeng, Guihua
Subjects: Physics - Optics, Quantum Physics
Abstract: The non-reciprocity of Sagnac interferometer provides ultra-high sensitivity for parameter estimation and offers a wide range of applications, especially for optical fiber sensing. In this work, we study a new type of non-reciprocity existed in optical fiber Sagnac interferometer where the polarization dependent loss is taken into consideration. In particular, this non-reciprocity is irrelevant to the physical effects that being considered in previous studies, which originates from the geometric phases induced by continuous-weak-measurement. In consequence, it has a unique phenomenon of sudden phase transition, which may open a new way for the future design of high precision optical fiber sensors., Comment: 10 pages, 11 figures. The wrong author list in v1 has been corrected
Published: 2022
Full Text: View/download PDF

50. Monolithic Integration of Embedded III-V Lasers on SOI

Author: Wei, Wen Qi, He, An, Yang, Bo, Huang, Jing-Zhi, Han, Dong, Ming, Min, Wang, Zi Hao, Guo, Xuhan, Su, Yikai, Zhang, Jian Jun, and Wang, Ting
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Silicon photonic integration has gained great success in many application fields owing to the excellent optical device properties and complementary metal-oxide semiconductor (CMOS) compatibility. Realizing monolithic integration of III-V lasers and silicon photonic components on single silicon wafer is recognized as a long-standing obstacle for ultra-dense photonic integration, which can provide considerable economical, energy efficient and foundry-scalable on-chip light sources, that has not been reported yet. Here, we demonstrate embedded InAs/GaAs quantum dot (QD) lasers directly grown on trenched silicon-on-insulator (SOI) substrate, enabling monolithic integration with butt-coupled silicon waveguides. By utilizing the patterned grating structures inside pre-defined SOI trenches and unique epitaxial method via molecular beam epitaxy (MBE), high-performance embedded InAs QD lasers with out-coupled silicon waveguide are achieved on such template. By resolving the epitaxy and fabrication challenges in such monolithic integrated architecture, embedded III-V lasers on SOI with continuous-wave lasing up to 85 oC are obtained. The maximum output power of 6.8 mW can be measured from the end tip of the butt-coupled silicon waveguides, with estimated coupling efficiency of approximately -7.35 dB. The results presented here provide a scalable and low-cost epitaxial method for realization of on-chip light sources directly coupling to the silicon photonic components for future high-density photonic integration.
Published: 2022

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

191 results on '"HUANG, Jing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources