Author: "ZHANG, Liang" / Search Limiters: Available in Library Collection - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"ZHANG, Liang"' showing total 13,150 results

Start Over Author "ZHANG, Liang" Search Limiters Available in Library Collection

13,150 results on '"ZHANG, Liang"'

1. mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Author: Hu, Anwen, Xu, Haiyang, Zhang, Liang, Ye, Jiabo, Yan, Ming, Zhang, Ji, Jin, Qin, Huang, Fei, and Zhou, Jingren
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multimodel Large Language Models(MLLMs) have achieved promising OCR-free Document Understanding performance by increasing the supported resolution of document images. However, this comes at the cost of generating thousands of visual tokens for a single document image, leading to excessive GPU memory and slower inference times, particularly in multi-page document comprehension. In this work, to address these challenges, we propose a High-resolution DocCompressor module to compress each high-resolution document image into 324 tokens, guided by low-resolution global visual features. With this compression module, to strengthen multi-page document comprehension ability and balance both token efficiency and question-answering performance, we develop the DocOwl2 under a three-stage training framework: Single-image Pretraining, Multi-image Continue-pretraining, and Multi-task Finetuning. DocOwl2 sets a new state-of-the-art across multi-page document understanding benchmarks and reduces first token latency by more than 50%, demonstrating advanced capabilities in multi-page questioning answering, explanation with evidence pages, and cross-page structure understanding. Additionally, compared to single-image MLLMs trained on similar data, our DocOwl2 achieves comparable single-page understanding performance with less than 20% of the visual tokens. Our codes, models, and data are publicly available at https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl2., Comment: 15 pages, 7 figures
Published: 2024

2. Modeling Reference-dependent Choices with Graph Neural Networks

Author: Zhang, Liang, Liu, Guannan, Wu, Junjie, and Tan, Yong
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: While the classic Prospect Theory has highlighted the reference-dependent and comparative nature of consumers' product evaluation processes, few models have successfully integrated this theoretical hypothesis into data-driven preference quantification, particularly in the realm of recommender systems development. To bridge this gap, we propose a new research problem of modeling reference-dependent preferences from a data-driven perspective, and design a novel deep learning-based framework named Attributed Reference-dependent Choice Model for Recommendation (ArcRec) to tackle the inherent challenges associated with this problem. ArcRec features in building a reference network from aggregated historical purchase records for instantiating theoretical reference points, which is then decomposed into product attribute specific sub-networks and represented through Graph Neural Networks. In this way, the reference points of a consumer can be encoded at the attribute-level individually from her past experiences but also reflect the crowd influences. ArcRec also makes novel contributions to quantifying consumers' reference-dependent preferences using a deep neural network-based utility function that integrates both interest-inspired and price-inspired preferences, with their complex interaction effects captured by an attribute-aware price sensitivity mechanism. Most importantly, ArcRec introduces a novel Attribute-level Willingness-To-Pay measure to the reference-dependent utility function, which captures a consumer's heterogeneous salience of product attributes via observing her attribute-level price tolerance to a product. Empirical evaluations on both synthetic and real-world online shopping datasets demonstrate ArcRec's superior performances over fourteen state-of-the-art baselines.
Published: 2024

3. Microsatellite-based real-time quantum key distribution

Author: Li, Yang, Cai, Wen-Qi, Ren, Ji-Gang, Wang, Chao-Ze, Yang, Meng, Zhang, Liang, Wu, Hui-Ying, Chang, Liang, Wu, Jin-Cai, Jin, Biao, Xue, Hua-Jian, Li, Xue-Jiao, Liu, Hui, Yu, Guang-Wen, Tao, Xue-Ying, Chen, Ting, Liu, Chong-Fei, Luo, Wen-Bin, Zhou, Jie, Yong, Hai-Lin, Li, Yu-Huai, Li, Feng-Zhi, Jiang, Cong, Chen, Hao-Ze, Wu, Chao, Tong, Xin-Hai, Xie, Si-Jiang, Zhou, Fei, Liu, Wei-Yue, Liu, Nai-Le, Li, Li, Xu, Feihu, Cao, Yuan, Yin, Juan, Shu, Rong, Wang, Xiang-Bin, Zhang, Qiang, Wang, Jian-Yu, Liao, Sheng-Kai, Peng, Cheng-Zhi, and Pan, Jian-Wei
Subjects: Quantum Physics
Abstract: A quantum network provides an infrastructure connecting quantum devices with revolutionary computing, sensing, and communication capabilities. As the best-known application of a quantum network, quantum key distribution (QKD) shares secure keys guaranteed by the laws of quantum mechanics. A quantum satellite constellation offers a solution to facilitate the quantum network on a global scale. The Micius satellite has verified the feasibility of satellite quantum communications, however, scaling up quantum satellite constellations is challenging, requiring small lightweight satellites, portable ground stations and real-time secure key exchange. Here we tackle these challenges and report the development of a quantum microsatellite capable of performing space-to-ground QKD using portable ground stations. The quantum microsatellite features a payload weighing approximately 23 kg, while the portable ground station weighs about 100 kg. These weights represent reductions by more than an order and two orders of magnitude, respectively, compared to the Micius satellite. Additionally, we multiplex bidirectional satellite-ground optical communication with quantum communication, enabling key distillation and secure communication in real-time. Using the microsatellite and the portable ground stations, we demonstrate satellite-based QKD with multiple ground stations and achieve the sharing of up to 0.59 million bits of secure keys during a single satellite pass. The compact quantum payload can be readily assembled on existing space stations or small satellites, paving the way for a satellite-constellation-based quantum and classical network for widespread real-life applications., Comment: 40 pages, 8 figures
Published: 2024

4. Unraveling the hybrid origins of the X-ray non-thermal emission from IGR J17091-3624

Author: Lin, Zikun, Wang, Yanan, del Palacio, Santiago, Méndez, Mariano, Zhang, Shuang-Nan, Russell, Thomas D., Ji, Long, Zhang, Jin, Zhang, Liang, Altamirano, Diego, and Liu, Jifeng
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present a comprehensive study based on multi-wavelength observations from the NuSTAR, NICER, Swift, Fermi, NEOWISE, and ATCA telescopes during the 2022 outburst of the black hole X-ray binary IGR J17091-3624. Our investigation concentrates on the heartbeat-like variability in the X-ray emission, with the aim of using it as a tool to unravel the origin of the non-thermal emission during the heartbeat state. Through X-ray timing and spectral analysis, we observe that the heartbeat-like variability correlates with changes in the disk temperature, supporting the disk radiation pressure instability scenario. Moreover, in addition to a Comptonization component, our time-averaged and phase-resolved spectroscopy reveal the presence of a power-law component that varies independently from the disk component. Combined with the radio to X-ray spectral energy distribution fitting, our results suggest that the power-law component could originate from synchrotron self-Compton radiation in the jet, which requires a strong magnetic field of about $B = (0.3$-$3.5)\times10^6$ G. Additionally, assuming that IGR J17091-3624 and GRS 1915+105 share the same radio-X-ray correlation coefficient during both the hard and the heartbeat states, we obtain a distance of $13.7\pm2.3$ kpc for IGR J17091-3624., Comment: 19 pages, 11 figures, 3 tables; accepted for publication in ApJ
Published: 2024

5. Quantum Long Short-Term Memory for Drug Discovery

Author: Zhang, Liang, Xu, Yin, Wu, Mohan, Wang, Liang, and Xu, Hua
Subjects: Quantum Physics, Computer Science - Machine Learning, Quantitative Biology - Biomolecules
Abstract: Quantum computing combined with machine learning (ML) is an extremely promising research area, with numerous studies demonstrating that quantum machine learning (QML) is expected to solve scientific problems more effectively than classical ML. In this work, we successfully apply QML to drug discovery, showing that QML can significantly improve model performance and achieve faster convergence compared to classical ML. Moreover, we demonstrate that the model accuracy of the QML improves as the number of qubits increases. We also introduce noise to the QML model and find that it has little effect on our experimental conclusions, illustrating the high robustness of the QML model. This work highlights the potential application of quantum computing to yield significant benefits for scientific advancement as the qubit quantity increase and quality improvement in the future.
Published: 2024

6. Generative Adversarial Networks for Imputing Sparse Learning Performance

Author: Zhang, Liang, Yeasin, Mohammed, Lin, Jionghao, Havugimana, Felix, and Hu, Xiangen
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Learning performance data, such as correct or incorrect responses to questions in Intelligent Tutoring Systems (ITSs) is crucial for tracking and assessing the learners' progress and mastery of knowledge. However, the issue of data sparsity, characterized by unexplored questions and missing attempts, hampers accurate assessment and the provision of tailored, personalized instruction within ITSs. This paper proposes using the Generative Adversarial Imputation Networks (GAIN) framework to impute sparse learning performance data, reconstructed into a three-dimensional (3D) tensor representation across the dimensions of learners, questions and attempts. Our customized GAIN-based method computational process imputes sparse data in a 3D tensor space, significantly enhanced by convolutional neural networks for its input and output layers. This adaptation also includes the use of a least squares loss function for optimization and aligns the shapes of the input and output with the dimensions of the questions-attempts matrices along the learners' dimension. Through extensive experiments on six datasets from various ITSs, including AutoTutor, ASSISTments and MATHia, we demonstrate that the GAIN approach generally outperforms existing methods such as tensor factorization and other generative adversarial network (GAN) based approaches in terms of imputation accuracy. This finding enhances comprehensive learning data modeling and analytics in AI-based education.
Published: 2024

7. Phase-resolved Spectroscopy of Low-frequency Quasi-periodic Oscillations from the Newly Discovered Black Hole X-ray Binary Swift J1727.8-1613

Author: Shui, Qing-Cang, Zhang, Shu, Peng, Jiang-Qiang, Zhang, Shuang-Nan, Chen, Yu-Peng, Ji, Long, Kong, Ling-Da, Feng, Hua, Yu, Zhuo-Li, Wang, Peng-Ju, Chang, Zhi, Yin, Hong-Xing, Qu, Jin-Lu, Tao, Lian, Ge, Ming-Yu, Zhang, Liang, and Li, Jian
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: Low-frequency quasi-periodic oscillations (LFQPOs) are commonly observed in X-ray light curves of black hole X-ray binaries (BHXRBs); however, their origin remains a topic of debate. In order to thoroughly investigate variations in spectral properties on the QPO timescale, we utilized the Hilbert-Huang transform technique to conduct phase-resolved spectroscopy across a broad energy band for LFQPOs in the newly discovered BHXRB Swift J1727.8-1613. This is achieved through quasi-simultaneous observations from Neutron star Interior Composition ExploreR (NICER), Nuclear Spectroscopic Telescope ARray (NuSTAR), and Hard X-ray Modulation Telescope (Insight-HXMT). Our analysis reveals that both the non-thermal and disk-blackbody components exhibit variations on the QPO timescale, with the former dominating the QPO variability. For the spectral parameters, we observe modulation of the disk temperature, spectral indices, and reflection fraction with the QPO phase with high statistical significance (>5\sigma). Notably, the variation in the disk temperature is found to precede the variations in the non-thermal and disk fluxes by ~0.4-0.5 QPO cycles. We suggest that these findings offer further evidence that the type-C QPO variability is a result of geometric effects of the accretion flow., Comment: Accepted for pulication in The Astrophysical Journal
Published: 2024

8. HHGT: Hierarchical Heterogeneous Graph Transformer for Heterogeneous Graph Representation Learning

Author: Zhu, Qiuyu, Zhang, Liang, Xu, Qianxiong, Liu, Kaijun, Long, Cheng, and Wang, Xiaoyang
Subjects: Computer Science - Machine Learning, Computer Science - Databases
Abstract: Despite the success of Heterogeneous Graph Neural Networks (HGNNs) in modeling real-world Heterogeneous Information Networks (HINs), challenges such as expressiveness limitations and over-smoothing have prompted researchers to explore Graph Transformers (GTs) for enhanced HIN representation learning. However, research on GT in HINs remains limited, with two key shortcomings in existing work: (1) A node's neighbors at different distances in HINs convey diverse semantics. Unfortunately, existing methods ignore such differences and uniformly treat neighbors within a given distance in a coarse manner, which results in semantic confusion. (2) Nodes in HINs have various types, each with unique semantics. Nevertheless, existing methods mix nodes of different types during neighbor aggregation, hindering the capture of proper correlations between nodes of diverse types. To bridge these gaps, we design an innovative structure named (k,t)-ring neighborhood, where nodes are initially organized by their distance, forming different non-overlapping k-ring neighborhoods for each distance. Within each k-ring structure, nodes are further categorized into different groups according to their types, thus emphasizing the heterogeneity of both distances and types in HINs naturally. Based on this structure, we propose a novel Hierarchical Heterogeneous Graph Transformer (HHGT) model, which seamlessly integrates a Type-level Transformer for aggregating nodes of different types within each k-ring neighborhood, followed by a Ring-level Transformer for aggregating different k-ring neighborhoods in a hierarchical manner. Extensive experiments are conducted on downstream tasks to verify HHGT's superiority over 14 baselines, with a notable improvement of up to 24.75% in NMI and 29.25% in ARI for node clustering task on the ACM dataset compared to the best baseline.
Published: 2024

9. A blazar in the epoch of reionization

Author: Banados, Eduardo, Momjian, Emmanuel, Connor, Thomas, Belladitta, Silvia, Decarli, Roberto, Mazzucchelli, Chiara, Venemans, Bram P., Walter, Fabian, Wang, Feige, Xie, Zhang-Liang, Barth, Aaron J., Eilers, Anna-Christina, Fan, Xiaohui, Khusanova, Yana, Schindler, Jan-Torge, Stern, Daniel, Yang, Jinyi, Andika, Irham Taufik, Carilli, Chris, Farina, Emanuele P., Fabian, Andrew, Hennawi, Joseph F., Pensabene, Antonio, and Rojas-Ruiz, Sofia
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - High Energy Astrophysical Phenomena
Abstract: Relativistic jets are thought to play a crucial role in the formation of massive galaxies and supermassive black holes. Here we report multi-wavelength and multi-epoch observations of the quasar VLASSJ0410-0139 at redshift z=7, powered by a 7e8 solar-mass black hole. Its radio variability, X-ray properties, and compact radio emission on parsec scales reveal that J0410-0139 is a blazar with a relativistic jet aligned with our line of sight. This blazar's existence implies that many more similar (unaligned) jetted sources must exist at z=7. One scenario is that we observe an intrinsically low-power radio jet, but we see it at high luminosity due to relativistic beaming effects. In this case, a large fraction (>80%) of the UV bright quasars must have a similar jet to match the number density expected from the UV quasar luminosity function. These jets can enhance the growth of supermassive black holes and substantially affect their host galaxies. However, the implications would be even more severe if the quasar belongs to the top 10% radio luminous quasars, as measured if the beaming enhancement is less than a factor of 10-15. In this scenario, there should be hundreds to thousands of radio-quiet quasars at z=7 with intrinsic properties similar to J0410-0139 -- in strong tension with the number density of bright quasars derived from their UV luminosity function. To reconcile these results, most black hole growth at z=7 must happen in an obscured phase, as some models predict. The existence of supermassive black holes in the epoch of reionization is facilitated by significant jet-enhanced or obscured super-Eddington accretion., Comment: Submitted
Published: 2024

10. A timing view of the additional high-energy spectral component discovered in the black hole candidate Swift J1727.8-1613

Author: Yang, Zi-Xu, Zhang, Liang, Zhang, Shuang-Nan, Tao, L., Zhang, Shu, Ma, Ruican, Bu, Qingcui, Huang, Yue, Liu, He-Xin, Yu, Wei, Xiao, Guang C., Wang, Peng-Ju, Feng, Hua, Song, Li-Ming, Ma, Xiang, Ge, Mingyu, Zhao, QingChang, and Qu, J. L.
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present an energy-dependent analysis for the type-C quasi-periodic oscillations (QPOs) observed in the black hole X-ray binary Swift J1727.8-1613 using Insight-HXMT observations. We find that the QPO fractional rms at energies above 40 keV is significantly higher than that below 20 keV. This is the first report of a high energy (HE)-rms excess in the rms spectrum of a black hole X-ray binary. In the high energy band, an extra hard component is observed in additional to the standard thermal Comptonization component at similar energy band. The value of the QPO HE-rms excess is not only correlated with the disk parameters and the photon index of the standard Comptonization component, but also exhibits a moderate positive correlation with the flux of the additional hard spectral component. No features in the QPO phase-lag spectra are seen corresponding to the additional hard component. We propose that the additional hard component in the spectrum may originate from jet emission and the associated QPO HE-rms excess can be explained by the precession of the jet base.
Published: 2024

11. DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Author: Wang, Qi, Xu, Zhou, Lin, Yuming, Ye, Jingtao, Li, Hongsheng, Zhu, Guangming, Shah, Syed Afaq Ali, Bennamoun, Mohammed, and Zhang, Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing their vast potential for advancement. However, the development in this field is currently slowed by the lack of comprehensive, large-scale datasets, which are critical for developing robust recognition frameworks. To bridge this gap, we introduces DailyDVS-200, a meticulously curated benchmark dataset tailored for the event-based action recognition community. DailyDVS-200 is extensive, covering 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences. This dataset is designed to reflect a broad spectrum of action types, scene complexities, and data acquisition diversity. Each sequence in the dataset is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions. Moreover, DailyDVS-200 is structured to facilitate a wide range of research paths, offering a solid foundation for both validating existing approaches and inspiring novel methodologies. By setting a new benchmark in the field, we challenge the current limitations of neuromorphic data processing and invite a surge of new approaches in event-based action recognition techniques, which paves the way for future explorations in neuromorphic computing and beyond. The dataset and source code are available at https://github.com/QiWang233/DailyDVS-200., Comment: Accepted to ECCV 2024
Published: 2024

12. Scaling Data-Driven Building Energy Modelling using Large Language Models

Author: Khadka, Sunil and Zhang, Liang
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Building Management System (BMS) through a data-driven method always faces data and model scalability issues. We propose a methodology to tackle the scalability challenges associated with the development of data-driven models for BMS by using Large Language Models (LLMs). LLMs' code generation adaptability can enable broader adoption of BMS by "automating the automation," particularly the data handling and data-driven modeling processes. In this paper, we use LLMs to generate code that processes structured data from BMS and build data-driven models for BMS's specific requirements. This eliminates the need for manual data and model development, reducing the time, effort, and cost associated with this process. Our hypothesis is that LLMs can incorporate domain knowledge about data science and BMS into data processing and modeling, ensuring that the data-driven modeling is automated for specific requirements of different building types and control objectives, which also improves accuracy and scalability. We generate a prompt template following the framework of Machine Learning Operations so that the prompts are designed to systematically generate Python code for data-driven modeling. Our case study indicates that bi-sequential prompting under the prompt template can achieve a high success rate of code generation and code accuracy, and significantly reduce human labor costs.
Published: 2024

13. Towards Better Graph-based Cross-document Relation Extraction via Non-bridge Entity Enhancement and Prediction Debiasing

Author: Yue, Hao, Lai, Shaopeng, Yang, Chengyi, Zhang, Liang, Yao, Junfeng, and Su, Jinsong
Subjects: Computer Science - Computation and Language
Abstract: Cross-document Relation Extraction aims to predict the relation between target entities located in different documents. In this regard, the dominant models commonly retain useful information for relation prediction via bridge entities, which allows the model to elaborately capture the intrinsic interdependence between target entities. However, these studies ignore the non-bridge entities, each of which co-occurs with only one target entity and offers the semantic association between target entities for relation prediction. Besides, the commonly-used dataset--CodRED contains substantial NA instances, leading to the prediction bias during inference. To address these issues, in this paper, we propose a novel graph-based cross-document RE model with non-bridge entity enhancement and prediction debiasing. Specifically, we use a unified entity graph to integrate numerous non-bridge entities with target entities and bridge entities, modeling various associations between them, and then use a graph recurrent network to encode this graph. Finally, we introduce a novel debiasing strategy to calibrate the original prediction distribution. Experimental results on the closed and open settings show that our model significantly outperforms all baselines, including the GPT-3.5-turbo and InstructUIE, achieving state-of-the-art performance. Particularly, our model obtains 66.23% and 55.87% AUC points in the official leaderboard\footnote{\url{https://codalab.lisn.upsaclay.fr/competitions/3770#results}} under the two settings, respectively, ranking the first place in all submissions since December 2023. Our code is available at https://github.com/DeepLearnXMU/CoRE-NEPD., Comment: Accepted to ACL 2024 Findings
Published: 2024

14. Integrating Attentional Factors and Spacing in Logistic Knowledge Tracing Models to Explore the Impact of Training Sequences on Category Learning

Author: Cao, Meng, Pavlik Jr., Philip I., Chu, Wei, and Zhang, Liang
Subjects: Computer Science - Computers and Society, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In category learning, a growing body of literature has increasingly focused on exploring the impacts of interleaving in contrast to blocking. The sequential attention hypothesis posits that interleaving draws attention to the differences between categories while blocking directs attention toward similarities within categories. Although a recent study underscores the joint influence of memory and attentional factors on sequencing effects, there remains a scarcity of effective computational models integrating both attentional and memory considerations to comprehensively understand the effect of training sequences on students' performance. This study introduces a novel integration of attentional factors and spacing into the logistic knowledge tracing (LKT) models to monitor students' performance across different training sequences (interleaving and blocking). Attentional factors were incorporated by recording the counts of comparisons between adjacent trials, considering whether they belong to the same or different category. Several features were employed to account for temporal spacing. We used cross-validations to test the model fit and predictions on the learning session and posttest. Our findings reveal that incorporating both attentional factors and spacing features in the Additive Factors Model (AFM) significantly enhances its capacity to capture the effects of interleaving and blocking and demonstrates superior predictive accuracy for students' learning outcomes. By bridging the gap between attentional factors and memory processes, our computational approach offers a more comprehensive framework for understanding and predicting category learning outcomes in educational settings., Comment: 7 pages, 3 figures, Educational Data Mining 2024
Published: 2024

15. SPL: A Socratic Playground for Learning Powered by Large Language Model

Author: Zhang, Liang, Lin, Jionghao, Kuang, Ziyi, Xu, Sheng, Yeasin, Mohammed, and Hu, Xiangen
Subjects: Computer Science - Artificial Intelligence
Abstract: Dialogue-based Intelligent Tutoring Systems (ITSs) have significantly advanced adaptive and personalized learning by automating sophisticated human tutoring strategies within interactive dialogues. However, replicating the nuanced patterns of expert human communication remains a challenge in Natural Language Processing (NLP). Recent advancements in NLP, particularly Large Language Models (LLMs) such as OpenAI's GPT-4, offer promising solutions by providing human-like and context-aware responses based on extensive pre-trained knowledge. Motivated by the effectiveness of LLMs in various educational tasks (e.g., content creation and summarization, problem-solving, and automated feedback provision), our study introduces the Socratic Playground for Learning (SPL), a dialogue-based ITS powered by the GPT-4 model, which employs the Socratic teaching method to foster critical thinking among learners. Through extensive prompt engineering, SPL can generate specific learning scenarios and facilitates efficient multi-turn tutoring dialogues. The SPL system aims to enhance personalized and adaptive learning experiences tailored to individual needs, specifically focusing on improving critical thinking skills. Our pilot experimental results from essay writing tasks demonstrate SPL has the potential to improve tutoring interactions and further enhance dialogue-based ITS functionalities. Our study, exemplified by SPL, demonstrates how LLMs enhance dialogue-based ITSs and expand the accessibility and efficacy of educational technologies.
Published: 2024

16. The Broadband X-ray Spectral Properties during the Rising Phases of the Outburst of the New Black Hole X-ray Binary Candidate Swift J1727.8-1613

Author: Liu, He-Xin, Xu, Yan-Jun, Zhang, Shuang-Nan, Yu, Wei, Huang, Yue, Tao, Lian, Zhang, Liang, Yang, Zi-Xu, Zhao, Qing-Chang, Qu, Jin-Lu, and Song, Li-Ming
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We report data analysis results about the outburst evolution and spectral properties during the hard state of the recently discovered X-ray transient Swift J1727.8-163 as observed by \emph{Insight}-HXMT and NuSTAR. We find that the broadband X-ray spectrum of Swift J1727.8-163 is more complex than the most typical spectral patterns of black hole X-ray binary systems, with not only a comparatively weaker reflection component but also an additional spectral continuum component, manifesting itself as a hard X-ray tail beyond the thermal Comptonization description detectable below 100 keV. This additional component can be phenomenologically well fitted by adding an extra power-law model with high energy exponential cutoff in the 2-120 keV energy band. We made an attempt to explain the broadband X-ray spectral continuum with a thermal/non-thermal hybrid plasma corona scenario , and find an ultra high compactness parameter ($l_{\rm s}\sim2000$) and a steep non-thermal electron distribution ($\Gamma_{\rm inj}>4$), suggesting the source was accreting with high Eddington rates and that the electron acceleration mechanism is not very efficient. We also present a detailed multi-epoch analysis of spectral properties using \emph{Insight}-HXMT data to investigate the evolution of the key physical properties regarding the disk and corona during the hard states. No significant variation is found with the inner disk radius and the coronal temperature during this time period, and the weak reflection and hard X-ray tail features are persistent. We discuss the physical implications of our spectral analysis results in the context of disk-corona relation, particle acceleration, and jet contribution, during the rise of a black hole X-ray binary in outburst., Comment: 16 pages, 6 figures
Published: 2024

17. X-ray and Radio campaign of the Z-source GX 340+0: discovery of X-ray polarization and its implications

Author: Bhargava, Yash, Ng, Mason, Zhang, Liang, Balasubramanian, Arvind, Russell, Thomas D., Kaushik, Aman, Jadoliya, Vishal, Ravi, Swati, Bhattacharyya, Sudip, Pahari, Mayukh, Homan, Jeroen, Marshall, Herman L., Chakrabarty, Deepto, and Carotenuto, Francesco
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present the discovery of X-ray polarization from the neutron star low-mass X-ray binary and Z-source, GX~340$+$0, using an Imaging X-ray Polarimetry Explorer (IXPE) observation in March 2024. Along with the IXPE observation, we conducted an extensive X-ray and radio monitoring campaign to ascertain the source properties during and around the IXPE observation. The source was within the horizontal branch throughout the multiwavelength campaign. We measured a significant X-ray polarization in 2--8 keV with polarization degree (PD) = $4.02 \pm 0.35$% and polarization angle (PA) = $37.6 \pm 2.5^\circ$. The energy-dependent polarization indicates that in the 2-2.5 keV energy range, the PA is much lower, $\sim9\pm8^\circ$, while other energy bands are consistent with the PA found over 2.5--8 keV. The simultaneous AstroSat-IXPE spectro-polarimetric observations provide some evidence for independent polarization from various spectral components, hinting at a disparity in the PA from the accretion disk and the Comptonized emission, while suggesting an unpolarized emission from the blackbody component. Radio observations in the 0.7--9 GHz frequency range reveal a non-detection of radio emission in 0.7-1.5 GHz and a significant detection in 5.5--9 GHz, suggesting the presence of a spectral break in 1.5-5.5 GHz. Using ATCA observation we place upper limits on the radio polarization at $<$6% on the linear polarization and $<$4% on the circular polarization at 3$\sigma$ level. We discuss the origin of the X-ray polarization and its implications on the geometry of the spectral components., Comment: Submitted in ApJL, 4 figures, 3 tables
Published: 2024

18. A Hessian-Aware Stochastic Differential Equation for Modelling SGD

Author: Li, Xiang, Shen, Zebang, Zhang, Liang, and He, Niao
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: Continuous-time approximation of Stochastic Gradient Descent (SGD) is a crucial tool to study its escaping behaviors from stationary points. However, existing stochastic differential equation (SDE) models fail to fully capture these behaviors, even for simple quadratic objectives. Built on a novel stochastic backward error analysis framework, we derive the Hessian-Aware Stochastic Modified Equation (HA-SME), an SDE that incorporates Hessian information of the objective function into both its drift and diffusion terms. Our analysis shows that HA-SME matches the order-best approximation error guarantee among existing SDE models in the literature, while achieving a significantly reduced dependence on the smoothness parameter of the objective. Further, for quadratic objectives, under mild conditions, HA-SME is proved to be the first SDE model that recovers exactly the SGD dynamics in the distributional sense. Consequently, when the local landscape near a stationary point can be approximated by quadratics, HA-SME is expected to accurately predict the local escaping behaviors of SGD.
Published: 2024

19. Modeling for Non-exponential Production Systems Using Parts Flow Data: Model Parameter Estimation and Performance Analysis

Author: Sun, Yuting and Zhang, Liang
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: Mathematical modeling of production systems is the foundation of all model-based approaches for production system analysis, design, improvement, and control. To construct such a model for the stochastic process of the production system more efficiently, a new modeling approach has been proposed that reversely identifies the model parameters using system performance metrics (e.g., production rate, work-in-process, etc.) derived from the parts flow data. This paper extends this performance metrics-based modeling approach to non-exponential serial production lines. Since no analytical expressions of performance metrics are available for non-exponential systems, we use neural network surrogate models to calculate those performance metrics as functions in terms of the system parameters. Then, based on the surrogate models and given performance metrics, the machine parameters are estimated by solving a constrained optimization problem that minimizes the mean square error of the performance metrics resulting from the estimated parameters compared to the true ones. With the designed multi-start particle swarm optimization algorithm, we find that multiple non-unique combinations of machine parameters can lead to practically the same system performance metrics and a linear relationship of the reliability parameters from these obtained estimations is observed. Besides, model sensitivity analysis is implemented to verify the robustness of the different combinations of machine parameters even under the potential improvement scenarios., Comment: 31 pages
Published: 2024

20. Anole: Adapting Diverse Compressed Models For Cross-Scene Prediction On Mobile Devices

Author: Li, Yunzhe, Zhu, Hongzi, Deng, Zhuohong, Cheng, Yunlong, Zhang, Liang, Chang, Shan, and Guo, Minyi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Emerging Artificial Intelligence of Things (AIoT) applications desire online prediction using deep neural network (DNN) models on mobile devices. However, due to the movement of devices, unfamiliar test samples constantly appear, significantly affecting the prediction accuracy of a pre-trained DNN. In addition, unstable network connection calls for local model inference. In this paper, we propose a light-weight scheme, called Anole, to cope with the local DNN model inference on mobile devices. The core idea of Anole is to first establish an army of compact DNN models, and then adaptively select the model fitting the current test sample best for online inference. The key is to automatically identify model-friendly scenes for training scene-specific DNN models. To this end, we design a weakly-supervised scene representation learning algorithm by combining both human heuristics and feature similarity in separating scenes. Moreover, we further train a model classifier to predict the best-fit scene-specific DNN model for each test sample. We implement Anole on different types of mobile devices and conduct extensive trace-driven and real-world experiments based on unmanned aerial vehicles (UAVs). The results demonstrate that Anole outwits the method of using a versatile large DNN in terms of prediction accuracy (4.5% higher), response time (33.1% faster) and power consumption (45.1% lower).
Published: 2024

21. Polarization Perspectives on Hercules X-1: Further Constraining the Geometry

Author: Zhao, Qingchang, Li, Hancheng, Tao, Lian, Feng, Hua, Zhang, Shuangnan, Walter, Roland, Ge, Mingyu, Tong, Hao, Ji, Long, Zhang, Liang, Qu, Jinlu, Huang, Yue, Ma, Xiang, Zhang, Shu, Yin, Qianqing, Yin, Hongxing, Ma, Ruican, Zhao, Shujie, Li, Panping, Yang, Zixu, Liu, Hexin, Yu, Wei, Huang, Yiming, Li, Zexi, Li, Yajun, Xiao, Jingyu, and Zhao, Kang
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We conduct a comprehensive analysis of the accreting X-ray pulsar, Hercules X-1, utilizing data from IXPE and NuSTAR. IXPE performed five observations of Her X-1, consisting of three in the Main-on state and two in the Short-on state. Our time-resolved analysis uncovers the linear correlations between the flux and polarization degree as well as the pulse fraction and polarization degree. Geometry parameters are rigorously constrained by fitting the phase-resolved modulations of Cyclotron Resonance Scattering Feature and polarization angle with a simple dipole model and Rotating Vector Model respectively, yielding roughly consistent results. The changes of $\chi_{\rm p}$ (the position angle of the pulsar's spin axis on the plane of the sky) between different Main-on observations suggest the possible forced precession of the neutron star crust. Furthermore, a linear association between the energy of Cyclotron Resonance Scattering Feature and polarization angle implies the prevalence of a dominant dipole magnetic field, and their phase-resolved modulations likely arise from viewing angle effects., Comment: Accepted for MNRAS
Published: 2024

22. TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

Author: Zhang, Liang, Hu, Anwen, Xu, Haiyang, Yan, Ming, Xu, Yichen, Jin, Qin, Zhang, Ji, and Huang, Fei
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Charts are important for presenting and explaining complex data relationships. Recently, multimodal large language models (MLLMs) have shown remarkable capabilities in various chart understanding tasks. However, the sheer size of these models in terms of parameters and computational requirements limits their use in resource-constrained environments. In this paper, we present TinyChart, an efficient MLLM for chart understanding with only 3B parameters. TinyChart overcomes two key challenges in efficient chart understanding: (1) reduce the burden of learning numerical computations through a Program-of-Thoughts (PoT) learning strategy, which trains the model to generate Python programs for numerical calculations, and (2) reduce lengthy vision feature sequences produced by the vision transformer for high-resolution images through a Vision Token Merging module, which gradually merges most similar vision tokens. Extensive experiments demonstrate that our 3B TinyChart achieves SOTA performance on a variety of chart understanding benchmarks including ChartQA, Chart-to-Text, Chart-to-Table, OpenCQA, and ChartX. It outperforms several chart understanding MLLM with up to 13B parameters such as ChartLlama and ChartAst, and close-sourced general-purpose MLLM GPT-4V on ChartQA. It also demonstrates its superior efficiency with higher throughput during inference due to a smaller model scale and more efficient vision encoding. Our code and model are available at https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/TinyChart., Comment: 13 pages, 11 figures
Published: 2024

23. New Timing Results of MSPs from NICER Observations

Author: Zheng, Shijie, Han, Dawei, Xu, Heng, Lee, Kejia, Yuan, Jianping, Wang, Haoxi, Ge, Mingyu, Zhang, Liang, Li, Yongye, Yin, Yitao, Ma, Xiang, Chen, Yong, and Zhang, Shuangnan
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: Millisecond pulsars (MSPs) are known for their long-term stability. Using six years of observations from the Neutron Star Interior Composition Explorer (NICER), we have conducted an in-depth analysis of the X-ray timing results for six MSPs: PSRs B1937+21, B1821$-$24, J0437$-$4715, J0030+0451, J0218+4232, and J2124$-$3358. The timing stability parameter $\sigma_z$ has been calculated, revealing remarkable timing precision on the order of $10^{-14}$ for PSRs B1937+21 and J0437$-$4715, and $10^{-13}$ for PSRs B1821$-$24, J0218+4232, and J0030+0451 over a timescale of 1000 days. These findings underscore the feasibility of autonomous in-orbit timekeeping using X-ray observations of MSPs. In addition, the consistency of long-term spin-down noise in the X-ray and radio bands has been investigated by comparison with IPTA radio data.
Published: 2024

24. Directional intense terahertz radiation driven by abruptly autofocusing lasers in air

Author: Zheng, Xiao-Ran, Li, Nan, Wang, Wei-Min, Zhang, Rui, Zhang, Cun-Lin, and Zhang, Liang-Liang
Subjects: Physics - Optics, Physics - Plasma Physics
Abstract: Two-color laser induced plasma filamentation in air could serve as tabletop sources of broadband terahertz (THz) pulses. Ubiquitous air in the earth facilitates its widespread utilities, particularly, in wireless communication and remote sensing, exploiting the unique advantage of the air-based scheme that the THz sources can be delivered over standoff distances via the pump laser propagation in air. However, the THz emission pattern inevitably has a conical angular profile with a dip in the propagation axis, therefore, THz energy concentration, propagation directionality, and accuracy of signal demodulation are significantly impaired, greatly limiting its direct applications. Here, we successfully eliminate the unfavorable conical profile by experiments and meanwhile enhance THz directionality and intensity by 17 folds by use of abruptly-autofocusing laser beams. Our theory and simulations show that these observations are attributed to efficient suppression of the dephasing effect appearing in previous investigations with ordinary Gaussian laser beams. This scheme is easily accessible since the abruptly-autofocusing beam can be achieved by imposing a spatial light modulator on the input Gaussian beam. This study solves a long-standing problem in the two-color laser scheme and clarifies the underlying physics of the conical angular profile formation., Comment: 22 pages, 4 figures, 1 table
Published: 2024

25. climber++: Pivot-Based Approximate Similarity Search over Big Data Series

Author: Zhang, Liang, Eltabakh, Mohamed Y., Rundensteiner, Elke A., and Alnuaim, Khalid
Subjects: Computer Science - Databases
Abstract: The generation and collection of big data series are becoming an integral part of many emerging applications in sciences, IoT, finance, and web applications among several others. The terabyte-scale of data series has motivated recent efforts to design fully distributed techniques for supporting operations such as approximate kNN similarity search, which is a building block operation in most analytics services on data series. Unfortunately, these techniques are heavily geared towards achieving scalability at the cost of sacrificing the results' accuracy. State-of-the-art systems report accuracy below 10% and 40%, respectively, which is not practical for many real-world applications. In this paper, we investigate the root problems in these existing techniques that limit their ability to achieve better a trade-off between scalability and accuracy. Then, we propose a framework, called CLIMBER, that encompasses a novel feature extraction mechanism, indexing scheme, and query processing algorithms for supporting approximate similarity search in big data series. For CLIMBER, we propose a new loss-resistant dual representation composed of rank-sensitive and ranking-insensitive signatures capturing data series objects. Based on this representation, we devise a distributed two-level index structure supported by an efficient data partitioning scheme. Our similarity metrics tailored for this dual representation enables meaningful comparison and distance evaluation between the rank-sensitive and ranking-insensitive signatures. Finally, we propose two efficient query processing algorithms, CLIMBER-kNN and CLIMBER-kNN-Adaptive, for answering approximate kNN similarity queries. Our experimental study on real-world and benchmark datasets demonstrates that CLIMBER, unlike existing techniques, features results' accuracy above 80% while retaining the desired scalability to terabytes of data., Comment: 16 pages, 14 figures, 1 table
Published: 2024

26. Solve arbitrary one-loop reduction with generating function

Author: Li, Tingfei, Song, Yuekai, and Zhang, Liang
Subjects: High Energy Physics - Phenomenology
Abstract: Recently, the concept of generating function has been employed in one-loop reduction. For one-loop integrals encompassing arbitrary tensor ranks and higher-pole contributions, the generating function can be decomposed into a tensor part and a higher-pole part. While the tensor component has been thoroughly addressed in recent studies, there remains a lack of satisfactory investigations regarding the higher-pole part. In this work, we completely solve the problem. We first establish the partial differential equations governing the higher-pole generating function. Based on these equations, we derive an integration recursion relation and solve it iteratively. This approach enables us to explore the analytical structure of higher-pole reduction and provides a valuable tool for generating reduction coefficients efficiently., Comment: 30 pages
Published: 2024

27. Beam test of a baseline vertex detector prototype for CEPC

Author: Li, Shuqi, Wu, Tianya, Huang, Xinhui, Zhou, Jia, Yan, Ziyue, Wang, Wei, Zeng, Hao, Hu, Yiming, Zhang, Xiaoxu, Liang, Zhijun, Wei, Wei, Zhang, Ying, Wei, Xiaomin, Zhang, Lei, Qi, Ming, Hu, Jun, Fu, Jinyu, Zhang, Hongyu, Li, Gang, Wu, Linghui, Dong, Mingyi, Li, Xiaoting, Casanova, Raimon, Zhang, Liang, Dong, Jianing, Wang, Jia, Zheng, Ran, Lu, Weiguo, Grinstein, Sebastian, and da Costa, João Guimarães
Subjects: Physics - Instrumentation and Detectors, High Energy Physics - Experiment
Abstract: The Circular Electron Positron Collider (CEPC) has been proposed to enable more thorough and precise measurements of the properties of Higgs, W, and Z bosons, as well as to search for new physics. In response to the stringent performance requirements of the vertex detector for the CEPC, a baseline vertex detector prototype was tested and characterized for the first time using a 6 GeV electron beam at DESY II Test Beam Line 21. The baseline vertex detector prototype is designed with a cylindrical barrel structure that contains six double-sided detector modules (ladders). Each side of the ladder includes TaichuPix-3 sensors based on Monolithic Active Pixel Sensor (MAPS) technology, a flexible printed circuit, and a carbon fiber support structure. Additionally, the readout electronics and the Data Acquisition system were also examined during this beam test. The performance of the prototype was evaluated using an electron beam that passed through six ladders in a perpendicular direction. The offline data analysis indicates a spatial resolution of about 5 um, with detection efficiency exceeding 99 % and an impact parameter resolution of about 5.1 um. These promising results from this baseline vertex detector prototype mark a significant step toward realizing the optimal vertex detector for the CEPC.
Published: 2024

28. Language Model Guided Interpretable Video Action Reasoning

Author: Wang, Ning, Zhu, Guangming, Li, HS, Zhang, Liang, Shah, Syed Afaq Ali, and Bennamoun, Mohammed
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: While neural networks have excelled in video action recognition tasks, their black-box nature often obscures the understanding of their decision-making processes. Recent approaches used inherently interpretable models to analyze video actions in a manner akin to human reasoning. These models, however, usually fall short in performance compared to their black-box counterparts. In this work, we present a new framework named Language-guided Interpretable Action Recognition framework (LaIAR). LaIAR leverages knowledge from language models to enhance both the recognition capabilities and the interpretability of video models. In essence, we redefine the problem of understanding video model decisions as a task of aligning video and language models. Using the logical reasoning captured by the language model, we steer the training of the video model. This integrated approach not only improves the video model's adaptability to different domains but also boosts its overall performance. Extensive experiments on two complex video action datasets, Charades & CAD-120, validates the improved performance and interpretability of our LaIAR framework. The code of LaIAR is available at https://github.com/NingWang2049/LaIAR., Comment: Accepted by CVPR 2024
Published: 2024

29. The paradigm of tax-reward and tax-punishment strategies in the advancement of public resource management dynamics

Author: Wang, Lichen, Liu, Yuyuan, Guo, Ruqiang, Zhang, Liang, Liu, Linjie, and Hua, Shijia
Subjects: Mathematics - Dynamical Systems
Abstract: In contemporary society, the effective utilization of public resources remains a subject of significant concern. A common issue arises from defectors seeking to obtain an excessive share of these resources for personal gain, potentially leading to resource depletion. To mitigate this tragedy and ensure sustainable development of resources, implementing mechanisms to either reward those who adhere to distribution rules or penalize those who do not, appears advantageous. We introduce two models: a tax-reward model and a tax-punishment model, to address this issue. Our analysis reveals that in the tax-reward model, the evolutionary trajectory of the system is influenced not only by the tax revenue collected but also by the natural growth rate of the resources. Conversely, the tax-punishment model exhibits distinct characteristics when compared to the tax-reward model, notably the potential for bistability. In such scenarios, the selection of initial conditions is critical, as it can determine the system's path. Furthermore, our study identifies instances where the system lacks stable points, exemplified by a limit cycle phenomenon, underscoring the complexity and dynamism inherent in managing public resources using these models., Comment: Accepted by Proceedings of the Royal Society B-Biological Sciences
Published: 2024

30. Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

Author: Geng, Binzong, Huan, Zhaoxin, Zhang, Xiaolu, He, Yong, Zhang, Liang, Yuan, Fajie, Zhou, Jun, and Mo, Linjian
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence
Abstract: With the rise of large language models (LLMs), recent works have leveraged LLMs to improve the performance of click-through rate (CTR) prediction. However, we argue that a critical obstacle remains in deploying LLMs for practical use: the efficiency of LLMs when processing long textual user behaviors. As user sequences grow longer, the current efficiency of LLMs is inadequate for training on billions of users and items. To break through the efficiency barrier of LLMs, we propose Behavior Aggregated Hierarchical Encoding (BAHE) to enhance the efficiency of LLM-based CTR modeling. Specifically, BAHE proposes a novel hierarchical architecture that decouples the encoding of user behaviors from inter-behavior interactions. Firstly, to prevent computational redundancy from repeated encoding of identical user behaviors, BAHE employs the LLM's pre-trained shallow layers to extract embeddings of the most granular, atomic user behaviors from extensive user sequences and stores them in the offline database. Subsequently, the deeper, trainable layers of the LLM facilitate intricate inter-behavior interactions, thereby generating comprehensive user embeddings. This separation allows the learning of high-level user representations to be independent of low-level behavior encoding, significantly reducing computational complexity. Finally, these refined user embeddings, in conjunction with correspondingly processed item embeddings, are incorporated into the CTR model to compute the CTR scores. Extensive experimental results show that BAHE reduces training time and memory by five times for CTR models using LLMs, especially with longer user sequences. BAHE has been deployed in a real-world system, allowing for daily updates of 50 million CTR data on 8 A100 GPUs, making LLMs practical for industrial CTR prediction., Comment: Accepted by the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2024
Published: 2024

31. Recovery of High-energy Low-frequency Quasi-periodic Oscillations from Black Hole X-ray Binary MAXI J1535-571 with a Hilbert-Huang Transform Method

Author: Shui, Qingcang, Zhang, Shu, Zhang, Shuangnan, Chen, Yupeng, Kong, Lingda, Peng, Jingqiang, Ji, Long, Wang, Pengju, Chang, Zhi, Yu, Zhuoli, Yin, Hongxing, Qu, Jinlu, Tao, Lian, Ge, Mingyu, Ma, Xiang, Zhang, Liang, Yu, Wei, and Li, Jian
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We propose a method based on the Hilbert-Huang transform (HHT) to recover the high-energy waveform of low-frequency quasi-periodic oscillations (LFQPOs). Based on the method, we successfully obtain the modulation of the phase-folded light curve above 170 keV using the QPO phase reconstructed at lower energies in MAXI J1535-571 with Insight-HXMT observations. A comprehensive simulation study is conducted to demonstrate that such modulation indeed originates from the QPO. Thus the highest energies turn out to significantly exceed the upper limit of ~100 keV for QPOs reported previously using the Fourier method, marking the first opportunity to study QPO properties above 100 keV in this source. Detailed analyses of these high-energy QPO profiles reveal different QPO properties between the 30-100 keV and 100-200 keV energy ranges: the phase lag remains relatively stable, and the amplitude slightly increases below ~100 keV, whereas above this threshold, soft phase lags and a decrease in amplitude are observed. Given the reports of a hard tail detection in broad spectroscopy, we propose that the newly discovered QPO properties above 100 keV are dominated by the hard tail component, possibly stemming from a relativistic jet. Our findings also indicate a strong correlation between the QPOs originating from the jet and corona, supporting the scenario of jet-corona coupling precssion. We emphasize that our proposed HHT-based method can serve as an efficient manner in expanding the high energy band for studying QPOs, thereby enhancing our understanding of their origin., Comment: 21 pages, 15 figures, accepted for publication in ApJL
Published: 2024

32. Timing analysis of the newly discovered black hole candidate Swift J1727.8-1613 with Insight-HXMT

Author: Yu, Wei, Bu, Qing-Cui, Zhang, Shuang-Nan, Liu, He-Xin, Zhang, Liang, Ducci, Lorenzo, Tao, Lian, Santangelo, Andrea, Doroshenko, Victor, Huang, Yue, Yang, Zi-Xu, and Qu, Jin-Lu
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We present the results obtained from an X-ray timing study of the new black hole candidate (BHC) Swift J1727.8-1613. The work is based on Hard X-ray Modulation Telescope (Insight-HXMT) observations carried out during the 2023 outburst. Prominent type-C low-frequency Quasi-periodic Oscillations (LFQPOs) are detected throughout the observations. With the substantial effective area of the Insight-HXMT at high energies, we examine the energy dependence of various parameters, including the centroid frequency, fractional rms, and phase lags of the type-C QPOs. Our findings align closely with those observed in high-inclination systems. During the initial stage of the outburst, a peaked noise component is also detected, the frequency of which is highly correlated with the LFQPO frequency, aligning with the Psaltis-Belloni-van der Klis (PBK) relation. By assuming that the peaked noise originates from the precession of the accretion disc, the spin of this source can be constrained. Our results suggest that this source may possess a high spin.
Published: 2024

33. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

Author: Hu, Anwen, Xu, Haiyang, Ye, Jiabo, Yan, Ming, Zhang, Liang, Zhang, Bo, Li, Chen, Zhang, Ji, Jin, Qin, Huang, Fei, and Zhou, Jingren
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Structure information is critical for understanding the semantics of text-rich images, such as documents, tables, and charts. Existing Multimodal Large Language Models (MLLMs) for Visual Document Understanding are equipped with text recognition ability but lack general structure understanding abilities for text-rich document images. In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs. Our Unified Structure Learning comprises structure-aware parsing tasks and multi-grained text localization tasks across 5 domains: document, webpage, table, chart, and natural image. To better encode structure information, we design a simple and effective vision-to-text module H-Reducer, which can not only maintain the layout information but also reduce the length of visual features by merging horizontal adjacent patches through convolution, enabling the LLM to understand high-resolution images more efficiently. Furthermore, by constructing structure-aware text sequences and multi-grained pairs of texts and bounding boxes for publicly available text-rich images, we build a comprehensive training set DocStruct4M to support structure learning. Finally, we construct a small but high-quality reasoning tuning dataset DocReason25K to trigger the detailed explanation ability in the document domain. Our model DocOwl 1.5 achieves state-of-the-art performance on 10 visual document understanding benchmarks, improving the SOTA performance of MLLMs with a 7B LLM by more than 10 points in 5/10 benchmarks. Our codes, models, and datasets are publicly available at https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl1.5., Comment: 21 pages, 15 figures
Published: 2024

34. Primal Methods for Variational Inequality Problems with Functional Constraints

Author: Zhang, Liang, He, Niao, and Muehlebach, Michael
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Constrained variational inequality problems are recognized for their broad applications across various fields including machine learning and operations research. First-order methods have emerged as the standard approach for solving these problems due to their simplicity and scalability. However, they typically rely on projection or linear minimization oracles to navigate the feasible set, which becomes computationally expensive in practical scenarios featuring multiple functional constraints. Existing efforts to tackle such functional constrained variational inequality problems have centered on primal-dual algorithms grounded in the Lagrangian function. These algorithms along with their theoretical analysis often require the existence and prior knowledge of the optimal Lagrange multipliers. In this work, we propose a simple primal method, termed Constrained Gradient Method (CGM), for addressing functional constrained variational inequality problems, without necessitating any information on the optimal Lagrange multipliers. We establish a non-asymptotic convergence analysis of the algorithm for variational inequality problems with monotone operators under smooth constraints. Remarkably, our algorithms match the complexity of projection-based methods in terms of operator queries for both monotone and strongly monotone settings, while utilizing significantly cheaper oracles based on quadratic programming. Furthermore, we provide several numerical examples to evaluate the efficacy of our algorithms.
Published: 2024

35. HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback

Author: Li, Ang, Xiao, Qiugen, Cao, Peng, Tang, Jian, Yuan, Yi, Zhao, Zijie, Chen, Xiaoyuan, Zhang, Liang, Li, Xiangyang, Yang, Kaitong, Guo, Weidong, Gan, Yukang, Yu, Xu, Wang, Daniell, and Shan, Ying
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Reinforcement Learning from AI Feedback (RLAIF) has the advantages of shorter annotation cycles and lower costs over Reinforcement Learning from Human Feedback (RLHF), making it highly efficient during the rapid strategy iteration periods of large language model (LLM) training. Using ChatGPT as a labeler to provide feedback on open-domain prompts in RLAIF training, we observe an increase in human evaluators' preference win ratio for model responses, but a decrease in evaluators' satisfaction rate. Analysis suggests that the decrease in satisfaction rate is mainly due to some responses becoming less helpful, particularly in terms of correctness and truthfulness, highlighting practical limitations of basic RLAIF. In this paper, we propose Hybrid Reinforcement Learning from AI Feedback (HRLAIF). This method enhances the accuracy of AI annotations for responses, making the model's helpfulness more robust in training process. Additionally, it employs AI for Red Teaming, further improving the model's harmlessness. Human evaluation results show that HRLAIF inherits the ability of RLAIF to enhance human preference for outcomes at a low cost while also improving the satisfaction rate of responses. Compared to the policy model before Reinforcement Learning (RL), it achieves an increase of 2.08\% in satisfaction rate, effectively addressing the issue of a decrease of 4.58\% in satisfaction rate after basic RLAIF., Comment: 18 pages, 7 figures
Published: 2024

36. Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

Author: Zhang, Liang, Lin, Jionghao, Borchers, Conrad, Sabatini, John, Hollander, John, Cao, Meng, and Hu, Xiangen
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. Our study investigates the application of advanced AI models, including Large Language Models (LLMs) like GPT-4, for predicting learning performance in adult literacy programs in ITSs. This research is motivated by the potential of LLMs to predict learning performance based on its inherent reasoning and computational capabilities. By using reading comprehension datasets from the ITS, AutoTutor, we evaluate the predictive capabilities of GPT-4 versus traditional machine learning methods in predicting learning performance through five-fold cross-validation techniques. Our findings show that the GPT-4 presents the competitive predictive abilities with traditional machine learning methods such as Bayesian Knowledge Tracing, Performance Factor Analysis, Sparse Factor Analysis Lite (SPARFA-Lite), tensor factorization and eXtreme Gradient Boosting (XGBoost). While XGBoost (trained on local machine) outperforms GPT-4 in predictive accuracy, GPT-4-selected XGBoost and its subsequent tuning on the GPT-4 platform demonstrates superior performance compared to local machine execution. Moreover, our investigation into hyper-parameter tuning by GPT-4 versus grid-search suggests comparable performance, albeit with less stability in the automated approach, using XGBoost as the case study. Our study contributes to the field by highlighting the potential of integrating LLMs with traditional machine learning models to enhance predictive accuracy and personalize adult literacy education, setting a foundation for future research in applying LLMs within ITSs., Comment: 26TH International Conference on Human-Computer Interaction
Published: 2024

37. Probing new physics with polarization components of the tau lepton in quasielastic $e^- p \to \Lambda_c \tau^-$ scattering process

Author: Yan, Xin-Shuai, Zhang, Liang-Hui, Chang, Qin, and Yang, Ya-Dong
Subjects: High Energy Physics - Phenomenology, High Energy Physics - Experiment
Abstract: Kinematics restrict the ability of rare charm decays to explore the charged Lepton Flavor Violation processes mediated by the quark-level $c\to u \ell \tau$ transition. To fill the gap, we propose exploring new physics (NP) through the quasielastic scattering process $e^-p\to \tau^-\Lambda_c$ and the polarization of the $\tau$ lepton. As analyzing modes for the $\tau$ polarization, we consider the decays $\tau^-\to \pi^-\nu_{\tau}$, $\tau^-\to \rho^-\nu_{\tau}$, and $\tau^- \to \ell^-\bar{\nu}_{\ell}\nu_{\tau}$, and show that the $\tau$ polarization components can be extracted from analyzing the kinematics of the $\tau$ visible decay products. In the framework of a general low-energy effective Lagrangian, we then perform a detailed analysis of the polarization components in various aspects and scrutinize possible NP signals. With one upcoming experimental setup, we finally demonstrate promising event rate can be expected for the cascade process and, even in the worst-case scenario -- no signals is observed at all -- it can still provide a competitive potential for constraining the NP, compared with those from the high-$p_T$ dilepton invariant mass tails at high-energy colliders., Comment: 20 pages, 4 figures, 5 tables
Published: 2024

38. Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective

Author: Yue, Zihao, Zhang, Liang, and Jin, Qin
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Large Multimodal Models (LMMs) often suffer from multimodal hallucinations, wherein they may create content that is not present in the visual inputs. In this paper, we explore a new angle of this issue: overly detailed training data hinders the model's ability to timely terminate generation, leading to continued outputs beyond visual perception limits. By investigating how the model decides to terminate generation with EOS, the special end-of-sentence token, we find that the model assesses the completeness of the entire sequence by comparing the generated text with the image. This observation suggests that the model possesses an inherent potential of making proper EOS decisions based on its visual perception to avoid overly lengthy outputs. To take advantage of such potential, we explore two methods to mitigate multimodal hallucinations: a training objective that enables the model to reduce hallucinations by learning from regular instruction data, and a data filtering strategy to prevent harmful training data from exacerbating model hallucinations. Both methods significantly improve the hallucination performance of LMMs, without requiring any additional data or knowledge., Comment: Accepted to ACL 2024
Published: 2024

39. LLM-CompDroid: Repairing Configuration Compatibility Bugs in Android Apps with Pre-trained Large Language Models

Author: Liu, Zhijie, Tang, Yutian, Li, Meiyun, Jin, Xin, Long, Yunfei, Zhang, Liang Feng, and Luo, Xiapu
Subjects: Computer Science - Software Engineering
Abstract: XML configurations are integral to the Android development framework, particularly in the realm of UI display. However, these configurations can introduce compatibility issues (bugs), resulting in divergent visual outcomes and system crashes across various Android API versions (levels). In this study, we systematically investigate LLM-based approaches for detecting and repairing configuration compatibility bugs. Our findings highlight certain limitations of LLMs in effectively identifying and resolving these bugs, while also revealing their potential in addressing complex, hard-to-repair issues that traditional tools struggle with. Leveraging these insights, we introduce the LLM-CompDroid framework, which combines the strengths of LLMs and traditional tools for bug resolution. Our experimental results demonstrate a significant enhancement in bug resolution performance by LLM-CompDroid, with LLM-CompDroid-GPT-3.5 and LLM-CompDroid-GPT-4 surpassing the state-of-the-art tool, ConfFix, by at least 9.8% and 10.4% in both Correct and Correct@k metrics, respectively. This innovative approach holds promise for advancing the reliability and robustness of Android applications, making a valuable contribution to the field of software development.
Published: 2024

40. Adversarial Curriculum Graph Contrastive Learning with Pair-wise Augmentation

Author: Zhao, Xinjian, Zhang, Liang, Liu, Yang, Guo, Ruocheng, and Zhao, Xiangyu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Graph contrastive learning (GCL) has emerged as a pivotal technique in the domain of graph representation learning. A crucial aspect of effective GCL is the caliber of generated positive and negative samples, which is intrinsically dictated by their resemblance to the original data. Nevertheless, precise control over similarity during sample generation presents a formidable challenge, often impeding the effective discovery of representative graph patterns. To address this challenge, we propose an innovative framework: Adversarial Curriculum Graph Contrastive Learning (ACGCL), which capitalizes on the merits of pair-wise augmentation to engender graph-level positive and negative samples with controllable similarity, alongside subgraph contrastive learning to discern effective graph patterns therein. Within the ACGCL framework, we have devised a novel adversarial curriculum training methodology that facilitates progressive learning by sequentially increasing the difficulty of distinguishing the generated samples. Notably, this approach transcends the prevalent sparsity issue inherent in conventional curriculum learning strategies by adaptively concentrating on more challenging training data. Finally, a comprehensive assessment of ACGCL is conducted through extensive experiments on six well-known benchmark datasets, wherein ACGCL conspicuously surpasses a set of state-of-the-art baselines.
Published: 2024

41. Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems

Author: Zhang, Liang and Chen, Zhelun
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: The potential of Machine Learning Control (MLC) in HVAC systems is hindered by its opaque nature and inference mechanisms, which is challenging for users and modelers to fully comprehend, ultimately leading to a lack of trust in MLC-based decision-making. To address this challenge, this paper investigates and explores Interpretable Machine Learning (IML), a branch of Machine Learning (ML) that enhances transparency and understanding of models and their inferences, to improve the credibility of MLC and its industrial application in HVAC systems. Specifically, we developed an innovative framework that combines the principles of Shapley values and the in-context learning feature of Large Language Models (LLMs). While the Shapley values are instrumental in dissecting the contributions of various features in ML models, LLM provides an in-depth understanding of rule-based parts in MLC; combining them, LLM further packages these insights into a coherent, human-understandable narrative. The paper presents a case study to demonstrate the feasibility of the developed IML framework for model predictive control-based precooling under demand response events in a virtual testbed. The results indicate that the developed framework generates and explains the control signals in accordance with the rule-based rationale.
Published: 2024

42. Advancing Building Energy Modeling with Large Language Models: Exploration and Case Studies

Author: Zhang, Liang, Chen, Zhelun, and Ford, Vitaly
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence
Abstract: The rapid progression in artificial intelligence has facilitated the emergence of large language models like ChatGPT, offering potential applications extending into specialized engineering modeling, especially physics-based building energy modeling. This paper investigates the innovative integration of large language models with building energy modeling software, focusing specifically on the fusion of ChatGPT with EnergyPlus. A literature review is first conducted to reveal a growing trend of incorporating of large language models in engineering modeling, albeit limited research on their application in building energy modeling. We underscore the potential of large language models in addressing building energy modeling challenges and outline potential applications including 1) simulation input generation, 2) simulation output analysis and visualization, 3) conducting error analysis, 4) co-simulation, 5) simulation knowledge extraction and training, and 6) simulation optimization. Three case studies reveal the transformative potential of large language models in automating and optimizing building energy modeling tasks, underscoring the pivotal role of artificial intelligence in advancing sustainable building practices and energy efficiency. The case studies demonstrate that selecting the right large language model techniques is essential to enhance performance and reduce engineering efforts. Besides direct use of large language models, three specific techniques were utilized: 1) prompt engineering, 2) retrieval-augmented generation, and 3) multi-agent large language models. The findings advocate a multidisciplinary approach in future artificial intelligence research, with implications extending beyond building energy modeling to other specialized engineering modeling.
Published: 2024

43. Enhancing the efficiency of protein language models with minimal wet-lab data through few-shot learning

Author: Zhou, Ziyi, Zhang, Liang, Yu, Yuanxi, Li, Mingchen, Hong, Liang, and Tan, Pan
Subjects: Quantitative Biology - Biomolecules
Abstract: Accurately modeling the protein fitness landscapes holds great importance for protein engineering. Recently, due to their capacity and representation ability, pre-trained protein language models have achieved state-of-the-art performance in predicting protein fitness without experimental data. However, their predictions are limited in accuracy as well as interpretability. Furthermore, such deep learning models require abundant labeled training examples for performance improvements, posing a practical barrier. In this work, we introduce FSFP, a training strategy that can effectively optimize protein language models under extreme data scarcity. By combining the techniques of meta-transfer learning, learning to rank, and parameter-efficient fine-tuning, FSFP can significantly boost the performance of various protein language models using merely tens of labeled single-site mutants from the target protein. The experiments across 87 deep mutational scanning datasets underscore its superiority over both unsupervised and supervised approaches, revealing its potential in facilitating AI-guided protein design.
Published: 2024

44. 3DG: A Framework for Using Generative AI for Handling Sparse Learner Performance Data From Intelligent Tutoring Systems

Author: Zhang, Liang, Lin, Jionghao, Borchers, Conrad, Cao, Meng, and Hu, Xiangen
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Learning performance data (e.g., quiz scores and attempts) is significant for understanding learner engagement and knowledge mastery level. However, the learning performance data collected from Intelligent Tutoring Systems (ITSs) often suffers from sparsity, impacting the accuracy of learner modeling and knowledge assessments. To address this, we introduce the 3DG framework (3-Dimensional tensor for Densification and Generation), a novel approach combining tensor factorization with advanced generative models, including Generative Adversarial Network (GAN) and Generative Pre-trained Transformer (GPT), for enhanced data imputation and augmentation. The framework operates by first representing the data as a three-dimensional tensor, capturing dimensions of learners, questions, and attempts. It then densifies the data through tensor factorization and augments it using Generative AI models, tailored to individual learning patterns identified via clustering. Applied to data from an AutoTutor lesson by the Center for the Study of Adult Literacy (CSAL), the 3DG framework effectively generated scalable, personalized simulations of learning performance. Comparative analysis revealed GAN's superior reliability over GPT-4 in this context, underscoring its potential in addressing data sparsity challenges in ITSs and contributing to the advancement of personalized educational technology.
Published: 2024

45. AstroSat and NICER timing view of the Z-type Neutron Star X-ray binary GX 340+0

Author: Pahari, Mayukh, Suman, Shree, Bhargava, Yash, Weston, Alexander, Zhang, Liang, Bhattacharyya, Sudip, Misra, Ranjeev, and McHardy, Ian
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: The timing properties of the Z-type low-mass X-ray binaries provide insights into the emission components involved in producing the unique Z-shaped track in the hardness-intensity diagrams of these sources. In this work, we investigate the AstroSat and NICER observations of the GX 340+0 covering the complete 'Z'-track from the horizontal branch (HB) to the extended flaring branch (EFB). For the first time, we present the Z-track as seen in soft X-rays using the AstroSat/SXT and NICER (the soft colour is defined as a ratio of 3-6 keV to 0.5-3 keV). The shape of the track is distinctly different in soft X-rays, strongly suggesting the presence of additional components active in soft X-rays. The detailed timing analysis revealed significant quasi-periodic oscillation throughout the HB and the normal branch (NB) using LAXPC and the first NICER detection of 33.1 +/- 1.1 Hz horizontal branch oscillation (HBO) in 3-6 keV. The oscillations at the HB/NB vertex are observed to have higher frequencies (41-52 Hz) than the HB oscillations (16-31 Hz) and NB oscillations (6.2-8 Hz) but significantly lower rms (~1.6%). The HB oscillation is also limited to the energy range of 3-20 keV, indicating an association of HBO origin with the non-thermal component. It is also supported by earlier studies that found the strongest X-ray polarisation during HB., Comment: 15 pages, 12 figures, 4 tables, accepted for publication in the MNRAS
Published: 2024

46. Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately

Author: Zhang, Liang, Jijo, Katherine, Setty, Spurthi, Chung, Eden, Javid, Fatima, Vidra, Natan, and Clifford, Tommy
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) generate responses to questions; however, their effectiveness is often hindered by sub-optimal quality of answers and occasional failures to provide accurate responses to questions. To address these challenges, a fine-tuning process is employed, involving feedback and examples to refine models. The objective is to enhance AI models through continuous feedback loops, utilizing metrics such as cosine similarity, LLM evaluation and Rouge-L scores to evaluate the models. Leveraging LLMs like GPT-3.5, GPT4ALL, and LLaMA2, and Claude, this approach is benchmarked on financial datasets, including the FinanceBench and RAG Instruct Benchmark Tester Dataset, illustrating the necessity of fine-tuning. The results showcase the capability of fine-tuned models to surpass the accuracy of zero-shot LLMs, providing superior question and answering capabilities. Notably, the combination of fine-tuning the LLM with a process known as Retrieval Augmented Generation (RAG) proves to generate responses with improved accuracy.
Published: 2024

47. Research on the knee region of cosmic ray by using a novel type of electron-neutron detector array

Author: Li, Bing-Bing, Ma, Xin-Hua, Cui, Shu-Wang, Chen, Hao-Kun, Chen, Tian-Lu, Danzengluobu, Gao, Wei, Hu, Hai-Bing, Kuleshov, Denis, Kurinov, Kirill, Liu, Hu, Liu, Mao-Yuan, Liu, Ye, Peng, Da-Yu, Qi, Yao-Hui, Shchegolev, Oleg, Stenkin, Yuri, Yin, Li-Qiao, Zhang, Heng-Yu, and Zhang, Liang-Wei
Subjects: Astrophysics - High Energy Astrophysical Phenomena, Astrophysics - Instrumentation and Methods for Astrophysics, Physics - Instrumentation and Detectors
Abstract: By accurately measuring composition and energy spectrum of cosmic ray, the origin problem of so called "keen" region (energy > 1 PeV) can be solved. However, up to the present, the results of the spectrum in the knee region obtained by several previous experiments have shown obvious differences, so they cannot give effective evidence for judging the theoretical models on the origin of the knee. Recently, the Large High Altitude Air Shower Observatory (LHAASO) has reported several major breakthroughs and important results in astro-particle physics field. Relying on its advantages of wide-sky survey, high altitude location and large area detector arrays, the research content of LHAASO experiment mainly includes ultra high-energy gamma-ray astronomy, measurement of cosmic ray spectra in the knee region, searching for dark matter and new phenomena of particle physics at higher energy. The electron and Thermal Neutron detector (EN-Detector) is a new scintillator detector which applies thermal neutron detection technology to measure cosmic ray extensive air shower (EAS). This technology is an extension of LHAASO. The EN-Detector Array (ENDA) can highly efficiently measure thermal neutrons generated by secondary hadrons so called "skeleton" of EAS. In this paper, we perform the optimization of ENDA configuration, and obtain expectations on the ENDA results, including thermal neutron distribution, trigger efficiency and capability of cosmic ray composition separation. The obtained real data results are consistent with those by the Monte Carlo simulation.
Published: 2024

48. Studies on the soft intermediate state X-ray flare of MAXI J1535-571 during its 2017 outburst

Author: Ma, Ruican, Tao, Lian, Méndez, Mariano, Zhang, Shuang-Nan, Xu, Yanjun, Zhang, Liang, Liu, Hexin, Qu, Jinlu, Song, Liming, Ren, Xiaoqin, Zhao, Shujie, Huang, Yue, Ma, Xiang, Zhao, Qingchang, Xu, Yingchen, Li, Panping, Yang, Zixu, and Yu, Wei
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We analyzed an observation with the Nuclear Spectroscopic Telescope Array of the black-hole X-ray binary MAXI J1535-571 in the soft intermediate state, in which we detected a 2.5-ks long flare. Our spectral fitting results suggest that MAXI J1535-571 possesses a high spin of 0.97 (-0.10/+0.02) and a low inclination of approximately 24 deg. We observed a gradual increase in the inner disc radius, as determined from fits to the continuum spectrum. This trend is inconsistent with an increased flux ratio of the thermal component, as well as the source evolving towards the soft state. This inconsistency may be attributed to a gradual decrease of the color correction factor. Additionally, with a flare velocity of approximately 0.5 c and a higher hardness ratio during the flare period, the quasi-simultaneous detection of a type-B QPO in the Neutron Star Interior Composition Explorer data, and quasi-simultaneous ejecta launch through radio observations collectively provide strong evidence supporting the possibility that the flare originated from a discrete jet ejection., Comment: 11 pages, 8 figures, 3 tables; accepted to be published in MNRAS
Published: 2024

49. The two-way knowledge interaction interface between humans and neural networks

Author: He, Zhanliang, Xiong, Nuoye, Li, Hongsheng, Shen, Peiyi, Zhu, Guangming, and Zhang, Liang
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Despite neural networks (NN) have been widely applied in various fields and generally outperforms humans, they still lack interpretability to a certain extent, and humans are unable to intuitively understand the decision logic of NN. This also hinders the knowledge interaction between humans and NN, preventing humans from getting involved to give direct guidance when NN's decisions go wrong. While recent research in explainable AI has achieved interpretability of NN from various perspectives, it has not yet provided effective methods for knowledge exchange between humans and NN. To address this problem, we constructed a two-way interaction interface that uses structured representations of visual concepts and their relationships as the "language" for knowledge exchange between humans and NN. Specifically, NN provide intuitive reasoning explanations to humans based on the class-specific structural concepts graph (C-SCG). On the other hand, humans can modify the biases present in the C-SCG through their prior knowledge and reasoning ability, and thus provide direct knowledge guidance to NN through this interface. Through experimental validation, based on this interaction interface, NN can provide humans with easily understandable explanations of the reasoning process. Furthermore, human involvement and prior knowledge can directly and effectively contribute to enhancing the performance of NN.
Published: 2024

50. Content-Conditioned Generation of Stylized Free hand Sketches

Author: Liu, Jiajun, Wang, Siyuan, Zhu, Guangming, Zhang, Liang, Li, Ning, and Gao, Eryang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: In recent years, the recognition of free-hand sketches has remained a popular task. However, in some special fields such as the military field, free-hand sketches are difficult to sample on a large scale. Common data augmentation and image generation techniques are difficult to produce images with various free-hand sketching styles. Therefore, the recognition and segmentation tasks in related fields are limited. In this paper, we propose a novel adversarial generative network that can accurately generate realistic free-hand sketches with various styles. We explore the performance of the model, including using styles randomly sampled from a prior normal distribution to generate images with various free-hand sketching styles, disentangling the painters' styles from known free-hand sketches to generate images with specific styles, and generating images of unknown classes that are not in the training set. We further demonstrate with qualitative and quantitative evaluations our advantages in visual quality, content accuracy, and style imitation on SketchIME., Comment: 6 pages, 7 figures, ICSMD
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

13,150 results on '"ZHANG, Liang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources