48,331 results on '"Li, Hao"'
Search Results
2. Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework
- Author
-
Wu, Yuping, Li, Hao, Zhu, Hongbo, Nenadic, Goran, and Zeng, Xiao-Jun
- Subjects
Computer Science - Computation and Language - Abstract
Extract-then-Abstract is a naturally coherent paradigm to conduct abstractive summarization with the help of salient information identified by the extractive model. Previous works that adopt this paradigm train the extractor and abstractor separately and introduce extra parameters to highlight the extracted salients to the abstractor, which results in error accumulation and additional training costs. In this paper, we first introduce a parameter-free highlight method into the encoder-decoder framework: replacing the encoder attention mask with a saliency mask in the cross-attention module to force the decoder to focus only on salient parts of the input. A preliminary analysis compares different highlight methods, demonstrating the effectiveness of our saliency mask. We further propose the novel extract-and-abstract paradigm, ExtAbs, which jointly and seamlessly performs Extractive and Abstractive summarization tasks within single encoder-decoder model to reduce error accumulation. In ExtAbs, the vanilla encoder is augmented to extract salients, and the vanilla decoder is modified with the proposed saliency mask to generate summaries. Built upon BART and PEGASUS, experiments on three datasets show that ExtAbs can achieve superior performance than baselines on the extractive task and performs comparable, or even better than the vanilla models on the abstractive task.
- Published
- 2024
3. Multi-granularity Score-based Generative Framework Enables Efficient Inverse Design of Complex Organics
- Author
-
Chen, Zijun, Wang, Yu, Lv, Liuzhenghao, Li, Hao, Lin, Zongying, Yuan, Li, and Tian, Yonghong
- Subjects
Computer Science - Computational Engineering, Finance, and Science - Abstract
Efficiently retrieving an enormous chemical library to design targeted molecules is crucial for accelerating drug discovery, organic chemistry, and optoelectronic materials. Despite the emergence of generative models to produce novel drug-like molecules, in a more realistic scenario, the complexity of functional groups (e.g., pyrene, acenaphthylene, and bridged-ring systems) and extensive molecular scaffolds remain challenging obstacles for the generation of complex organics. Traditionally, the former demands an extra learning process, e.g., molecular pre-training, and the latter requires expensive computational resources. To address these challenges, we propose OrgMol-Design, a multi-granularity framework for efficiently designing complex organics. Our OrgMol-Design is composed of a score-based generative model via fragment prior for diverse coarse-grained scaffold generation and a chemical-rule-aware scoring model for fine-grained molecular structure design, circumventing the difficulty of intricate substructure learning without losing connection details among fragments. Our approach achieves state-of-the-art performance in four real-world and more challenging benchmarks covering broader scientific domains, outperforming advanced molecule generative models. Additionally, it delivers a substantial speedup and graphics memory reduction compared to diffusion-based graph models. Our results also demonstrate the importance of leveraging fragment prior for a generalized molecule inverse design model.
- Published
- 2024
4. Shift system and its applications
- Author
-
Li, Hao and Sugimoto, Shoma
- Subjects
Mathematics - Representation Theory ,Mathematical Physics - Abstract
We introduce a new concept named shift system. This is a purely Lie algebraic setting to develop the geometric representation theory of Feigin-Tipunin construction. After reformulating the discussion in past works of the second author under this new setting, as an application, we extend almost all the main results of these works to the (multiplet) principal W-algebra at positive integer level associated with a simple Lie algebra $\mathfrak{g}$ and Lie superalgebra $\mathfrak{osp}(1|2n)$, respectively. This paper also contains an appendix by Myungbo Shim on the relationship between Feigin-Tipunin construction and recent quantum field theories., Comment: 35 pages
- Published
- 2024
5. FuXi-2.0: Advancing machine learning weather forecasting model for practical applications
- Author
-
Zhong, Xiaohui, Chen, Lei, Fan, Xu, Qian, Wenxu, Liu, Jun, and Li, Hao
- Subjects
Physics - Atmospheric and Oceanic Physics ,Computer Science - Machine Learning - Abstract
Machine learning (ML) models have become increasingly valuable in weather forecasting, providing forecasts that not only lower computational costs but often match or exceed the accuracy of traditional numerical weather prediction (NWP) models. Despite their potential, ML models typically suffer from limitations such as coarse temporal resolution, typically 6 hours, and a limited set of meteorological variables, limiting their practical applicability. To overcome these challenges, we introduce FuXi-2.0, an advanced ML model that delivers 1-hourly global weather forecasts and includes a comprehensive set of essential meteorological variables, thereby expanding its utility across various sectors like wind and solar energy, aviation, and marine shipping. Our study conducts comparative analyses between ML-based 1-hourly forecasts and those from the high-resolution forecast (HRES) of the European Centre for Medium-Range Weather Forecasts (ECMWF) for various practical scenarios. The results demonstrate that FuXi-2.0 consistently outperforms ECMWF HRES in forecasting key meteorological variables relevant to these sectors. In particular, FuXi-2.0 shows superior performance in wind power forecasting compared to ECMWF HRES, further validating its efficacy as a reliable tool for scenarios demanding precise weather forecasts. Additionally, FuXi-2.0 also integrates both atmospheric and oceanic components, representing a significant step forward in the development of coupled atmospheric-ocean models. Further comparative analyses reveal that FuXi-2.0 provides more accurate forecasts of tropical cyclone intensity than its predecessor, FuXi-1.0, suggesting that there are benefits of an atmosphere-ocean coupled model over atmosphere-only models.
- Published
- 2024
6. Attention-Based Beamformer For Multi-Channel Speech Enhancement
- Author
-
Bai, Jinglin, Li, Hao, Zhang, Xueliang, and Chen, Fei
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Minimum Variance Distortionless Response (MVDR) is a classical adaptive beamformer that theoretically ensures the distortionless transmission of signals in the target direction, which makes it popular in real applications. Its noise reduction performance actually depends on the accuracy of the noise and speech spatial covariance matrices (SCMs) estimation. Time-frequency masks are often used to compute these SCMs. However, most mask-based beamforming methods typically assume that the sources are stationary, ignoring the case of moving sources, which leads to performance degradation. In this paper, we propose an attention-based mechanism to calculate the speech and noise SCMs and then apply MVDR to obtain the enhanced speech. To fully incorporate spatial information, the inplace convolution operator and frequency-independent LSTM are applied to facilitate SCMs estimation. The model is optimized in an end-to-end manner. Experiments demonstrate that the proposed method outperforms baselines with reduced computation and fewer parameters under various conditions.
- Published
- 2024
7. Modified Meta-Thompson Sampling for Linear Bandits and Its Bayes Regret Analysis
- Author
-
Li, Hao, Liang, Dong, and Xie, Zheng
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning ,Mathematics - Optimization and Control - Abstract
Meta-learning is characterized by its ability to learn how to learn, enabling the adaptation of learning strategies across different tasks. Recent research introduced the Meta-Thompson Sampling (Meta-TS), which meta-learns an unknown prior distribution sampled from a meta-prior by interacting with bandit instances drawn from it. However, its analysis was limited to Gaussian bandit. The contextual multi-armed bandit framework is an extension of the Gaussian Bandit, which challenges agent to utilize context vectors to predict the most valuable arms, optimally balancing exploration and exploitation to minimize regret over time. This paper introduces Meta-TSLB algorithm, a modified Meta-TS for linear contextual bandits. We theoretically analyze Meta-TSLB and derive an $ O((m+\log(m))\sqrt{n\log(n)})$ bound on its Bayes regret, in which $m$ represents the number of bandit instances, and $n$ the number of rounds of Thompson Sampling. Additionally, our work complements the analysis of Meta-TS for linear contextual bandits. The performance of Meta-TSLB is evaluated experimentally under different settings, and we experimente and analyze the generalization capability of Meta-TSLB, showcasing its potential to adapt to unseen instances.
- Published
- 2024
8. Physical Processes Behind the Co-Evolution of Halos, Galaxies and Supermassive Black Holes in the IllustrisTNG Simulation
- Author
-
Li, Hao, Chen, Yangyao, Wang, Huiyuan, and Mo, Houjun
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We explore the co-evolution of dark matter halos, their central galaxies, and central supermassive black holes (SMBHs) using the IllustrisTNG (TNG) simulation. We find that the evolutionary histories of individual galaxies in the $M_{\rm BH}$-$M_*$ plane can be decomposed into four distinct phases, separated by three transition points. We identify the driving processes of galaxy evolution within each phase and derive the conditions necessary and sufficient for transitions to subsequent phases. The first phase is dominated by star formation, with its duration primarily determined by the mass of the SMBH seed and the surrounding gas environment. The second phase is characterized by rapid SMBH growth, and the transition to the next phase occurs when the thermal-mode feedback of active galactic nucleus (AGN) can unbind gas from the galaxy. The third phase involves self-regulation of the SMBH, and the transition to the quenched phase occurs when the kinetic-mode feedback of AGN counterbalances gas cooling within the subhalo. The final phase is dominated by mergers. We investigate the use of scaling relations among different mass components and evolutionary phases to understand processes implemented in TNG and other simulations, and discuss how current and forthcoming observations can be used to constrain models., Comment: 28 pages, 12 figures, Submitted to MNRAS. Comments welcome!
- Published
- 2024
9. Full Stokes-vector inversion of the solar Mg II h & k lines
- Author
-
Li, Hao, Alemán, Tanausú del Pino, and Bueno, Javier Trujillo
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
The polarization of the Mg II h & k resonance lines is the result of the joint action of scattering processes and the magnetic field induced Hanle, Zeeman, and magneto-optical effects, thus holding significant potential for the diagnostic of the magnetic field in the solar chromosphere. The Chromospheric LAyer Spectro-Polarimeter sounding rocket experiment, carried out in 2019, successfully measured at each position along the 196 arcsec spectrograph slit the wavelength variation of the four Stokes parameters in the spectral region of this doublet around 280 nm, both in an active region plage and in a quiet region close to the limb. We consider some of these CLASP2 Stokes profiles and apply to them the recently-developed HanleRT Tenerife Inversion Code, which assumes a one-dimensional model atmosphere for each spatial pixel under consideration (i.e., it neglects the effects of horizontal radiative transfer). We find that the non-magnetic causes of symmetry breaking, due to the horizontal inhomogeneities and the gradients of the horizontal components of the macroscopic velocity in the solar atmosphere, have a significant impact on the linear polarization profiles. By introducing such non-magnetic causes of symmetry breaking as parameters in our inversion code, we can successfully fit the Stokes profiles and provide an estimation of the magnetic field vector. For example, in the quiet region pixels, where no circular polarization signal is detected, we find that the magnetic field strength in the upper chromosphere varies between 1 and 20 gauss., Comment: Accepted for publication in the Astrophysical Journal
- Published
- 2024
10. Solve paint color effect prediction problem in trajectory optimization of spray painting robot using artificial neural network inspired by the Kubelka Munk model
- Author
-
Wang, Hexiang, Bi, Zhiyuan, Cheng, Zhen, Li, Xinru, Zhu, Jiake, Jiang, Liyuan, Li, Hao, and Lu, Shizhou
- Subjects
Computer Science - Robotics - Abstract
Currently, the spray-painting robot trajectory planning technology aiming at spray painting quality mainly applies to single-color spraying. Conventional methods of optimizing the spray gun trajectory based on simulated thickness can only qualitatively reflect the color distribution, and can not simulate the color effect of spray painting at the pixel level. Therefore, it is not possible to accurately control the area covered by the color and the gradation of the edges of the area, and it is also difficult to deal with the situation where multiple colors of paint are sprayed in combination. To solve the above problems, this paper is inspired by the Kubelka-Munk model and combines the 3D machine vision method and artificial neural network to propose a spray painting color effect prediction method. The method is enabled to predict the execution effect of the spray gun trajectory with pixel-level accuracy from the dimension of the surface color of the workpiece after spray painting. On this basis, the method can be used to replace the traditional thickness simulation method to establish the objective function of the spray gun trajectory optimization problem, and thus solve the difficult problem of spray gun trajectory optimization for multi-color paint combination spraying. In this paper, the mathematical model of the spray painting color effect prediction problem is first determined through the analysis of the Kubelka-Munk paint film color rendering model, and at the same time, the spray painting color effect dataset is established with the help of the depth camera and point cloud processing algorithm. After that, the multilayer perceptron model was improved with the help of gating and residual structure and was used for the color prediction task. To verify ...
- Published
- 2024
11. Distributed exact multi-objective quantum search algorithm
- Author
-
Li, Hao, Qiu, Daowen, and Luo, Le
- Subjects
Quantum Physics - Abstract
Multi-objective search means searching for any one of several objectives in an unstructured database. Grover's algorithm has quadratic acceleration in multi-objection search than classical ones. Iterated operator in Grover's algorithm is a key element and plays an important role in amplitude amplification. In this paper, we design two distributed iterated operators and therefore two new distributed Grover's algorithms are obtained with the following advantages: (1) Compared to Grover's algorithm and the modified Grover's algorithm by Long, our distributed algorithms require fewer qubits; (2) Compared to the distributed Grover's algorithm proposed by Qiu et al., one of our distributed algorithms is exact. Of course, both our distributed algorithms require quite quantum communication and involve a number of more complicated unitary operators as cost, but there still may have certain advantage of physical realizability in the Noisy Intermediate-Scale Quantum (NISQ) era., Comment: 38 pages, 20 figure, comments are welcome
- Published
- 2024
12. Primordial Bounce-Inflation Scenario to Alleviate Cosmological Tensions and Lensing Anomaly
- Author
-
Li, Hao-Hao, Zhang, Xin-zhe, and Qiu, Taotao
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,General Relativity and Quantum Cosmology - Abstract
We put forward a primordial scenario to alleviate cosmological tensions, i.e. Hubble ($H_0$) tension and $ S_8 $ tension. Based on flat $\Lambda$CDM, the Bounce-Inflation (BI) scenario gives the results that $ H_0 = 68.60^{+0.40}_{-0.45} \, \text{km}/\text{s}/\text{Mpc}$, $ S_8 = 0.806 \pm 0.011 $ by using \texttt{Planck 2018} data sets and $ H_0 = 68.96 \pm 0.38 \, \text{km}/\text{s}/\text{Mpc}$, $ S_8 = 0.797\pm 0.010 $ by using \texttt{Planck 2018} + \texttt{SPT3G} data sets. These reduce the cosmological tensions slightly. We also take an extended $\Lambda$CDM model into account, $\Lambda$CDM (BI)+$A_L$, where $ A_L $ is the gravitational lensing amplitude. The results are $ H_0 = 69.38 \pm 0.49 \, \text{km}/\text{s}/\text{Mpc}$, $ S_8 = 0.774 \pm 0.014 $ fitted by \texttt{Planck 2018} data sets and $ H_0 = 69.49 \pm 0.45 \, \text{km}/\text{s}/\text{Mpc}$, $ S_8 = 0.771^{+0.013}_{-0.012} $ fitted by \texttt{Planck 2018} + \texttt{SPT3G} data sets, which reduce the Hubble tension to $\sim 3\sigma $ level and show no $S_8 $ tension. The $A_L \approx 1.1$ is smaller than the result of the inflation scenario with a constraint of \texttt{Planck 2018} data sets. Besides, the spectral index of the bounce-inflation scenario $ n_s $ is about $ 0.98 $, with a trend to the Harrison-Zel'dovich spectrum., Comment: 14 pages, 4 figures, 3 tables
- Published
- 2024
13. NYK-MS: A Well-annotated Multi-modal Metaphor and Sarcasm Understanding Benchmark on Cartoon-Caption Dataset
- Author
-
Chang, Ke, Li, Hao, Zhang, Junzhao, and Wu, Yunfang
- Subjects
Computer Science - Computation and Language - Abstract
Metaphor and sarcasm are common figurative expressions in people's communication, especially on the Internet or the memes popular among teenagers. We create a new benchmark named NYK-MS (NewYorKer for Metaphor and Sarcasm), which contains 1,583 samples for metaphor understanding tasks and 1,578 samples for sarcasm understanding tasks. These tasks include whether it contains metaphor/sarcasm, which word or object contains metaphor/sarcasm, what does it satirize and why does it contains metaphor/sarcasm, all of the 7 tasks are well-annotated by at least 3 annotators. We annotate the dataset for several rounds to improve the consistency and quality, and use GUI and GPT-4V to raise our efficiency. Based on the benchmark, we conduct plenty of experiments. In the zero-shot experiments, we show that Large Language Models (LLM) and Large Multi-modal Models (LMM) can't do classification task well, and as the scale increases, the performance on other 5 tasks improves. In the experiments on traditional pre-train models, we show the enhancement with augment and alignment methods, which prove our benchmark is consistent with previous dataset and requires the model to understand both of the two modalities., Comment: 13 pages, 6 figures
- Published
- 2024
14. GRPose: Learning Graph Relations for Human Image Generation with Pose Priors
- Author
-
Yin, Xiangchen, Di, Donglin, Fan, Lei, Li, Hao, Wei, Chen, Gou, Xiaofei, Song, Yang, Sun, Xiao, and Yang, Xun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent methods using diffusion models have made significant progress in human image generation with various additional controls such as pose priors. However, existing approaches still struggle to generate high-quality images with consistent pose alignment, resulting in unsatisfactory outputs. In this paper, we propose a framework delving into the graph relations of pose priors to provide control information for human image generation. The main idea is to establish a graph topological structure between the pose priors and latent representation of diffusion models to capture the intrinsic associations between different pose parts. A Progressive Graph Integrator (PGI) is designed to learn the spatial relationships of the pose priors with the graph structure, adopting a hierarchical strategy within an Adapter to gradually propagate information across different pose parts. A pose perception loss is further introduced based on a pretrained pose estimation network to minimize the pose differences. Extensive qualitative and quantitative experiments conducted on the Human-Art and LAION-Human datasets demonstrate that our model achieves superior performance, with a 9.98% increase in pose average precision compared to the latest benchmark model. The code is released on *******., Comment: The code will be released at https://github.com/XiangchenYin/GRPose
- Published
- 2024
15. Cascaded Temporal Updating Network for Efficient Video Super-Resolution
- Author
-
Li, Hao, Dong, Jiangxin, and Pan, Jinshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing video super-resolution (VSR) methods generally adopt a recurrent propagation network to extract spatio-temporal information from the entire video sequences, exhibiting impressive performance. However, the key components in recurrent-based VSR networks significantly impact model efficiency, e.g., the alignment module occupies a substantial portion of model parameters, while the bidirectional propagation mechanism significantly amplifies the inference time. Consequently, developing a compact and efficient VSR method that can be deployed on resource-constrained devices, e.g., smartphones, remains challenging. To this end, we propose a cascaded temporal updating network (CTUN) for efficient VSR. We first develop an implicit cascaded alignment module to explore spatio-temporal correspondences from adjacent frames. Moreover, we propose a unidirectional propagation updating network to efficiently explore long-range temporal information, which is crucial for high-quality video reconstruction. Specifically, we develop a simple yet effective hidden updater that can leverage future information to update hidden features during forward propagation, significantly reducing inference time while maintaining performance. Finally, we formulate all of these components into an end-to-end trainable VSR network. Extensive experimental results show that our CTUN achieves a favorable trade-off between efficiency and performance compared to existing methods. Notably, compared with BasicVSR, our method obtains better results while employing only about 30% of the parameters and running time. The source code and pre-trained models will be available at https://github.com/House-Leo/CTUN., Comment: Project website: https://github.com/House-Leo/CTUN
- Published
- 2024
16. Experimental practical quantum tokens with transaction time advantage
- Author
-
Jiang, Yang-Fan, Kent, Adrian, Pitalúa-García, Damián, Yao, Xiaochen, Chen, Xiaohan, Huang, Jia, Cowperthwaite, George, Zheng, Qibin, Li, Hao, You, Lixing, Liu, Yang, Zhang, Qiang, and Pan, Jian-Wei
- Subjects
Quantum Physics - Abstract
Quantum money is the first invention in quantum information science, promising advantages over classical money by simultaneously achieving unforgeability, user privacy, and instant validation. However, standard quantum money relies on quantum memories and long-distance quantum communication, which are technologically extremely challenging. Quantum "S-money" tokens eliminate these technological requirements while preserving unforgeability, user privacy, and instant validation. Here, we report the first full experimental demonstration of quantum S-tokens, proven secure despite errors, losses and experimental imperfections. The heralded single-photon source with a high system efficiency of 88.24% protects against arbitrary multi-photon attacks arising from losses in the quantum token generation. Following short-range quantum communication, the token is stored, transacted, and verified using classical bits. We demonstrate a transaction time advantage over intra-city 2.77 km and inter-city 60.54 km optical fibre networks, compared with optimal classical cross-checking schemes. Our implementation demonstrates the practicality of quantum S-tokens for applications requiring high security, privacy and minimal transaction times, like financial trading and network control. It is also the first demonstration of a quantitative quantum time advantage in relativistic cryptography, showing the enhanced cryptographic power of simultaneously considering quantum and relativistic physics., Comment: 74 pages, 6 figures
- Published
- 2024
17. LLMs are not Zero-Shot Reasoners for Biomedical Information Extraction
- Author
-
Nagar, Aishik, Schlegel, Viktor, Nguyen, Thanh-Tung, Li, Hao, Wu, Yuping, Binici, Kuluhan, and Winkler, Stefan
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Large Language Models (LLMs) are increasingly adopted for applications in healthcare, reaching the performance of domain experts on tasks such as question answering and document summarisation. Despite their success on these tasks, it is unclear how well LLMs perform on tasks that are traditionally pursued in the biomedical domain, such as structured information extration. To breach this gap, in this paper, we systematically benchmark LLM performance in Medical Classification and Named Entity Recognition (NER) tasks. We aim to disentangle the contribution of different factors to the performance, particularly the impact of LLMs' task knowledge and reasoning capabilities, their (parametric) domain knowledge, and addition of external knowledge. To this end we evaluate various open LLMs -- including BioMistral and Llama-2 models -- on a diverse set of biomedical datasets, using standard prompting, Chain-of-Thought (CoT) and Self-Consistency based reasoning as well as Retrieval-Augmented Generation (RAG) with PubMed and Wikipedia corpora. Counter-intuitively, our results reveal that standard prompting consistently outperforms more complex techniques across both tasks, laying bare the limitations in the current application of CoT, self-consistency and RAG in the biomedical domain. Our findings suggest that advanced prompting methods developed for knowledge- or reasoning-intensive tasks, such as CoT or RAG, are not easily portable to biomedical tasks where precise structured outputs are required. This highlights the need for more effective integration of external knowledge and reasoning mechanisms in LLMs to enhance their performance in real-world biomedical applications., Comment: 11 pages
- Published
- 2024
18. Interplay of Quantum Resources in Nonlocality Tests
- Author
-
Dong, Hai-Hao, Zhu, Yuwei, Cheng, Su-Yi, Zhang, Xingjian, Li, Cheng-Long, Li, Ying-Zhao, Li, Hao, You, Lixing, Ma, Xiongfeng, Zhang, Qiang, and Pan, Jian-Wei
- Subjects
Quantum Physics - Abstract
Nonlocality, evidenced by the violation of Bell inequalities, not only signifies entanglement but also highlights measurement incompatibility in quantum systems. Utilizing the generalized Clauser-Horne-Shimony-Holt (CHSH) Bell inequality, our high-efficiency optical setup achieves a loophole-free violation of $2.0132$. This result provides a device-independent lower bound on entanglement, quantified as the entanglement of formation at $0.0159$. Moreover, by tuning the parameters of the generalized Bell inequality, we enhance the estimation of measurement incompatibility, which is quantified by an effective overlap of $4.3883 \times 10^{-5}$. To explore the intricate interplay among nonlocality, entanglement, and measurement incompatibility, we generate mixed states, allowing for flexible modulation of entanglement via fast switching among the four Bell states using Pockels cells, achieving a fidelity above $99.10\%$. Intriguingly, our results reveal a counterintuitive relationship where increasing incompatibility initially boosts nonlocality but eventually leads to its reduction. Typically, maximal nonlocality does not coincide with maximal incompatibility. This experimental study sheds light on the optimal management of quantum resources for Bell-inequality-based quantum information processing., Comment: 15 pages, 9 figures
- Published
- 2024
19. Dynamic Neural Dowker Network: Approximating Persistent Homology in Dynamic Directed Graphs
- Author
-
Li, Hao, Jiang, Hao, Fan, Jiajun, Ye, Dongsheng, and Du, Liang
- Subjects
Computer Science - Machine Learning ,Mathematics - Algebraic Topology - Abstract
Persistent homology, a fundamental technique within Topological Data Analysis (TDA), captures structural and shape characteristics of graphs, yet encounters computational difficulties when applied to dynamic directed graphs. This paper introduces the Dynamic Neural Dowker Network (DNDN), a novel framework specifically designed to approximate the results of dynamic Dowker filtration, aiming to capture the high-order topological features of dynamic directed graphs. Our approach creatively uses line graph transformations to produce both source and sink line graphs, highlighting the shared neighbor structures that Dowker complexes focus on. The DNDN incorporates a Source-Sink Line Graph Neural Network (SSLGNN) layer to effectively capture the neighborhood relationships among dynamic edges. Additionally, we introduce an innovative duality edge fusion mechanism, ensuring that the results for both the sink and source line graphs adhere to the duality principle intrinsic to Dowker complexes. Our approach is validated through comprehensive experiments on real-world datasets, demonstrating DNDN's capability not only to effectively approximate dynamic Dowker filtration results but also to perform exceptionally in dynamic graph classification tasks., Comment: KDD 2024
- Published
- 2024
20. Cross-View Geolocalization and Disaster Mapping with Street-View and VHR Satellite Imagery: A Case Study of Hurricane IAN
- Author
-
Li, Hao, Deuser, Fabian, Yina, Wenping, Luo, Xuanshu, Walther, Paul, Mai, Gengchen, Huang, Wei, and Werner, Martin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Nature disasters play a key role in shaping human-urban infrastructure interactions. Effective and efficient response to natural disasters is essential for building resilience and a sustainable urban environment. Two types of information are usually the most necessary and difficult to gather in disaster response. The first information is about disaster damage perception, which shows how badly people think that urban infrastructure has been damaged. The second information is geolocation awareness, which means how people whereabouts are made available. In this paper, we proposed a novel disaster mapping framework, namely CVDisaster, aiming at simultaneously addressing geolocalization and damage perception estimation using cross-view Street-View Imagery (SVI) and Very High-Resolution satellite imagery. CVDisaster consists of two cross-view models, where CVDisaster-Geoloc refers to a cross-view geolocalization model based on a contrastive learning objective with a Siamese ConvNeXt image encoder, and CVDisaster-Est is a cross-view classification model based on a Couple Global Context Vision Transformer (CGCViT). Taking Hurricane IAN as a case study, we evaluate the CVDisaster framework by creating a novel cross-view dataset (CVIAN) and conducting extensive experiments. As a result, we show that CVDisaster can achieve highly competitive performance (over 80% for geolocalization and 75% for damage perception estimation) with even limited fine-tuning efforts, which largely motivates future cross-view models and applications within a broader GeoAI research community. The data and code are publicly available at: https://github.com/tum-bgd/CVDisaster.
- Published
- 2024
21. Mapping the longitudinal magnetic field in the atmosphere of an active region plage from the inversion of the near-ultraviolet CLASP2.1 spectropolarimetric data
- Author
-
Li, Hao, Alemán, Tanausú del Pino, Bueno, Javier Trujillo, Ishikawa, Ryohko, Ballester, Ernest Alsina, McKenzie, David E., Belluzzi, Luca, Song, Donguk, Okamoto, Takenori J., Kobayashi, Ken, Rachmeler, Laurel A., Bethge, Christian, and Auchère, Frédéric
- Subjects
Astrophysics - Solar and Stellar Astrophysics - Abstract
We apply the HanleRT Tenerife Inversion Code to the spectro-polarimetric observations obtained by the Chromospheric LAyer SpectroPolarimeter. This suborbital space experiment measured the variation with wavelength of the four Stokes parameters in the near-ultraviolet spectral region of the Mg II h & k lines over a solar disk area containing part of an active region plage and the edge of a sunspot penumbra. We infer the stratification of the temperature, the electron density, the line of-sight velocity, the micro-turbulent velocity, and the longitudinal component of the magnetic field from the observed intensity and circular polarization profiles. The inferred model atmosphere shows larger temperature and electron density in the plage and the superpenumbra regions than in the quiet regions. The shape of the plage region in terms of its brightness is similar to the pattern of the inferred longitudinal component of the magnetic field in the chromosphere, as well as to that of the overlying moss observed by AIA in the 171 A band, which suggests a similar magnetic origin for the heating in both the plage and the moss region. Moreover, this heating is particularly significant in the regions with larger inferred magnetic flux. In contrast, in the superpenumbra, the regions with larger electron density and temperature are usually found in between these regions with larger magnetic flux, suggesting that the details of the heating mechanism in the chromosphere of the superpenumbra may be different to those in the plage, but with the magnetic field still playing a key role., Comment: Accepted for publication in the Astrophysical Journal
- Published
- 2024
22. Drone based superconducting single photon detection system with detection efficiency more than 90%
- Author
-
Ma, Ruoyan, Guo, Zhimin, Chen, Dai, Dai, Xiaojun, Xiao, You, Zhang, ChengJun, Xiong, Jiamin, Huang, Jia, Zhang, Xingyu, Liu, Xiaoyu, Rong, Liangliang, Li, Hao, Zhang, Xiaofu, and You, Lixing
- Subjects
Condensed Matter - Superconductivity ,Physics - Optics - Abstract
Bounded by the size, weight, and power consumption (SWaP) of conventional superconducting single photon detectors (SSPD), applications of SSPDs were commonly confined in the laboratory. However, booming demands for high efficiency single photon detector incorporated with avionic platforms arise with the development of remote imaging and sensing or long-haul quantum communication without topographical constraints. We herein designed and manufactured the first drone based SSPD system with a SDE as high as 91.8%. This drone based SSPD system is established with high performance NbTiN SSPDs, self-developed miniature liquid helium dewar, and homemade integrated electric setups, which is able to be launched in complex topographical conditions. Such a drone based SSPD system may open the use of SSPDs for applications that demand high-SDE in complex environments.
- Published
- 2024
23. ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
- Author
-
Gao, Ziyi, Chen, Kai, Wei, Zhipeng, Mou, Tingshu, Chen, Jingjing, Tan, Zhiyu, Li, Hao, and Jiang, Yu-Gang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent diffusion-based unrestricted attacks generate imperceptible adversarial examples with high transferability compared to previous unrestricted attacks and restricted attacks. However, existing works on diffusion-based unrestricted attacks are mostly focused on images yet are seldom explored in videos. In this paper, we propose the Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack (ReToMe-VA), which is the first framework to generate imperceptible adversarial video clips with higher transferability. Specifically, to achieve spatial imperceptibility, ReToMe-VA adopts a Timestep-wise Adversarial Latent Optimization (TALO) strategy that optimizes perturbations in diffusion models' latent space at each denoising step. TALO offers iterative and accurate updates to generate more powerful adversarial frames. TALO can further reduce memory consumption in gradient computation. Moreover, to achieve temporal imperceptibility, ReToMe-VA introduces a Recursive Token Merging (ReToMe) mechanism by matching and merging tokens across video frames in the self-attention module, resulting in temporally consistent adversarial videos. ReToMe concurrently facilitates inter-frame interactions into the attack process, inducing more diverse and robust gradients, thus leading to better adversarial transferability. Extensive experiments demonstrate the efficacy of ReToMe-VA, particularly in surpassing state-of-the-art attacks in adversarial transferability by more than 14.16% on average.
- Published
- 2024
24. FuXi Weather: An end-to-end machine learning weather data assimilation and forecasting system
- Author
-
Sun, Xiuyu, Zhong, Xiaohui, Xu, Xiaoze, Huang, Yuanqing, Li, Hao, Feng, Jie, Han, Wei, Wu, Libo, and Qi, Yuan
- Subjects
Computer Science - Machine Learning ,Physics - Atmospheric and Oceanic Physics - Abstract
Operational numerical weather prediction systems consist of three fundamental components: the global observing system for data collection, data assimilation for generating initial conditions, and the forecasting model to predict future weather conditions. While NWP have undergone a quiet revolution, with forecast skills progressively improving over the past few decades, their advancement has slowed due to challenges such as high computational costs and the complexities associated with assimilating an increasing volume of observational data and managing finer spatial grids. Advances in machine learning offer an alternative path towards more efficient and accurate weather forecasts. The rise of machine learning based weather forecasting models has also spurred the development of machine learning based DA models or even purely machine learning based weather forecasting systems. This paper introduces FuXi Weather, an end-to-end machine learning based weather forecasting system. FuXi Weather employs specialized data preprocessing and multi-modal data fusion techniques to integrate information from diverse sources under all-sky conditions, including microwave sounders from 3 polar-orbiting satellites and radio occultation data from Global Navigation Satellite System. Operating on a 6-hourly DA and forecasting cycle, FuXi Weather independently generates robust and accurate 10-day global weather forecasts at a spatial resolution of 0.25\textdegree. It surpasses the European Centre for Medium-range Weather Forecasts high-resolution forecasts in terms of predictability, extending the skillful forecast lead times for several key weather variables such as the geopotential height at 500 hPa from 9.25 days to 9.5 days. The system's high computational efficiency and robust performance, even with limited observations, demonstrates its potential as a promising alternative to traditional NWP systems., Comment: 34 pages, 4 figures
- Published
- 2024
25. PRISM Lite: A lightweight model for interactive 3D placenta segmentation in ultrasound
- Author
-
Li, Hao, Oguz, Baris, Arenas, Gabriel, Yao, Xing, Wang, Jiacheng, Pouch, Alison, Byram, Brett, Schwartz, Nadav, and Oguz, Ipek
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Placenta volume measured from 3D ultrasound (3DUS) images is an important tool for tracking the growth trajectory and is associated with pregnancy outcomes. Manual segmentation is the gold standard, but it is time-consuming and subjective. Although fully automated deep learning algorithms perform well, they do not always yield high-quality results for each case. Interactive segmentation models could address this issue. However, there is limited work on interactive segmentation models for the placenta. Despite their segmentation accuracy, these methods may not be feasible for clinical use as they require relatively large computational power which may be especially prohibitive in low-resource environments, or on mobile devices. In this paper, we propose a lightweight interactive segmentation model aiming for clinical use to interactively segment the placenta from 3DUS images in real-time. The proposed model adopts the segmentation from our fully automated model for initialization and is designed in a human-in-the-loop manner to achieve iterative improvements. The Dice score and normalized surface Dice are used as evaluation metrics. The results show that our model can achieve superior performance in segmentation compared to state-of-the-art models while using significantly fewer parameters. Additionally, the proposed model is much faster for inference and robust to poor initial masks. The code is available at https://github.com/MedICL-VU/PRISM-placenta.
- Published
- 2024
26. Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets
- Author
-
Li, Hao, Liu, Han, von Busch, Heinrich, Grimm, Robert, Huisman, Henkjan, Tong, Angela, Winkel, David, Penzkofer, Tobias, Shabunin, Ivan, Choi, Moon Hyung, Yang, Qingsong, Szolar, Dieter, Shea, Steven, Coakley, Fergus, Harisinghani, Mukesh, Oguz, Ipek, Comaniciu, Dorin, Kamen, Ali, and Lou, Bin
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Our hypothesis is that UDA using diffusion-weighted images, generated with a unified model, offers a promising and reliable strategy for enhancing the performance of supervised learning models in multi-site prostate lesion detection, especially when various b-values are present. This retrospective study included data from 5,150 patients (14,191 samples) collected across nine different imaging centers. A novel UDA method using a unified generative model was developed for multi-site PCa detection. This method translates diffusion-weighted imaging (DWI) acquisitions, including apparent diffusion coefficient (ADC) and individual DW images acquired using various b-values, to align with the style of images acquired using b-values recommended by Prostate Imaging Reporting and Data System (PI-RADS) guidelines. The generated ADC and DW images replace the original images for PCa detection. An independent set of 1,692 test cases (2,393 samples) was used for evaluation. The area under the receiver operating characteristic curve (AUC) was used as the primary metric, and statistical analysis was performed via bootstrapping. For all test cases, the AUC values for baseline SL and UDA methods were 0.73 and 0.79 (p<.001), respectively, for PI-RADS>=3, and 0.77 and 0.80 (p<.001) for PI-RADS>=4 PCa lesions. In the 361 test cases under the most unfavorable image acquisition setting, the AUC values for baseline SL and UDA were 0.49 and 0.76 (p<.001) for PI-RADS>=3, and 0.50 and 0.77 (p<.001) for PI-RADS>=4 PCa lesions. The results indicate the proposed UDA with generated images improved the performance of SL methods in multi-site PCa lesion detection across datasets with various b values, especially for images acquired with significant deviations from the PI-RADS recommended DWI protocol (e.g. with an extremely high b-value)., Comment: Accept at Radiology: Artificial Intelligence. Journal reference and external DOI will be added once published
- Published
- 2024
- Full Text
- View/download PDF
27. Microwave-optics entanglement via coupled opto- and magnomechanical microspheres
- Author
-
Li, Hao-Tian, Fan, Zhi-Yuan, Zhu, Huai-Bing, Gröblacher, Simon, and Li, Jie
- Subjects
Quantum Physics ,Condensed Matter - Mesoscale and Nanoscale Physics ,Physics - Applied Physics ,Physics - Optics - Abstract
Microwave-optics entanglement plays a crucial role in building hybrid quantum networks with quantum nodes working in the microwave and optical frequency bands. However, there are limited efficient ways to produce such entanglement due to the large frequency mismatch between the two regimes. Here, we present a new mechanism to prepare microwave-optics entanglement based on a hybrid system of two coupled opto- and magnomechanical microspheres, i.e., a YIG sphere and a silica sphere. The YIG sphere holds a magnon mode and a vibration mode induced by magnetostriction, while the silica sphere supports an optical whispering-gallery mode and a mechanical mode coupled via an optomechanical interaction. The two mechanical modes are close in frequency and directly coupled via physical contact of the two microspheres. We show that by simultaneously activating the magnomechanical (optomechanical) Stokes (anti-Stokes) scattering, stationary entanglement can be established between the magnon and optical modes via mechanics-mechanics coupling. This leads to stationary microwave-optics entanglement by further coupling the YIG sphere to a microwave cavity and utilizing the magnon-microwave state swapping. Our protocol is within reach of current technology and may become a promising new approach for preparing microwave-optics entanglement, which finds unique applications in hybrid quantum networks and quantum information processing with hybrid quantum systems.
- Published
- 2024
28. VidGen-1M: A Large-Scale Dataset for Text-to-video Generation
- Author
-
Tan, Zhiyu, Yang, Xiaomeng, Qin, Luozheng, and Li, Hao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The quality of video-text pairs fundamentally determines the upper bound of text-to-video models. Currently, the datasets used for training these models suffer from significant shortcomings, including low temporal consistency, poor-quality captions, substandard video quality, and imbalanced data distribution. The prevailing video curation process, which depends on image models for tagging and manual rule-based curation, leads to a high computational load and leaves behind unclean data. As a result, there is a lack of appropriate training datasets for text-to-video models. To address this problem, we present VidGen-1M, a superior training dataset for text-to-video models. Produced through a coarse-to-fine curation strategy, this dataset guarantees high-quality videos and detailed captions with excellent temporal consistency. When used to train the video generation model, this dataset has led to experimental results that surpass those obtained with other models., Comment: project page: https://sais-fuxi.github.io/projects/vidgen-1m
- Published
- 2024
29. Model Hijacking Attack in Federated Learning
- Author
-
Li, Zheng, Wu, Siyuan, Chen, Ruichuan, Aditya, Paarijaat, Akkus, Istemi Ekin, Vanga, Manohar, Zhang, Min, Li, Hao, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Machine learning (ML), driven by prominent paradigms such as centralized and federated learning, has made significant progress in various critical applications ranging from autonomous driving to face recognition. However, its remarkable success has been accompanied by various attacks. Recently, the model hijacking attack has shown that ML models can be hijacked to execute tasks different from their original tasks, which increases both accountability and parasitic computational risks. Nevertheless, thus far, this attack has only focused on centralized learning. In this work, we broaden the scope of this attack to the federated learning domain, where multiple clients collaboratively train a global model without sharing their data. Specifically, we present HijackFL, the first-of-its-kind hijacking attack against the global model in federated learning. The adversary aims to force the global model to perform a different task (called hijacking task) from its original task without the server or benign client noticing. To accomplish this, unlike existing methods that use data poisoning to modify the target model's parameters, HijackFL searches for pixel-level perturbations based on their local model (without modifications) to align hijacking samples with the original ones in the feature space. When performing the hijacking task, the adversary applies these cloaks to the hijacking samples, compelling the global model to identify them as original samples and predict them accordingly. We conduct extensive experiments on four benchmark datasets and three popular models. Empirical results demonstrate that its attack performance outperforms baselines. We further investigate the factors that affect its performance and discuss possible defenses to mitigate its impact.
- Published
- 2024
30. Construction of various time-dependent Hamiltonians on a single photonic chip
- Author
-
Ye, Rui, Li, Guangzhen, Wan, Shuai, Xue, Xiaotian, Wang, Piyu, Qiao, Xin, Li, Hao, Liu, Shijie, Wang, Jiayu, Ma, Rui, Bo, Fang, Zheng, Yuanlin, Dong, Chunhua, Yuan, Luqi, and Chen, Xianfeng
- Subjects
Physics - Optics - Abstract
Integrated photonics provides an important platform for simulating physical models with high-performance chip-scale devices, where the lattice size and the time-dependence of a model are key ingredients for further enriching the functionality of a photonic chip. Here, we propose and demonstrate the construction of various time-dependent Hamiltonian models using a single microresonator on thin-film lithium niobate chip. Such an integrated microresonator holds high quality factor to 10^6, and supports the construction of the synthetic frequency lattice with effective lattice sites up to 152 under the electro-optic modulation. By further applying a bichromatic modulation composed of two radio-frequency signals oppositely detuned from the resonant frequency in the microresonator, we build different time-dependent Hamiltonians with the time-varying nearest-neighbor coupling strength in synthetic frequency lattice. We measure the temporal features from capturing the dynamic band structures of the lattice and demonstrate a variety of time-dependent synthetic lattice models by engineering the driven pattern of the modulation, highlighting great flexibility of the microresonator. Our work shows a photonic chip for simulating versatile time-dependent Hamiltonians, which pushes forward quantum simulations in integrated photonics with great experimental tunability and reconfigurability., Comment: 14 pages, 5 figures
- Published
- 2024
31. Retinal IPA: Iterative KeyPoints Alignment for Multimodal Retinal Imaging
- Author
-
Wang, Jiacheng, Li, Hao, Hu, Dewei, Xu, Rui, Yao, Xing, Tao, Yuankai K., and Oguz, Ipek
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
We propose a novel framework for retinal feature point alignment, designed for learning cross-modality features to enhance matching and registration across multi-modality retinal images. Our model draws on the success of previous learning-based feature detection and description methods. To better leverage unlabeled data and constrain the model to reproduce relevant keypoints, we integrate a keypoint-based segmentation task. It is trained in a self-supervised manner by enforcing segmentation consistency between different augmentations of the same image. By incorporating a keypoint augmented self-supervised layer, we achieve robust feature extraction across modalities. Extensive evaluation on two public datasets and one in-house dataset demonstrates significant improvements in performance for modality-agnostic retinal feature alignment. Our code and model weights are publicly available at \url{https://github.com/MedICL-VU/RetinaIPA}.
- Published
- 2024
32. An Error Discovery and Correction for the Family of V-Shaped BPSO Algorithms
- Author
-
Zhao, Qing, Zhang, Chengkui, Li, Hao, and Ke, Ting
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
BPSO algorithm is a swarm intelligence optimization algorithm, which has the characteristics of good optimization effect, high efficiency and easy to implement. In recent years, it has been used to optimize a variety of machine learning and deep learning models, such as CNN, LSTM, SVM, etc. But it is easy to fall into local optimum for the lack of exploitation ability. It is found that in the article, which is different from previous studies, The reason for the poor performance is an error existing in their velocity update function, which leads to abnormal and chaotic behavior of particles. This not only makes the algorithm difficult to converge, but also often searches the repeated space. So, traditionally, it has to rely on a low w value in the later stage to force these algorithms to converge, but also makes them quickly lose their search ability and prone to getting trapped in local optima. This article proposes a velocity legacy term correction method for all V-shaped BPSOs. Experimentals based on 0/1 knapsack problems show that it has a significant effect on accuracy and efficiency for all of the 4 commonly used V-Shaped BPSOs. Therefore it is an significant breakthrough in the field of swarm intelligence., Comment: 25 pages, 11 figures
- Published
- 2024
33. A quantum analog of Huygen's clock: noise-induced synchronization
- Author
-
Tyagi, Bhavay, Li, Hao, Bittner, Eric R., Piryatinski, Andrei, and Silva-Acuna, Carlos
- Subjects
Quantum Physics - Abstract
We propose a quantum analogue of the Huygens clock, in which the phases of two spins achieve synchronization through their interaction with a shared environment. The environment functions analogously to the escapement mechanism in a mechanical clock, regulating the gear train and permitting the advancement of timing in discrete intervals. In our proposed model, the relative phase of the two spins become synchronized through interaction with a mutual, correlated, environment. We show that for a system of qubits, several arguments can be made that significantly reduce the cardinality of the set of allowed measurements and, hence, the complexity of the problem. We present a numerically efficient method to calculate the degree of quantumness that exists in the correlations of our final density matrix. This method also provides a tight upper bound for when the system is described by rank-3 and rank-4 density matrices.
- Published
- 2024
34. SeqMIA: Sequential-Metric Based Membership Inference Attack
- Author
-
Li, Hao, Li, Zheng, Wu, Siyuan, Hu, Chengrui, Ye, Yutong, Zhang, Min, Feng, Dengguo, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Most existing membership inference attacks (MIAs) utilize metrics (e.g., loss) calculated on the model's final state, while recent advanced attacks leverage metrics computed at various stages, including both intermediate and final stages, throughout the model training. Nevertheless, these attacks often process multiple intermediate states of the metric independently, ignoring their time-dependent patterns. Consequently, they struggle to effectively distinguish between members and non-members who exhibit similar metric values, particularly resulting in a high false-positive rate. In this study, we delve deeper into the new membership signals in the black-box scenario. We identify a new, more integrated membership signal: the Pattern of Metric Sequence, derived from the various stages of model training. We contend that current signals provide only partial perspectives of this new signal: the new one encompasses both the model's multiple intermediate and final states, with a greater emphasis on temporal patterns among them. Building upon this signal, we introduce a novel attack method called Sequential-metric based Membership Inference Attack (SeqMIA). Specifically, we utilize knowledge distillation to obtain a set of distilled models representing various stages of the target model's training. We then assess multiple metrics on these distilled models in chronological order, creating distilled metric sequence. We finally integrate distilled multi-metric sequences as a sequential multiformat and employ an attention-based RNN attack model for inference. Empirical results show SeqMIA outperforms all baselines, especially can achieve an order of magnitude improvement in terms of TPR @ 0.1% FPR. Furthermore, we delve into the reasons why this signal contributes to SeqMIA's high attack performance, and assess various defense mechanisms against SeqMIA., Comment: Accepted by ACM CCS 2024
- Published
- 2024
35. Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
- Author
-
Jin, Peng, Li, Hao, Cheng, Zesen, Li, Kehan, Yu, Runyi, Liu, Chang, Ji, Xiangyang, Yuan, Li, and Chen, Jie
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Text-to-motion generation requires not only grounding local actions in language but also seamlessly blending these individual actions to synthesize diverse and realistic global motions. However, existing motion generation methods primarily focus on the direct synthesis of global motions while neglecting the importance of generating and controlling local actions. In this paper, we propose the local action-guided motion diffusion model, which facilitates global motion generation by utilizing local actions as fine-grained control signals. Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis. During the diffusion process for synthesizing global motion, we calculate the local-action gradient to provide conditional guidance. This local-to-global paradigm reduces the complexity associated with direct global motion generation and promotes motion diversity via sampling diverse actions as conditions. Extensive experiments on two human motion datasets, i.e., HumanML3D and KIT, demonstrate the effectiveness of our method. Furthermore, our method provides flexibility in seamlessly combining various local actions and continuous guiding weight adjustment, accommodating diverse user preferences, which may hold potential significance for the community. The project page is available at https://jpthu17.github.io/GuidedMotion-project/., Comment: Accepted by ECCV 2024
- Published
- 2024
36. Variational Quantum Imaginary Time Evolution for Matrix Product State Ansatz with Tests on Transcorrelated Hamiltonians
- Author
-
Li, Hao-En, Li, Xiang, Huang, Jia-Cheng, Zhang, Guang-Ze, Shen, Zhu-Ping, Zhao, Chen, Li, Jun, and Hu, Han-Shi
- Subjects
Quantum Physics ,Physics - Chemical Physics - Abstract
The matrix product state (MPS) ansatz offers a promising approach for finding the ground state of molecular Hamiltonians and solving quantum chemistry problems. Building on this concept, the proposed technique of quantum circuit MPS (QCMPS) enables the simulation of chemical systems using a relatively small number of qubits. In this study, we enhance the optimization performance of the QCMPS ansatz by employing the variational quantum imaginary time evolution (VarQITE) approach. Guided by McLachlan's variational principle, the VarQITE method provides analytical metrics and gradients, resulting in improved convergence efficiency and robustness of the QCMPS. We validate these improvements numerically through simulations of $\rm H_2$, $\rm H_4$, and $\rm LiH$ molecules. Additionally, given that VarQITE is applicable to non-Hermitian Hamiltonians, we evaluate its effectiveness in preparing the ground state of transcorrelated (TC) Hamiltonians. This approach yields energy estimates comparable to the complete basis set (CBS) limit while using even fewer qubits. Specifically, we perform simulations of the beryllium atom and $\rm LiH$ molecule using only three qubits, maintaining high fidelity with the CBS ground state energy of these systems. This qubit reduction is achieved through the combined advantages of both the QCMPS ansatz and transcorrelation. Our findings demonstrate the potential practicality of this quantum chemistry algorithm on near-term quantum devices., Comment: 15 pages, 8 figures
- Published
- 2024
37. Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models
- Author
-
Gao, Wanling, Huang, Yunyou, Cui, Dandan, Yu, Zhuoming, Liu, Wenjing, Liang, Xiaoshuang, Zhao, Jiahui, Xie, Jiyue, Li, Hao, Ma, Li, Ye, Ning, Kang, Yumiao, Luo, Dingfeng, Pan, Peng, Huang, Wei, Liu, Zhongmou, Hu, Jizhong, Zhao, Gangyuan, Jiang, Chongrong, Huang, Fan, Wei, Tianyi, Tang, Suqin, Xia, Bingjie, Zhang, Zhifei, and Zhan, Jianfeng
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
A profound gap persists between artificial intelligence (AI) and clinical practice in medicine, primarily due to the lack of rigorous and cost-effective evaluation methodologies. State-of-the-art and state-of-the-practice AI model evaluations are limited to laboratory studies on medical datasets or direct clinical trials with no or solely patient-centered controls. Moreover, the crucial role of clinicians in collaborating with AI, pivotal for determining its impact on clinical practice, is often overlooked. For the first time, we emphasize the critical necessity for rigorous and cost-effective evaluation methodologies for AI models in clinical practice, featuring patient/clinician-centered (dual-centered) AI randomized controlled trials (DC-AI RCTs) and virtual clinician-based in-silico trials (VC-MedAI) as an effective proxy for DC-AI RCTs. Leveraging 7500 diagnosis records from two-step inaugural DC-AI RCTs across 14 medical centers with 125 clinicians, our results demonstrate the necessity of DC-AI RCTs and the effectiveness of VC-MedAI. Notably, VC-MedAI performs comparably to human clinicians, replicating insights and conclusions from prospective DC-AI RCTs. We envision DC-AI RCTs and VC-MedAI as pivotal advancements, presenting innovative and transformative evaluation methodologies for AI models in clinical practice, offering a preclinical-like setting mirroring conventional medicine, and reshaping development paradigms in a cost-effective and fast-iterative manner. Chinese Clinical Trial Registration: ChiCTR2400086816., Comment: 24 pages
- Published
- 2024
38. Magnon squeezing via reservoir-engineered optomagnomechanics
- Author
-
Fan, Zhi-Yuan, Zhu, Huai-Bing, Li, Hao-Tian, and Li, Jie
- Subjects
Quantum Physics ,Condensed Matter - Mesoscale and Nanoscale Physics ,Physics - Optics - Abstract
We show how to prepare magnonic squeezed states in an optomagnomechanical system, in which magnetostriction induced mechanical displacement couples to an optical cavity via radiation pressure. We discuss two scenarios depending on whether the magnomechanical coupling is linear or dispersive. We show that in both cases the strong mechanical squeezing obtained via two-tone driving of the optical cavity can be efficiently transferred to the magnon mode. In the linear coupling case, stationary magnon squeezing is achieved; while in the dispersive coupling case, a transient magnonic squeezed state is prepared in a two-step protocol. The proposed magnonic squeezed states find promising applications in quantum information processing and quantum sensing using magnons., Comment: Invited contribution to the Special Topic on "Brillouin Scattering and Optomechanics" in APL Photonics
- Published
- 2024
39. Interactive Segmentation Model for Placenta Segmentation from 3D Ultrasound images
- Author
-
Li, Hao, Oguz, Baris, Arenas, Gabriel, Yao, Xing, Wang, Jiacheng, Pouch, Alison, Byram, Brett, Schwartz, Nadav, and Oguz, Ipek
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Placenta volume measurement from 3D ultrasound images is critical for predicting pregnancy outcomes, and manual annotation is the gold standard. However, such manual annotation is expensive and time-consuming. Automated segmentation algorithms can often successfully segment the placenta, but these methods may not consistently produce robust segmentations suitable for practical use. Recently, inspired by the Segment Anything Model (SAM), deep learning-based interactive segmentation models have been widely applied in the medical imaging domain. These models produce a segmentation from visual prompts provided to indicate the target region, which may offer a feasible solution for practical use. However, none of these models are specifically designed for interactively segmenting 3D ultrasound images, which remain challenging due to the inherent noise of this modality. In this paper, we evaluate publicly available state-of-the-art 3D interactive segmentation models in contrast to a human-in-the-loop approach for the placenta segmentation task. The Dice score, normalized surface Dice, averaged symmetric surface distance, and 95-percent Hausdorff distance are used as evaluation metrics. We consider a Dice score of 0.95 a successful segmentation. Our results indicate that the human-in-the-loop segmentation model reaches this standard. Moreover, we assess the efficiency of the human-in-the-loop model as a function of the amount of prompts. Our results demonstrate that the human-in-the-loop model is both effective and efficient for interactive placenta segmentation. The code is available at \url{https://github.com/MedICL-VU/PRISM-placenta}.
- Published
- 2024
40. Exploring the Causality of End-to-End Autonomous Driving
- Author
-
Li, Jiankun, Li, Hao, Liu, Jiangjiang, Zou, Zhikang, Ye, Xiaoqing, Wang, Fan, Huang, Jizhou, Wu, Hua, and Wang, Haifeng
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Deep learning-based models are widely deployed in autonomous driving areas, especially the increasingly noticed end-to-end solutions. However, the black-box property of these models raises concerns about their trustworthiness and safety for autonomous driving, and how to debug the causality has become a pressing concern. Despite some existing research on the explainability of autonomous driving, there is currently no systematic solution to help researchers debug and identify the key factors that lead to the final predicted action of end-to-end autonomous driving. In this work, we propose a comprehensive approach to explore and analyze the causality of end-to-end autonomous driving. First, we validate the essential information that the final planning depends on by using controlled variables and counterfactual interventions for qualitative analysis. Then, we quantitatively assess the factors influencing model decisions by visualizing and statistically analyzing the response of key model inputs. Finally, based on the comprehensive study of the multi-factorial end-to-end autonomous driving system, we have developed a strong baseline and a tool for exploring causality in the close-loop simulator CARLA. It leverages the essential input sources to obtain a well-designed model, resulting in highly competitive capabilities. As far as we know, our work is the first to unveil the mystery of end-to-end autonomous driving and turn the black box into a white one. Thorough close-loop experiments demonstrate that our method can be applied to end-to-end autonomous driving solutions for causality debugging. Code will be available at https://github.com/bdvisl/DriveInsight.
- Published
- 2024
41. Adaptively Robust and Sparse K-means Clustering
- Author
-
Li, Hao, Sugasawa, Shonosuke, and Katayama, Shota
- Subjects
Statistics - Computation ,Statistics - Machine Learning - Abstract
While K-means is known to be a standard clustering algorithm, it may be compromised due to the presence of outliers and high-dimensional noisy variables. This paper proposes adaptively robust and sparse K-means clustering (ARSK) to address these practical limitations of the standard K-means algorithm. We introduce a redundant error component for each observation for robustness, and this additional parameter is penalized using a group sparse penalty. To accommodate the impact of high-dimensional noisy variables, the objective function is modified by incorporating weights and implementing a penalty to control the sparsity of the weight vector. The tuning parameters to control the robustness and sparsity are selected by Gap statistics. Through simulation experiments and real data analysis, we demonstrate the superiority of the proposed method to existing algorithms in identifying clusters without outliers and informative variables simultaneously.
- Published
- 2024
42. Studying the Impact of TensorFlow and PyTorch Bindings on Machine Learning Software Quality
- Author
-
Li, Hao, Rajbahadur, Gopi Krishnan, and Bezemer, Cor-Paul
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence - Abstract
Bindings for machine learning frameworks (such as TensorFlow and PyTorch) allow developers to integrate a framework's functionality using a programming language different from the framework's default language (usually Python). In this paper, we study the impact of using TensorFlow and PyTorch bindings in C#, Rust, Python and JavaScript on the software quality in terms of correctness (training and test accuracy) and time cost (training and inference time) when training and performing inference on five widely used deep learning models. Our experiments show that a model can be trained in one binding and used for inference in another binding for the same framework without losing accuracy. Our study is the first to show that using a non-default binding can help improve machine learning software quality from the time cost perspective compared to the default Python binding while still achieving the same level of correctness.
- Published
- 2024
- Full Text
- View/download PDF
43. High-order Uncertain Differential Equation and Its Application to Nuclear Reactors
- Author
-
Li, Hao and Wang, Yuqian
- Subjects
Mathematics - Analysis of PDEs ,34F05 - Abstract
High-order uncertain differential equation (HUDE) was introduced in literature. But the present method to solve a HUDE is incorrect. In this paper, we will rigorously prove some comparion theorems of high-order differential equations, and present a method to solve a family of HUDE, including parameter estimation and hypothesis test. Then an application to nuclear reactor kinetics is given to illustrate the method., Comment: 24 pages, 6 figures
- Published
- 2024
44. Enhanced Second-Harmonic Generation in Thin-Film Lithium Niobate Circular Bragg Nanocavity
- Author
-
Li, Zengya, Hu, Zhuoran, Ye, Xiaona, Mao, Zhengyang, Feng, Juan, Li, Hao, Liu, Shijie, Wang, Bo, Zheng, Yuanlin, and Chen, Xianfeng
- Subjects
Physics - Optics - Abstract
Second-order nonlinearity gives rise to many distinctive physical phenomena, e.g., second-harmonic generation, which plays an important role in fundamental science and various applications. Lithium niobate, one of the most widely used nonlinear crystals, exhibits strong second-order nonlinear effects and electro-optic properties. However, its moderate refractive index and etching sidewall angle limit its capability in confining light into nanoscales, restricting its application in nanophotonics. Here, we exploit nanocavities formed by second-order circular Bragg gratings, which support resonant anapole modes to achieve highly enhanced SHG in thin film lithium niobate. The CBG nanocavity exhibits a record-high normalized conversion efficiency of $1.21\times10^{-2}\mathrm{cm^2/GW}$ under the pump intensity of $1.9$ $\mathrm{MW/cm^2}$. An SHG enhancement of $42,000$ is realized compared to TFLN. Besides, we also show s- and p-polarization independent SHG in elliptical Bragg nanocavities. This work could inspire studying nonlinear optics at the nanoscale on TFLN as well as other novel photonic platforms., Comment: 19 pages, 5 figures
- Published
- 2024
45. A machine learning model that outperforms conventional global subseasonal forecast models.
- Author
-
Chen, Lei, Zhong, Xiaohui, Li, Hao, Wu, Jie, Lu, Bo, Chen, Deliang, Xie, Shang-Ping, Wu, Libo, Chao, Qingchen, Lin, Chensen, Hu, Zixin, and Qi, Yuan
- Abstract
Skillful subseasonal forecasts are crucial for various sectors of society but pose a grand scientific challenge. Recently, machine learning-based weather forecasting models outperform the most successful numerical weather predictions generated by the European Centre for Medium-Range Weather Forecasts (ECMWF), but have not yet surpassed conventional models at subseasonal timescales. This paper introduces FuXi Subseasonal-to-Seasonal (FuXi-S2S), a machine learning model that provides global daily mean forecasts up to 42 days, encompassing five upper-air atmospheric variables at 13 pressure levels and 11 surface variables. FuXi-S2S, trained on 72 years of daily statistics from ECMWF ERA5 reanalysis data, outperforms the ECMWFs state-of-the-art Subseasonal-to-Seasonal model in ensemble mean and ensemble forecasts for total precipitation and outgoing longwave radiation, notably enhancing global precipitation forecast. The improved performance of FuXi-S2S can be primarily attributed to its superior capability to capture forecast uncertainty and accurately predict the Madden-Julian Oscillation (MJO), extending the skillful MJO prediction from 30 days to 36 days. Moreover, FuXi-S2S not only captures realistic teleconnections associated with the MJO but also emerges as a valuable tool for discovering precursor signals, offering researchers insights and potentially establishing a new paradigm in Earth system science research.
- Published
- 2024
46. Permittivity tensor imaging: modular label-free imaging of 3D dry mass and 3D orientation at high resolution.
- Author
-
Yeh, Li-Hao, Ivanov, Ivan, Chandler, Talon, Byrum, Janie, Chhun, Bryant, Guo, Syuan-Ming, Foltz, Cameron, Hashemi, Ezzat, Perez-Bermejo, Juan, Wang, Huijun, Yu, Yanhao, Kazansky, Peter, Conklin, Bruce, Han, May, and Mehta, Shalin
- Subjects
Imaging ,Three-Dimensional ,Animals ,Mice ,Algorithms ,Brain ,Microscopy ,Software ,Humans ,Image Processing ,Computer-Assisted - Abstract
The dry mass and the orientation of biomolecules can be imaged without a label by measuring their permittivity tensor (PT), which describes how biomolecules affect the phase and polarization of light. Three-dimensional (3D) imaging of PT has been challenging. We present a label-free computational microscopy technique, PT imaging (PTI), for the 3D measurement of PT. PTI encodes the invisible PT into images using oblique illumination, polarization-sensitive detection and volumetric sampling. PT is decoded from the data with a vectorial imaging model and a multi-channel inverse algorithm, assuming uniaxial symmetry in each voxel. We demonstrate high-resolution imaging of PT of isotropic beads, anisotropic glass targets, mouse brain tissue, infected cells and histology slides. PTI outperforms previous label-free imaging techniques such as vector tomography, ptychography and light-field imaging in resolving the 3D orientation and symmetry of organelles, cells and tissue. We provide open-source software and modular hardware to enable the adoption of the method.
- Published
- 2024
47. HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis
- Author
-
Deng, Ruining, Liu, Quan, Cui, Can, Yao, Tianyuan, Xiong, Juming, Bao, Shunxing, Li, Hao, Yin, Mengmeng, Wang, Yu, Zhao, Shilin, Tang, Yucheng, Yang, Haichun, and Huo, Yuankai
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel Hierarchical Adaptive Taxonomy Segmentation (HATs) method, which is designed to thoroughly segment panoramic views of kidney structures by leveraging detailed anatomical insights. Our approach entails (1) the innovative HATs technique which translates spatial relationships among 15 distinct object classes into a versatile "plug-and-play" loss function that spans across regions, functional units, and cells, (2) the incorporation of anatomical hierarchies and scale considerations into a unified simple matrix representation for all panoramic entities, (3) the adoption of the latest AI foundation model (EfficientSAM) as a feature extraction tool to boost the model's adaptability, yet eliminating the need for manual prompt generation in conventional segment anything model (SAM). Experimental findings demonstrate that the HATs method offers an efficient and effective strategy for integrating clinical insights and imaging precedents into a unified segmentation model across more than 15 categories. The official implementation is publicly available at https://github.com/hrlblab/HATs., Comment: arXiv admin note: text overlap with arXiv:2402.19286
- Published
- 2024
48. SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation
- Author
-
Huang, De-Xing, Zhou, Xiao-Hu, Xie, Xiao-Liang, Liu, Shi-Qi, Wang, Shuang-Yi, Feng, Zhen-Qiu, Gui, Mei-Jiang, Li, Hao, Xiang, Tian-Yu, Yao, Bo-Xian, and Hou, Zeng-Guang
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Automatic vessel segmentation is paramount for developing next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel interaction network (SPIRONet) is proposed to address the above issues. Specifically, dual encoders are utilized to comprehensively capture local spatial and global frequency vessel features. Then, a cross-attention fusion module is introduced to effectively fuse spatial and frequency features, thereby enhancing feature discriminability. Furthermore, a topological channel interaction module is designed to filter out task-irrelevant responses based on graph neural networks. Extensive experimental results on several challenging datasets (CADSA, CAXF, DCA1, and XCAD) demonstrate state-of-the-art performances of our method. Moreover, the inference speed of SPIRONet is 21 FPS with a 512x512 input size, surpassing clinical real-time requirements (6~12FPS). These promising outcomes indicate SPIRONet's potential for integration into vascular interventional navigation systems. Code is available at https://github.com/Dxhuang-CASIA/SPIRONet.
- Published
- 2024
49. CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI
- Author
-
Wang, Zi, Wang, Fanwen, Qin, Chen, Lyu, Jun, Cheng, Ouyang, Wang, Shuo, Li, Yan, Yu, Mengyao, Zhang, Haoyu, Guo, Kunyuan, Shi, Zhang, Li, Qirong, Xu, Ziqiang, Zhang, Yajing, Li, Hao, Hua, Sha, Chen, Binghua, Sun, Longyu, Sun, Mengting, Li, Qin, Chu, Ying-Hua, Bai, Wenjia, Qin, Jing, Zhuang, Xiahai, Prieto, Claudia, Young, Alistair, Markl, Michael, Wang, He, Wu, Lianming, Yang, Guang, Qu, Xiaobo, and Wang, Chengyan
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Databases - Abstract
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation., Comment: 19 pages, 3 figures, 2 tables
- Published
- 2024
50. Radiative decay and axial-vector decay behaviors of octet pentaquark states
- Author
-
Lei, Ya-Ding and Li, Hao-Song
- Subjects
High Energy Physics - Phenomenology - Abstract
In this work, we systematically calculate transition magnetic moments, radiative decay widths, and axial-vector coupling constants of octet hidden-charm molecular pentaquark states with different flavor representations in constituent quark model. We discuss the relations between transition magnetic moments and decay widths for pentaquark states. For octet pentaquark states with the $8_{1f}$ and $8_{2f}$ flavor representations, decay widths of the processes $P_{\psi}|\frac{3}{2}^-\rangle_{(\frac{1}{2}^+\otimes1^-)}\to P_{\psi}|\frac{1}{2}^-\rangle_{(\frac{1}{2}^+\otimes0^-)}\gamma$ and $P_{\psi}|\frac{1}{2}^-\rangle_{(\frac{1}{2}^+\otimes1^-)}\to P_{\psi}|\frac{1}{2}^-\rangle_{(\frac{1}{2}^+\otimes0^-)}\gamma$ are quite close, decay widths of the $P_{\psi}|\frac{3}{2}^-\rangle_{(\frac{1}{2}^+\otimes1^-)}\to P_{\psi}|\frac{1}{2}^-\rangle_{(\frac{1}{2}^+\otimes1^-)}\gamma$ process are close to zero, and we notice that the axial-vector coupling constants of the pentaquark states are generally smaller than that of the nucleon., Comment: 18 pages, 0 figure
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.