Author: "Yu, Han" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yu, Han"' showing total 24,527 results

Start Over Author "Yu, Han"

24,527 results on '"Yu, Han"'

1. Double Machine Learning for Adaptive Causal Representation in High-Dimensional Data

Author: Aouar, Lynda and Yu, Han
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Statistics - Applications, Statistics - Computation
Abstract: Adaptive causal representation learning from observational data is presented, integrated with an efficient sample splitting technique within the semiparametric estimating equation framework. The support points sample splitting (SPSS), a subsampling method based on energy distance, is employed for efficient double machine learning (DML) in causal inference. The support points are selected and split as optimal representative points of the full raw data in a random sample, in contrast to the traditional random splitting, and providing an optimal sub-representation of the underlying data generating distribution. They offer the best representation of a full big dataset, whereas the unit structural information of the underlying distribution via the traditional random data splitting is most likely not preserved. Three machine learning estimators were adopted for causal inference, support vector machine (SVM), deep learning (DL), and a hybrid super learner (SL) with deep learning (SDL), using SPSS. A comparative study is conducted between the proposed SVM, DL, and SDL representations using SPSS, and the benchmark results from Chernozhukov et al. (2018), which employed random forest, neural network, and regression trees with a random k-fold cross-fitting technique on the 401(k)-pension plan real data. The simulations show that DL with SPSS and the hybrid methods of DL and SL with SPSS outperform SVM with SPSS in terms of computational efficiency and the estimation quality, respectively.
Published: 2024

2. Learning from 'Silly' Questions Improves Large Language Models, But Only Slightly

Author: Zhu, Tingyuan, Liu, Shudong, Wang, Yidong, Wong, Derek F., Yu, Han, Shinozaki, Takahiro, and Wang, Jindong
Subjects: Computer Science - Computation and Language
Abstract: Constructing high-quality Supervised Fine-Tuning (SFT) datasets is critical for the training of large language models (LLMs). Recent studies have shown that using data from a specific source, Ruozhiba, a Chinese website where users ask "silly" questions to better understand certain topics, can lead to better fine-tuning performance. This paper aims to explore some hidden factors: the potential interpretations of its success and a large-scale evaluation of the performance. First, we leverage GPT-4 to analyze the successful cases of Ruozhiba questions from the perspective of education, psychology, and cognitive science, deriving a set of explanatory rules. Then, we construct fine-tuning datasets by applying these rules to the MMLU training set. Surprisingly, our results indicate that rules can significantly improve model performance in certain tasks, while potentially diminishing performance on others. For example, SFT data generated following the "Counterintuitive Thinking" rule can achieve approximately a 5% improvement on the "Global Facts" task, whereas the "Blurring the Conceptual Boundaries" rule leads to a performance drop of 6.14% on the "Econometrics" task. In addition, for specific tasks, different rules tend to have a consistent impact on model performance. This suggests that the differences between the extracted rules are not as significant, and the effectiveness of the rules is relatively consistent across tasks. Our research highlights the importance of considering task diversity and rule applicability when constructing SFT datasets to achieve more comprehensive performance improvements., Comment: 27 pages, 14 figures
Published: 2024

3. Giant spin Hall effect with multi-directional spin components in Ni4W

Author: Yang, Yifei, Lee, Seungjun, Chen, Yu-Chia, Jia, Qi, Sousa, Duarte, Odlyzko, Michael, Garcia-Barriocanal, Javier, Yu, Guichuan, Haugstad, Greg, Fan, Yihong, Huang, Yu-Han, Lyu, Deyuan, Cresswell, Zach, Low, Tony, and Wang, Jian-Ping
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: Spin-orbit torque (SOT) can be used to efficiently manipulate the magnetic state of magnetic materials, which is an essential element for memory and logic applications. Due to symmetry constraints, only in-plane spins can be injected into the ferromagnet from the underlying SOT layer for conventional SOT materials such as heavy metals and topological materials. Through the use of materials with low symmetries, or other symmetry breaking approaches, unconventional spin currents with out-of-plane polarization has been demonstrated and enabled field-free deterministic switching of perpendicular magnetization. Despite this progress, the SOT efficiency of these materials has typically remained low. Here, we report a giant SOT efficiency of 0.85 in sputtered Ni4W/CoFeB heterostructure at room temperature, as evaluated by second harmonic Hall measurements. In addition, due to the low crystal symmetry of Ni4W, unconventional out-of-plane and Dresselhaus-like spin components were observed. Macro-spin simulation suggests our spin Hall tensor to provide about an order of magnitude improvement in the magnetization switching efficiency, thus broadening the path towards energy efficient spintronic devices using low-symmetry materials.
Published: 2024

4. Finite-time thermodynamics: A journey beginning with optimizing heat engines

Author: Ma, Yu-Han and Zhao, Xiu-Hua
Subjects: Condensed Matter - Statistical Mechanics, Physics - Applied Physics, Physics - Classical Physics, Quantum Physics
Abstract: In this paper, we summarize the historical development of finite-time thermodynamics and review the current state of research over the past two decades in this field, focusing on fundamental constraints of finite-time thermodynamic cycles, optimal control and optimization of thermodynamic processes, the operation of unconventional heat engines, and experimental progress., Comment: 4 pages, 131 references, comments are welcome
Published: 2024

5. Unified Approach to Power-Efficiency Trade-Off of Generic Thermal Machines

Author: Ma, Yu-Han and Fu, Cong
Subjects: Condensed Matter - Statistical Mechanics, Physics - Applied Physics, Physics - Classical Physics
Abstract: Due to the diverse functionalities of different thermal machines, their optimization relies on a case-by-case basis, lacking unified results. In this work, we propose a general approach to determine power-efficiency trade-off relation (PETOR) for any thermal machine. For cases where cycle (of duration $\tau$) irreversibility satisfies the typical $1/\tau$-scaling, we provide a unified PETOR which is applicable to heat engines, refrigerators, heat exchangers and heat pumps. It is shown that, some typical PETORs, such as those for low-dissipation Carnot cycles (including heat engine and refrigerator cycles) and the steady-state heat engines operating between finite-sized reservoirs are naturally recovered., Comment: 4+2 pages, 2 figures, comments are welcome
Published: 2024

6. Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs

Author: Zhang, Yifei, Zhu, Hao, Liu, Aiwei, Yu, Han, Koniusz, Piotr, and King, Irwin
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks. However, the enormous size of LLMs poses significant challenges in terms of computational complexity and resource requirements. Low-Rank Adaptation (LoRA) has emerged as a promising solution. However, there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum. In this work, we propose eXtreme Gradient Boosting LoRA (XGBLoRA), a novel framework that bridges this gap by leveraging the power of ensemble learning. Inspired by gradient boosting, XGBLoRA iteratively learns and merges a sequence of LoRA adaptations to refine model predictions. It achieves better performance than the standard LoRA, while enjoying the computational efficiency of rank-1 adaptations. We provide theoretical analysis to show the convergence and optimality of our approach, and conduct extensive experiments on a range of natural language processing tasks. The results demonstrate that XGBLoRA consistently outperforms standard LoRA and achieves performance comparable to full fine-tuning with significantly fewer trainable parameters. This work advances parameter-efficient fine-tuning for LLMs, and offers a promising solution for adapting LLMs to downstream tasks while optimizing performance and efficiency., Comment: 19 pages
Published: 2024

7. Free-Rider and Conflict Aware Collaboration Formation for Cross-Silo Federated Learning

Author: Chen, Mengmeng, Wu, Xiaohu, Tang, Xiaoli, He, Tiantian, Ong, Yew-Soon, Liu, Qiqi, Lao, Qicheng, and Yu, Han
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning
Abstract: Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key sources of FL-PTs. The resulting FL ecosystem has two features: (i) self-interest, and (ii) competition among FL-PTs. This requires the desirable FL-PT selection strategy to simultaneously mitigate the problems of free riders and conflicts of interest among competitors. To this end, we propose an optimal FL collaboration formation strategy -- FedEgoists -- which ensures that: (1) a FL-PT can benefit from FL if and only if it benefits the FL ecosystem, and (2) a FL-PT will not contribute to its competitors or their supporters. It provides an efficient clustering solution to group FL-PTs into coalitions, ensuring that within each coalition, FL-PTs share the same interest. We theoretically prove that the FL-PT coalitions formed are optimal since no coalitions can collaborate together to improve the utility of any of their members. Extensive experiments on widely adopted benchmark datasets demonstrate the effectiveness of FedEgoists compared to nine state-of-the-art baseline methods, and its ability to establish efficient collaborative networks in cross-silos FL with FL-PTs that engage in business activities.
Published: 2024

8. Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible?

Author: Gerogiannis, Argyrios, Huang, Yu-Han, and Veeravalli, Venugopal V.
Subjects: Computer Science - Machine Learning
Abstract: We study the problem of Non-Stationary Reinforcement Learning (NS-RL) without prior knowledge about the system's non-stationarity. A state-of-the-art, black-box algorithm, known as MASTER, is considered, with a focus on identifying the conditions under which it can achieve its stated goals. Specifically, we prove that MASTER's non-stationarity detection mechanism is not triggered for practical choices of horizon, leading to performance akin to a random restarting algorithm. Moreover, we show that the regret bound for MASTER, while being order optimal, stays above the worst-case linear regret until unreasonably large values of the horizon. To validate these observations, MASTER is tested for the special case of piecewise stationary multi-armed bandits, along with methods that employ random restarting, and others that use quickest change detection to restart. A simple, order optimal random restarting algorithm, that has prior knowledge of the non-stationarity is proposed as a baseline. The behavior of the MASTER algorithm is validated in simulations, and it is shown that methods employing quickest change detection are more robust and consistently outperform MASTER and other random restarting approaches., Comment: Corrected minor typos in the proof of Theorem 2 on pages 25 and 26
Published: 2024

9. Benchmarking Data Heterogeneity Evaluation Approaches for Personalized Federated Learning

Author: Li, Zhilong, Wu, Xiaohu, Tang, Xiaoli, He, Tiantian, Ong, Yew-Soon, Chen, Mengmeng, Liu, Qiqi, Lao, Qicheng, and Yu, Han
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: There is growing research interest in measuring the statistical heterogeneity of clients' local datasets. Such measurements are used to estimate the suitability for collaborative training of personalized federated learning (PFL) models. Currently, these research endeavors are taking place in silos and there is a lack of a unified benchmark to provide a fair and convenient comparison among various approaches in common settings. We aim to bridge this important gap in this paper. The proposed benchmarking framework currently includes six representative approaches. Extensive experiments have been conducted to compare these approaches under five standard non-IID FL settings, providing much needed insights into which approaches are advantageous under which settings. The proposed framework offers useful guidance on the suitability of various data divergence measures in FL systems. It is beneficial for keeping related research activities on the right track in terms of: (1) designing PFL schemes, (2) selecting appropriate data heterogeneity evaluation approaches for specific FL application scenarios, and (3) addressing fairness issues in collaborative model training. The code is available at https://github.com/Xiaoni-61/DH-Benchmark., Comment: Accepted to FL@FM-NeurIPS'24
Published: 2024

10. Theory of Pressure Dependence of Superconductivity in Bilayer Nickelate La$_3$Ni$_2$O$_{7}$

Author: Jiang, Kai-Yue, Cao, Yu-Han, Yang, Qing-Geng, Lu, Hong-Yan, and Wang, Qiang-Hua
Subjects: Condensed Matter - Superconductivity
Abstract: The recent experiment shows the superconducting transition temperature in the Ruddlesden-Popper bilayer La$_3$Ni$_2$O$_{7}$ decreases monotonically with increasing pressure above 14 GPa. In order to unravel the underlying mechanism for this unusual dependence, we performed theoretical investigations by combining the density functional theory (DFT) and the unbiased functional renormalization group (FRG). Our DFT calculations show that the Fermi pockets are essentially unchanged with increasing pressure (above 14 GPa), but the bandwidth is enlarged, and particularly the interlayer hopping integral between the nickel $3d_{3z^2-r^2}$ orbitals is enhanced. From the DFT band structure, we construct the bilayer tight-binding model in terms of the nickel $3d_{3z^2-r^2}$ and $3d_{x^2-y^2}$ orbitals. On this basis, we investigate the superconductivity induced by correlation effects by FRG calculations. We find consistently $s_\pm$-wave pairing triggered by spin fluctuations, but the latter are weakened by pressure and lead to a decreasing transition temperature versus pressure, in qualitatively agreement with the experiment. We emphasize that the itinerancy of the $d$-orbitals is important and captured naturally in our FRG calculations, and we argue that the unusual pressure dependence would be unnatural, if not impossible, in the otherwise local-moment picture of the nickel $d$-orbitals. This sheds lights on the pertinent microscopic description of, and more importantly the mechanism of superconductivity in La$_3$Ni$_2$O$_{7}$., Comment: 8 pages, 4 figures
Published: 2024

11. Federated Graph Learning with Adaptive Importance-based Sampling

Author: Li, Anran, Chen, Yuanyuan, Ren, Chao, Wang, Wenhan, Hu, Ming, Li, Tianlin, Yu, Han, and Chen, Qingyu
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: For privacy-preserving graph learning tasks involving distributed graph datasets, federated learning (FL)-based GCN (FedGCN) training is required. A key challenge for FedGCN is scaling to large-scale graphs, which typically incurs high computation and communication costs when dealing with the explosively increasing number of neighbors. Existing graph sampling-enhanced FedGCN training approaches ignore graph structural information or dynamics of optimization, resulting in high variance and inaccurate node embeddings. To address this limitation, we propose the Federated Adaptive Importance-based Sampling (FedAIS) approach. It achieves substantial computational cost saving by focusing the limited resources on training important nodes, while reducing communication overhead via adaptive historical embedding synchronization. The proposed adaptive importance-based sampling method jointly considers the graph structural heterogeneity and the optimization dynamics to achieve optimal trade-off between efficiency and accuracy. Extensive evaluations against five state-of-the-art baselines on five real-world graph datasets show that FedAIS achieves comparable or up to 3.23% higher test accuracy, while saving communication and computation costs by 91.77% and 85.59%.
Published: 2024

12. Thermodynamic Geometric Control of Active Matter

Author: Wang, Yating, Lei, Enmai, Ma, Yu-Han, Tu, Z. C., and Li, Geng
Subjects: Condensed Matter - Statistical Mechanics, Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Soft Condensed Matter
Abstract: Active matter represents a class of non-equilibrium systems that constantly dissipate energy to produce directed motion. The thermodynamic control of active matter holds great potential for advancements in synthetic molecular motors, targeted drug delivery, and adaptive smart materials. However, the inherently non-equilibrium nature of active matter poses a significant challenge in achieving optimal control with minimal energy cost. In this work, we extend the concept of thermodynamic geometry, traditionally applied to passive systems, to active matter, proposing a systematic geometric framework for minimizing energy cost in non-equilibrium driving processes. We derive a cost metric that defines a Riemannian manifold for control parameters, enabling the use of powerful geometric tools to determine optimal control protocols. The geometric perspective reveals that, unlike in passive systems, minimizing energy cost in active systems involves a trade-off between intrinsic and external dissipation, leading to an optimal transportation speed that coincides with the self-propulsion speed of active matter. This insight enriches the broader concept of thermodynamic geometry. We demonstrate the application of this approach by optimizing the performance of an active monothermal engine within this geometric framework.
Published: 2024

13. Inter-Layer Correlation of Loop Current Charge Density Wave on the Bilayer Kagom\'e Lattice

Author: Dong, Jin-Wei, Lin, Yu-Han, Fu, Ruiqing, Su, Gang, Wang, Ziqiang, and Zhou, Sen
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: Loop current order has been suggested as a promising candidate for the spontaneous time-reversal symmetry breaking $2a_0 \times 2a_0$ charge density wave (CDW) revealed in vanadium-based kagom\'e metals \avs\ ($A$ = K, Rb, Cs) near van Hove filling $n_\text{vH} = 5/12$. Weak-coupling analyses and mean field calculations have demonstrated that nearest-neighbor Coulomb repulsion $V_1$ and next-nearest-neighbor Coulomb repulsion $V_2$ drives, respectively, real and imaginary bond-ordered CDW, with the latter corresponding to time-reversal symmetry breaking loop current CDW. It is important to understand the inter-layer correlation of these bond-ordered CDWs and its consequences in the bulk kagom\'e materials. To provide physical insights, we investigate in this paper the $c$-axis stacking of them, loop current CDW in particular, on the minimal bilayer kagom\'e lattice. The bare susceptibilities for stacking of real and imaginary bond orders are calculated for the free electrons on the bilayer kagom\'e lattice with inter-layer coupling $t_\perp=0.2t$, which splits the van Hove filling to $n_{+\text{vH}}=4.64/12$ and $n_{-\text{vH}}=5.44/12$. While real and imaginary bond-ordered CDWs are still favored, respectively, by $V_1$ and $V_2$, their inter-layer coupling is sensitive to band filling $n$. They tend to stack symmetrically near $n_{\pm\text{vH}}$ with identical bond orders in the two layers and give rise to a $2a_0 \times 2a_0 \times 1c_0$ CDW. On the other hand, they prefer to stack antisymmetrically around $n_\text{vH}$ with opposite bond orders in the two layers and lead to a $2a_0 \times 2a_0 \times 2c_0$ CDW. The concrete bilayer $t$-$t_\perp$-$V_1$-V$_2$ model is then studied. We obtain the mean-field ground states and determine the inter-layer coupling as a function of band filling at various interactions. The nontrivial topological properties of loop current CDWs are studied ..., Comment: 12 pages, 8 figures, 2 tables
Published: 2024

14. Interplay of Charge Density Wave and Magnetism on the Kagom\'e Lattice

Author: Lin, Yu-Han, Dong, Jin-Wei, Fu, Ruiqing, Wu, Xian-Xin, Wang, Ziqiang, and Zhou, Sen
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: Motivated by the recent discovery of charge density wave (CDW) order in the magnetic kagom\'e metal FeGe, we study the single-orbital $t$-$U$-$V_1$-$V_2$ model on the kagom\'e lattice, where $U$, $V_1$, and $V_2$ are the onsite, nearest neighbor, and next-nearest-neighbor Coulomb repulsions, respectively. When the Fermi level lies in the flat band, the instability toward ferromagnetic (FM) order gives rise to a FM half-metal at sufficiently large onsite $U$. Intriguingly, at band filling $n=17/24$, the Fermi level crosses the van Hove singularity of the spin-minority bands of the half-metal. We show that, due to the unique geometry and sublattice interference on the kagom\'e lattice at van Hove singularity, the intersite Coulomb interactions $V_1$ and $V_2$ drive a real and an imaginary bond-ordered $2a_0 \times 2a_0$ CDW instability, respectively. The FM loop current CDW with complex bond orders is a spin-polarized Chern insulator exhibiting the quantum anomalous Hall effect. The bond fluctuations are found to be substantially enhanced compared to the corresponding nonmagnetic kagom\'e metals at van Hove filling, providing a concrete model realization of the bond-ordered CDWs, including the FM loop current CDW, over the onsite charge density ordered states. When the spins are partially polarized, we find that the formation of bond-ordered CDWs enhances substantially the ordered magnetic moments. These findings provide physical insights for the emergence of loop-current and bond-ordered CDW and their interplay with magnetism on the kagom\'e lattice, with possible connections to the magnetic kagom\'e metal FeGe.
Published: 2024

15. Correlations among genetic, epigenetic, and phenotypic variation of Phragmites australis along latitudes

Author: Chen, Yu-Han, Meng, Jian-Qiao, Wang, Chun-Lin, Fang, Tao, Jia, Zi-Xuan, and Luo, Fang-Li
Published: 2024

16. Low-cost demonstration of the Zeeman effect: From qualitative observation to quantitative experiments

Author: Qin, Shao-Han and Ma, Yu-Han
Subjects: Physics - Physics Education, Physics - Atomic Physics, Physics - Instrumentation and Detectors, Physics - Optics, Quantum Physics
Abstract: The Zeeman effect, a fundamental quantum phenomenon, demonstrates the interaction between magnetic fields and atomic systems. While precise spectroscopic measurements of this effect have advanced significantly, there remains a lack of simple, visually accessible demonstrations for educational purposes. Here, we present a low-cost experiment that allows for direct visual observation of the Zeeman effect. Our setup involves a flame containing sodium (from table salt) placed in front of a sodium vapor lamp. When a magnetic field is applied to the flame, the shadow cast by the flame noticeably lightens, providing a clear, naked-eye demonstration of the Zeeman effect. Furthermore, we conduct two quantitative experiments using this setup, examining the effects of varying magnetic field strength and sodium concentration. This innovative approach not only enriches the experimental demonstration for teaching atomic physics at undergraduate and high school levels but also provides an open platform for students to explore the Zeeman effect through hands-on experience., Comment: 5 pages, 6 figures, Comments are welcome. This manuscript is intended for submission to American Journal of Physics
Published: 2024

17. On the distinction between the per-protocol effect and the effect of the treatment strategy

Author: Dahabreh, Issa J., Ung, Lawson, Hernán, Miguel A., and Chiu, Yu-Han
Subjects: Statistics - Methodology
Abstract: In randomized trials, the per-protocol effect, that is, the effect of being assigned a treatment strategy and receiving treatment according to the assigned strategy, is sometimes thought to reflect the effect of the treatment strategy itself, without intervention on assignment. Here, we argue by example that this is not necessarily the case. We examine a causal structure for a randomized trial where these two causal estimands -- the per-protocol effect and the effect of the treatment strategy -- are not equal, and where their corresponding identifying observed data functionals are not the same, but both require information on assignment for identification. Our example highlights the conceptual difference between the per-protocol effect and the effect of the treatment strategy itself, the conditions under which the observed data functionals for these estimands are equal, and suggests that in some cases their identification requires information on assignment, even when assignment is randomized. An implication of these findings is that in observational analyses that aim to emulate a target randomized trial in which an analog of assignment is well-defined, the effect of the treatment strategy is not necessarily an observational analog of the per-protocol effect. Furthermore, either of these effects may be unidentifiable without information on treatment assignment, unless one makes additional assumptions; informally, that assignment does not affect the outcome except through treatment (i.e., an exclusion-restriction assumption), and that assignment is not a confounder of the treatment outcome association conditional on other variables in the analysis.
Published: 2024

18. Moment transference principles and multiplicative diophantine approximation on hypersurfaces

Author: Chow, Sam and Yu, Han
Subjects: Mathematics - Number Theory, 11J83 (primary), 42A16 (secondary)
Abstract: We determine the generic multiplicative approximation rate on a hypersurface. There are four regimes, according to convergence or divergence and curved or flat, and we address all of them. Using geometry and arithmetic in Fourier space, we develop a general framework of moment transference principles, which convert Lebesgue data into data for some other measure.
Published: 2024

19. High Probability Latency Sequential Change Detection over an Unknown Finite Horizon

Author: Huang, Yu-Han and Veeravalli, Venugopal V.
Subjects: Computer Science - Data Structures and Algorithms, Electrical Engineering and Systems Science - Systems and Control, Mathematics - Statistics Theory
Abstract: A finite horizon variant of the quickest change detection problem is studied, in which the goal is to minimize a delay threshold (latency), under constraints on the probability of false alarm and the probability that the latency is exceeded. In addition, the horizon is not known to the change detector. A variant of the cumulative sum (CuSum) test with a threshold that increasing logarithmically with time is proposed as a candidate solution to the problem. An information-theoretic lower bound on the minimum value of the latency under the constraints is then developed. This lower bound is used to establish certain asymptotic optimality properties of the proposed test in terms of the horizon and the false alarm probability. Some experimental results are given to illustrate the performance of the test., Comment: 7 pages, 2 figures, International Symposium of Information Theory
Published: 2024

20. Triggering the Untriggered: The First Einstein Probe-Detected Gamma-Ray Burst 240219A and Its Implications

Author: Yin, Yi-Han Iris, Zhang, Bin-Bin, Yang, Jun, Sun, Hui, Zhang, Chen, Shao, Yi-Xuan, Hu, You-Dong, Zhu, Zi-Pei, Xu, Dong, An, Li, Gao, He, Wu, Xue-Feng, Zhang, Bing, Castro-Tirado, Alberto Javier, Pandey, Shashi B., Rau, Arne, Lei, Weihua, Xie, Wei, Ghirlanda, Giancarlo, Piro, Luigi, O'Brien, Paul, Troja, Eleonora, Jonker, Peter, Yu, Yun-Wei, An, Jie, Chen, Run-Chao, Chen, Yi-Jing, Dong, Xiao-Fei, Eyles-Ferris, Rob, Fan, Zhou, Fu, Shao-Yu, Fynbo, Johan P. U., Gao, Xing, Huang, Yong-Feng, Jiang, Shuai-Qing, Jiang, Ya-Hui, Julakanti, Yashaswi, Kuulkers, Erik, Lao, Qing-Hui, Li, Dongyue, Ling, Zhi-Xing, Liu, Xing, Liu, Yuan, Mou, Jia-Yu, Pan, Xin, Varun, Wei, Daming, Wu, Qinyu, Yadav, Muskan, Yang, Yu-Han, Yuan, Weimin, and Zhang, Shuang-Nan
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: The Einstein Probe (EP) achieved its first detection and localization of a bright X-ray flare, EP240219a, on 2024 February 19, during its commissioning phase. Subsequent targeted searches triggered by the EP240219a alert identified a faint, untriggered gamma-ray burst (GRB) in the archived data of Fermi Gamma-ray Burst Monitor (GBM), Swift Burst Alert Telescope (BAT), and Insight-HXMT/HE. The EP Wide-field X-ray Telescope (WXT) light curve reveals a long duration of approximately 160 s with a slow decay, whereas the Fermi/GBM light curve shows a total duration of approximately 70 s. The peak in the Fermi/GBM light curve occurs slightly later with respect to the peak seen in the EP/WXT light curve. Our spectral analysis shows that a single cutoff power-law (PL) model effectively describes the joint EP/WXT--Fermi/GBM spectra in general, indicating coherent broad emission typical of GRBs. The model yielded a photon index of $\sim -1.70 \pm 0.05$ and a peak energy of $\sim 257 \pm 134$ keV. After detection of GRB 240219A, long-term observations identified several candidates in optical and radio wavelengths, none of which was confirmed as the afterglow counterpart during subsequent optical and near-infrared follow-ups. The analysis of GRB 240219A classifies it as an X-ray rich GRB (XRR) with a high peak energy, presenting both challenges and opportunities for studying the physical origins of X-ray flashes, XRRs, and classical GRBs. Furthermore, linking the cutoff PL component to nonthermal synchrotron radiation suggests that the burst is driven by a Poynting flux-dominated outflow., Comment: 15 pages, 8 figures, 3 tables
Published: 2024
Full Text: View/download PDF

21. A Joint Inversion of Sources and Seismic Waveforms for Velocity Distribution: 1-D and 2-D Examples

Author: Yu, Han
Subjects: Physics - Geophysics
Abstract: Waveform inversion is theoretically a powerful tool to reconstruct subsurface structures, but a usually encountered problem is that accurate sources are very rare, causing the computation unstable and divergent. This challenging problem, although sometimes ignored and even imperceptible, can easily create discrepancies in calculated shot gathers, which will then lead to wrong residuals that must be migrated back to the gradients, hence jeopardizing the inverted tomograms. In practice, any shot gather may correspond to its own source even if some of them can be transformed alike after data processing. To resolve this problem, we propose a collocated inversion of sources and early arrival waveforms with the two submodules executing alternatively. Not only can this method produce a decent wavelet that approaches the true source or an equivalent source, but more importantly, it can also invert for credible background velocity models with the optimized sources. Part of the cycle skipping problems can also be mitigated because it avoids the trial and error experiments on various sources. Numerical tests upon a series of different conditions validate the effectiveness of this method. Restrictions on initial sources or starting velocity models will be relaxed with this method, and it can be extended to any other applications for engineering or exploration purposes., Comment: Part of the content in this manuscript has already been published in the journal Near Surface Geophysics (2024), which includes a real data example
Published: 2024
Full Text: View/download PDF

22. Cost-Efficient Computation Offloading in SAGIN: A Deep Reinforcement Learning and Perception-Aided Approach

Author: Gao, Yulan, Ye, Ziqiang, and Yu, Han
Subjects: Computer Science - Networking and Internet Architecture, Electrical Engineering and Systems Science - Signal Processing
Abstract: The Space-Air-Ground Integrated Network (SAGIN), crucial to the advancement of sixth-generation (6G) technology, plays a key role in ensuring universal connectivity, particularly by addressing the communication needs of remote areas lacking cellular network infrastructure. This paper delves into the role of unmanned aerial vehicles (UAVs) within SAGIN, where they act as a control layer owing to their adaptable deployment capabilities and their intermediary role. Equipped with millimeter-wave (mmWave) radar and vision sensors, these UAVs are capable of acquiring multi-source data, which helps to diminish uncertainty and enhance the accuracy of decision-making. Concurrently, UAVs collect tasks requiring computing resources from their coverage areas, originating from a variety of mobile devices moving at different speeds. These tasks are then allocated to ground base stations (BSs), low-earth-orbit (LEO) satellite, and local processing units to improve processing efficiency. Amidst this framework, our study concentrates on devising dynamic strategies for facilitating task hosting between mobile devices and UAVs, offloading computations, managing associations between UAVs and BSs, and allocating computing resources. The objective is to minimize the time-averaged network cost, considering the uncertainty of device locations, speeds, and even types. To tackle these complexities, we propose a deep reinforcement learning and perception-aided online approach (DRL-and-Perception-aided Approach) for this joint optimization in SAGIN, tailored for an environment filled with uncertainties. The effectiveness of our proposed approach is validated through extensive numerical simulations, which quantify its performance relative to various network parameters.
Published: 2024

23. Compressed Sensing Inspired User Acquisition for Downlink Integrated Sensing and Communication Transmissions

Author: Song, Yi, Pedraza, Fernando, Li, Shuangyang, Li, Siyao, Yu, Han, and Caire, Giuseppe
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: This paper investigates radar-assisted user acquisition for downlink multi-user multiple-input multiple-output (MIMO) transmission using Orthogonal Frequency Division Multiplexing (OFDM) signals. Specifically, we formulate a concise mathematical model for the user acquisition problem, where each user is characterized by its delay and beamspace response. Therefore, we propose a two-stage method for user acquisition, where the Multiple Signal Classification (MUSIC) algorithm is adopted for delay estimation, and then a least absolute shrinkage and selection operator (LASSO) is applied for estimating the user response in the beamspace. Furthermore, we also provide a comprehensive performance analysis of the considered problem based on the pair-wise error probability (PEP). Particularly, we show that the rank and the geometric mean of non-zero eigenvalues of the squared beamspace difference matrix determines the user acquisition performance. More importantly, we reveal that simultaneously probing multiple beams outperforms concentrating power on a specific beam direction in each time slot under the power constraint, when only limited OFDM symbols are transmitted. Our numerical results confirm our conclusions and also demonstrate a promising acquisition performance of the proposed two-stage method.
Published: 2024

24. Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration

Author: Yu, Han-Cheng, Shih, Yu-An, Law, Kin-Man, Hsieh, Kai-Yu, Cheng, Yu-Chen, Ho, Hsin-Chih, Lin, Zih-An, Hsu, Wen-Chuan, and Fan, Yao-Chung
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we tackle the task of distractor generation (DG) for multiple-choice questions. Our study introduces two key designs. First, we propose \textit{retrieval augmented pretraining}, which involves refining the language model pretraining to align it more closely with the downstream task of DG. Second, we explore the integration of knowledge graphs to enhance the performance of DG. Through experiments with benchmarking datasets, we show that our models significantly outperform the state-of-the-art results. Our best-performing model advances the F1@3 score from 14.80 to 16.47 in MCQ dataset and from 15.92 to 16.50 in Sciq dataset., Comment: Findings at ACL 2024
Published: 2024

25. Federated Model Heterogeneous Matryoshka Representation Learning

Author: Yi, Liping, Yu, Han, Ren, Chao, Wang, Gang, Liu, Xiaoguang, and Li, Xiaoxiao
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Model heterogeneous federated learning (MHeteroFL) enables FL clients to collaboratively train models with heterogeneous structures in a distributed fashion. However, existing MHeteroFL methods rely on training loss to transfer knowledge between the client model and the server model, resulting in limited knowledge exchange. To address this limitation, we propose the Federated model heterogeneous Matryoshka Representation Learning (FedMRL) approach for supervised learning tasks. It adds an auxiliary small homogeneous model shared by clients with heterogeneous local models. (1) The generalized and personalized representations extracted by the two models' feature extractors are fused by a personalized lightweight representation projector. This step enables representation fusion to adapt to local data distribution. (2) The fused representation is then used to construct Matryoshka representations with multi-dimensional and multi-granular embedded representations learned by the global homogeneous model header and the local heterogeneous model header. This step facilitates multi-perspective representation learning and improves model learning capability. Theoretical analysis shows that FedMRL achieves a $O(1/T)$ non-convex convergence rate. Extensive experiments on benchmark datasets demonstrate its superior model accuracy with low communication and computational costs compared to seven state-of-the-art baselines. It achieves up to 8.48% and 24.94% accuracy improvement compared with the state-of-the-art and the best same-category baseline, respectively.
Published: 2024

26. A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation

Author: Chen, Kai, Wang, Ye, Li, Yitong, Li, Aiping, Yu, Han, and Song, Xin
Subjects: Computer Science - Artificial Intelligence
Abstract: Temporal knowledge graph (TKG) reasoning has two settings: interpolation reasoning and extrapolation reasoning. Both of them draw plenty of research interest and have great significance. Methods of the former de-emphasize the temporal correlations among facts sequences, while methods of the latter require strict chronological order of knowledge and ignore inferring clues provided by missing facts of the past. These limit the practicability of TKG applications as almost all of the existing TKG reasoning methods are designed specifically to address either one setting. To this end, this paper proposes an original Temporal PAth-based Reasoning (TPAR) model for both the interpolation and extrapolation reasoning. TPAR performs a neural-driven symbolic reasoning fashion that is robust to ambiguous and noisy temporal data and with fine interpretability as well. Comprehensive experiments show that TPAR outperforms SOTA methods on the link prediction task for both the interpolation and the extrapolation settings. A novel pipeline experimental setting is designed to evaluate the performances of SOTA combinations and the proposed TPAR towards interpolation and extrapolation reasoning. More diverse experiments are conducted to show the robustness and interpretability of TPAR., Comment: To appear in ACL 2024 main conference
Published: 2024

27. ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-Enhanced Cardiological Text

Author: Yu, Han, Guo, Peikun, and Sano, Akane
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Artificial Intelligence
Abstract: The utilization of deep learning on electrocardiogram (ECG) analysis has brought the advanced accuracy and efficiency of cardiac healthcare diagnostics. By leveraging the capabilities of deep learning in semantic understanding, especially in feature extraction and representation learning, this study introduces a new multimodal contrastive pretaining framework that aims to improve the quality and robustness of learned representations of 12-lead ECG signals. Our framework comprises two key components, including Cardio Query Assistant (CQA) and ECG Semantics Integrator(ESI). CQA integrates a retrieval-augmented generation (RAG) pipeline to leverage large language models (LLMs) and external medical knowledge to generate detailed textual descriptions of ECGs. The generated text is enriched with information about demographics and waveform patterns. ESI integrates both contrastive and captioning loss to pretrain ECG encoders for enhanced representations. We validate our approach through various downstream tasks, including arrhythmia detection and ECG-based subject identification. Our experimental results demonstrate substantial improvements over strong baselines in these tasks. These baselines encompass supervised and self-supervised learning methods, as well as prior multimodal pretraining approaches.
Published: 2024

28. FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler

Author: Peng, Hongyi, Yu, Han, Tang, Xiaoli, and Li, Xiaoxiao
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Federated learning (FL) enables collaborative machine learning across distributed data owners, but data heterogeneity poses a challenge for model calibration. While prior work focused on improving accuracy for non-iid data, calibration remains under-explored. This study reveals existing FL aggregation approaches lead to sub-optimal calibration, and theoretical analysis shows despite constraining variance in clients' label distributions, global calibration error is still asymptotically lower bounded. To address this, we propose a novel Federated Calibration (FedCal) approach, emphasizing both local and global calibration. It leverages client-specific scalers for local calibration to effectively correct output misalignment without sacrificing prediction accuracy. These scalers are then aggregated via weight averaging to generate a global scaler, minimizing the global calibration error. Extensive experiments demonstrate FedCal significantly outperforms the best-performing baseline, reducing global calibration error by 47.66% on average., Comment: This paper has been accepted by ICML'24
Published: 2024

29. AdaWaveNet: Adaptive Wavelet Network for Time Series Analysis

Author: Yu, Han, Guo, Peikun, and Sano, Akane
Subjects: Computer Science - Machine Learning
Abstract: Time series data analysis is a critical component in various domains such as finance, healthcare, and meteorology. Despite the progress in deep learning for time series analysis, there remains a challenge in addressing the non-stationary nature of time series data. Traditional models, which are built on the assumption of constant statistical properties over time, often struggle to capture the temporal dynamics in realistic time series, resulting in bias and error in time series analysis. This paper introduces the Adaptive Wavelet Network (AdaWaveNet), a novel approach that employs Adaptive Wavelet Transformation for multi-scale analysis of non-stationary time series data. AdaWaveNet designed a lifting scheme-based wavelet decomposition and construction mechanism for adaptive and learnable wavelet transforms, which offers enhanced flexibility and robustness in analysis. We conduct extensive experiments on 10 datasets across 3 different tasks, including forecasting, imputation, and a newly established super-resolution task. The evaluations demonstrate the effectiveness of AdaWaveNet over existing methods in all three tasks, which illustrates its potential in various real-world applications.
Published: 2024

30. Agent-oriented Joint Decision Support for Data Owners in Auction-based Federated Learning

Author: Tang, Xiaoli, Yu, Han, and Li, Xiaoxiao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory
Abstract: Auction-based Federated Learning (AFL) has attracted extensive research interest due to its ability to motivate data owners (DOs) to join FL through economic means. While many existing AFL methods focus on providing decision support to model users (MUs) and the AFL auctioneer, decision support for data owners remains open. To bridge this gap, we propose a first-of-its-kind agent-oriented joint Pricing, Acceptance and Sub-delegation decision support approach for data owners in AFL (PAS-AFL). By considering a DO's current reputation, pending FL tasks, willingness to train FL models, and its trust relationships with other DOs, it provides a systematic approach for a DO to make joint decisions on AFL bid acceptance, task sub-delegation and pricing based on Lyapunov optimization to maximize its utility. It is the first to enable each DO to take on multiple FL tasks simultaneously to earn higher income for DOs and enhance the throughput of FL tasks in the AFL ecosystem. Extensive experiments based on six benchmarking datasets demonstrate significant advantages of PAS-AFL compared to six alternative strategies, beating the best baseline by 28.77% and 2.64% on average in terms of utility and test accuracy of the resulting FL models, respectively.
Published: 2024

31. Multi-modal Learnable Queries for Image Aesthetics Assessment

Author: Xiong, Zhiwei, Zhang, Yunfan, Shen, Zhiqi, Ren, Peiran, and Yu, Han
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image aesthetics assessment (IAA) is attracting wide interest with the prevalence of social media. The problem is challenging due to its subjective and ambiguous nature. Instead of directly extracting aesthetic features solely from the image, user comments associated with an image could potentially provide complementary knowledge that is useful for IAA. With existing large-scale pre-trained models demonstrating strong capabilities in extracting high-quality transferable visual and textual features, learnable queries are shown to be effective in extracting useful features from the pre-trained visual features. Therefore, in this paper, we propose MMLQ, which utilizes multi-modal learnable queries to extract aesthetics-related features from multi-modal pre-trained features. Extensive experimental results demonstrate that MMLQ achieves new state-of-the-art performance on multi-modal IAA, beating previous methods by 7.7% and 8.3% in terms of SRCC and PLCC, respectively., Comment: Accepted by ICME2024
Published: 2024

32. pFedAFM: Adaptive Feature Mixture for Batch-Level Personalization in Heterogeneous Federated Learning

Author: Yi, Liping, Yu, Han, Ren, Chao, Zhang, Heng, Wang, Gang, Liu, Xiaoguang, and Li, Xiaoxiao
Subjects: Computer Science - Machine Learning
Abstract: Model-heterogeneous personalized federated learning (MHPFL) enables FL clients to train structurally different personalized models on non-independent and identically distributed (non-IID) local data. Existing MHPFL methods focus on achieving client-level personalization, but cannot address batch-level data heterogeneity. To bridge this important gap, we propose a model-heterogeneous personalized Federated learning approach with Adaptive Feature Mixture (pFedAFM) for supervised learning tasks. It consists of three novel designs: 1) A sharing global homogeneous small feature extractor is assigned alongside each client's local heterogeneous model (consisting of a heterogeneous feature extractor and a prediction header) to facilitate cross-client knowledge fusion. The two feature extractors share the local heterogeneous model's prediction header containing rich personalized prediction knowledge to retain personalized prediction capabilities. 2) An iterative training strategy is designed to alternately train the global homogeneous small feature extractor and the local heterogeneous large model for effective global-local knowledge exchange. 3) A trainable weight vector is designed to dynamically mix the features extracted by both feature extractors to adapt to batch-level data heterogeneity. Theoretical analysis proves that pFedAFM can converge over time. Extensive experiments on 2 benchmark datasets demonstrate that it significantly outperforms 7 state-of-the-art MHPFL methods, achieving up to 7.93% accuracy improvement while incurring low communication and computation costs.
Published: 2024

33. Advances and Open Challenges in Federated Foundation Models

Author: Ren, Chao, Yu, Han, Peng, Hongyi, Tang, Xiaoli, Zhao, Bo, Yi, Liping, Tan, Alysa Ziying, Gao, Yulan, Li, Anran, Li, Xiaoxiao, Li, Zengxiang, and Yang, Qiang
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI). This integration offers enhanced capabilities, while addressing concerns of privacy, data decentralization and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic relationship and exploring novel methodologies, challenges, and future directions that the FL research field needs to focus on in order to thrive in the age of FMs. A systematic multi-tiered taxonomy is proposed, categorizing existing FedFM approaches for model training, aggregation, trustworthiness, and incentivization. Key challenges, including how to enable FL to deal with high complexity of computational demands, privacy considerations, contribution evaluation, and communication efficiency, are thoroughly discussed. Moreover, this paper explores the intricate challenges of communication, scalability and security inherent in training/fine-tuning FMs via FL. It highlights the potential of quantum computing to revolutionize the processes of training, inference, optimization and security. This survey also introduces the implementation requirement of FedFM and some practical FedFM applications. It highlights lessons learned with a clear understanding of our findings for FedFM. Finally, this survey not only provides insights into the current state and challenges of FedFM, but also offers a blueprint for future research directions, emphasizing the need for developing trustworthy solutions. It serves as a foundational guide for researchers and practitioners interested in contributing to this interdisciplinary and rapidly advancing field., Comment: Survey of Federated Foundation Models (FedFM)
Published: 2024

34. Intelligent Agents for Auction-based Federated Learning: A Survey

Author: Tang, Xiaoli, Yu, Han, Li, Xiaoxiao, and Kraus, Sarit
Subjects: Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory
Abstract: Auction-based federated learning (AFL) is an important emerging category of FL incentive mechanism design, due to its ability to fairly and efficiently motivate high-quality data owners to join data consumers' (i.e., servers') FL training tasks. To enhance the efficiency in AFL decision support for stakeholders (i.e., data consumers, data owners, and the auctioneer), intelligent agent-based techniques have emerged. However, due to the highly interdisciplinary nature of this field and the lack of a comprehensive survey providing an accessible perspective, it is a challenge for researchers to enter and contribute to this field. This paper bridges this important gap by providing a first-of-its-kind survey on the Intelligent Agents for AFL (IA-AFL) literature. We propose a unique multi-tiered taxonomy that organises existing IA-AFL works according to 1) the stakeholders served, 2) the auction mechanism adopted, and 3) the goals of the agents, to provide readers with a multi-perspective view into this field. In addition, we analyse the limitations of existing approaches, summarise the commonly adopted performance evaluation metrics, and discuss promising future directions leading towards effective and efficient stakeholder-oriented decision support in IA-AFL ecosystems.
Published: 2024

35. Generation of Ultrarelativistic Vortex Leptons with Large Orbital Angular Momenta

Author: Ababekri, Mamutjan, Zhou, Jun-Lin, Guo, Ren-Tong, Ren, Yong-Zheng, Kou, Yu-Han, Zhao, Qian, Li, Zhong-Peng, and Li, Jian-Xing
Subjects: High Energy Physics - Phenomenology
Abstract: Ultrarelativistic vortex leptons with intrinsic orbital angular momenta (OAM) have important applications in high energy particle physics, nuclear physics, astrophysics, etc. However, unfortunately, their generation still poses a great challenge. Here, we put forward a novel method for generating ultrarelativistic vortex positrons and electrons through nonlinear Breit-Wheeler (NBW) scattering of vortex $\gamma$ photons. For the first time, a complete angular momentum-resolved scattering theory has been formulated, introducing the angular momentum of laser photons and vortex particles into the conventional NBW scattering framework. We find that vortex positron (electron) can be produced when the outgoing electron (positron) is generated along the collision axis. By unveiling the angular momentum transfer mechanism, we clarify that OAM of the $\gamma$ photon and angular momenta of multiple laser photons are entirely transferred to the generated pairs, leading to the production of ultrarelativistic vortex positrons or electrons with large OAM. Furthermore, we find that the cone opening angle and superposition state of the vortex $\gamma$ photon, distinct characteristics aside from its intrinsic OAM, can be determined via the angular distribution of created pairs in NBW processes. Our method paves the way for investigating strong-field quantum electrodynamics processes concerning the generation and detection of vortex particle beams in intense lasers., Comment: 8
Published: 2024
Full Text: View/download PDF

36. Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models

Author: Yang, Zeyu, Yu, Han, Guo, Peikun, Zanna, Khadija, Yang, Xiaoxue, and Sano, Akane
Subjects: Computer Science - Machine Learning
Abstract: Diffusion models have emerged as a robust framework for various generative tasks, including tabular data synthesis. However, current tabular diffusion models tend to inherit bias in the training dataset and generate biased synthetic data, which may influence discriminatory actions. In this research, we introduce a novel tabular diffusion model that incorporates sensitive guidance to generate fair synthetic data with balanced joint distributions of the target label and sensitive attributes, such as sex and race. The empirical results demonstrate that our method effectively mitigates bias in training data while maintaining the quality of the generated samples. Furthermore, we provide evidence that our approach outperforms existing methods for synthesizing tabular data on fairness metrics such as demographic parity ratio and equalized odds ratio, achieving improvements of over $10\%$. Our implementation is available at https://github.com/comp-well-org/fair-tab-diffusion.
Published: 2024

37. A Note on LoRA

Author: Fomenko, Vlad, Yu, Han, Lee, Jongho, Hsieh, Stanley, and Chen, Weizhu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: LoRA (Low-Rank Adaptation) has emerged as a preferred method for efficiently adapting Large Language Models (LLMs) with remarkable simplicity and efficacy. This note extends the original LoRA paper by offering new perspectives that were not initially discussed and presents a series of insights for deploying LoRA at scale. Without introducing new experiments, we aim to improve the understanding and application of LoRA.
Published: 2024

38. Integrated Communication, Localization, and Sensing in 6G D-MIMO Networks

Author: Guo, Hao, Wymeersch, Henk, Makki, Behrooz, Chen, Hui, Wu, Yibo, Durisi, Giuseppe, Keskin, Musa Furkan, Moghaddam, Mohammad H., Madapatha, Charitha, Yu, Han, Hammarberg, Peter, Kim, Hyowon, and Svensson, Tommy
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: Future generations of mobile networks call for concurrent sensing and communication functionalities in the same hardware and/or spectrum. Compared to communication, sensing services often suffer from limited coverage, due to the high path loss of the reflected signal and the increased infrastructure requirements. To provide a more uniform quality of service, distributed multiple input multiple output (D-MIMO) systems deploy a large number of distributed nodes and efficiently control them, making distributed integrated sensing and communications (ISAC) possible. In this paper, we investigate ISAC in D-MIMO through the lens of different design architectures and deployments, revealing both conflicts and synergies. In addition, simulation and demonstration results reveal both opportunities and challenges towards the implementation of ISAC in D-MIMO.
Published: 2024

39. Optimizing Vaccine Site Locations While Considering Travel Inconvenience and Public Health Outcomes

Author: Zhang, Suyanpeng, Suen, Sze-chuan, Yu, Han, Dessouky, Maged, and Ordonez, Fernando
Subjects: Mathematics - Optimization and Control
Abstract: During the COVID-19 pandemic, there were over three million infections in Los Angeles County (LAC). To facilitate distribution when vaccines first became available, LAC set up six mega-sites for dispensing a large number of vaccines to the public. To understand if another choice of mega-site location would have improved accessibility and health outcomes, and to provide insight into future vaccine allocation problems, we propose a multi-objective mixed integer linear programming model that balances travel convenience, infection reduction, and equitable distribution. We provide a tractable objective formulation that effectively proxies real-world public health goals of reducing infections while considering travel inconvenience and equitable distribution of resources. Compared with the solution empirically used in LAC in 2020, we recommend more dispersed mega-site locations that result in a 28% reduction in travel inconvenience and avert an additional 1,000 infections.
Published: 2024

40. Fairness-Aware Multi-Server Federated Learning Task Delegation over Wireless Networks

Author: Gao, Yulan, Ren, Chao, and Yu, Han
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: In the rapidly advancing field of federated learning (FL), ensuring efficient FL task delegation while incentivising FL client participation poses significant challenges, especially in wireless networks where FL participants' coverage is limited. Existing Contract Theory-based methods are designed under the assumption that there is only one FL server in the system (i.e., the monopoly market assumption), which in unrealistic in practice. To address this limitation, we propose Fairness-Aware Multi-Server FL task delegation approach (FAMuS), a novel framework based on Contract Theory and Lyapunov optimization to jointly address these intricate issues facing wireless multi-server FL networks (WMSFLN). Within a given WMSFLN, a task requester products multiple FL tasks and delegate them to FL servers which coordinate the training processes. To ensure fair treatment of FL servers, FAMuS establishes virtual queues to track their previous access to FL tasks, updating them in relation to the resulting FL model performance. The objective is to minimize the time-averaged cost in a WMSFLN, while ensuring all queues remain stable. This is particularly challenging given the incomplete information regarding FL clients' participation cost and the unpredictable nature of the WMSFLN state, which depends on the locations of the mobile clients. Extensive experiments comparing FAMuS against five state-of-the-art approaches based on two real-world datasets demonstrate that it achieves 6.91% higher test accuracy, 27.34% lower cost, and 0.63% higher fairness on average than the best-performing baseline.
Published: 2024

41. Machine Learning Assisted Adjustment Boosts Efficiency of Exact Inference in Randomized Controlled Trials

Author: Yu, Han, Hutson, Alan D., and Ma, Xiaoyi
Subjects: Statistics - Methodology, Statistics - Machine Learning
Abstract: In this work, we proposed a novel inferential procedure assisted by machine learning based adjustment for randomized control trials. The method was developed under the Rosenbaum's framework of exact tests in randomized experiments with covariate adjustments. Through extensive simulation experiments, we showed the proposed method can robustly control the type I error and can boost the statistical efficiency for a randomized controlled trial (RCT). This advantage was further demonstrated in a real-world example. The simplicity, flexibility, and robustness of the proposed method makes it a competitive candidate as a routine inference procedure for RCTs, especially when nonlinear association or interaction among covariates is expected. Its application may remarkably reduce the required sample size and cost of RCTs, such as phase III clinical trials.
Published: 2024

42. A Survey on Evaluation of Out-of-Distribution Generalization

Author: Yu, Han, Liu, Jiashuo, Zhang, Xingxuan, Wu, Jiayun, and Cui, Peng
Subjects: Computer Science - Machine Learning
Abstract: Machine learning models, while progressively advanced, rely heavily on the IID assumption, which is often unfulfilled in practice due to inevitable distribution shifts. This renders them susceptible and untrustworthy for deployment in risk-sensitive applications. Such a significant problem has consequently spawned various branches of works dedicated to developing algorithms capable of Out-of-Distribution (OOD) generalization. Despite these efforts, much less attention has been paid to the evaluation of OOD generalization, which is also a complex and fundamental problem. Its goal is not only to assess whether a model's OOD generalization capability is strong or not, but also to evaluate where a model generalizes well or poorly. This entails characterizing the types of distribution shifts that a model can effectively address, and identifying the safe and risky input regions given a model. This paper serves as the first effort to conduct a comprehensive review of OOD evaluation. We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization, according to the availability of test data. Additionally, we briefly discuss OOD evaluation in the context of pretrained models. In closing, we propose several promising directions for future research in OOD evaluation.
Published: 2024

43. Internet-Based Individualized Cognitive Behavioral Therapy for Shift Work Sleep Disorder Empowered by Well-Being Prediction: Protocol for a Pilot Study

Author: Ito-Masui, Asami, Kawamoto, Eiji, Sakamoto, Ryota, Yu, Han, Sano, Akane, Motomura, Eishi, Tanii, Hisashi, Sakano, Shoko, Esumi, Ryo, Imai, Hiroshi, and Shimaoka, Motomu
Subjects: Medicine, Computer applications to medicine. Medical informatics, R858-859.7
Abstract: BackgroundShift work sleep disorders (SWSDs) are associated with the high turnover rates of nurses, and are considered a major medical safety issue. However, initial management can be hampered by insufficient awareness. In recent years, it has become possible to visualize, collect, and analyze the work-life balance of health care workers with irregular sleeping and working habits using wearable sensors that can continuously monitor biometric data under real-life settings. In addition, internet-based cognitive behavioral therapy for psychiatric disorders has been shown to be effective. Application of wearable sensors and machine learning may potentially enhance the beneficial effects of internet-based cognitive behavioral therapy. ObjectiveIn this study, we aim to develop and evaluate the effect of a new internet-based cognitive behavioral therapy for SWSD (iCBTS). This system includes current methods such as medical sleep advice, as well as machine learning well-being prediction to improve the sleep durations of shift workers and prevent declines in their well-being. MethodsThis study consists of two phases: (1) preliminary data collection and machine learning for well-being prediction; (2) intervention and evaluation of iCBTS for SWSD. Shift workers in the intensive care unit at Mie University Hospital will wear a wearable sensor that collects biometric data and answer daily questionnaires regarding their well-being. They will subsequently be provided with an iCBTS app for 4 weeks. Sleep and well-being measurements between baseline and the intervention period will be compared. ResultsRecruitment for phase 1 ended in October 2019. Recruitment for phase 2 has started in October 2020. Preliminary results are expected to be available by summer 2021. ConclusionsiCBTS empowered with well-being prediction is expected to improve the sleep durations of shift workers, thereby enhancing their overall well-being. Findings of this study will reveal the potential of this system for improving sleep disorders among shift workers. Trial RegistrationUMIN Clinical Trials Registry UMIN000036122 (phase 1), UMIN000040547 (phase 2); https://tinyurl.com/dkfmmmje, https://upload.umin.ac.jp/cgi-open-bin/ctr_e/ctr_view.cgi?recptno=R000046284 International Registered Report Identifier (IRRID)DERR1-10.2196/24799
Published: 2021
Full Text: View/download PDF

44. Pneumatic tube transport-induced pseudohyperkalemia in patients with extreme leukocytosis: a retrospective study from a single medical center

Author: Tseng, Yu-Chuan, Lin, Peter Bor-Chian, Hsieh, Stephanie, Huang, Kuan-Lin, Hsiao, Chiung-Tzu, Hsiao, Yu-Chi, Liu, Yi-Ju, Huang, Yu-Han, and Wu, Cho-Han
Published: 2024
Full Text: View/download PDF

45. Synergy of metal–support interaction and positive Pd species promoting efficient C–Cl bond activation on Pd-based Ce-MOF-derived catalysts

Author: Hu, Xiao-Jie, Sun, Yu-Han, Liu, Ling-Yue, Mao, Dan-Jun, and Zheng, Shou-Rong
Published: 2024
Full Text: View/download PDF

46. Long-Run Effects of Childhood Exposure to Medical Marijuana Laws on Education and Labor Market Outcomes

Author: Yang, Maorui and Yu, Han
Published: 2024
Full Text: View/download PDF

47. Overlapping Domain Decomposition Methods Based on Tensor Format for Solving High-Dimensional Partial Differential Equations

Author: Chen, Yu-Han and Li, Chen-Liang
Published: 2024
Full Text: View/download PDF

48. Medium bandgap A-DA’D-A type small molecule acceptors prepared by synergetic modification strategy for efficient indoor organic photovoltaic devices

Author: Zou, Tianwei, Gong, Yufei, Li, Xiaojun, Sun, Guangpei, Ng, Ho Ming, Zhang, Ming, Yu, Han, Zhang, Zhi-Guo, Meng, Lei, and Li, Yongfang
Published: 2024
Full Text: View/download PDF

49. Association of pyrethroids exposure with asthma in US children and adolescents: a nationally representative cross-sectional study

Author: Wang, Yi-Fan, Gao, Fei, Jiang, Yu-Han, Xia, Rui-Wen, Wang, Xu, Li, Li, Wang, Xue-Lin, Yun, Ya-Nan, and Zou, Ying-Xue
Published: 2024
Full Text: View/download PDF

50. How to design a sustainable street network for neighbourhoods: an empirical study of China’s inner cities from the perspective of spatial configuration

Author: Song, Yacheng, Li, Jingjin, Wang, Ruoyu, Yu, Han, Li, Fanyi, and Pang, Yueting
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

24,527 results on '"Yu, Han"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources