Author: "Gao, Huan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Gao, Huan"' showing total 2,074 results

Start Over Author "Gao, Huan"

2,074 results on '"Gao, Huan"'

1. Diffusion-based Visual Anagram as Multi-task Learning

Author: Xu, Zhiyuan, Chen, Yinhe, Gao, Huan-ang, Zhao, Weiyan, Zhang, Guiyu, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual anagrams are images that change appearance upon transformation, like flipping or rotation. With the advent of diffusion models, generating such optical illusions can be achieved by averaging noise across multiple views during the reverse denoising process. However, we observe two critical failure modes in this approach: (i) concept segregation, where concepts in different views are independently generated, which can not be considered a true anagram, and (ii) concept domination, where certain concepts overpower others. In this work, we cast the visual anagram generation problem in a multi-task learning setting, where different viewpoint prompts are analogous to different tasks,and derive denoising trajectories that align well across tasks simultaneously. At the core of our designed framework are two newly introduced techniques, where (i) an anti-segregation optimization strategy that promotes overlap in cross-attention maps between different concepts, and (ii) a noise vector balancing method that adaptively adjusts the influence of different tasks. Additionally, we observe that directly averaging noise predictions yields suboptimal performance because statistical properties may not be preserved, prompting us to derive a noise variance rectification method. Extensive qualitative and quantitative experiments demonstrate our method's superior ability to generate visual anagrams spanning diverse concepts., Comment: WACV 2025. Code is publicly available at https://github.com/Pixtella/Anagram-MTL
Published: 2024

2. Dual-frame Fluid Motion Estimation with Test-time Optimization and Zero-divergence Loss

Author: Zhang, Yifei, Gao, Huan-ang, Jiang, Zhou, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D particle tracking velocimetry (PTV) is a key technique for analyzing turbulent flow, one of the most challenging computational problems of our century. At the core of 3D PTV is the dual-frame fluid motion estimation algorithm, which tracks particles across two consecutive frames. Recently, deep learning-based methods have achieved impressive accuracy in dual-frame fluid motion estimation; however, they heavily depend on large volumes of labeled data. In this paper, we introduce a new method that is completely self-supervised and notably outperforms its fully-supervised counterparts while requiring only 1% of the training samples (without labels) used by previous methods. Our method features a novel zero-divergence loss that is specific to the domain of turbulent flow. Inspired by the success of splat operation in high-dimensional filtering and random fields, we propose a splat-based implementation for this loss which is both efficient and effective. The self-supervised nature of our method naturally supports test-time optimization, leading to the development of a tailored Dynamic Velocimetry Enhancer (DVE) module. We demonstrate that strong cross-domain robustness is achieved through test-time optimization on unseen leave-one-out synthetic domains and real physical/biological domains. Code, data and models are available at https://github.com/Forrest-110/FluidMotionNet., Comment: Accepted by NeurIPS 2024
Published: 2024

3. Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Author: Zhang, Guiyu, Gao, Huan-ang, Jiang, Zijian, Zhao, Hao, and Zheng, Zhedong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we focus on the task of conditional image generation, where an image is synthesized according to user instructions. The critical challenge underpinning this task is ensuring both the fidelity of the generated images and their semantic alignment with the provided conditions. To tackle this issue, previous studies have employed supervised perceptual losses derived from pre-trained models, i.e., reward models, to enforce alignment between the condition and the generated result. However, we observe one inherent shortcoming: considering the diversity of synthesized images, the reward model usually provides inaccurate feedback when encountering newly generated data, which can undermine the training process. To address this limitation, we propose an uncertainty-aware reward modeling, called Ctrl-U, including uncertainty estimation and uncertainty-aware regularization, designed to reduce the adverse effects of imprecise feedback from the reward model. Given the inherent cognitive uncertainty within reward models, even images generated under identical conditions often result in a relatively large discrepancy in reward loss. Inspired by the observation, we explicitly leverage such prediction variance as an uncertainty indicator. Based on the uncertainty estimation, we regularize the model training by adaptively rectifying the reward. In particular, rewards with lower uncertainty receive higher loss weights, while those with higher uncertainty are given reduced weights to allow for larger variability. The proposed uncertainty regularization facilitates reward fine-tuning through consistency construction. Extensive experiments validate the effectiveness of our methodology in improving the controllability and generation quality, as well as its scalability across diverse conditional scenarios. Code will soon be available at https://grenoble-zhang.github.io/Ctrl-U-Page/., Comment: Preprint. Work in progress
Published: 2024

4. RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image

Author: Chen, Xiaoxue, Zheng, Jv, Huang, Hao, Xu, Haoran, Gu, Weihao, Chen, Kangliang, xiang, He, Gao, Huan-ang, Zhao, Hao, Zhou, Guyue, and Zhang, Yaqin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The generation of high-quality 3D car assets is essential for various applications, including video games, autonomous driving, and virtual reality. Current 3D generation methods utilizing NeRF or 3D-GS as representations for 3D objects, generate a Lambertian object under fixed lighting and lack separated modelings for material and global illumination. As a result, the generated assets are unsuitable for relighting under varying lighting conditions, limiting their applicability in downstream tasks. To address this challenge, we propose a novel relightable 3D object generative framework that automates the creation of 3D car assets, enabling the swift and accurate reconstruction of a vehicle's geometry, texture, and material properties from a single input image. Our approach begins with introducing a large-scale synthetic car dataset comprising over 1,000 high-precision 3D vehicle models. We represent 3D objects using global illumination and relightable 3D Gaussian primitives integrating with BRDF parameters. Building on this representation, we introduce a feed-forward model that takes images as input and outputs both relightable 3D Gaussians and global illumination parameters. Experimental results demonstrate that our method produces photorealistic 3D car assets that can be seamlessly integrated into road scenes with different illuminations, which offers substantial practical benefits for industrial applications.
Published: 2024

5. Effects of pristine and photoaged tire wear particles and their leachable additives on key nitrogen removal processes and nitrous oxide accumulation in estuarine sediments

Author: Ye, Jinyu, Gao, Yuan, Gao, Huan, Zhao, Qingqing, Zhou, Minjie, Xue, Xiangdong, and Shi, Meng
Subjects: Quantitative Biology - Populations and Evolution, Physics - Atmospheric and Oceanic Physics
Abstract: Global estuaries and coastal regions, acting as critical interfaces for mitigating nitrogen flux to marine, concurrently contend with contamination from tire wear particles (TWPs). However, the effects of pristine and photoaged TWP (P-TWP and A-TWP) and their leachates (P-TWPL and A-TWPL) on key nitrogen removal processes in estuarine sediments remain unclear. This study explored the responses of denitrification rate, anammox rate, and nitrous oxide (N2O) accumulation to P-TWP, A-TWP, P-TWPL, and A-TWPL exposures in estuarine sediments, and assessed the potential biotoxic substances in TWPL. Results indicate that P-TWP inhibited the denitrification rate and increased N2O accumulation without significantly impacting the anammox rate. A-TWP intensified the denitrification rate inhibition by further reducing narG gene abundance and NAR activity, and also decreased the hzo gene abundance, HZO activity, and Candidatus Kuenenia abundance, thereby slowing the anammox rate. N2O accumulation was lower after A-TWP exposure than P-TWP, with the NIR/NOS and NOR/NOS activity ratios closely associated with N2O accumulation. Batch experiments indicated that photoaging promoted Zn release from TWPL, significantly contributing to the inhibited denitrification rate and increased N2O accumulation by TWP. In addition, TWP drives changes in microbial community structure through released additives, with the abundance of DNB and AnAOB closely linked to the Zn, Mn, and As concentrations in TWPL. This study offers insights into assessing the environmental risks of TWPs in estuarine ecosystems., Comment: 42 pages, 1 table, 7 figures
Published: 2024

6. Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving

Author: Ding, Kairui, Chen, Boyuan, Su, Yuchen, Gao, Huan-ang, Jin, Bu, Sima, Chonghao, Zhang, Wuqiang, Li, Xiaohui, Barsch, Paul, Li, Hongyang, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: End-to-end architectures in autonomous driving (AD) face a significant challenge in interpretability, impeding human-AI trust. Human-friendly natural language has been explored for tasks such as driving explanation and 3D captioning. However, previous works primarily focused on the paradigm of declarative interpretability, where the natural language interpretations are not grounded in the intermediate outputs of AD systems, making the interpretations only declarative. In contrast, aligned interpretability establishes a connection between language and the intermediate outputs of AD systems. Here we introduce Hint-AD, an integrated AD-language system that generates language aligned with the holistic perception-prediction-planning outputs of the AD model. By incorporating the intermediate outputs and a holistic token mixer sub-network for effective feature adaptation, Hint-AD achieves desirable accuracy, achieving state-of-the-art results in driving language tasks including driving explanation, 3D dense captioning, and command prediction. To facilitate further study on driving explanation task on nuScenes, we also introduce a human-labeled dataset, Nu-X. Codes, dataset, and models will be publicly available., Comment: CoRL 2024, Project Page: https://air-discover.github.io/Hint-AD/
Published: 2024

7. Training-Free Model Merging for Multi-target Domain Adaptation

Author: Li, Wenyi, Gao, Huan-ang, Gao, Mingju, Tian, Beiwen, Zhi, Rong, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we study multi-target domain adaptation of scene understanding models. While previous methods achieved commendable results through inter-domain consistency losses, they often assumed unrealistic simultaneous access to images from all target domains, overlooking constraints such as data transfer bandwidth limitations and data privacy concerns. Given these challenges, we pose the question: How to merge models adapted independently on distinct domains while bypassing the need for direct access to training data? Our solution to this problem involves two components, merging model parameters and merging model buffers (i.e., normalization layer statistics). For merging model parameters, empirical analyses of mode connectivity surprisingly reveal that linear merging suffices when employing the same pretrained backbone weights for adapting separate models. For merging model buffers, we model the real-world distribution with a Gaussian prior and estimate new statistics from the buffers of separately trained models. Our method is simple yet effective, achieving comparable performance with data combination training baselines, while eliminating the need for accessing training data. Project page: https://air-discover.github.io/ModelMerging, Comment: Accepted to ECCV 2024
Published: 2024

8. FairDiff: Fair Segmentation with Point-Image Diffusion

Author: Li, Wenyi, Xu, Haoran, Zhang, Guiyu, Gao, Huan-ang, Gao, Mingju, Wang, Mengyu, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fairness is an important topic for medical image analysis, driven by the challenge of unbalanced training data among diverse target groups and the societal demand for equitable medical quality. In response to this issue, our research adopts a data-driven strategy-enhancing data balance by integrating synthetic images. However, in terms of generating synthetic images, previous works either lack paired labels or fail to precisely control the boundaries of synthetic images to be aligned with those labels. To address this, we formulate the problem in a joint optimization manner, in which three networks are optimized towards the goal of empirical risk minimization and fairness maximization. On the implementation side, our solution features an innovative Point-Image Diffusion architecture, which leverages 3D point clouds for improved control over mask boundaries through a point-mask-image synthesis pipeline. This method outperforms significantly existing techniques in synthesizing scanning laser ophthalmoscopy (SLO) fundus images. By combining synthetic data with real data during the training phase using a proposed Equal Scale approach, our model achieves superior fairness segmentation performance compared to the state-of-the-art fairness learning models. Code is available at https://github.com/wenyi-li/FairDiff., Comment: Accepted to MICCAI 2024
Published: 2024

9. Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Author: He, Bingxiang, Ding, Ning, Qian, Cheng, Deng, Jia, Cui, Ganqu, Yuan, Lifan, Gao, Huan-ang, Chen, Huimin, Liu, Zhiyuan, and Sun, Maosong
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Understanding alignment techniques begins with comprehending zero-shot generalization brought by instruction tuning, but little of the mechanism has been understood. Existing work has largely been confined to the task level, without considering that tasks are artificially defined and, to LLMs, merely consist of tokens and representations. This line of research has been limited to examining transfer between tasks from a task-pair perspective, with few studies focusing on understanding zero-shot generalization from the perspective of the data itself. To bridge this gap, we first demonstrate through multiple metrics that zero-shot generalization during instruction tuning happens very early. Next, we investigate the facilitation of zero-shot generalization from both data similarity and granularity perspectives, confirming that encountering highly similar and fine-grained training data earlier during instruction tuning, without the constraints of defined "tasks", enables better generalization. Finally, we propose a more grounded training data arrangement method, Test-centric Multi-turn Arrangement, and show its effectiveness in promoting continual learning and further loss reduction. For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level. We hope our analysis will advance the understanding of zero-shot generalization during instruction tuning and contribute to the development of more aligned LLMs. Our code is released at https://github.com/HBX-hbx/dynamics_of_zero-shot_generalization., Comment: 33 pages, 14 figures
Published: 2024

10. Rip-NeRF: Anti-aliasing Radiance Fields with Ripmap-Encoded Platonic Solids

Author: Liu, Junchen, Hu, Wenbo, Yang, Zhuo, Chen, Jianteng, Wang, Guoliang, Chen, Xiaoxue, Cai, Yantong, Gao, Huan-ang, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics
Abstract: Despite significant advancements in Neural Radiance Fields (NeRFs), the renderings may still suffer from aliasing and blurring artifacts, since it remains a fundamental challenge to effectively and efficiently characterize anisotropic areas induced by the cone-casting procedure. This paper introduces a Ripmap-Encoded Platonic Solid representation to precisely and efficiently featurize 3D anisotropic areas, achieving high-fidelity anti-aliasing renderings. Central to our approach are two key components: Platonic Solid Projection and Ripmap encoding. The Platonic Solid Projection factorizes the 3D space onto the unparalleled faces of a certain Platonic solid, such that the anisotropic 3D areas can be projected onto planes with distinguishable characterization. Meanwhile, each face of the Platonic solid is encoded by the Ripmap encoding, which is constructed by anisotropically pre-filtering a learnable feature grid, to enable featurzing the projected anisotropic areas both precisely and efficiently by the anisotropic area-sampling. Extensive experiments on both well-established synthetic datasets and a newly captured real-world dataset demonstrate that our Rip-NeRF attains state-of-the-art rendering quality, particularly excelling in the fine details of repetitive structures and textures, while maintaining relatively swift training times., Comment: SIGGRAPH 2024, Project page: https://junchenliu77.github.io/Rip-NeRF , Code: https://github.com/JunchenLiu77/Rip-NeRF
Published: 2024
Full Text: View/download PDF

11. PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

Author: Ding, Kairui, Chen, Boyuan, Wu, Ruihai, Li, Yuyang, Zhang, Zongzheng, Gao, Huan-ang, Li, Siqi, Zhou, Guyue, Zhu, Yixin, Dong, Hao, and Zhao, Hao
Subjects: Computer Science - Robotics, Computer Science - Computer Vision and Pattern Recognition
Abstract: Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-grasping methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-grasping planning framework incorporating a point-level affordance representation and a relay training approach. Our method significantly improves adaptability, allowing effective manipulation across a wide range of environments and object types. When evaluated on the ShapeNet-v2 dataset, PreAfford not only enhances grasping success rates by 69% but also demonstrates its practicality through successful real-world experiments. These improvements highlight PreAfford's potential to redefine standards for robotic handling of complex manipulation tasks in diverse settings., Comment: Project Page: https://air-discover.github.io/PreAfford/
Published: 2024

12. Denitrification characteristics and potential application of a novel aerobic denitrifying bacterium, Pseudomonas plecoglossicida, isolated from the Exopalaemon carinicauda pond

Author: Shen, Shanrui, Zhou, Yuan, Qian, Han, Wu, Chen, Gao, Huan, and Lai, Xiaofang
Published: 2024
Full Text: View/download PDF

13. Online Information Literacies of Chinese International Students in the United States during the COVID-19 Pandemic

Author: Gao, Huan and Kohnen, Angela
Abstract: Using a transnational lens, this narrative study examines the online information literacies of six Chinese international graduate students in the United States during the COVID-19 pandemic. The data of the study were collected from phenomenological interviewing, weekly information-seeking dairies, and focus group discussions. This study illuminates Chinese international students' transnational information literacies in navigating the pandemic online information environment. These students stayed attuned with the pandemic conditions and relevant regulations in order to inform their important decision-making concerning health, safety, visa issues, and international travel. The study also highlights participants' cultural ways of information seeking and pragmatic approaches to information credibility assessment. Results from the study show the importance of understanding and empowering the information literacy of international students, especially during a global health emergency.
Published: 2023

14. SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing

Author: Song, Xiaowei, Zheng, Jv, Yuan, Shiran, Gao, Huan-ang, Zhao, Jingwei, He, Xiang, Gu, Weihao, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we present a Scale-adaptive method for Anti-aliasing Gaussian Splatting (SA-GS). While the state-of-the-art method Mip-Splatting needs modifying the training procedure of Gaussian splatting, our method functions at test-time and is training-free. Specifically, SA-GS can be applied to any pretrained Gaussian splatting field as a plugin to significantly improve the field's anti-alising performance. The core technique is to apply 2D scale-adaptive filters to each Gaussian during test time. As pointed out by Mip-Splatting, observing Gaussians at different frequencies leads to mismatches between the Gaussian scales during training and testing. Mip-Splatting resolves this issue using 3D smoothing and 2D Mip filters, which are unfortunately not aware of testing frequency. In this work, we show that a 2D scale-adaptive filter that is informed of testing frequency can effectively match the Gaussian scale, thus making the Gaussian primitive distribution remain consistent across different testing frequencies. When scale inconsistency is eliminated, sampling rates smaller than the scene frequency result in conventional jaggedness, and we propose to integrate the projected 2D Gaussian within each pixel during testing. This integration is actually a limiting case of super-sampling, which significantly improves anti-aliasing performance over vanilla Gaussian Splatting. Through extensive experiments using various settings and both bounded and unbounded scenes, we show SA-GS performs comparably with or better than Mip-Splatting. Note that super-sampling and integration are only effective when our scale-adaptive filtering is activated. Our codes, data and models are available at https://github.com/zsy1987/SA-GS., Comment: Project page: https://kevinsong729.github.io/project-pages/SA-GS/ Code: https://github.com/zsy1987/SA-GS
Published: 2024

15. Ultraman: Single Image 3D Human Reconstruction with Ultra Speed and Detail

Author: Chen, Mingjin, Chen, Junhao, Ye, Xiaojun, Gao, Huan-ang, Chen, Xiaoxue, Fan, Zhaoxin, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: 3D human body reconstruction has been a challenge in the field of computer vision. Previous methods are often time-consuming and difficult to capture the detailed appearance of the human body. In this paper, we propose a new method called \emph{Ultraman} for fast reconstruction of textured 3D human models from a single image. Compared to existing techniques, \emph{Ultraman} greatly improves the reconstruction speed and accuracy while preserving high-quality texture details. We present a set of new frameworks for human reconstruction consisting of three parts, geometric reconstruction, texture generation and texture mapping. Firstly, a mesh reconstruction framework is used, which accurately extracts 3D human shapes from a single image. At the same time, we propose a method to generate a multi-view consistent image of the human body based on a single image. This is finally combined with a novel texture mapping method to optimize texture details and ensure color consistency during reconstruction. Through extensive experiments and evaluations, we demonstrate the superior performance of \emph{Ultraman} on various standard datasets. In addition, \emph{Ultraman} outperforms state-of-the-art methods in terms of human rendering quality and speed. Upon acceptance of the article, we will make the code and data publicly available., Comment: Project Page: https://air-discover.github.io/Ultraman/
Published: 2024

16. P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors

Author: Jiang, Zhou, Zhu, Zhenxin, Li, Pengfei, Gao, Huan-ang, Yuan, Tianyuan, Shi, Yongliang, Zhao, Hang, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Autonomous vehicles are gradually entering city roads today, with the help of high-definition maps (HDMaps). However, the reliance on HDMaps prevents autonomous vehicles from stepping into regions without this expensive digital infrastructure. This fact drives many researchers to study online HDMap generation algorithms, but the performance of these algorithms at far regions is still unsatisfying. We present P-MapNet, in which the letter P highlights the fact that we focus on incorporating map priors to improve model performance. Specifically, we exploit priors in both SDMap and HDMap. On one hand, we extract weakly aligned SDMap from OpenStreetMap, and encode it as an additional conditioning branch. Despite the misalignment challenge, our attention-based architecture adaptively attends to relevant SDMap skeletons and significantly improves performance. On the other hand, we exploit a masked autoencoder to capture the prior distribution of HDMap, which can serve as a refinement module to mitigate occlusions and artifacts. We benchmark on the nuScenes and Argoverse2 datasets. Through comprehensive experiments, we show that: (1) our SDMap prior can improve online map generation performance, using both rasterized (by up to $+18.73$ $\rm mIoU$) and vectorized (by up to $+8.50$ $\rm mAP$) output representations. (2) our HDMap prior can improve map perceptual metrics by up to $6.34\%$. (3) P-MapNet can be switched into different inference modes that covers different regions of the accuracy-efficiency trade-off landscape. (4) P-MapNet is a far-seeing solution that brings larger improvements on longer ranges. Codes and models are publicly available at https://jike5.github.io/P-MapNet., Comment: Code: https://jike5.github.io/P-MapNet
Published: 2024

17. SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis

Author: Gao, Huan-ang, Gao, Mingju, Li, Jiaju, Li, Wenyi, Zhi, Rong, Tang, Hao, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Semantic image synthesis (SIS) shows good promises for sensor simulation. However, current best practices in this field, based on GANs, have not yet reached the desired level of quality. As latent diffusion models make significant strides in image generation, we are prompted to evaluate ControlNet, a notable method for its dense control capabilities. Our investigation uncovered two primary issues with its results: the presence of weird sub-structures within large semantic areas and the misalignment of content with the semantic mask. Through empirical study, we pinpointed the cause of these problems as a mismatch between the noised training data distribution and the standard normal prior applied at the inference stage. To address this challenge, we developed specific noise priors for SIS, encompassing spatial, categorical, and a novel spatial-categorical joint prior for inference. This approach, which we have named SCP-Diff, has set new state-of-the-art results in SIS on Cityscapes, ADE20K and COCO-Stuff, yielding a FID as low as 10.53 on Cityscapes. The code and models can be accessed via the project page., Comment: Project Page: https://air-discover.github.io/SCP-Diff/
Published: 2024

18. Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics

Author: Tian, Beiwen, Gao, Huan-ang, Cui, Leiyao, Zheng, Yupeng, Luo, Lan, Wang, Baofeng, Zhi, Rong, Zhou, Guyue, and Zhao, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unrealistic frame-wise manner. To bridge this gap, we contribute the first video anomaly segmentation dataset for autonomous driving. Since placing various anomalous objects on busy roads and annotating them in every frame are dangerous and expensive, we resort to synthetic data. To improve the relevance of this synthetic dataset to real-world applications, we train a generative adversarial network conditioned on rendering G-buffers for photorealism enhancement. Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns. As an initial benchmarking, we provide baselines using latest supervised and unsupervised road anomaly segmentation methods. Apart from conventional ones, we focus on two new metrics: temporal consistency and latencyaware streaming accuracy. We believe the latter is valuable as it measures whether an anomaly segmentation algorithm can truly prevent a car from crashing in a temporally informed setting.
Published: 2024

19. Training-Free Model Merging for Multi-target Domain Adaptation

Author: Li, Wenyi, Gao, Huan-ang, Gao, Mingju, Tian, Beiwen, Zhi, Rong, Zhao, Hao, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

20. SCP-Diff: Spatial-Categorical Joint Prior for Diffusion Based Semantic Image Synthesis

Author: Gao, Huan-ang, Gao, Mingju, Li, Jiaju, Li, Wenyi, Zhi, Rong, Tang, Hao, Zhao, Hao, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

21. Adapting Short-Term Transformers for Action Detection in Untrimmed Videos

Author: Yang, Min, Gao, Huan, Guo, Ping, and Wang, Limin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision Transformer (ViT) has shown high potential in video recognition, owing to its flexible design, adaptable self-attention mechanisms, and the efficacy of masked pre-training. Yet, it remains unclear how to adapt these pre-trained short-term ViTs for temporal action detection (TAD) in untrimmed videos. The existing works treat them as off-the-shelf feature extractors for each short-trimmed snippet without capturing the fine-grained relation among different snippets in a broader temporal context. To mitigate this issue, this paper focuses on designing a new mechanism for adapting these pre-trained ViT models as a unified long-form video transformer to fully unleash its modeling power in capturing inter-snippet relation, while still keeping low computation overhead and memory consumption for efficient TAD. To this end, we design effective cross-snippet propagation modules to gradually exchange short-term video information among different snippets from two levels. For inner-backbone information propagation, we introduce a cross-snippet propagation strategy to enable multi-snippet temporal feature interaction inside the backbone.For post-backbone information propagation, we propose temporal transformer layers for further clip-level modeling. With the plain ViT-B pre-trained with VideoMAE, our end-to-end temporal action detector (ViT-TAD) yields a very competitive performance to previous temporal action detectors, riching up to 69.5 average mAP on THUMOS14, 37.40 average mAP on ActivityNet-1.3 and 17.20 average mAP on FineAction., Comment: Accepted by CVPR2024
Published: 2023

22. A deep LSTM-based constitutive model for describing the impact characteristics of concrete–granite composites with different roughness interfaces

Author: Gao, Huan, Zhai, Yue, and Wang, Tienan
Published: 2024
Full Text: View/download PDF

23. National survey on the current status of airway management in China

Author: He, Yuewen, Zhang, Zhengze, Li, Ruogen, Hu, Die, Gao, Huan, Liu, Yurui, Liu, Hao, Feng, Siqi, Liu, Huihui, Zhong, Ming, Li, Yuhui, Wang, Yong, and Ma, Wuhua
Published: 2024
Full Text: View/download PDF

24. DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

Author: Gao, Huan-ang, Tian, Beiwen, Li, Pengfei, Zhao, Hao, and Zhou, Guyue
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes. We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently. While this paradigm is natural for image-level or pixel-level prediction, adapting it to the detection problem is challenged by the issue of proposal matching. Prior methods are based upon two-stage pipelines, matching heuristically selected proposals generated in the first stage and resulting in spatially sparse training signals. In contrast, we propose the first semisupervised 3D detection algorithm that works in the singlestage manner and allows spatially dense training signals. A fundamental issue of this new design is the quantization error caused by point-to-voxel discretization, which inevitably leads to misalignment between two transformed views in the voxel domain. To this end, we derive and implement closed-form rules that compensate this misalignment onthe-fly. Our results are significant, e.g., promoting ScanNet mAP@0.5 from 35.2% to 48.5% using 20% annotation. Codes and data will be publicly available., Comment: Accepted to ICCV 2023. Code: https://github.com/AIR-DISCOVER/DQS3D
Published: 2023

25. STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation

Author: Zheng, Yupeng, Zhong, Chengliang, Li, Pengfei, Gao, Huan-ang, Zheng, Yuhang, Jin, Bu, Wang, Ling, Zhao, Hao, Zhou, Guyue, Zhang, Qichao, and Zhao, Dongbin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Self-supervised depth estimation draws a lot of attention recently as it can promote the 3D sensing capabilities of self-driving vehicles. However, it intrinsically relies upon the photometric consistency assumption, which hardly holds during nighttime. Although various supervised nighttime image enhancement methods have been proposed, their generalization performance in challenging driving scenarios is not satisfactory. To this end, we propose the first method that jointly learns a nighttime image enhancer and a depth estimator, without using ground truth for either task. Our method tightly entangles two self-supervised tasks using a newly proposed uncertain pixel masking strategy. This strategy originates from the observation that nighttime images not only suffer from underexposed regions but also from overexposed regions. By fitting a bridge-shaped curve to the illumination map distribution, both regions are suppressed and two tasks are bridged naturally. We benchmark the method on two established datasets: nuScenes and RobotCar and demonstrate state-of-the-art performance on both of them. Detailed ablations also reveal the mechanism of our proposal. Last but not least, to mitigate the problem of sparse ground truth of existing datasets, we provide a new photo-realistically enhanced nighttime dataset based upon CARLA. It brings meaningful new challenges to the community. Codes, data, and models are available at https://github.com/ucaszyp/STEPS., Comment: Accepted by ICRA 2023, Code: https://github.com/ucaszyp/STEPS
Published: 2023

26. From Semi-supervised to Omni-supervised Room Layout Estimation Using Point Clouds

Author: Gao, Huan-ang, Tian, Beiwen, Li, Pengfei, Chen, Xiaoxue, Zhao, Hao, Zhou, Guyue, Chen, Yurong, and Zha, Hongbin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-the-art (SOTA) solution for PC-based layout estimation is not straightforward. To this end, we define a quad set matching strategy and several consistency losses based upon metrics tailored for layout quads. Besides, we propose a new online pseudo-label harvesting algorithm that decomposes the distribution of a hybrid distance measure between quads and PC into two components. This technique does not need manual threshold selection and intuitively encourages quads to align with reliable layout points. Surprisingly, this framework also works for the fully-supervised setting, achieving a new SOTA on the ScanNet benchmark. Last but not least, we also push the semi-supervised setting to the realistic omni-supervised setting, demonstrating significantly promoted performance on a newly annotated ARKitScenes testing set. Our codes, data and models are released in this repository., Comment: Accepted to ICRA2023. Code: https://github.com/AIR-DISCOVER/Omni-PQ
Published: 2023

27. PS-DeiT: A Part-Selection Based DeiT for Fine-Grained Classification

Author: Gao, Huan, Guo, Yu, Zhao, Tingting, Hu, Zhiqiang, Chen, Yarui, Xie, Ning, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Pan, Yijie, editor, and Zhang, Qinhu, editor
Published: 2024
Full Text: View/download PDF

28. Privacy Preserving Distributed Optimization via Paillier Encryption and Randomness Injection

Author: Cheng, Xinyan, Gao, Huan, Zhi, Yongfeng, Zhang, Shu, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Hua, Yongzhao, editor, Liu, Yishi, editor, and Han, Liang, editor
Published: 2024
Full Text: View/download PDF

29. Modulation of DOM-Induced Head-Twitch Response by mGluR2 Agonist/Inverse Agonist is Associated with 5-HT2AR-Mediated Gs Signaling Pathway

Author: Gao, Huan, Liu, Xiaoqian, Xie, Lulu, Tan, Bo, and Su, Ruibin
Published: 2024
Full Text: View/download PDF

30. Baseline prevalence and longitudinal assessment of autonomic dysfunction in early Parkinson’s disease

Author: Yang, Lanqing, Gao, Huan, and Ye, Min
Published: 2024
Full Text: View/download PDF

31. Dynamics based Privacy Preservation in Decentralized Optimization

Author: Gao, Huan, Wang, Yongqiang, and Nedić, Angelia
Subjects: Mathematics - Optimization and Control
Abstract: With decentralized optimization having increased applications in various domains ranging from machine learning, control, sensor networks, to robotics, its privacy is also receiving increased attention. Existing privacy-preserving approaches for decentralized optimization achieve privacy preservation by patching decentralized optimization with information-technology privacy mechanisms such as differential privacy or homomorphic encryption, which either sacrifices optimization accuracy or incurs heavy computation/communication overhead. We propose an inherently privacy-preserving decentralized optimization algorithm by exploiting the robustness of decentralized optimization to uncertainties in optimization dynamics. More specifically, we present a general decentralized optimization framework, based on which we show that privacy can be enabled in decentralized optimization by adding randomness in optimization parameters. We further show that the added randomness has no influence on the accuracy of optimization, and prove that our inherently privacy-preserving algorithm has $R$-linear convergence when the global objective function is smooth and strongly convex. We also rigorously prove that the proposed algorithm can avoid the gradient of a node from being inferable by other nodes. Numerical simulation results confirm the theoretical predictions.
Published: 2022

32. Correction to: Effects of Polyethylene Microplastics and Natural Sands on the Dispersion of Spilled Oil in the Marine Environment

Author: Yu, Xin Ping, primary, Gao, Huan, additional, An, Ya Ya, additional, Qi, Zhi Xin, additional, and Xiong, De Qi, additional
Published: 2024
Full Text: View/download PDF

33. Cross-Domain Correlation Distillation for Unsupervised Domain Adaptation in Nighttime Semantic Segmentation

Author: Gao, Huan, Guo, Jichang, Wang, Guoli, and Zhang, Qian
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The performance of nighttime semantic segmentation is restricted by the poor illumination and a lack of pixel-wise annotation, which severely limit its application in autonomous driving. Existing works, e.g., using the twilight as the intermediate target domain to perform the adaptation from daytime to nighttime, may fail to cope with the inherent difference between datasets caused by the camera equipment and the urban style. Faced with these two types of domain shifts, i.e., the illumination and the inherent difference of the datasets, we propose a novel domain adaptation framework via cross-domain correlation distillation, called CCDistill. The invariance of illumination or inherent difference between two images is fully explored so as to make up for the lack of labels for nighttime images. Specifically, we extract the content and style knowledge contained in features, calculate the degree of inherent or illumination difference between two images. The domain adaptation is achieved using the invariance of the same kind of difference. Extensive experiments on Dark Zurich and ACDC demonstrate that CCDistill achieves the state-of-the-art performance for nighttime semantic segmentation. Notably, our method is a one-stage domain adaptation network which can avoid affecting the inference time. Our implementation is available at https://github.com/ghuan99/CCDistill.
Published: 2022

34. Effect of glass fiber on the mechanical and thermal insulation performances of kaolinite-based thermal insulator

Author: Feng, Runlong, Yang, Qingwen, Dai, Huixing, Deng, Miao, Wang, Huizhao, Hou, Chunming, Cheng, Zhouyueyang, and Gao, Huan
Published: 2024
Full Text: View/download PDF

35. The alpha2-adrenoceptor agonist clonidine protects against hypoxic-ischemic brain damage in neonatal mice through the Nrf2/NF-κB signaling pathway

Author: Su, Daojing, Gao, Huan, He, Min, Hao, Hu, Liao, Heng, and Zheng, Su
Published: 2024
Full Text: View/download PDF

36. Effect of calcination temperature on the activity of basalt tailings for the lightweight geopolymer from microwave curing

Author: Lei, Shengjun, Gao, Huan, Feng, Runlong, Dai, Huixing, Bernardo, Enrico, Zhang, Haomin, Cheng, Zhouyueyang, Zhang, Xianghui, Deng, Miao, Li, Pingfeng, and Wang, Ling
Published: 2024
Full Text: View/download PDF

37. Ratiometric visualization fluorescence temperature sensing based on dual-lanthanide metal-organic frameworks

Author: He, Tong, Gao, Huan, Diao, Xi-Hui, Muhammad, Yaseen, Chen, Chao, Wang, Hao, Qi, Chuan-Song, Li, Wei, Liu, Na, and Li, Yun-Wu
Published: 2024
Full Text: View/download PDF

38. Ultra-low cost and high-performance paper-based flexible pressure sensor for artificial intelligent E-skin

Author: Chen, Yugang, Wang, Shasha, Liu, Yiren, Deng, Huichan, Gao, Huan, Cao, Mengyu, Zhang, Chong, Cheng, Xiaogang, and Xie, Linghai
Published: 2024
Full Text: View/download PDF

39. Deep learning-assisted design for battery liquid cooling plate with bionic leaf structure considering non-uniform heat generation

Author: Zheng, Aodi, Gao, Huan, Jia, Xiongjie, Cai, Yuhao, Yang, Xiaohu, Zhu, Qiang, and Jiang, Haoran
Published: 2024
Full Text: View/download PDF

40. Enhancement effect of basalt fiber on the foamy kaolinite-based composite thermal insulator

Author: Dai, Huixing, Gao, Huan, Jiang, Biaoxiu, Yang, Qingwen, Li, Xinjuan, Guo, Xiaoping, Cheng, Zhouyueyang, Xiong, Yi, Li, Xiang, Chen, Xiaowen, Wu, Jifeng, and Wang, Ling
Published: 2024
Full Text: View/download PDF

41. Identification of sanguinarine as a novel antagonist for perfluorooctanoate/perfluorooctane sulfonate-induced senescence of hepatocytes: An integrated computational and experimental analysis

Author: Zhang, Xue, Gao, Huan, Chen, Xiaoyu, Liu, Ziqi, Wang, Han, Cui, Mengxing, Li, Yajie, Yu, Yongjiang, Chen, Shen, Xing, Xiumei, Chen, Liping, Li, Daochuan, Zeng, Xiaowen, and Wang, Qing
Published: 2024
Full Text: View/download PDF

42. Effective treatment of traumatic brain injury by injection of a selenium-containing ointment

Author: Hu, Haijun, Gao, Huan, Wang, Kai, Jin, Zeyuan, Zheng, Weiwei, Wang, Qiaoxuan, Yang, Yufang, Yu, Chaonan, Xu, Kedi, and Gao, Changyou
Published: 2024
Full Text: View/download PDF

43. Upcycling of polyethylene/polypropylene mixtures promoted by well-controlled multiblock olefin copolymers

Author: Kang, Yuze, Wang, Hanlin, Meng, Fanmao, Chen, Shangtao, Du, Bin, Jiang, Zihao, Wu, Jiajia, Zhang, Kebin, Gao, Huan, Pan, Li, and Li, Yuesheng
Published: 2024
Full Text: View/download PDF

44. Heteryunine A, an amidated tryptophan-catechin-spiroketal hybrid with antifibrotic activity from Heterosmilax yunnanensis

Author: Du, Rong-Rong, Wang, Ruo-Yu, Zhou, Ji-Chao, Gao, Huan-Huan, Qin, Wen-Jie, Duan, Xiu-Mei, Yang, Ya-Nan, Zhang, Xiao-Wei, and Zhang, Pei-Cheng
Published: 2024
Full Text: View/download PDF

45. The mannose-binding lectin (MBL) gene cloned from Exopalaemon carinicauda plays a key role in resisting infection by Vibrio parahaemolyticus

Author: Shi, Tingting, Gao, Jiayi, Xu, Wanyuan, Liu, Xue, Yan, Binlun, Azra, Mohamad Nor, Baloch, Wazir Ali, Wang, Panpan, and Gao, Huan
Published: 2024
Full Text: View/download PDF

46. Enhancing marine protection: Antibacterial, anti-corrosion, and wettability properties of smart coatings via magnet assisted jet electrodeposition

Author: Yu, Ziyang, Chen, Ya, Song, Yachao, Huang, Dazhi, Gao, Huan, Yang, Feifei, Shen, Lida, and Wang, Dongsheng
Published: 2025
Full Text: View/download PDF

47. A downscaling framework with WRF-UCM and LES/RANS models for urban microclimate simulation strategy: Validation through both measurement and mechanism model

Author: Liu, Jiawen, Gao, Huan, Jia, Ruoyu, Wang, Ran, Han, Dongrui, Liu, Luo, Xu, Xinliang, and Qiao, Zhi
Published: 2025
Full Text: View/download PDF

48. AMPK activation by dietary acadesine improves fillet texture in large yellow croaker (Larimichthys crocea) fed a linseed oil-based diet

Author: Chen, Shengdi, Sun, Zihan, Liu, Ningning, Yang, Chenbin, Li, Na, Li, Lu, Wei, Chaoqing, Yan, Binlun, Gao, Huan, Tan, Peng, and Mu, Hua
Published: 2025
Full Text: View/download PDF

49. Algorithm-Level Confidentiality for Average Consensus on Time-Varying Directed Graphs

Author: Gao, Huan and Wang, Yongqiang
Subjects: Computer Science - Multiagent Systems
Abstract: Average consensus plays a key role in distributed networks, with applications ranging from time synchronization, information fusion, load balancing, to decentralized control. Existing average consensus algorithms require individual agents to exchange explicit state values with their neighbors, which leads to the undesirable disclosure of sensitive information in the state. In this paper, we propose a novel average consensus algorithm for time-varying directed graphs that can protect the confidentiality of a participating agent against other participating agents. The algorithm injects randomness in interaction to obfuscate information on the algorithm-level and can ensure information-theoretic privacy without the assistance of any trusted third party or data aggregator. By leveraging the inherent robustness of consensus dynamics against random variations in interaction, our proposed algorithm can also guarantee the accuracy of average consensus. The algorithm is distinctly different from differential-privacy based average consensus approaches which enable confidentiality through compromising accuracy in obtained consensus value. Numerical simulations confirm the effectiveness and efficiency of our proposed approach., Comment: This paper has been accepted to IEEE Transactions on Network Science and Engineering as a regular paper. Please cite this paper as: Huan Gao and Yongqiang Wang, Algorithm-Level Confidentiality for Average Consensus on Time-Varying Directed Graphs. IEEE Transactions on Network Science and Engineering, doi: 10.1109/TNSE.2022.3140274
Published: 2022

50. Delta of Exopalaemon carinicauda: molecular characterization, expression in different tissues and developmental stages, and its SNPs association analysis with development

Author: Lai, Xiaofang, Ji, Fanyue, Yu, Feifan, Chen, Hao, Shen, Shanrui, and Gao, Huan
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,074 results on '"Gao, Huan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources