Author: "Li Kun" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Li Kun"' showing total 19,498 results

Start Over Author "Li Kun"

19,498 results on '"Li Kun"'

101. CLDR: Contrastive Learning Drug Response Models from Natural Language Supervision

Author: Li, Kun and Hu, Wenbin
Subjects: Quantitative Biology - Biomolecules, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Quantitative Biology - Molecular Networks
Abstract: Deep learning-based drug response prediction (DRP) methods can accelerate the drug discovery process and reduce R\&D costs. Although the mainstream methods achieve high accuracy in predicting response regression values, the regression-aware representations of these methods are fragmented and fail to capture the continuity of the sample order. This phenomenon leads to models optimized to sub-optimal solution spaces, reducing generalization ability and may result in significant wasted costs in the drug discovery phase. In this paper, we propose \MN, a contrastive learning framework with natural language supervision for the DRP. The \MN~converts regression labels into text, which is merged with the captions text of the drug response as a second modality of the samples compared to the traditional modalities (graph, sequence). In each batch, two modalities of one sample are considered positive pairs and the other pairs are considered negative pairs. At the same time, in order to enhance the continuous representation capability of the numerical text, a common-sense numerical knowledge graph is introduced. We validated several hundred thousand samples from the Genomics of Drug Sensitivity in Cancer dataset, observing the average improvement of the DRP method ranges from 7.8\% to 31.4\% with the application of our framework. The experiments prove that the \MN~effectively constrains the samples to a continuous distribution in the representation space, and achieves impressive prediction performance with only a few epochs of fine-tuning after pre-training. The code is available at: \url{https://gitee.com/xiaoyibang/clipdrug.git}., Comment: 9 pages, 4 figures, 3 tables
Published: 2023

102. Joint2Human: High-quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Author: Zhang, Muxin, Feng, Qiao, Su, Zhuo, Wen, Chao, Xue, Zhou, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D human generation is increasingly significant in various applications. However, the direct use of 2D generative methods in 3D generation often results in losing local details, while methods that reconstruct geometry from generated images struggle with global view consistency. In this work, we introduce Joint2Human, a novel method that leverages 2D diffusion models to generate detailed 3D human geometry directly, ensuring both global structure and local details. To achieve this, we employ the Fourier occupancy field (FOF) representation, enabling the direct generation of 3D shapes as preliminary results with 2D generative models. With the proposed high-frequency enhancer and the multi-view recarving strategy, our method can seamlessly integrate the details from different views into a uniform global shape. To better utilize the 3D human prior and enhance control over the generated geometry, we introduce a compact spherical embedding of 3D joints. This allows for an effective guidance of pose during the generation process. Additionally, our method can generate 3D humans guided by textual inputs. Our experimental results demonstrate the capability of our method to ensure global structure, local details, high resolution, and low computational cost simultaneously. More results and the code can be found on our project page at http://cic.tju.edu.cn/faculty/likun/projects/Joint2Human.
Published: 2023

103. R2Human: Real-Time 3D Human Appearance Rendering from a Single Image

Author: Yang, Yuanwang, Feng, Qiao, Lai, Yu-Kun, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Rendering 3D human appearance from a single image in real-time is crucial for achieving holographic communication and immersive VR/AR. Existing methods either rely on multi-camera setups or are constrained to offline operations. In this paper, we propose R2Human, the first approach for real-time inference and rendering of photorealistic 3D human appearance from a single image. The core of our approach is to combine the strengths of implicit texture fields and explicit neural rendering with our novel representation, namely Z-map. Based on this, we present an end-to-end network that performs high-fidelity color reconstruction of visible areas and provides reliable color inference for occluded regions. To further enhance the 3D perception ability of our network, we leverage the Fourier occupancy field as a prior for generating the texture field and providing a sampling surface in the rendering stage. We also propose a consistency loss and a spatial fusion strategy to ensure the multi-view coherence. Experimental results show that our method outperforms the state-of-the-art methods on both synthetic data and challenging real-world images, in real-time. The project page can be found at http://cic.tju.edu.cn/faculty/likun/projects/R2Human.
Published: 2023

104. Layered 3D Human Generation via Semantic-Aware Diffusion Model

Author: Wang, Yi, Ma, Jian, Shao, Ruizhi, Feng, Qiao, Lai, Yu-Kun, Liu, Yebin, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The generation of 3D clothed humans has attracted increasing attention in recent years. However, existing work cannot generate layered high-quality 3D humans with consistent body structures. As a result, these methods are unable to arbitrarily and separately change and edit the body and clothing of the human. In this paper, we propose a text-driven layered 3D human generation framework based on a novel physically-decoupled semantic-aware diffusion model. To keep the generated clothing consistent with the target text, we propose a semantic-confidence strategy for clothing that can eliminate the non-clothing content generated by the model. To match the clothing with different body shapes, we propose a SMPL-driven implicit field deformation network that enables the free transfer and reuse of clothing. Besides, we introduce uniform shape priors based on the SMPL model for body and clothing, respectively, which generates more diverse 3D content without being constrained by specific templates. The experimental results demonstrate that the proposed method not only generates 3D humans with consistent body structures but also allows free editing in a layered manner. The source code will be made public., Comment: Error in the derivation of equation 11 in section 4.3.1
Published: 2023

105. EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

Author: Wang, Fei, Guo, Dan, Li, Kun, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Video Motion Magnification (VMM) aims to break the resolution limit of human visual perception capability and reveal the imperceptible minor motion that contains valuable information in the macroscopic domain. However, challenges arise in this task due to photon noise inevitably introduced by photographic devices and spatial inconsistency in amplification, leading to flickering artifacts in static fields and motion blur and distortion in dynamic fields in the video. Existing methods focus on explicit motion modeling without emphasizing prioritized denoising during the motion magnification process. This paper proposes a novel dynamic filtering strategy to achieve static-dynamic field adaptive denoising. Specifically, based on Eulerian theory, we separate texture and shape to extract motion representation through inter-frame shape differences, expecting to leverage these subdivided features to solve this task finely. Then, we introduce a novel dynamic filter that eliminates noise cues and preserves critical features in the motion magnification and amplification generation phases. Overall, our unified framework, EulerMormer, is a pioneering effort to first equip with Transformer in learning-based VMM. The core of the dynamic filter lies in a global dynamic sparse cross-covariance attention mechanism that explicitly removes noise while preserving vital information, coupled with a multi-scale dual-path gating mechanism that selectively regulates the dependence on different frequency features to reduce spatial attenuation and complement motion boundaries. We demonstrate extensive experiments that EulerMormer achieves more robust video motion magnification from the Eulerian perspective, significantly outperforming state-of-the-art methods. The source code is available at https://github.com/VUT-HFUT/EulerMormer.
Published: 2023

106. Solving Motion Planning Tasks with a Scalable Generative Model

Author: Hu, Yihan, Chai, Siqi, Yang, Zhening, Qian, Jingyu, Li, Kun, Shao, Wenxin, Zhang, Haichao, Xu, Wei, Liu, Qiang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

107. Physical Education Course Based on Self-Regulated Learning to Improve Students' Physical Literacy

Author: Li, Kun, Onyon, Nitikorn, Choichareon, Thapana, and Charoontham, Orasa
Abstract: Background and Aim: This study adopts the teaching mode of self-regulated learning to change the traditional physical education teaching methods, so as to improve the physical literacy of college students. The component of college students' physical literacy includes physical cognition, physical experience, physical ability and sports behavior. The purpose of this study was to compare students' physical literacy scores before and after learning physical education course based on self-regulated learning. Materials and Methods: In this experimental study, 40 students (one class) from Zhoukou Normal University were investigated. This study used test and self-evaluation forms given to students before and after learning through physical education based on self-regulated learning. Data were collected and analyzed by means, standard deviation and t-test for dependent sample. Results: The results of the physical literacy before and after learning based on the self-regulated learning course, it was found that the mean scores of pretests of students' physical literacy were 188.88, SD was 25.45, and mean scores of posttests was 230.53, SD was 16.79. The post-test score of students' physical literacy was higher than that of the pre-test score, and the difference was statistically significance (t= 13.64, p<0.05). The average score of post-tests was progressively higher than the pre-test score. Conclusion: The developed physical education course based on self-regulated learning has a significant effect on improving the physical literacy of college students.
Published: 2023

108. Comparative efficacy of different noninvasive brain stimulation protocols on upper-extremity motor function and activities of daily living after stroke: a systematic review and network meta-analysis

Author: Li, Ling-Ling, Wu, Jia-Jia, Li, Kun-Peng, Jin, Jing, Xiang, Yun-Ting, Hua, Xu-Yun, Zheng, Mou-Xiong, and Xu, Jian-Guang
Published: 2024
Full Text: View/download PDF

109. Simulation of novel CFTS solar cells with SCAPS-1D software

Author: Zhou, Chenliang, Chen, Wei, Chen, Zhili, Cheng, Xiangyu, Zhang, Yunxiang, Sun, Gongyi, Li, Kun, Liu, Zhaohui, Shi, Lin, Wang, Zhongjie, Liu, Wei, and Zhang, Qinfang
Published: 2024
Full Text: View/download PDF

110. Numerical Simulation of a Three-Dimensional Strongly Magnetic Material Tensor in Wave Vector Domain

Author: Li, Kun, Shi, Hui, Wu, Yu-cheng, Kang, Chen, and Zhao-Dongdong
Published: 2024
Full Text: View/download PDF

111. Microbiome analysis reveals alteration in water microbial communities due to livestock activities

Author: Xu, Chang, Lu, Sijia, Cidan, Yangji, Wang, Hongzhuang, Sun, Guangming, Saleem, Muhammad Usman, Ataya, Farid Shokry, Zhu, Yanbin, Wangdui-Basang, and Li, Kun
Published: 2024
Full Text: View/download PDF

112. Enhanced phytoremediation of 2,4-DNP-contaminated wastewater by Salix matsudana Koidz with MeJA pretreatment and associated mechanism

Author: Li, Kun, Ji, Chao, Fu, Guilong, Chen, Yu, Tian, Huimei, Yao, Qi, Li, Chuanrong, and Xie, Huicheng
Published: 2024
Full Text: View/download PDF

113. Oscillation properties of eigenfunctions for Sturm-Liouville problems with interface conditions via Prüfer transformation

Author: Li, Zhi-yu, Li, Kun, Cai, Jin-ming, Qin, Jian-fang, and Zheng, Zhao-wen
Published: 2024
Full Text: View/download PDF

114. Chinese Medicine Prolongs Overall Survival of Chinese Patients with Advanced Gastric Cancer: Treatment Pattern and Survival Analysis of a 20-Year Real-World Study

Author: Cao, Ni-da, Zhu, Xiao-hong, Ma, Fang-qi, Xu, Yan, Dong, Jia-huan, Qin, Meng-meng, Liu, Tian-shu, Zhu, Chun-chao, Guo, Wei-jian, Ding, Hong-hua, Guo, Yuan-biao, Liu, Li-kun, Song, Jin-jie, Wu, Ji-ping, Cheng, Yue-lei, Zeng, Lin, and Zhao, Ai-guang
Published: 2024
Full Text: View/download PDF

115. Persistence of peripheral CD8 + CD28− T cells indicates a favourable outcome and tumour immunity in first-line HER2-positive metastatic breast cancer

Author: Liu, Xiaoran, Cheng, Xiangming, Xie, Feng, Li, Kun, Shi, Yongcan, Shao, Bin, Liang, Xu, Wan, Fengling, Jia, Shidong, Zhang, Yue, Liu, Yiqiang, and Li, Huiping
Published: 2024
Full Text: View/download PDF

116. The protective effects of Ninjin’yoeito against liver steatosis/fibrosis in a non-alcoholic steatohepatitis model mouse

Author: Takano, Kyohei, Kaneda, Marisa, Aoki, Yayoi, Fujita, Nina, Chiba, Shigeki, Michihara, Seiwa, Han, Li-Kun, and Takahashi, Ryuji
Published: 2024
Full Text: View/download PDF

117. SpeechAct: Towards Generating Whole-body Motion from Speech

Author: Zhang, Jinsong, Zhu, Minjie, Zhang, Yuxiang, Liu, Yebin, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper addresses the problem of generating whole-body motion from speech. Despite great successes, prior methods still struggle to produce reasonable and diverse whole-body motions from speech. This is due to their reliance on suboptimal representations and a lack of strategies for generating diverse results. To address these challenges, we present a novel hybrid point representation to achieve accurate and continuous motion generation, e.g., avoiding foot skating, and this representation can be transformed into an easy-to-use representation, i.e., SMPL-X body mesh, for many applications. To generate whole-body motion from speech, for facial motion, closely tied to the audio signal, we introduce an encoder-decoder architecture to achieve deterministic outcomes. However, for the body and hands, which have weaker connections to the audio signal, we aim to generate diverse yet reasonable motions. To boost diversity in motion generation, we propose a contrastive motion learning method to encourage the model to produce more distinctive representations. Specifically, we design a robust VQ-VAE to learn a quantized motion codebook using our hybrid representation. Then, we regress the motion representation from the audio signal by a translation model employing our contrastive motion learning method. Experimental results validate the superior performance and the correctness of our model. The project page is available for research purposes at http://cic.tju.edu.cn/faculty/likun/projects/SpeechAct., Comment: The paper has been archived without permission from the newly added author
Published: 2023

118. High-Quality Animatable Dynamic Garment Reconstruction from Monocular Videos

Author: Li, Xiongzheng, Zhang, Jinsong, Lai, Yu-Kun, Yang, Jingyu, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Much progress has been made in reconstructing garments from an image or a video. However, none of existing works meet the expectations of digitizing high-quality animatable dynamic garments that can be adjusted to various unseen poses. In this paper, we propose the first method to recover high-quality animatable dynamic garments from monocular videos without depending on scanned data. To generate reasonable deformations for various unseen poses, we propose a learnable garment deformation network that formulates the garment reconstruction task as a pose-driven deformation problem. To alleviate the ambiguity estimating 3D garments from monocular videos, we design a multi-hypothesis deformation module that learns spatial representations of multiple plausible deformations. Experimental results on several public datasets demonstrate that our method can reconstruct high-quality dynamic garments with coherent surface details, which can be easily animated under unseen poses. The code will be provided for research purposes.
Published: 2023

119. Towards Grouping in Large Scenes with Occlusion-aware Spatio-temporal Transformers

Author: Zhang, Jinsong, Gu, Lingfeng, Lai, Yu-Kun, Wang, Xueyang, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Group detection, especially for large-scale scenes, has many potential applications for public safety and smart cities. Existing methods fail to cope with frequent occlusions in large-scale scenes with multiple people, and are difficult to effectively utilize spatio-temporal information. In this paper, we propose an end-to-end framework,GroupTransformer, for group detection in large-scale scenes. To deal with the frequent occlusions caused by multiple people, we design an occlusion encoder to detect and suppress severely occluded person crops. To explore the potential spatio-temporal relationship, we propose spatio-temporal transformers to simultaneously extract trajectory information and fuse inter-person features in a hierarchical manner. Experimental results on both large-scale and small-scale scenes demonstrate that our method achieves better performance compared with state-of-the-art methods. On large-scale scenes, our method significantly boosts the performance in terms of precision and F1 score by more than 10%. On small-scale scenes, our method still improves the performance of F1 score by more than 5%. The project page with code can be found at http://cic.tju.edu.cn/faculty/likun/projects/GroupTrans., Comment: 11 pages, 5 figures
Published: 2023

120. CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability

Author: Lv, Minxuan, Dai, Chengwei, Li, Kun, Zhou, Wei, and Hu, Songlin
Subjects: Computer Science - Computation and Language
Abstract: Neural network models are vulnerable to adversarial examples, and adversarial transferability further increases the risk of adversarial attacks. Current methods based on transferability often rely on substitute models, which can be impractical and costly in real-world scenarios due to the unavailability of training data and the victim model's structural details. In this paper, we propose a novel approach that directly constructs adversarial examples by extracting transferable features across various tasks. Our key insight is that adversarial transferability can extend across different tasks. Specifically, we train a sequence-to-sequence generative model named CT-GAT using adversarial sample data collected from multiple tasks to acquire universal adversarial features and generate adversarial examples for different tasks. We conduct experiments on ten distinct datasets, and the results demonstrate that our method achieves superior attack performance with small cost., Comment: Accepted to EMNLP 2023 main conference Corrected the header error in Figure 3
Published: 2023

121. MeaeQ: Mount Model Extraction Attacks with Efficient Queries

Author: Dai, Chengwei, Lv, Minxuan, Li, Kun, and Zhou, Wei
Subjects: Computer Science - Computation and Language
Abstract: We study model extraction attacks in natural language processing (NLP) where attackers aim to steal victim models by repeatedly querying the open Application Programming Interfaces (APIs). Recent works focus on limited-query budget settings and adopt random sampling or active learning-based sampling strategies on publicly available, unannotated data sources. However, these methods often result in selected queries that lack task relevance and data diversity, leading to limited success in achieving satisfactory results with low query costs. In this paper, we propose MeaeQ (Model extraction attack with efficient Queries), a straightforward yet effective method to address these issues. Specifically, we initially utilize a zero-shot sequence inference classifier, combined with API service information, to filter task-relevant data from a public text corpus instead of a problem domain-specific dataset. Furthermore, we employ a clustering-based data reduction technique to obtain representative data as queries for the attack. Extensive experiments conducted on four benchmark datasets demonstrate that MeaeQ achieves higher functional similarity to the victim model than baselines while requiring fewer queries. Our code is available at https://github.com/C-W-D/MeaeQ., Comment: Accepted by EMNLP 2023 main conference
Published: 2023

122. Spatial Crowdsourcing Task Allocation Scheme for Massive Data with Spatial Heterogeneity

Author: Li, Kun, Wang, Shengling, Shi, Hongwei, Cheng, Xiuzhen, and Xu, Minghui
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Spatial crowdsourcing (SC) engages large worker pools for location-based tasks, attracting growing research interest. However, prior SC task allocation approaches exhibit limitations in computational efficiency, balanced matching, and participation incentives. To address these challenges, we propose a graph-based allocation framework optimized for massive heterogeneous spatial data. The framework first clusters similar tasks and workers separately to reduce allocation scale. Next, it constructs novel non-crossing graph structures to model balanced adjacencies between unevenly distributed tasks and workers. Based on the graphs, a bidirectional worker-task matching scheme is designed to produce allocations optimized for mutual interests. Extensive experiments on real-world datasets analyze the performance under various parameter settings.
Published: 2023

123. Transformer-based Multimodal Change Detection with Multitask Consistency Constraints

Author: Liu, Biyuan, Chen, Huaixin, Li, Kun, and Yang, Michael Ying
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Change detection plays a fundamental role in Earth observation for analyzing temporal iterations over time. However, recent studies have largely neglected the utilization of multimodal data that presents significant practical and technical advantages compared to single-modal approaches. This research focuses on leveraging {pre-event} digital surface model (DSM) data and {post-event} digital aerial images captured at different times for detecting change beyond 2D. We observe that the current change detection methods struggle with the multitask conflicts between semantic and height change detection tasks. To address this challenge, we propose an efficient Transformer-based network that learns shared representation between cross-dimensional inputs through cross-attention. {It adopts a consistency constraint to establish the multimodal relationship. Initially, pseudo-changes are derived by employing height change thresholding. Subsequently, the $L2$ distance between semantic and pseudo-changes within their overlapping regions is minimized. This explicitly endows the height change detection (regression task) and semantic change detection (classification task) with representation consistency.} A DSM-to-image multimodal dataset encompassing three cities in the Netherlands was constructed. It lays a new foundation for beyond-2D change detection from cross-dimensional inputs. Compared to five state-of-the-art change detection methods, our model demonstrates consistent multitask superiority in terms of semantic and height change detection. Furthermore, the consistency strategy can be seamlessly adapted to the other methods, yielding promising improvements.
Published: 2023
Full Text: View/download PDF

124. A New Transformation Approach for Uplift Modeling with Binary Outcome

Author: Li, Kun, Tian, Jiang, and Xiang, Xiaojia
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Uplift modeling has been used effectively in fields such as marketing and customer retention, to target those customers who are more likely to respond due to the campaign or treatment. Essentially, it is a machine learning technique that predicts the gain from performing some action with respect to not taking it. A popular class of uplift models is the transformation approach that redefines the target variable with the original treatment indicator. These transformation approaches only need to train and predict the difference in outcomes directly. The main drawback of these approaches is that in general it does not use the information in the treatment indicator beyond the construction of the transformed outcome and usually is not efficient. In this paper, we design a novel transformed outcome for the case of the binary target variable and unlock the full value of the samples with zero outcome. From a practical perspective, our new approach is flexible and easy to use. Experimental results on synthetic and real-world datasets obviously show that our new approach outperforms the traditional one. At present, our new approach has already been applied to precision marketing in a China nation-wide financial holdings group.
Published: 2023

125. Zero-shot Learning of Drug Response Prediction for Preclinical Drug Screening

Author: Li, Kun, Luo, Yong, Cai, Xiantao, Hu, Wenbin, and Du, Bo
Subjects: Quantitative Biology - Biomolecules, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Quantitative Biology - Cell Behavior, Quantitative Biology - Genomics
Abstract: Conventional deep learning methods typically employ supervised learning for drug response prediction (DRP). This entails dependence on labeled response data from drugs for model training. However, practical applications in the preclinical drug screening phase demand that DRP models predict responses for novel compounds, often with unknown drug responses. This presents a challenge, rendering supervised deep learning methods unsuitable for such scenarios. In this paper, we propose a zero-shot learning solution for the DRP task in preclinical drug screening. Specifically, we propose a Multi-branch Multi-Source Domain Adaptation Test Enhancement Plug-in, called MSDA. MSDA can be seamlessly integrated with conventional DRP methods, learning invariant features from the prior response data of similar drugs to enhance real-time predictions of unlabeled compounds. We conducted experiments using the GDSCv2 and CellMiner datasets. The results demonstrate that MSDA efficiently predicts drug responses for novel compounds, leading to a general performance improvement of 5-10\% in the preclinical drug screening phase. The significance of this solution resides in its potential to accelerate the drug discovery process, improve drug candidate assessment, and facilitate the success of drug discovery., Comment: 16 pages, 3 figures, 3 tables
Published: 2023

126. Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding

Author: Li, Jiaxiu, Li, Kun, Li, Jia, Chen, Guoliang, Guo, Dan, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Make-up temporal video grounding (MTVG) aims to localize the target video segment which is semantically related to a sentence describing a make-up activity, given a long video. Compared with the general video grounding task, MTVG focuses on meticulous actions and changes on the face. The make-up instruction step, usually involving detailed differences in products and facial areas, is more fine-grained than general activities (e.g, cooking activity and furniture assembly). Thus, existing general approaches cannot locate the target activity effectually. More specifically, existing proposal generation modules are not yet fully developed in providing semantic cues for the more fine-grained make-up semantic comprehension. To tackle this issue, we propose an effective proposal-based framework named Dual-Path Temporal Map Optimization Network (DPTMO) to capture fine-grained multimodal semantic details of make-up activities. DPTMO extracts both query-agnostic and query-guided features to construct two proposal sets and uses specific evaluation methods for the two sets. Different from the commonly used single structure in previous methods, our dual-path structure can mine more semantic information in make-up videos and distinguish fine-grained actions well. These two candidate sets represent the cross-modal makeup video-text similarity and multi-modal fusion relationship, complementing each other. Each set corresponds to its respective optimization perspective, and their joint prediction enhances the accuracy of video timestamp prediction. Comprehensive experiments on the YouMakeup dataset demonstrate our proposed dual structure excels in fine-grained semantic comprehension.
Published: 2023

127. FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Object Detection

Author: Hu, Chunyong, Zheng, Hang, Li, Kun, Xu, Jianyun, Mao, Weibo, Luo, Maochun, Wang, Lingxuan, Chen, Mingxia, Peng, Qihao, Liu, Kaixuan, Zhao, Yiru, Hao, Peihan, Liu, Minzhe, and Yu, Kaicheng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-sensor modal fusion has demonstrated strong advantages in 3D object detection tasks. However, existing methods that fuse multi-modal features require transforming features into the bird's eye view space and may lose certain information on Z-axis, thus leading to inferior performance. To this end, we propose a novel end-to-end multi-modal fusion transformer-based framework, dubbed FusionFormer, that incorporates deformable attention and residual structures within the fusion encoding module. Specifically, by developing a uniform sampling strategy, our method can easily sample from 2D image and 3D voxel features spontaneously, thus exploiting flexible adaptability and avoiding explicit transformation to the bird's eye view space during the feature concatenation process. We further implement a residual structure in our feature encoder to ensure the model's robustness in case of missing an input modality. Through extensive experiments on a popular autonomous driving benchmark dataset, nuScenes, our method achieves state-of-the-art single model performance of 72.6% mAP and 75.1% NDS in the 3D object detection task without test time augmentation.
Published: 2023

128. The Floquet Fermi Liquid

Author: Shi, Li-kun, Matsyshyn, Oles, Song, Justin C. W., and Villadiego, Inti Sodemann
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: We demonstrate the existence of a non-equilibrium "Floquet Fermi Liquid" state arising in partially filled Floquet Bloch bands weakly coupled to ideal fermionic baths, which possess a collection of "Floquet Fermi surfaces" enclosed inside each other, resembling matryoshka dolls. We elucidate several properties of these states, including their quantum oscillations under magnetic fields which feature slow beating patterns of their amplitude reflecting the different areas of the Floquet Fermi surfaces, consistent with those observed in microwave induced resistance oscillation experiments. We also investigate their specific heat and thermodynamic density of states and demonstrate how by controlling properties of the drive, such as its frequency, one can tune some of the Floquet Fermi surfaces towards non-equilibrium van-Hove singularities without changing the electron density., Comment: 17 pages, 3 figures
Published: 2023
Full Text: View/download PDF

129. Rapid droplet leads the Liquid-Infused Slippery Surfaces more slippery

Author: Li, Kun, Lv, Cunjing, and Feng, Xi-Qiao
Subjects: Physics - Fluid Dynamics, Condensed Matter - Soft Condensed Matter
Abstract: The introduction of lubricant between fluid and substrate endows the Liquid-Infused Slippery Surfaces with excellent wetting properties: low contact angle, various liquids repellency, ice-phobic and self-healing. Droplets moving on such surfaces have been widely demonstrated to obey a Landau-Levich-Derjaguin (LLD) friction. Here, we show that this power law is surprisingly decreased with the droplet accelerates: in the rapid droplet regime, the slippery surfaces seem more slippery than LLD friction. Combining experimental and numerical techniques, we find that the meniscus surrounding the droplet exhibits an incompletely developed state. The Incompletely Developed Meniscus possesses shorter shear length and thicker shear thickness than the prediction of Bretherton model and therefore is responsible for the more slippery regime. With an extended Bretherton model, we not only provide an analytical description to the IDM behavior but also the friction when the Capillary Number of the moving droplet is larger than the Critical Capillary Number.
Published: 2023

130. Research on tor-based anonymous DDoS attack

Author: Wang Rui, Yang Zhiye, Li Kun, Chen Chen, and Chen Yanru
Subjects: Engineering (General). Civil engineering (General), TA1-2040
Abstract: Based on Tor, an anonymous DDoS attack is proposed, and the attacker of this attack is very difficult to trace due to the strong anonymity of Tor. Through conducting experiments in different network environment and making analyzation of security, the experiments results demonstrate the effectiveness of this attack, and the strategy to defense the attack is also discussed.
Published: 2021
Full Text: View/download PDF

131. Analysis on carbon emission reduction path and expected effect of inland river ships in China

Author: Li Kun, Liu Wendi, Jiao Fangfang, and Ji Yongbo
Subjects: Environmental sciences, GE1-350
Abstract: In order to systematically study the carbon emission reduction path suitable for China's inland river ships, this paper analyzes the actual situation of China's inland river ships ' operation scale, fleet structure, fuel consumption, etc., and combines the development of domestic and foreign carbon emission reduction technologies and the ship's own operating characteristics. Carbon emission reduction technology paths for China's inland river ships have been proposed in this paper, which includes different tonnage classes, different ship ages, and different types of river ships. Also, the expected application effects of each technology path are analyzed in depth in this paper. The research results can give a certain support for the implementation of carbon emission reduction in China's shipping industry.
Published: 2021
Full Text: View/download PDF

132. Study on the application of natural gas to inland and coastal ships in China

Author: Li Kun, Li Xiang, and Zhang Jianghe
Subjects: ship, liquified natural gas, newly build, reconstruction, lng, Environmental sciences, GE1-350
Abstract: In order to promote the green and low-carbon development of ships and promote shipping energy conservation and emission reduction. This paper systematically combed the development of liquefied natural gas (LNG) powered ships in China's inland ships, deeply studied and analyzed the development characteristics of China's new LNG powered ships and reconstructed LNG powered ships, so as to provided reference experience and lay a foundation for China to continue to promote the application of LNG in the water transportation industry in the future.
Published: 2021
Full Text: View/download PDF

133. In-beam gamma rays of CSNS Back-n characterized by black resonance filter

Author: Wang, Jin-Cheng, Ren, Jie, Jiang, Wei, Ruan, Xi-Chao, Liu, Ying-Yi, Yang, Hao-Lan, Xu, Kuo-Zhi, Pan, Xin-Yi, Sun, Qi, Bao, Jie, Huang, Han-Xiong, Bai, Hao-Fan, Bai, Jiang-Bo, Cao, Ping, Chen, Qi-Ping, Chen, Yong-Hao, Duan, Wen-Hao, Fan, An-Chuan, Fan, Rui-Rui, Feng, Chang-Qing, Gu, Min-Hao, Han, Chang-Cai, Han, Zi-Jie, He, Guo-Zhu, He, Yong-Cheng, Hong, Yang, Hu, Yi-Wei, Jiang, Zhi-Jie, Kang, Ling, Lan, Chang-Lin, Li, Bo, Li, Feng, Li, Qiang, Li, Xiao, Li, Yang, Liu, Jie, Liu, Rong, Liu, Shu-Bin, Liu, Yi-Na, Luan, Guang-Yuan, Ning, Chang-Jun, Qiu, Yi-Jia, Ren, Wen-Kai, Ren, Zhi-Zhou, Song, Zhao-Hui, Sun, Kang, Tan, Zhi-Xin, Tang, Jing-Yu, Tang, Sheng-Da, Wang, Li-Jiao, Wang, Peng-Cheng, Wang, Zhao-Hui, Wen, Zhong-Wei, Wu, Xiao-Guang, Wu, Xuan, Wu, Ze-Peng, Xia, Cong, Xie, Li-Kun, Yi, Han, Yu, Tao, Yu, Yong-Ji, Zhang, Guo-Hui, Zhang, Hang-Chang, Zhang, Qi-Wei, Zhang, Xian-Peng, Zhang, Yu-Liang, Zhang, Zhi-Yong, Zhao, Mao-Yuan, Zhou, Zhi-Hao, Zhu, Ke-Jun, and Zou, Chong
Published: 2024
Full Text: View/download PDF

134. Multi-analytical study of Ming Dynasty Xianying Temple building mortar in Shaanxi, China

Author: Li, Xiangyong, Wei, Guofeng, Zhang, Yaxu, Yu, Chunlei, and Li, Kun
Published: 2024
Full Text: View/download PDF

135. Multiple layered PVDF-CNTs foams with gradient structure and high electromagnetic shielding performance

Author: Si, Yongbo, Li, Kun, Ding, Zihao, Zhang, Shixun, Zhang, Xiaoli, Liao, Xia, Yang, Yang, Guo, Xiaoqin, and Chen, Jingbo
Published: 2024
Full Text: View/download PDF

136. MICROSTRUCTURE AND MECHANICAL PROPERTIES OF ULTRAFINE WC/Co CEMENTED CARBIDES WITH CUBIC BORON NITRIDE AND Cr₃C₂ ADDITIONS

Author: Genrong Zhang, Haiyan Chen, Dong Lihua, Yin, and Li Kun
Subjects: Crystal inhibitor, Mechanical properties, Ultrafine microstructure, Cemented carbide, Tungsten carbide, Chromium carbide, Cubic boron nitride, Clay industries. Ceramics. Glass, TP785-869
Abstract: This study investigates the microstructure and mechanical properties of ultrafine tungsten carbide and cobalt (WC/Co) cemented carbides with cubic boron nitride (CBN) and chromium carbide (Cr₃C₂) fabricated by a hot pressing sintering process. This study uses samples with 8 wt% Co content and 7.5 vol% CBN content, and with different Cr₃C₂ content ranging from 0 to 0.30 wt%. Based on the experimental results, Cr₃C₂ content has a significant influence on inhibiting abnormal grain growth and decreasing grain size in cemented carbides. Near-full densification is possible when CBN-WC/Co with 0.25 wt% Cr₃C₂ is sintered at 1350°C and 20 MPa; the resulting material possesses optimal mechanical properties and density, with an acceptable Vickers hardness of 19.20 GPa, fracture toughness of 8.47 MPa.m1/2 and flexural strength of 564 MPa.u̇ Å k⃗
Published: 2016
Full Text: View/download PDF

137. Membrane–based separation and concentration of total flavone glycosides from Desmodium styracifolium

Author: Yang Kun, Guo Ze-Bin, and Li Kun-Ping
Subjects: Environmental sciences, GE1-350
Abstract: Desmodium styracifolium is one of the traditional Chinese herbs. In the present study, membrane-based technologies were used to separate and concentrate of the total flavone glycosides fraction from D. styracifolium. The extracts flowed through an ultrafiltration membrane which MWCO (molecular weight cut-off) is 30KDa and the permeate was concentrated by MWCO-1KDa nano-filtration membrane. The solid content of the membrane concentrated extracts of D. styracifolium (MEDs) was 26.5 mg/ml. Moreover, the content of vicenin-2, schaftoside and isovitexin in MEDs were 4.88 %, 9.76 %, 1.89 % respectively. The assay in vitro showed MEDs has better anti-inflammatory effect which partly proved that our membrane-based processes for separation and concentration of flavone glycosides from D. styracifolium is reliable and practicable.
Published: 2020
Full Text: View/download PDF

138. Exploiting Diverse Feature for Multimodal Sentiment Analysis

Author: Li, Jia, Qian, Wei, Li, Kun, Li, Qi, Guo, Dan, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this paper, we present our solution to the MuSe-Personalisation sub-challenge in the MuSe 2023 Multimodal Sentiment Analysis Challenge. The task of MuSe-Personalisation aims to predict the continuous arousal and valence values of a participant based on their audio-visual, language, and physiological signal modalities data. Considering different people have personal characteristics, the main challenge of this task is how to build robustness feature presentation for sentiment prediction. To address this issue, we propose exploiting diverse features. Specifically, we proposed a series of feature extraction methods to build a robust representation and model ensemble. We empirically evaluate the performance of the utilized method on the officially provided dataset. \textbf{As a result, we achieved 3rd place in the MuSe-Personalisation sub-challenge.} Specifically, we achieve the results of 0.8492 and 0.8439 for MuSe-Personalisation in terms of arousal and valence CCC.
Published: 2023

139. Dual-path TokenLearner for Remote Photoplethysmography-based Physiological Measurement with Facial Videos

Author: Qian, Wei, Guo, Dan, Li, Kun, Tian, Xilan, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Remote photoplethysmography (rPPG) based physiological measurement is an emerging yet crucial vision task, whose challenge lies in exploring accurate rPPG prediction from facial videos accompanied by noises of illumination variations, facial occlusions, head movements, \etc, in a non-contact manner. Existing mainstream CNN-based models make efforts to detect physiological signals by capturing subtle color changes in facial regions of interest (ROI) caused by heartbeats. However, such models are constrained by the limited local spatial or temporal receptive fields in the neural units. Unlike them, a native Transformer-based framework called Dual-path TokenLearner (Dual-TL) is proposed in this paper, which utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video. Specifically, the proposed Dual-TL uses a Spatial TokenLearner (S-TL) to explore associations in different facial ROIs, which promises the rPPG prediction far away from noisy ROI disturbances. Complementarily, a Temporal TokenLearner (T-TL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements. The two TokenLearners, S-TL and T-TL, are executed in a dual-path mode. This enables the model to reduce noise disturbances for final rPPG signal prediction. Extensive experiments on four physiological measurement benchmark datasets are conducted. The Dual-TL achieves state-of-the-art performances in both intra- and cross-dataset testings, demonstrating its immense potential as a basic backbone for rPPG measurement. The source code is available at \href{https://github.com/VUT-HFUT/Dual-TL}{https://github.com/VUT-HFUT/Dual-TL}
Published: 2023

140. ViGT: Proposal-free Video Grounding with Learnable Token in Transformer

Author: Li, Kun, Guo, Dan, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: The video grounding (VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely Video Grounding Transformer (ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows. (1) The token is unrelated to the video or the query and avoids data bias toward the original video and query. (2) The token simultaneously performs global context aggregation from video and query features. First, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention (i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets: ANet Captions, TACoS and YouCookII. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT., Comment: This paper has been accepted by SCIENCE CHINA Information Sciences
Published: 2023
Full Text: View/download PDF

141. Data Augmentation for Human Behavior Analysis in Multi-Person Conversations

Author: Li, Kun, Guo, Dan, Chen, Guoliang, Liu, Feiyang, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we present the solution of our team HFUT-VUT for the MultiMediate Grand Challenge 2023 at ACM Multimedia 2023. The solution covers three sub-challenges: bodily behavior recognition, eye contact detection, and next speaker prediction. We select Swin Transformer as the baseline and exploit data augmentation strategies to address the above three tasks. Specifically, we crop the raw video to remove the noise from other parts. At the same time, we utilize data augmentation to improve the generalization of the model. As a result, our solution achieves the best results of 0.6262 for bodily behavior recognition in terms of mean average precision and the accuracy of 0.7771 for eye contact detection on the corresponding test set. In addition, our approach also achieves comparable results of 0.5281 for the next speaker prediction in terms of unweighted average recall., Comment: Solutions of HFUT-VUT Team at the ACM MM 2023 Grand Challenge (MultiMediate: Multi-modal Behaviour Analysis for Artificial Mediation). Accepted at ACM MM 2023
Published: 2023
Full Text: View/download PDF

142. FusionAD: Multi-modality Fusion for Prediction and Planning Tasks of Autonomous Driving

Author: Ye, Tengju, Jing, Wei, Hu, Chunyong, Huang, Shikun, Gao, Lingping, Li, Fangzhen, Wang, Jingke, Guo, Ke, Xiao, Wencong, Mao, Weibo, Zheng, Hang, Li, Kun, Chen, Junbo, and Yu, Kaicheng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Building a multi-modality multi-task neural network toward accurate and robust performance is a de-facto standard in perception task of autonomous driving. However, leveraging such data from multiple sensors to jointly optimize the prediction and planning tasks remains largely unexplored. In this paper, we present FusionAD, to the best of our knowledge, the first unified framework that fuse the information from two most critical sensors, camera and LiDAR, goes beyond perception task. Concretely, we first build a transformer based multi-modality fusion network to effectively produce fusion based features. In constrast to camera-based end-to-end method UniAD, we then establish a fusion aided modality-aware prediction and status-aware planning modules, dubbed FMSPnP that take advantages of multi-modality features. We conduct extensive experiments on commonly used benchmark nuScenes dataset, our FusionAD achieves state-of-the-art performance and surpassing baselines on average 15% on perception tasks like detection and tracking, 10% on occupancy prediction accuracy, reducing prediction error from 0.708 to 0.389 in ADE score and reduces the collision rate from 0.31% to only 0.12%.
Published: 2023

143. Joint Skeletal and Semantic Embedding Loss for Micro-gesture Classification

Author: Li, Kun, Guo, Dan, Chen, Guoliang, Peng, Xinge, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we briefly introduce the solution of our team HFUT-VUT for the Micros-gesture Classification in the MiGA challenge at IJCAI 2023. The micro-gesture classification task aims at recognizing the action category of a given video based on the skeleton data. For this task, we propose a 3D-CNNs-based micro-gesture recognition network, which incorporates a skeletal and semantic embedding loss to improve action classification performance. Finally, we rank 1st in the Micro-gesture Classification Challenge, surpassing the second-place team in terms of Top-1 accuracy by 1.10%., Comment: 1st Place in Micro-gesture Classification sub-challenge in MiGA at IJCAI-2023
Published: 2023

144. Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer

Author: Yu, Wing-Yin, Po, Lai-Man, Cheung, Ray C. C., Zhao, Yuzhi, Xue, Yu, and Li, Kun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Video-based human pose transfer is a video-to-video generation task that animates a plain source human image based on a series of target human poses. Considering the difficulties in transferring highly structural patterns on the garments and discontinuous poses, existing methods often generate unsatisfactory results such as distorted textures and flickering artifacts. To address these issues, we propose a novel Deformable Motion Modulation (DMM) that utilizes geometric kernel offset with adaptive weight modulation to simultaneously perform feature alignment and style transfer. Different from normal style modulation used in style transfer, the proposed modulation mechanism adaptively reconstructs smoothed frames from style codes according to the object shape through an irregular receptive field of view. To enhance the spatio-temporal consistency, we leverage bidirectional propagation to extract the hidden motion information from a warped image sequence generated by noisy poses. The proposed feature propagation significantly enhances the motion prediction ability by forward and backward propagation. Both quantitative and qualitative experimental results demonstrate superiority over the state-of-the-arts in terms of image fidelity and visual continuity. The source code is publicly available at github.com/rocketappslab/bdmm., Comment: ICCV 2023
Published: 2023

145. ATWM: Defense against adversarial malware based on adversarial training

Author: Li, Kun, Zhang, Fan, and Guo, Wei
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: Deep learning technology has made great achievements in the field of image. In order to defend against malware attacks, researchers have proposed many Windows malware detection models based on deep learning. However, deep learning models are vulnerable to adversarial example attacks. Malware can generate adversarial malware with the same malicious function to attack the malware detection model and evade detection of the model. Currently, many adversarial defense studies have been proposed, but existing adversarial defense studies are based on image sample and cannot be directly applied to malware sample. Therefore, this paper proposes an adversarial malware defense method based on adversarial training. This method uses preprocessing to defend simple adversarial examples to reduce the difficulty of adversarial training. Moreover, this method improves the adversarial defense capability of the model through adversarial training. We experimented with three attack methods in two sets of datasets, and the results show that the method in this paper can improve the adversarial defense capability of the model without reducing the accuracy of the model.
Published: 2023

146. CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation

Author: Cen, Jun, Zhang, Shiwei, Pei, Yixuan, Li, Kun, Zheng, Hang, Luo, Maochun, Zhang, Yingya, and Chen, Qifeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 2D RGB images and 3D LIDAR point clouds provide complementary knowledge for the perception system of autonomous vehicles. Several 2D and 3D fusion methods have been explored for the LIDAR semantic segmentation task, but they suffer from different problems. 2D-to-3D fusion methods require strictly paired data during inference, which may not be available in real-world scenarios, while 3D-to-2D fusion methods cannot explicitly make full use of the 2D information. Therefore, we propose a Bidirectional Fusion Network with Cross-Modality Knowledge Distillation (CMDFusion) in this work. Our method has two contributions. First, our bidirectional fusion scheme explicitly and implicitly enhances the 3D feature via 2D-to-3D fusion and 3D-to-2D fusion, respectively, which surpasses either one of the single fusion schemes. Second, we distillate the 2D knowledge from a 2D network (Camera branch) to a 3D network (2D knowledge branch) so that the 3D network can generate 2D information even for those points not in the FOV (field of view) of the camera. In this way, RGB images are not required during inference anymore since the 2D knowledge branch provides 2D information according to the 3D LIDAR input. We show that our CMDFusion achieves the best performance among all fusion-based methods on SemanticKITTI and nuScenes datasets. The code will be released at https://github.com/Jun-CEN/CMDFusion.
Published: 2023

147. Interactive Image Segmentation with Cross-Modality Vision Transformers

Author: Li, Kun, Vosselman, George, and Yang, Michael Ying
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Interactive image segmentation aims to segment the target from the background with the manual guidance, which takes as input multimodal data such as images, clicks, scribbles, and bounding boxes. Recently, vision transformers have achieved a great success in several downstream visual tasks, and a few efforts have been made to bring this powerful architecture to interactive segmentation task. However, the previous works neglect the relations between two modalities and directly mock the way of processing purely visual information with self-attentions. In this paper, we propose a simple yet effective network for click-based interactive segmentation with cross-modality vision transformers. Cross-modality transformers exploits mutual information to better guide the learning process. The experiments on several benchmarks show that the proposed method achieves superior performance in comparison to the previous state-of-the-art models. The stability of our method in term of avoiding failure cases shows its potential to be a practical annotation tool. The code and pretrained models will be released under https://github.com/lik1996/iCMFormer., Comment: 16 pages
Published: 2023

148. Imitation with Spatial-Temporal Heatmap: 2nd Place Solution for NuPlan Challenge

Author: Hu, Yihan, Li, Kun, Liang, Pingyuan, Qian, Jingyu, Yang, Zhening, Zhang, Haichao, Shao, Wenxin, Ding, Zhuangzhuang, Xu, Wei, and Liu, Qiang
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: This paper presents our 2nd place solution for the NuPlan Challenge 2023. Autonomous driving in real-world scenarios is highly complex and uncertain. Achieving safe planning in the complex multimodal scenarios is a highly challenging task. Our approach, Imitation with Spatial-Temporal Heatmap, adopts the learning form of behavior cloning, innovatively predicts the future multimodal states with a heatmap representation, and uses trajectory refinement techniques to ensure final safety. The experiment shows that our method effectively balances the vehicle's progress and safety, generating safe and comfortable trajectories. In the NuPlan competition, we achieved the second highest overall score, while obtained the best scores in the ego progress and comfort metrics.
Published: 2023

149. Efficient HDR Reconstruction from Real-World Raw Images

Author: Yang, Qirui, Liu, Yihao, Chen, Qihua, Yue, Huanjing, Li, Kun, and Yang, Jingyu
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient high dynamic range (HDR) algorithms. However, many existing HDR methods either deliver unsatisfactory results or consume too much computational and memory resources, hindering their application to high-resolution images (usually with more than 12 megapixels) in practice. In addition, existing HDR dataset collection methods often are labor-intensive. In this work, in a new aspect, we discover an excellent opportunity for HDR reconstructing directly from raw images and investigating novel neural network structures that benefit the deployment of mobile devices. Our key insights are threefold: (1) we develop a lightweight-efficient HDR model, RepUNet, using the structural re-parameterization technique to achieve fast and robust HDR; (2) we design a new computational raw HDR data formation pipeline and construct a real-world raw HDR dataset, RealRaw-HDR; (3) we propose a plug-and-play motion alignment loss to mitigate motion ghosting under limited bandwidth conditions. Our model contains less than 830K parameters and takes less than 3 ms to process an image of 4K resolution using one RTX 3090 GPU. While being highly efficient, our model also outperforms the state-of-the-art HDR methods in terms of PSNR, SSIM, and a color difference metric.
Published: 2023

150. Traditional Rural Heritage Conservation in China: Policies and Theories

Author: Li, Kun and Li, Kun
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

19,498 results on '"Li Kun"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources