Author: "Zhou, Jinxing" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhou, Jinxing"' showing total 396 results

Start Over Author "Zhou, Jinxing"

396 results on '"Zhou, Jinxing"'

1. Towards Open-Vocabulary Audio-Visual Event Localization

Author: Zhou, Jinxing, Guo, Dan, Guo, Ruohao, Mao, Yuxin, Hu, Jingjing, Zhong, Yiran, Chang, Xiaojun, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: The Audio-Visual Event Localization (AVEL) task aims to temporally locate and classify video events that are both audible and visible. Most research in this field assumes a closed-set setting, which restricts these models' ability to handle test data containing event categories absent (unseen) during training. Recently, a few studies have explored AVEL in an open-set setting, enabling the recognition of unseen events as ``unknown'', but without providing category-specific semantics. In this paper, we advance the field by introducing the Open-Vocabulary Audio-Visual Event Localization (OV-AVEL) problem, which requires localizing audio-visual events and predicting explicit categories for both seen and unseen data at inference. To address this new task, we propose the OV-AVEBench dataset, comprising 24,800 videos across 67 real-life audio-visual scenes (seen:unseen = 46:21), each with manual segment-level annotation. We also establish three evaluation metrics for this task. Moreover, we investigate two baseline approaches, one training-free and one using a further fine-tuning paradigm. Specifically, we utilize the unified multimodal space from the pretrained ImageBind model to extract audio, visual, and textual (event classes) features. The training-free baseline then determines predictions by comparing the consistency of audio-text and visual-text feature similarities. The fine-tuning baseline incorporates lightweight temporal layers to encode temporal relations within the audio and visual modalities, using OV-AVEBench training data for model fine-tuning. We evaluate these baselines on the proposed OV-AVEBench dataset and discuss potential directions for future work in this new field., Comment: Project page: https://github.com/jasongief/OV-AVEL
Published: 2024

2. Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Author: Zhou, Jinxing, Guo, Dan, Mao, Yuxin, Zhong, Yiran, Chang, Xiaojun, and Wang, Meng
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less attention. We aim to advance the decoding phase and improve its interpretability. Specifically, we introduce a new decoding paradigm, \underline{l}abel s\underline{e}m\underline{a}ntic-based \underline{p}rojection (LEAP), that employs labels texts of event categories, each bearing distinct and explicit semantics, for parsing potentially overlapping events.LEAP works by iteratively projecting encoded latent features of audio/visual segments onto semantically independent label embeddings. This process, enriched by modeling cross-modal (audio/visual-label) interactions, gradually disentangles event semantics within video segments to refine relevant label embeddings, guaranteeing a more discriminative and interpretable decoding process. To facilitate the LEAP paradigm, we propose a semantic-aware optimization strategy, which includes a novel audio-visual semantic similarity loss function. This function leverages the Intersection over Union of audio and visual events (EIoU) as a novel metric to calibrate audio-visual similarities at the feature level, accommodating the varied event densities across modalities. Extensive experiments demonstrate the superiority of our method, achieving new state-of-the-art performance for AVVP and also enhancing the relevant audio-visual event localization task., Comment: Accepted by ECCV2024
Published: 2024

3. Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling

Author: Zhou, Jinxing, Guo, Dan, Zhong, Yiran, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: The Audio-Visual Video Parsing task aims to identify and temporally localize the events that occur in either or both the audio and visual streams of audible videos. It often performs in a weakly-supervised manner, where only video event labels are provided, \ie, the modalities and the timestamps of the labels are unknown. Due to the lack of densely annotated labels, recent work attempts to leverage pseudo labels to enrich the supervision. A commonly used strategy is to generate pseudo labels by categorizing the known video event labels for each modality. However, the labels are still confined to the video level, and the temporal boundaries of events remain unlabeled. In this paper, we propose a new pseudo label generation strategy that can explicitly assign labels to each video segment by utilizing prior knowledge learned from the open world. Specifically, we exploit the large-scale pretrained models, namely CLIP and CLAP, to estimate the events in each video segment and generate segment-level visual and audio pseudo labels, respectively. We then propose a new loss function to exploit these pseudo labels by taking into account their category-richness and segment-richness. A label denoising strategy is also adopted to further improve the visual pseudo labels by flipping them whenever abnormally large forward losses occur. We perform extensive experiments on the LLP dataset and demonstrate the effectiveness of each proposed design and we achieve state-of-the-art video parsing performance on all types of event parsing, \ie, audio event, visual event, and audio-visual event. We also examine the proposed pseudo label generation strategy on a relevant weakly-supervised audio-visual event localization task and the experimental results again verify the benefits and generalization of our method., Comment: IJCV 2024 Accepted. arXiv admin note: substantial text overlap with arXiv:2303.02344
Published: 2024

4. TAVGBench: Benchmarking Text to Audible-Video Generation

Author: Mao, Yuxin, Shen, Xuyang, Zhang, Jing, Qin, Zhen, Zhou, Jinxing, Xiang, Mochu, Zhong, Yiran, and Dai, Yuchao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: The Text to Audible-Video Generation (TAVG) task involves generating videos with accompanying audio based on text descriptions. Achieving this requires skillful alignment of both audio and video elements. To support research in this field, we have developed a comprehensive Text to Audible-Video Generation Benchmark (TAVGBench), which contains over 1.7 million clips with a total duration of 11.8 thousand hours. We propose an automatic annotation pipeline to ensure each audible video has detailed descriptions for both its audio and video contents. We also introduce the Audio-Visual Harmoni score (AVHScore) to provide a quantitative measure of the alignment between the generated audio and video modalities. Additionally, we present a baseline model for TAVG called TAVDiffusion, which uses a two-stream latent diffusion model to provide a fundamental starting point for further research in this area. We achieve the alignment of audio and video by employing cross-attention and contrastive learning. Through extensive experiments and evaluations on TAVGBench, we demonstrate the effectiveness of our proposed model under both conventional metrics and our proposed metrics., Comment: Technical Report. Project page:https://github.com/OpenNLPLab/TAVGBench
Published: 2024

5. Dynamic Mechanical Behavior and Modified Johnson-Cook Model of Ti60 Alloy under High-temperature and High-strain-rate Conditions

Author: Yang, Dong and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

6. Audio-Visual Segmentation with Semantics

Author: Zhou, Jinxing, Shen, Xuyang, Wang, Jianyuan, Zhang, Jiayi, Sun, Weixuan, Zhang, Jing, Birchfield, Stan, Guo, Dan, Kong, Lingpeng, Wang, Meng, and Zhong, Yiran
Published: 2024
Full Text: View/download PDF

7. Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Author: Li, Zhangbin, Guo, Dan, Zhou, Jinxing, Zhang, Jing, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper focuses on the Audio-Visual Question Answering (AVQA) task that aims to answer questions derived from untrimmed audible videos. To generate accurate answers, an AVQA model is expected to find the most informative audio-visual clues relevant to the given questions. In this paper, we propose to explicitly consider fine-grained visual objects in video frames (object-level clues) and explore the multi-modal relations(i.e., the object, audio, and question) in terms of feature interaction and model optimization. For the former, we present an end-to-end object-oriented network that adopts a question-conditioned clue discovery module to concentrate audio/visual modalities on respective keywords of the question and designs a modality-conditioned clue collection module to highlight closely associated audio segments or visual objects. For model optimization, we propose an object-aware adaptive-positivity learning strategy that selects the highly semantic-matched multi-modal pair as positivity. Specifically, we design two object-aware contrastive loss functions to identify the highly relevant question-object pairs and audio-object pairs, respectively. These selected pairs are constrained to have larger similarity values than the mismatched pairs. The positivity-selecting process is adaptive as the positivity pairs selected in each video frame may be different. These two object-aware objectives help the model understand which objects are exactly relevant to the question and which are making sounds. Extensive experiments on the MUSIC-AVQA dataset demonstrate the proposed method is effective in finding favorable audio-visual clues and also achieves new state-of-the-art question-answering performance., Comment: Accepted by AAAI-2024
Published: 2023

8. Audio-Visual Instance Segmentation

Author: Guo, Ruohao, Ying, Xianghua, Chen, Yaru, Niu, Dantong, Li, Guangyao, Qu, Liao, Qi, Yanyu, Zhou, Jinxing, Xing, Bowei, Yue, Wenzhen, Shi, Ji, Wang, Qixun, Zhang, Peiliang, and Liang, Buwen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this paper, we propose a new multi-modal task, termed audio-visual instance segmentation (AVIS), which aims to simultaneously identify, segment and track individual sounding object instances in audible videos. To facilitate this research, we introduce a high-quality benchmark named AVISeg, containing over 90K instance masks from 26 semantic categories in 926 long videos. Additionally, we propose a strong baseline model for this task. Our model first localizes sound source within each frame, and condenses object-specific contexts into concise tokens. Then it builds long-range audio-visual dependencies between these tokens using window-based attention, and tracks sounding objects among the entire video sequences. Extensive experiments reveal that our method performs best on AVISeg, surpassing the existing methods from related tasks. We further conduct the evaluation on several multi-modal large models; however, they exhibits subpar performance on instance-level sound source localization and temporal perception. We expect that AVIS will inspire the community towards a more comprehensive multi-modal understanding. The dataset and code will soon be released on https://github.com/ruohaoguo/avis., Comment: Project page: https://github.com/ruohaoguo/avis
Published: 2023

9. Stimulation of organic N mineralization by N‒acquiring enzyme activity alleviates soil microbial N limitation following afforestation in subtropical karst areas

Author: Liu, Lijun, Zhu, Qilin, Wen, Dongni, Yang, Lin, Ni, Kang, Xu, Xingliang, Cao, Jianhua, Meng, Lei, Yang, Jinling, Zhou, Jinxing, Zhu, Tongbin, and Müller, Christoph
Published: 2024
Full Text: View/download PDF

10. Fine-grained Audible Video Description

Author: Shen, Xuyang, Li, Dong, Zhou, Jinxing, Qin, Zhen, He, Bowen, Han, Xiaodong, Li, Aixuan, Dai, Yuchao, Kong, Lingpeng, Wang, Meng, Qiao, Yu, and Zhong, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We explore a new task for audio-visual-language modeling called fine-grained audible video description (FAVD). It aims to provide detailed textual descriptions for the given audible videos, including the appearance and spatial locations of each object, the actions of moving objects, and the sounds in videos. Existing visual-language modeling tasks often concentrate on visual cues in videos while undervaluing the language and audio modalities. On the other hand, FAVD requires not only audio-visual-language modeling skills but also paragraph-level language generation abilities. We construct the first fine-grained audible video description benchmark (FAVDBench) to facilitate this research. For each video clip, we first provide a one-sentence summary of the video, ie, the caption, followed by 4-6 sentences describing the visual details and 1-2 audio-related descriptions at the end. The descriptions are provided in both English and Chinese. We create two new metrics for this task: an EntityScore to gauge the completeness of entities in the visual descriptions, and an AudioScore to assess the audio descriptions. As a preliminary approach to this task, we propose an audio-visual-language transformer that extends existing video captioning model with an additional audio branch. We combine the masked language modeling and auto-regressive language modeling losses to optimize our model so that it can produce paragraph-level descriptions. We illustrate the efficiency of our model in audio-visual-language modeling by evaluating it against the proposed benchmark using both conventional captioning metrics and our proposed metrics. We further put our benchmark to the test in video generation models, demonstrating that employing fine-grained video descriptions can create more intricate videos than using captions., Comment: accepted to CVPR 2023, Xuyang Shen, Dong Li and Jinxing Zhou contribute equally, code link: github.com/OpenNLPLab/FAVDBench, dataset link: www.avlbench.opennlplab.cn
Published: 2023

11. Improving Audio-Visual Video Parsing with Pseudo Visual Labels

Author: Zhou, Jinxing, Guo, Dan, Zhong, Yiran, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Audio-Visual Video Parsing is a task to predict the events that occur in video segments for each modality. It often performs in a weakly supervised manner, where only video event labels are provided, i.e., the modalities and the timestamps of the labels are unknown. Due to the lack of densely annotated labels, recent work attempts to leverage pseudo labels to enrich the supervision. A commonly used strategy is to generate pseudo labels by categorizing the known event labels for each modality. However, the labels are still limited to the video level, and the temporal boundaries of event timestamps remain unlabeled. In this paper, we propose a new pseudo label generation strategy that can explicitly assign labels to each video segment by utilizing prior knowledge learned from the open world. Specifically, we exploit the CLIP model to estimate the events in each video segment based on visual modality to generate segment-level pseudo labels. A new loss function is proposed to regularize these labels by taking into account their category-richness and segmentrichness. A label denoising strategy is adopted to improve the pseudo labels by flipping them whenever high forward binary cross entropy loss occurs. We perform extensive experiments on the LLP dataset and demonstrate that our method can generate high-quality segment-level pseudo labels with the help of our newly proposed loss and the label denoising strategy. Our method achieves state-of-the-art audio-visual video parsing performance.
Published: 2023

12. Audio-Visual Segmentation with Semantics

Author: Zhou, Jinxing, Shen, Xuyang, Wang, Jianyuan, Zhang, Jiayi, Sun, Weixuan, Zhang, Jing, Birchfield, Stan, Guo, Dan, Kong, Lingpeng, Wang, Meng, and Zhong, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark, i.e., AVSBench, providing pixel-wise annotations for sounding objects in audible videos. It contains three subsets: AVSBench-object (Single-source subset, Multi-sources subset) and AVSBench-semantic (Semantic-labels subset). Accordingly, three settings are studied: 1) semi-supervised audio-visual segmentation with a single sound source; 2) fully-supervised audio-visual segmentation with multiple sound sources, and 3) fully-supervised audio-visual semantic segmentation. The first two settings need to generate binary masks of sounding objects indicating pixels corresponding to the audio, while the third setting further requires generating semantic maps indicating the object category. To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage audio-visual mapping during training. Quantitative and qualitative experiments on AVSBench compare our approach to several existing methods for related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench. Online benchmark is available at http://www.avlbench.opennlplab.cn., Comment: Submitted to TPAMI as a journal extension of ECCV 2022. Jinxing Zhou, Xuyang Shen, and Jianyuan Wang contribute equally to this work. Meng Wang and Yiran Zhong are the corresponding authors. Code is available at https://github.com/OpenNLPLab/AVSBench. Online benchmark is available at http://www.avlbench.opennlplab.cn. arXiv admin note: substantial text overlap with arXiv:2207.05042
Published: 2023

13. Contrastive Positive Sample Propagation along the Audio-Visual Event Line

Author: Zhou, Jinxing, Guo, Dan, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). Given a video, we aim to localize video segments containing an AVE and identify its category. It is pivotal to learn the discriminative features for each video segment. Unlike existing work focusing on audio-visual feature fusion, in this paper, we propose a new contrastive positive sample propagation (CPSP) method for better deep feature representation learning. The contribution of CPSP is to introduce the available full or weak label as a prior that constructs the exact positive-negative samples for contrastive learning. Specifically, the CPSP involves comprehensive contrastive constraints: pair-level positive sample propagation (PSP), segment-level and video-level positive sample activation (PSA$_S$ and PSA$_V$). Three new contrastive objectives are proposed (\emph{i.e.}, $\mathcal{L}_{\text{avpsp}}$, $\mathcal{L}_\text{spsa}$, and $\mathcal{L}_\text{vpsa}$) and introduced into both the fully and weakly supervised AVE localization. To draw a complete picture of the contrastive learning in AVE localization, we also study the self-supervised positive sample propagation (SSPSP). As a result, CPSP is more helpful to obtain the refined audio-visual features that are distinguishable from the negatives, thus benefiting the classifier prediction. Extensive experiments on the AVE and the newly collected VGGSound-AVEL100k datasets verify the effectiveness and generalization ability of our method., Comment: Accepted to TPAMI; Dataset and Code are available at https://github.com/jasongief/CPSP. arXiv admin note: substantial text overlap with arXiv:2104.00239
Published: 2022

14. Audio-Visual Segmentation

Author: Zhou, Jinxing, Wang, Jianyuan, Zhang, Jiayi, Sun, Weixuan, Zhang, Jing, Birchfield, Stan, Guo, Dan, Kong, Lingpeng, Wang, Meng, and Zhong, Yiran
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: We propose to explore a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark (AVSBench), providing pixel-wise annotations for the sounding objects in audible videos. Two settings are studied with this benchmark: 1) semi-supervised audio-visual segmentation with a single sound source and 2) fully-supervised audio-visual segmentation with multiple sound sources. To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage the audio-visual mapping during training. Quantitative and qualitative experiments on the AVSBench compare our approach to several existing methods from related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench., Comment: ECCV 2022; Code is available at https://github.com/OpenNLPLab/AVSBench
Published: 2022

15. Exploring nickel adsorption and desorption dynamics in sandy clay loam and clay loam soil

Author: Rebi, Ansa, Ghazanfar, Sammia, Sabir, Muhammad, Wang, Guan, Hussain, Azfar, Flynn, Trevan, Zhou, Jinxing, and Li, Guijing
Published: 2024
Full Text: View/download PDF

16. Stoichiometric and bacterial eco-physiological insights into microbial resource availability in karst regions affected by clipping-and-burning

Author: Rebi, Ansa, Wang, Guan, Yang, Tao, Kanomanyanga, Jasper, Ejaz, Irsa, Mustafa, Adnan, Rizwan, Muhammad, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

17. Characterizing the local and global climatic factors associated with vegetation dynamics in the karst region of southwest China

Author: Hussain, Azfar, Cao, Jianhua, Abbas, Haider, Hussain, Ishtiaq, Zhou, Jinxing, Yang, Hui, Rezaei, Abolfazl, Luo, Qukan, Ullah, Waheed, and Liang, Zhong
Published: 2024
Full Text: View/download PDF

18. Greater variation of soil organic carbon in limestone- than shale-based soil along soil depth in a subtropical coniferous forest within a karst faulted basin of China

Author: Yang, Tao, Wang, Genzhu, Long, Jie, Mi, Jinyan, Yu, Aijia, Liu, Xingyu, Zhang, Haoran, Dong, Liang, Li, Zihao, Zheng, Chenghao, Herath, Saman, Zhou, Jinxing, and Peng, Xiawei
Published: 2024
Full Text: View/download PDF

19. Increasing monsoon precipitation extremes in relation to large-scale climatic patterns in Pakistan

Author: Hussain, Azfar, Hussain, Ishtiaq, Rezaei, Abolfazl, Ullah, Waheed, Lu, Mengqian, Zhou, Jinxing, and Guan, Yinghui
Published: 2024
Full Text: View/download PDF

20. Effect on biomass production and phosphorus use efficiency of maize by using citric acid amended di-ammonium phosphate fertilizer

Author: Ikram, Wasiq, Rebi, Ansa, Fareed, Muhammad Irfan, Sattar, Mehwish, Wang, Guan, Wahla, Abdul Qadeer, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

21. Divergent responses of aggregate breakdown by slaking to nitrogen forms in solution for contrasting soil types

Author: Wu, Xinliang, Wang, Chenyu, Cai, Chongfa, Yao, Sixu, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

22. Drivers of plant diversification along an altitudinal gradient in the alpine desert grassland, Northern Tibetan Plateau

Author: Wang, Lina, Gesang, Quzhen, Luo, Jiufu, Wu, Xinliang, Rebi, Ansa, You, Yonggang, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

23. Geographic variations of aggregate mechanical stability controlled by inorganic cementing agents across the East Asian monsoon region

Author: Fan, Xuesong, Wu, Xinliang, Zhou, Jinxing, and Wan, Long
Published: 2024
Full Text: View/download PDF

24. The dominant influence of terrain and geology on vegetation mortality in response to drought: Exploring resilience and resistance

Author: Xiao, Linying, Zhou, Jinxing, Wu, Xiuqin, Anas Khan, Muhammad, Zhao, Sen, and Wu, Xinliang
Published: 2024
Full Text: View/download PDF

25. Spatiotemporal temperature trends over homogenous climatic regions of Pakistan during 1961–2017

Author: Hussain, Azfar, Hussain, Ishtiaq, Ali, Shaukat, Ullah, Waheed, Khan, Firdos, Ullah, Safi, Abbas, Haider, Manzoom, Asima, Cao, Jianhua, and Zhou, Jinxing
Published: 2023
Full Text: View/download PDF

26. Dynamic monitoring and restorability evaluation of alpine wetland in the eastern edge of Qinghai–Tibet Plateau

Author: Zhang, Xuexia, Hu, Yunzhe, Zhao, Liuhui, Fu, Shujing, Cui, Yi, Fulati, Gulimire, Wang, Xiangyu, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

27. Unraveling the impact of wildfires on permafrost ecosystems: Vulnerability, implications, and management strategies

Author: Rebi, Ansa, Wang, Guan, Irfan, Muhammad, Hussain, Azfar, Mustafa, Adnan, Flynn, Trevan, Ejaz, Irsa, Raza, Taqi, Mushtaq, Parsa, Rizwan, Muhammad, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

28. Positive Sample Propagation along the Audio-Visual Event Line

Author: Zhou, Jinxing, Zheng, Liang, Zhong, Yiran, Hao, Shijie, and Wang, Meng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). Given a video, we aim to localize video segments containing an AVE and identify its category. In order to learn discriminative features for a classifier, it is pivotal to identify the helpful (or positive) audio-visual segment pairs while filtering out the irrelevant ones, regardless whether they are synchronized or not. To this end, we propose a new positive sample propagation (PSP) module to discover and exploit the closely related audio-visual pairs by evaluating the relationship within every possible pair. It can be done by constructing an all-pair similarity map between each audio and visual segment, and only aggregating the features from the pairs with high similarity scores. To encourage the network to extract high correlated features for positive samples, a new audio-visual pair similarity loss is proposed. We also propose a new weighting branch to better exploit the temporal correlations in weakly supervised setting. We perform extensive experiments on the public AVE dataset and achieve new state-of-the-art accuracy in both fully and weakly supervised settings, thus verifying the effectiveness of our method., Comment: Accepted to CVPR 2021. Code is available at https://github.com/jasongief/PSP_CVPR_2021
Published: 2021

29. Soil type regulates the divergent loss characteristics of sediment associated carbon and nitrogen in different size classes during rainfall erosion on cultivated lands

Author: Wu, Xinliang, Zhang, Zhiyong, Cai, Chongfa, Zhou, Jinxing, and Zhang, Wenbo
Published: 2024
Full Text: View/download PDF

30. Memory effects of vegetation after extreme weather events under various geological conditions in a typical karst watershed in southwestern China

Author: Xiao, Linying, Wu, Xiuqin, Zhao, Sen, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

31. Effect of irrigation levels on the physiological responses of petunia cultivars for selection

Author: Rebi, Ansa, Ejaz, Irsa, Khatana, Muhammad Ahsan, Alvi, Ahmad Bilal Abbas, Irfan, Muhammad, Wang, Guan, Gang, You Yong, Wang, Lina, Meng, Yu, Ghazanfar, Sammia, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

32. A new method for disentangling the coupling effect of slaking and mechanical breakdown on aggregate stability: Validation on splash erosion

Author: Wu, Xinliang, Yao, Sixu, and Zhou, Jinxing
Published: 2024
Full Text: View/download PDF

33. Topography-driven differences in soil N transformation constrain N availability in karst ecosystems

Author: Wen, Dongni, Yang, Lin, Ni, Kang, Xu, Xingliang, Yu, Longfei, Elrys, Ahmed S., Meng, Lei, Zhou, Jinxing, Zhu, Tongbin, and Müller, Christoph
Published: 2024
Full Text: View/download PDF

34. Effects of rock lithology and soil nutrients on nitrogen and phosphorus mobility in trees in non-karst and karst forests of southwest China

Author: Zheng, Chenghao, Wan, Long, Wang, Ruoshui, Wang, Guan, Dong, Liang, Yang, Tao, Yang, Qilin, and Zhou, Jinxing
Published: 2023
Full Text: View/download PDF

35. Contributions of climate and soil properties to geographic variations of soil organic matter across the East Asian monsoon region

Author: Wu, Xinliang, Cai, Chongfa, Yuan, Zaijian, Li, Dingqiang, Zhou, Jinxing, and Huang, Chao
Published: 2023
Full Text: View/download PDF

36. Bacteria life-history strategies and the linkage of soil C-N-P stoichiometry to microbial resource limitation differed in karst and non-karst plantation forests in southwest China

Author: Yang, Tao, Zhang, Haoran, Zheng, Chenghao, Wu, Xuejing, Zhao, Yutong, Li, Xinyang, Liu, Haizhu, Dong, Liang, Lu, Zichun, Zhou, Jinxing, and Peng, Xiawei
Published: 2023
Full Text: View/download PDF

37. Spatiotemporal pattern of vegetation water use efficiency between 2003 and 2017 and its coupling relationship with artificial carbon sequestration in the karst region of Southwestern China

Author: Wang, Lei, Wu, Xiuqin, Guo, Jianbin, Zhou, Jinxing, Wu, Xiebao, and Huang, Junwei
Published: 2023
Full Text: View/download PDF

38. Phosphorus addition increases microbial necromass by increasing N availability in China: A meta-analysis

Author: Zhang, Haoran, Yang, Tao, Wu, Xuejing, Zhang, Jianwei, Yu, Xiuying, Zhou, Jinxing, Herath, Saman, and Peng, Xiawei
Published: 2023
Full Text: View/download PDF

39. Assessment of precipitation extremes and their association with NDVI, monsoon and oceanic indices over Pakistan

Author: Hussain, Azfar, Hussain, Ishtiaq, Ali, Shaukat, Ullah, Waheed, Khan, Firdos, Rezaei, Abolfazl, Ullah, Safi, Abbas, Haider, Manzoom, Asima, Cao, Jianhua, and Zhou, Jinxing
Published: 2023
Full Text: View/download PDF

40. Analysis and simulation of nitrogen loss during water erosion process on windward slope under wind-driven rain conditions

Author: An, Miaoying, Xing, Weiming, Han, Yuguo, Zhou, Jinxing, Qu, Zhixu, Zhao, Chenyang, and Xu, Pan
Published: 2023
Full Text: View/download PDF

41. Non-linear response of sediment size characteristics and associated transport patterns to soil structural stability in sheet erosion under field rainfall simulation

Author: Wu, Xinliang, Cai, Chongfa, Li, Dingqiang, Zhou, Jinxing, and Zhang, Wenbo
Published: 2023
Full Text: View/download PDF

42. Properties and pelletization of Camellia oleifera shell after anoxic storage

Author: Huang, Zhongliang, Chen, Hongli, Tan, Mengjiao, Zhang, Liqiang, Qin, Xiaoli, Zhang, Xuan, Zhou, Jinxing, Zhong, Renhua, and Li, Hui
Published: 2023
Full Text: View/download PDF

43. Geographic variations of pore structure of clayey soils along a climatic gradient

Author: Wu, Xinliang, Yuan, Zaijian, Li, Dingqiang, Zhou, Jinxing, and Liu, Tong
Published: 2023
Full Text: View/download PDF

44. Slash-and-burn in karst regions lowers soil gross nitrogen (N) transformation rates and N-turnover

Author: Wang, Guan, Zhu, Tongbin, Zhou, Jinxing, Yu, Yongjie, Petropoulos, Evangelos, and Müller, Christoph
Published: 2022
Full Text: View/download PDF

45. Improving the performance of an unmixing model in sediment source apportionment using synthetic sediment mixtures and an adaptive boosting algorithm

Author: Zhao, Yang, Gao, Guanglei, Ding, Guodong, Zhou, Qizhi, Zhang, Ying, Wang, Jiayuan, and Zhou, Jinxing
Published: 2022
Full Text: View/download PDF

46. Effects of secondary succession on soil fungal and bacterial compositions and diversities in a karst area

Author: Wang, Genzhu, Liu, Yuguo, Cui, Ming, Zhou, Ziyuan, Zhang, Qian, Li, Yajin, Ha, Wenxiu, Pang, Danbo, Luo, Jiufu, and Zhou, Jinxing
Published: 2022
Full Text: View/download PDF

47. Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-Wise Pseudo Labeling

Author: Zhou, Jinxing, primary, Guo, Dan, additional, Zhong, Yiran, additional, and Wang, Meng, additional
Published: 2024
Full Text: View/download PDF

48. Soil Quality Variation under Different Land Use Types and Its Driving Factors in Beijing

Author: Qiang, Fangfang, primary, Sheng, Changchang, additional, Zhang, Jiaqi, additional, Jiang, Liwei, additional, and Zhou, Jinxing, additional
Published: 2024
Full Text: View/download PDF

49. Desertification and Its Control along the Qinghai-Tibet Railway

Author: Liu, Yuguo, primary, Luo, Jiufu, additional, Zhou, Jinxing, additional, and Cui, Ming, additional
Published: 2022
Full Text: View/download PDF

50. Audio–Visual Segmentation

Author: Zhou, Jinxing, primary, Wang, Jianyuan, additional, Zhang, Jiayi, additional, Sun, Weixuan, additional, Zhang, Jing, additional, Birchfield, Stan, additional, Guo, Dan, additional, Kong, Lingpeng, additional, Wang, Meng, additional, and Zhong, Yiran, additional
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

396 results on '"Zhou, Jinxing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources