259 results on '"Hou, Junhui"'
Search Results
2. A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks
- Author
-
Zhang, Yifan, Hou, Junhui, and Yuan, Yixuan
- Published
- 2024
- Full Text
- View/download PDF
3. Unsupervised video-based action recognition using two-stream generative adversarial network
- Author
-
Lin, Wei, Zeng, Huanqiang, Zhu, Jianqing, Hsia, Chih-Hsien, Hou, Junhui, and Ma, Kai-Kuang
- Published
- 2024
- Full Text
- View/download PDF
4. GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation
- Author
-
Zhang, Yifan, Zhang, Qijian, Zhu, Zhiyu, Hou, Junhui, and Yuan, Yixuan
- Published
- 2023
- Full Text
- View/download PDF
5. Functions and mechanisms of lncRNA MALAT1 in cancer chemotherapy resistance
- Author
-
Hou, Junhui, Zhang, Gong, Wang, Xia, Wang, Yuan, and Wang, Kefeng
- Published
- 2023
- Full Text
- View/download PDF
6. Knowledge-map analysis of percutaneous nephrolithotomy (PNL) for urolithiasis
- Author
-
Hou, Junhui, Lv, Zongwei, Wang, Yuan, Wang, Xia, Wang, Yibing, and Wang, Kefeng
- Published
- 2023
- Full Text
- View/download PDF
7. Differentiable Deformation Graph-Based Neural Non-rigid Registration
- Author
-
Feng, Wanquan, Cai, Hongrui, Hou, Junhui, Deng, Bailin, and Zhang, Juyong
- Published
- 2023
- Full Text
- View/download PDF
8. Screen content video quality assessment based on spatiotemporal sparse feature
- Author
-
Ding, Rui, Zeng, Huanqiang, Wen, Hao, Huang, Hailiang, Cheng, Shan, and Hou, Junhui
- Published
- 2023
- Full Text
- View/download PDF
9. RegGeoNet: Learning Regular Representations for Large-Scale 3D Point Clouds
- Author
-
Zhang, Qijian, Hou, Junhui, Qian, Yue, Chan, Antoni B., Zhang, Juyong, and He, Ying
- Published
- 2022
- Full Text
- View/download PDF
10. Guest Editorial: machine learning for visual information processing & understanding
- Author
-
Zeng, Huanqiang, Cao, Jiuwen, Yang, Yimin, and Hou, Junhui
- Published
- 2023
- Full Text
- View/download PDF
11. Occlusion-Resistant instance segmentation of piglets in farrowing pens using center clustering network
- Author
-
Huang, Endai, Mao, Axiu, Hou, Junhui, Wu, Yongjian, Xu, Weitao, Camila Ceballos, Maria, Parsons, Thomas D., and Liu, Kai
- Published
- 2023
- Full Text
- View/download PDF
12. Effect of circular RNAs and N6-methyladenosine (m6A) modification on cancer biology
- Author
-
Zhang, Gong, Hou, Junhui, Mei, Chenxue, Wang, Xia, Wang, Yuan, and Wang, Kefeng
- Published
- 2023
- Full Text
- View/download PDF
13. Semi-supervised adaptive kernel concept factorization
- Author
-
Wu, Wenhui, Hou, Junhui, Wang, Shiqi, Kwong, Sam, and Zhou, Yu
- Published
- 2023
- Full Text
- View/download PDF
14. Learning hyperspectral images from RGB images via a coarse-to-fine CNN
- Author
-
Mei, Shaohui, Geng, Yunhao, Hou, Junhui, and Du, Qian
- Published
- 2022
- Full Text
- View/download PDF
15. Single image-based head pose estimation with spherical parametrization and 3D morphing
- Author
-
Yuan, Hui, Li, Mengyu, Hou, Junhui, and Xiao, Jimin
- Published
- 2020
- Full Text
- View/download PDF
16. Video summarization via block sparse dictionary selection
- Author
-
Ma, Mingyang, Mei, Shaohui, Wan, Shuai, Hou, Junhui, Wang, Zhiyong, and Feng, David Dagan
- Published
- 2020
- Full Text
- View/download PDF
17. Correlation filter tracker with siamese: A robust and real-time object tracking framework
- Author
-
Pan, Gengzheng, Chen, Guochun, Kang, Wenxiong, and Hou, Junhui
- Published
- 2019
- Full Text
- View/download PDF
18. 3D human pose estimation via human structure-aware fully connected network
- Author
-
Zhang, Xiaoyan, Tang, Zhenhua, Hou, Junhui, and Hao, Yanbin
- Published
- 2019
- Full Text
- View/download PDF
19. ImmerTai: Immersive Motion Learning in VR Environments
- Author
-
Chen, Xiaoming, Chen, Zhibo, Li, Ye, He, Tianyu, Hou, Junhui, Liu, Sen, and He, Ying
- Published
- 2019
- Full Text
- View/download PDF
20. A multi-scale contrast-based image quality assessment model for multi-exposure image fusion
- Author
-
Xing, Lu, Cai, Lei, Zeng, Huanqiang, Chen, Jing, Zhu, Jianqing, and Hou, Junhui
- Published
- 2018
- Full Text
- View/download PDF
21. Downstream-agnostic Adversarial Examples
- Author
-
Zhou, Ziqi, Hu, Shengshan, Zhao, Ruizhi, Wang, Qian, Zhang, Leo Yu, Hou, Junhui, and Jin, Hai
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning operations to enjoy the benefit of "large model". Despite this promising prospect, the security of pre-trained encoder has not been thoroughly investigated yet, especially when the pre-trained encoder is publicly available for commercial use. In this paper, we propose AdvEncoder, the first framework for generating downstream-agnostic universal adversarial examples based on the pre-trained encoder. AdvEncoder aims to construct a universal adversarial perturbation or patch for a set of natural images that can fool all the downstream tasks inheriting the victim pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Therefore, we first exploit the high frequency component information of the image to guide the generation of adversarial examples. Then we design a generative attack framework to construct adversarial perturbations/patches by learning the distribution of the attack surrogate dataset to improve their attack success rates and transferability. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset. We also tailor four defenses for pre-trained encoders, the results of which further prove the attack ability of AdvEncoder., Comment: This paper has been accepted by the International Conference on Computer Vision (ICCV '23, October 2--6, 2023, Paris, France)
- Published
- 2023
22. Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
- Author
-
Zhu, Zhiyu, Hou, Junhui, and Wu, Dapeng Oliver
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper addresses the problem of cross-modal object tracking from RGB videos and event data. Rather than constructing a complex cross-modal fusion network, we explore the great potential of a pre-trained vision Transformer (ViT). Particularly, we delicately investigate plug-and-play training augmentations that encourage the ViT to bridge the vast distribution gap between the two modalities, enabling comprehensive cross-modal information interaction and thus enhancing its ability. Specifically, we propose a mask modeling strategy that randomly masks a specific modality of some tokens to enforce the interaction between tokens from different modalities interacting proactively. To mitigate network oscillations resulting from the masking strategy and further amplify its positive effect, we then theoretically propose an orthogonal high-rank loss to regularize the attention matrix. Extensive experiments demonstrate that our plug-and-play training augmentation techniques can significantly boost state-of-the-art one-stream and twostream trackers to a large extent in terms of both tracking precision and success rate. Our new perspective and findings will potentially bring insights to the field of leveraging powerful pre-trained ViTs to model cross-modal data. The code will be publicly available.
- Published
- 2023
23. Unleash the Potential of 3D Point Cloud Modeling with A Calibrated Local Geometry-driven Distance Metric
- Author
-
Ren, Siyu and Hou, Junhui
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Quantifying the dissimilarity between two unstructured 3D point clouds is a challenging task, with existing metrics often relying on measuring the distance between corresponding points that can be either inefficient or ineffective. In this paper, we propose a novel distance metric called Calibrated Local Geometry Distance (CLGD), which computes the difference between the underlying 3D surfaces calibrated and induced by a set of reference points. By associating each reference point with two given point clouds through computing its directional distances to them, the difference in directional distances of an identical reference point characterizes the geometric difference between a typical local region of the two point clouds. Finally, CLGD is obtained by averaging the directional distance differences of all reference points. We evaluate CLGD on various optimization and unsupervised learning-based tasks, including shape reconstruction, rigid registration, scene flow estimation, and feature representation. Extensive experiments show that CLGD achieves significantly higher accuracy under all tasks in a memory and computationally efficient manner, compared with existing metrics. As a generic metric, CLGD has the potential to advance 3D point cloud modeling. The source code is publicly available at https://github.com/rsy6318/CLGD.
- Published
- 2023
24. Low-latency compression of mocap data using learned spatial decorrelation transform
- Author
-
Hou, Junhui, Chau, Lap-Pui, Magnenat-Thalmann, Nadia, and He, Ying
- Published
- 2016
- Full Text
- View/download PDF
25. Decoupling Dynamic Monocular Videos for Dynamic View Synthesis
- Author
-
You, Meng and Hou, Junhui
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The challenge of dynamic view synthesis from dynamic monocular videos, i.e., synthesizing novel views for free viewpoints given a monocular video of a dynamic scene captured by a moving camera, mainly lies in accurately modeling the dynamic objects of a scene using limited 2D frames, each with a varying timestamp and viewpoint. Existing methods usually require pre-processed 2D optical flow and depth maps by additional methods to supervise the network, making them suffer from the inaccuracy of the pre-processed supervision and the ambiguity when lifting the 2D information to 3D. In this paper, we tackle this challenge in an unsupervised fashion. Specifically, we decouple the motion of the dynamic objects into object motion and camera motion, respectively regularized by proposed unsupervised surface consistency and patch-based multi-view constraints. The former enforces the 3D geometric surfaces of moving objects to be consistent over time, while the latter regularizes their appearances to be consistent across different viewpoints. Such a fine-grained motion formulation can alleviate the learning difficulty for the network, thus enabling it to produce not only novel views with higher quality but also more accurate scene flows and depth than existing methods requiring extra supervision. We will make the code publicly available.
- Published
- 2023
26. GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute
- Author
-
Xing, Jinrui, Yuan, Hui, Hamzaoui, Raouf, Liu, Hao, and Hou, Junhui
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently. Specifically, we use a parallel-serial graph attention module with a multi-head graph attention mechanism to focus on important points or features and help them fuse together. Additionally, we design a feature refinement module that takes into account the normals and geometry distance between points. To work within the limitations of GPU memory capacity, the distorted point cloud is divided into overlap-allowed 3D patches, which are sent to GQE-Net for quality enhancement. To account for differences in data distribution among different color omponents, three models are trained for the three color components. Experimental results show that our method achieves state-of-the-art performance. For example, when implementing GQE-Net on the recent G-PCC coding standard test model, 0.43 dB, 0.25 dB, and 0.36 dB Bjontegaard delta (BD)-peak-signal-to-noise ratio (PSNR), corresponding to 14.0%, 9.3%, and 14.5% BD-rate savings can be achieved on dense point clouds for the Y, Cb, and Cr components, respectively., 13 pages, 11 figures, submitted to IEEE TIP
- Published
- 2023
27. CAS-NET: Cascade attention-based sampling neural network for point cloud simplification
- Author
-
Chen, Chen, Yuan, Hui, Liu, Hao, Hou, Junhui, and Hamzaoui, Raouf
- Subjects
Point clouds ,Attention-based sampling - Abstract
Point cloud sampling can reduce storage requirements and computation costs for various vision tasks. Traditional sampling methods, such as farthest point sampling, are not geared towards downstream tasks and may fail on such tasks. In this paper, we propose a cascade attention-based sampling network (CAS-Net), which is end-to-end trainable. Specifically, we propose an attention-based sampling module (ASM) to capture the semantic features and preserve the geometry of the original point cloud. Experimental results on the ModelNet40 dataset show that CAS-Net outperforms state-of-the-art methods in a sampling-based point cloud classification task, while preserving the geometric structure of the sampled point cloud.
- Published
- 2023
28. Bidirectional Propagation for Cross-Modal 3D Object Detection
- Author
-
Zhang, Yifan, Zhang, Qijian, Hou, Junhui, Yuan, Yixuan, and Xing, Guoliang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent works have revealed the superiority of feature-level fusion for cross-modal 3D object detection, where fine-grained feature propagation from 2D image pixels to 3D LiDAR points has been widely adopted for performance improvement. Still, the potential of heterogeneous feature propagation between 2D and 3D domains has not been fully explored. In this paper, in contrast to existing pixel-to-point feature propagation, we investigate an opposite point-to-pixel direction, allowing point-wise features to flow inversely into the 2D image branch. Thus, when jointly optimizing the 2D and 3D streams, the gradients back-propagated from the 2D image branch can boost the representation ability of the 3D backbone network working on LiDAR point clouds. Then, combining pixel-to-point and point-to-pixel information flow mechanisms, we construct an bidirectional feature propagation framework, dubbed BiProDet. In addition to the architectural design, we also propose normalized local coordinates map estimation, a new 2D auxiliary task for the training of the 2D image branch, which facilitates learning local spatial-aware features from the image modality and implicitly enhances the overall 3D detection performance. Extensive experiments and ablation studies validate the effectiveness of our method. Notably, we rank $\mathbf{1^{\mathrm{st}}}$ on the highly competitive KITTI benchmark on the cyclist class by the time of submission. The source code is available at https://github.com/Eaphan/BiProDet.
- Published
- 2023
29. Deep Diversity-Enhanced Feature Representation of Hyperspectral Images
- Author
-
Hou, Jinhui, Zhu, Zhiyu, Hou, Junhui, Liu, Hui, Zeng, Huanqiang, and Meng, Deyu
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
In this paper, we study the problem of embedding the high-dimensional spatio-spectral information of hyperspectral (HS) images efficiently and effectively, oriented by feature diversity. To be specific, based on the theoretical formulation that feature diversity is correlated with the rank of the unfolded kernel matrix, we rectify 3D convolution by modifying its topology to boost the rank upper-bound, yielding a rank-enhanced spatial-spectral symmetrical convolution set (ReS$^3$-ConvSet), which is able to not only learn diverse and powerful feature representations but also save network parameters. In addition, we also propose a novel diversity-aware regularization (DA-Reg) term, which acts directly on the feature maps to maximize the independence among elements. To demonstrate the superiority of the proposed ReS$^3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks, including denoising, spatial super-resolution, and classification. Extensive experiments demonstrate that the proposed approaches outperform state-of-the-art methods to a significant extent both quantitatively and qualitatively. The code is publicly available at \url{https://github.com/jinnh/ReSSS-ConvSet}., 15 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2207.04266
- Published
- 2023
30. Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation
- Author
-
Zhang, Qijian and Hou, Junhui
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The past few years have witnessed the prevalence of self-supervised representation learning within the language and 2D vision communities. However, such advancements have not been fully migrated to the 3D point cloud learning community. Different from existing pre-training paradigms for 3D point clouds falling into the scope of generative modeling or contrastive learning, this paper proposes a translative pre-training framework, namely PointVST, driven by a novel self-supervised pretext task of cross-modal translation from 3D point clouds to their corresponding diverse forms of 2D rendered images. More specifically, we start by deducing view-conditioned point-wise embeddings via the insertion of a viewpoint indicator and then adaptively aggregate a view-specific global codeword fed into the subsequent 2D convolutional translation heads for image generation. Extensive experiments on various downstream tasks of 3D shape analysis and scene understanding demonstrate that PointVST shows consistent and prominent performance superiority over current state-of-the-art methods. Our code will be made publicly available.
- Published
- 2022
31. A Comprehensive Study and Comparison of the Robustness of 3D Object Detectors Against Adversarial Attacks
- Author
-
Zhang, Yifan, Hou, Junhui, and Yuan, Yixuan
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Deep learning-based 3D object detectors have made significant progress in recent years and have been deployed in a wide range of applications. It is crucial to understand the robustness of detectors against adversarial attacks when employing detectors in security-critical applications. In this paper, we make the first attempt to conduct a thorough evaluation and analysis of the robustness of 3D detectors under adversarial attacks. Specifically, we first extend three kinds of adversarial attacks to the 3D object detection task to benchmark the robustness of state-of-the-art 3D object detectors against attacks on KITTI and Waymo datasets, subsequently followed by the analysis of the relationship between robustness and properties of detectors. Then, we explore the transferability of cross-model, cross-task, and cross-data attacks. We finally conduct comprehensive experiments of defense for 3D detectors, demonstrating that simple transformations like flipping are of little help in improving robustness when the strategy of transformation imposed on input point cloud data is exposed to attackers. Our findings will facilitate investigations in understanding and defending the adversarial attacks against 3D object detectors to advance this field.
- Published
- 2022
32. Graph Augmentation Clustering Network
- Author
-
Peng, Zhihao, Liu, Hui, Jia, Yuheng, and Hou, Junhui
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Science - Multimedia ,Machine Learning (cs.LG) ,Multimedia (cs.MM) - Abstract
Existing graph clustering networks heavily rely on a predefined graph and may fail if the initial graph is of low quality. To tackle this issue, we propose a novel graph augmentation clustering network capable of adaptively enhancing the initial graph to achieve better clustering performance. Specifically, we first integrate the node attribute and topology structure information to learn the latent feature representation. Then, we explore the local geometric structure information on the embedding space to construct an adjacency graph and subsequently develop an adaptive graph augmentation architecture to fuse that graph with the initial one dynamically. Finally, we minimize the Jeffreys divergence between multiple derived distributions to conduct network training in an unsupervised fashion. Extensive experiments on six commonly used benchmark datasets demonstrate that the proposed method consistently outperforms several state-of-the-art approaches. In particular, our method improves the ARI by more than 9.39\% over the best baseline on DBLP. The source codes and data have been submitted to the appendix.
- Published
- 2022
33. PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal Distillation for 3D Shape Recognition
- Author
-
Zhang, Qijian, Hou, Junhui, and Qian, Yue
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
As two fundamental representation modalities of 3D objects, 3D point clouds and multi-view 2D images record shape information from different domains of geometric structures and visual appearances. In the current deep learning era, remarkable progress in processing such two data modalities has been achieved through respectively customizing compatible 3D and 2D network architectures. However, unlike multi-view image-based 2D visual modeling paradigms, which have shown leading performance in several common 3D shape recognition benchmarks, point cloud-based 3D geometric modeling paradigms are still highly limited by insufficient learning capacity, due to the difficulty of extracting discriminative features from irregular geometric signals. In this paper, we explore the possibility of boosting deep 3D point cloud encoders by transferring visual knowledge extracted from deep 2D image encoders under a standard teacher-student distillation workflow. Generally, we propose PointMCD, a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student. To perform heterogeneous feature alignment between 2D visual and 3D geometric domains, we further investigate visibility-aware feature projection (VAFP), by which point-wise embeddings are reasonably aggregated into view-specific geometric descriptors. By pair-wisely aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification. Experiments on 3D shape classification, part segmentation, and unsupervised learning strongly validate the effectiveness of our method. The code and data will be publicly available at https://github.com/keeganhk/PointMCD., Comment: Accepted to TMM
- Published
- 2022
34. GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation
- Author
-
Zhang, Yifan, Zhang, Qijian, Zhu, Zhiyu, Hou, Junhui, and Yuan, Yixuan
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The inherent ambiguity in ground-truth annotations of 3D bounding boxes, caused by occlusions, signal missing, or manual annotation errors, can confuse deep 3D object detectors during training, thus deteriorating detection accuracy. However, existing methods overlook such issues to some extent and treat the labels as deterministic. In this paper, we formulate the label uncertainty problem as the diversity of potentially plausible bounding boxes of objects. Then, we propose GLENet, a generative framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables. The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors to build probabilistic detectors and supervise the learning of the localization uncertainty. Besides, we propose an uncertainty-aware quality estimator architecture in probabilistic detectors to guide the training of the IoU-branch with predicted localization uncertainty. We incorporate the proposed methods into various popular base 3D detectors and demonstrate significant and consistent performance gains on both KITTI and Waymo benchmark datasets. Especially, the proposed GLENet-VR outperforms all published LiDAR-based approaches by a large margin and achieves the top rank among single-modal methods on the challenging KITTI test set. The source code and pre-trained models are publicly available at \url{https://github.com/Eaphan/GLENet}.
- Published
- 2022
35. Light Field Depth Estimation Based on Stitched-EPI
- Author
-
Zhou, Ping, Liu, Xiaoyang, Jin, Jing, Zhang, Yuting, and Hou, Junhui
- Subjects
I.4.5 ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Depth estimation is one of the most essential problems for light field applications. In EPI-based methods, the slope computation usually suffers low accuracy due to the discretization error and low angular resolution. In addition, recent methods work well in most regions but often struggle with blurry edges over occluded regions and ambiguity over texture-less regions. To address these challenging issues, we first propose the stitched-EPI and half-stitched-EPI algorithms for non-occluded and occluded regions, respectively. The algorithms improve slope computation by shifting and concatenating lines in different EPIs but related to the same point in 3D scene, while the half-stitched-EPI only uses non-occluded part of lines. Combined with the joint photo-consistency cost proposed by us, the more accurate and robust depth map can be obtained in both occluded and non-occluded regions. Furthermore, to improve the depth estimation in texture-less regions, we propose a depth propagation strategy that determines their depth from the edge to interior, from accurate regions to coarse regions. Experimental and ablation results demonstrate that the proposed method achieves accurate and robust depth maps in all regions effectively., Comment: 15 pages
- Published
- 2022
36. Knowledge-map analysis of bladder cancer immunotherapy.
- Author
-
Lv, Zongwei, Hou, Junhui, Wang, Yuan, Wang, Xia, Wang, Yibing, and Wang, Kefeng
- Published
- 2023
- Full Text
- View/download PDF
37. Motion capture data recovery using skeleton constrained singular value thresholding
- Author
-
Tan, Cheen-Hau, Hou, JunHui, and Chau, Lap-Pui
- Published
- 2015
- Full Text
- View/download PDF
38. Guest editorial: Deep learning‐based point cloud processing, compression and analysis.
- Author
-
Zhang, Yun, Hamzaoui, Raouf, Wang, Xu, Hou, Junhui, and Valenzise, Giuseppe
- Subjects
DEEP learning ,POINT cloud ,POINT processes ,OBJECT recognition (Computer vision) ,ARTIFICIAL intelligence ,SIGNAL processing - Abstract
Point cloud data is a large collection of high dimensional 3D points with 3D coordinates and attributes, which has been one of the mainstream representations for emerging 3D applications, such as virtual reality, autonomous vehicles, and robotics. Due to the large‐scale unstructured high‐dimensional nature of point clouds, point cloud processing, transmitting and analysing has been challenging issues in multimedia signal processing and communication. Deep learning is a powerful tool to learn statistical knowledge from massive data. Advances in artificial intelligence, especially deep learning models are offering new opportunities for point cloud processing, compression and analysis. This special issue aims at promoting cutting‐edge research on deep learning‐based point cloud processing, including object detection, segmentation, registration, compression, and visual quality assessment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Deep Spatial-Angular Regularization for Light Field Imaging, Denoising, and Super-Resolution.
- Author
-
Guo, Mantang, Hou, Junhui, Jin, Jing, Chen, Jie, and Chau, Lap-Pui
- Subjects
- *
DECODING algorithms , *DEEP learning , *INVERSE problems , *MATHEMATICAL models , *GLOBAL method of teaching - Abstract
Coded aperture is a promising approach for capturing the 4-D light field (LF), in which the 4-D data are compressively modulated into 2-D coded measurements that are further decoded by reconstruction algorithms. The bottleneck lies in the reconstruction algorithms, resulting in rather limited reconstruction quality. To tackle this challenge, we propose a novel learning-based framework for the reconstruction of high-quality LFs from acquisitions via learned coded apertures. The proposed method incorporates the measurement observation into the deep learning framework elegantly to avoid relying entirely on data-driven priors for LF reconstruction. Specifically, we first formulate the compressive LF reconstruction as an inverse problem with an implicit regularization term. Then, we construct the regularization term with a deep efficient spatial-angular separable convolutional sub-network in the form of local and global residual learning to comprehensively explore the signal distribution free from the limited representation ability and inefficiency of deterministic mathematical modeling. Furthermore, we extend this pipeline to LF denoising and spatial super-resolution, which could be considered as variants of coded aperture imaging equipped with different degradation matrices. Extensive experimental results demonstrate that the proposed methods outperform state-of-the-art approaches to a significant extent both quantitatively and qualitatively, i.e., the reconstructed LFs not only achieve much higher PSNR/SSIM but also preserve the LF parallax structure better on both real and synthetic LF benchmarks. The code will be publicly available at https://github.com/MantangGuo/DRLF. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Finding Stars From Fireworks: Improving Non-Cooperative Iris Tracking.
- Author
-
Lin, Chengdong, Li, Xinlin, Li, Zhenjiang, and Hou, Junhui
- Subjects
IRIS (Eye) ,FIREWORKS ,VIDEO excerpts ,SCLERA ,IRIS recognition - Abstract
We revisit the problem of iris tracking with RGB cameras, aiming to obtain iris contours from captured images of eyes. We find the reason that limits the performance of the state-of-the-art method in more general non-cooperative environments, which prohibits a wider adoption of this useful technique in practice. We believe that because the iris boundary could be inherently unclear and blocked, as its pixels occupy only an extremely limited percentage of those on the entire image of the eye, similar to the stars hidden in fireworks, we should not treat the boundary pixels as one class to conduct end-to-end recognition directly. Thus, we propose to learn features from iris and sclera regions first, and then leverage entropy to sketch the thin and sharp iris boundary pixels, where we can trace more precise parameterized iris contours. In this work, we also collect a new dataset by smartphone with 22 K images of eyes from video clips. We annotate a subset of 2 K images, so that label propagation can be applied to further enhance the system performance. Extensive experiments over both public and our own datasets show that our method outperforms the state-of-the-art method. The results also indicate that our method can improve the coarsely labeled data to enhance the iris contour’s accuracy and support the downstream application better than the prior method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Light Field Reconstruction via Deep Adaptive Fusion of Hybrid Lenses
- Author
-
Jin, Jing, Guo, Mantang, Hou, Junhui, Liu, Hui, and Xiong, Hongkai
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned attention maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission., Accepted by IEEE TPAMI. arXiv admin note: text overlap with arXiv:1907.09640
- Published
- 2021
42. Point Cloud Quality Assessment via 3D Edge Similarity Measurement.
- Author
-
Lu, Zian, Huang, Hailiang, Zeng, Huanqiang, Hou, Junhui, and Ma, Kai-Kuang
- Subjects
POINT cloud ,FEATURE extraction - Abstract
In this letter, a new full-reference metric is presented to assess the perceptual quality of the point clouds (PCs). The human visual system (HVS) always shows a high sensitivity to the three-dimensional (3D) edge features inherent in the PCs. With this motivation, the three-dimensional edge similarity-based model (TDESM) is proposed, which makes the first attempt to apply 3D Difference of Gaussian (3D-DOG) on point cloud quality assessment (PCQA). Specifically, the 3D edge features are captured by convolving the dual-scale 3D-DOG filters with both reference and distorted PCs. The quality scores of distorted PCs are generated by combining the 3D edge similarity measured from different scales. The experiments are conducted on four publicly available PCQA datasets, i.e., Torlig2018, M-PCCD, ICIP2020, and SJTU-PCQA. Compared with multiple state-of-the-art PCQA metrics, our proposed approach is able to be higher consistent with the subjective perception on the PCs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Semisupervised Affinity Matrix Learning via Dual-Channel Information Recovery.
- Author
-
Jia, Yuheng, Liu, Hui, Hou, Junhui, Kwong, Sam, and Zhang, Qingfu
- Abstract
This article explores the problem of semisupervised affinity matrix learning, that is, learning an affinity matrix of data samples under the supervision of a small number of pairwise constraints (PCs). By observing that both the matrix encoding PCs, called pairwise constraint matrix (PCM) and the empirically constructed affinity matrix (EAM), express the similarity between samples, we assume that both of them are generated from a latent affinity matrix (LAM) that can depict the ideal pairwise relation between samples. Specifically, the PCM can be thought of as a partial observation of the LAM, while the EAM is a fully observed one but corrupted with noise/outliers. To this end, we innovatively cast the semisupervised affinity matrix learning as the recovery of the LAM guided by the PCM and EAM, which is technically formulated as a convex optimization problem. We also provide an efficient algorithm for solving the resulting model numerically. Extensive experiments on benchmark datasets demonstrate the significant superiority of our method over state-of-the-art ones when used for constrained clustering and dimensionality reduction. The code is publicly available at https://github.com/jyh-learning/LAM. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Occlusion-Aware Unsupervised Learning of Depth From 4-D Light Fields.
- Author
-
Jin, Jing and Hou, Junhui
- Subjects
- *
KERNEL (Mathematics) , *GRAPHICS processing units , *COHERENCE (Physics) , *OCCLUSION (Chemistry) - Abstract
Depth estimation is a fundamental issue in 4-D light field processing and analysis. Although recent supervised learning-based light field depth estimation methods have significantly improved the accuracy and efficiency of traditional optimization-based ones, these methods rely on the training over light field data with ground-truth depth maps which are challenging to obtain or even unavailable for real-world light field data. Besides, due to the inevitable gap (or domain difference) between real-world and synthetic data, they may suffer from serious performance degradation when generalizing the models trained with synthetic data to real-world data. By contrast, we propose an unsupervised learning-based method, which does not require ground-truth depth as supervision during training. Specifically, based on the basic knowledge of the unique geometry structure of light field data, we present an occlusion-aware strategy to improve the accuracy on occlusion areas, in which we explore the angular coherence among subsets of the light field views to estimate initial depth maps, and utilize a constrained unsupervised loss to learn their corresponding reliability for final depth prediction. Additionally, we adopt a multi-scale network with a weighted smoothness loss to handle the textureless areas. Experimental results on synthetic data show that our method can significantly shrink the performance gap between the previous unsupervised method and supervised ones, and produce depth maps with comparable accuracy to traditional methods with obviously reduced computational cost. Moreover, experiments on real-world datasets show that our method can avoid the domain shift problem presented in supervised methods, demonstrating the great potential of our method. The code will be publicly available at https://github.com/jingjin25/LFDE-OccUnNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Self-Supervised Symmetric Nonnegative Matrix Factorization.
- Author
-
Jia, Yuheng, Liu, Hui, Hou, Junhui, Kwong, Sam, and Zhang, Qingfu
- Subjects
MATRIX decomposition ,NONNEGATIVE matrices ,SYMMETRIC matrices ,RANDOM matrices ,CONSTRAINED optimization ,SELF-efficacy - Abstract
Symmetric nonnegative matrix factorization (SNMF) has demonstrated to be a powerful method for data clustering. However, SNMF is mathematically formulated as a non-convex optimization problem, making it sensitive to the initialization of variables. Inspired by ensemble clustering that aims to seek a better clustering result from a set of clustering results, we propose self-supervised SNMF (S3NMF), which is capable of boosting clustering performance progressively by taking advantage of the sensitivity to initialization characteristic of SNMF, without relying on any additional information. Specifically, we first perform SNMF repeatedly with a random positive matrix for initialization each time, leading to multiple decomposed matrices. Then, we rank the quality of the resulting matrices with adaptively learned weights, from which a new similarity matrix that is expected to be more discriminative is reconstructed for SNMF again. These two steps are iterated until the stopping criterion/maximum number of iterations is achieved. We mathematically formulate S3NMF as a constrained optimization problem, and provide an alternative optimization algorithm to solve it with the theoretical convergence guaranteed. Extensive experimental results on 10 commonly used benchmark datasets demonstrate the significant advantage of our S3NMF over 14 state-of-the-art methods in terms of 5 quantitative metrics. The source code is publicly available at https://github.com/jyh-learning/SSSNMF. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling.
- Author
-
Liu, Hao, Yuan, Hui, Hou, Junhui, Hamzaoui, Raouf, and Gao, Wei
- Subjects
GENERATIVE adversarial networks ,POINT cloud ,FEATURE extraction ,DEPERSONALIZATION ,DISTRIBUTED algorithms - Abstract
We propose a generative adversarial network for point cloud upsampling, which can not only make the upsampled points evenly distributed on the underlying surface but also efficiently generate clean high frequency regions. The generator of our network includes a dynamic graph hierarchical residual aggregation unit and a hierarchical residual aggregation unit for point feature extraction and upsampling, respectively. The former extracts multiscale point-wise descriptive features, while the latter captures rich feature details with hierarchical residuals. To generate neat edges, our discriminator uses a graph filter to extract and retain high frequency points. The generated high resolution point cloud and corresponding high frequency points help the discriminator learn the global and high frequency properties of the point cloud. We also propose an identity distribution loss function to make sure that the upsampled points remain on the underlying surface of the input low resolution point cloud. To assess the regularity of the upsampled points in high frequency regions, we introduce two evaluation metrics. Objective and subjective results demonstrate that the visual quality of the upsampled point clouds generated by our method is better than that of the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Screen Content Video Quality Assessment Model Using Hybrid Spatiotemporal Features.
- Author
-
Zeng, Huanqiang, Huang, Hailiang, Hou, Junhui, Cao, Jiuwen, Wang, Yongtao, and Ma, Kai-Kuang
- Subjects
PARAMETRIC modeling ,VIDEOS ,DATABASES - Abstract
In this paper, a full-reference video quality assessment (VQA) model is designed for the perceptual quality assessment of the screen content videos (SCVs), called the hybrid spatiotemporal feature-based model (HSFM). The SCVs are of hybrid structure including screen and natural scenes, which are perceived by the human visual system (HVS) with different visual effects. With this consideration, the three dimensional Laplacian of Gaussian (3D-LOG) filter and three dimensional Natural Scene Statistics (3D-NSS) are exploited to extract the screen and natural spatiotemporal features, based on the reference and distorted SCV sequences separately. The similarities of these extracted features are then computed independently, followed by generating the distorted screen and natural quality scores for screen and natural scenes. After that, an adaptive screen and natural quality fusion scheme through the local video activity is developed to combine them for arriving at the final VQA score of the distorted SCV under evaluation. The experimental results on the Screen Content Video Database (SCVD) and Compressed Screen Content Video Quality (CSCVQ) databases have shown that the proposed HSFM is more in line with the perceptual quality assessment of the SCVs perceived by the HVS, compared with a variety of classic and latest IQA/VQA models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. A Spatial and Geometry Feature-Based Quality Assessment Model for the Light Field Images.
- Author
-
Huang, Hailiang, Zeng, Huanqiang, Hou, Junhui, Chen, Jing, Zhu, Jianqing, and Ma, Kai-Kuang
- Subjects
FEATURE extraction ,GEOMETRY ,PARAMETRIC modeling ,GEOMETRIC modeling ,VIDEO compression - Abstract
This paper proposes a new full-reference image quality assessment (IQA) model for performing perceptual quality evaluation on light field (LF) images, called the spatial and geometry feature-based model (SGFM). Considering that the LF image describe both spatial and geometry information of the scene, the spatial features are extracted over the sub-aperture images (SAIs) by using contourlet transform and then exploited to reflect the spatial quality degradation of the LF images, while the geometry features are extracted across the adjacent SAIs based on 3D-Gabor filter and then explored to describe the viewing consistency loss of the LF images. These schemes are motivated and designed based on the fact that the human eyes are more interested in the scale, direction, contour from the spatial perspective and viewing angle variations from the geometry perspective. These operations are applied to the reference and distorted LF images independently. The degree of similarity can be computed based on the above-measured quantities for jointly arriving at the final IQA score of the distorted LF image. Experimental results on three commonly-used LF IQA datasets show that the proposed SGFM is more in line with the quality assessment of the LF images perceived by the human visual system (HVS), compared with multiple classical and state-of-the-art IQA models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Adaptive Attribute and Structure Subspace Clustering Network.
- Author
-
Peng, Zhihao, Liu, Hui, Jia, Yuheng, and Hou, Junhui
- Subjects
SMART structures ,SYMMETRIC matrices ,SPARSE matrices ,FEATURE extraction ,DEEP learning - Abstract
Deep self-expressiveness-based subspace clustering methods have demonstrated effectiveness. However, existing works only consider the attribute information to conduct the self-expressiveness, limiting the clustering performance. In this paper, we propose a novel adaptive attribute and structure subspace clustering network (AASSC-Net) to simultaneously consider the attribute and structure information in an adaptive graph fusion manner. Specifically, we first exploit an auto-encoder to represent input data samples with latent features for the construction of an attribute matrix. We also construct a mixed signed and symmetric structure matrix to capture the local geometric structure underlying data samples. Then, we perform self-expressiveness on the constructed attribute and structure matrices to learn their affinity graphs separately. Finally, we design a novel attention-based fusion module to adaptively leverage these two affinity graphs to construct a more discriminative affinity graph. Extensive experimental results on commonly used benchmark datasets demonstrate that our AASSC-Net significantly outperforms state-of-the-art methods. In addition, we conduct comprehensive ablation studies to discuss the effectiveness of the designed modules. The code is publicly available at https://github.com/ZhihaoPENG-CityU/AASSC-Net. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Attention-Guided Progressive Neural Texture Fusion for High Dynamic Range Image Restoration.
- Author
-
Chen, Jie, Yang, Zaifeng, Chan, Tsz Nam, Li, Hui, Hou, Junhui, and Chau, Lap-Pui
- Subjects
HIGH dynamic range imaging ,IMAGE reconstruction - Abstract
High Dynamic Range (HDR) imaging via multi-exposure fusion is an important task for most modern imaging platforms. In spite of recent developments in both hardware and algorithm innovations, challenges remain over content association ambiguities caused by saturation, motion, and various artifacts introduced during multi-exposure fusion such as ghosting, noise, and blur. In this work, we propose an Attention-guided Progressive Neural Texture Fusion (APNT-Fusion) HDR restoration model which aims to address these issues within one framework. An efficient two-stream structure is proposed which separately focuses on texture feature transfer over saturated regions and multi-exposure tonal and texture feature fusion. A neural feature transfer mechanism is proposed which establishes spatial correspondence between different exposures based on multi-scale VGG features in the masked saturated HDR domain for discriminative contextual clues over the ambiguous image areas. A progressive texture blending module is designed to blend the encoded two-stream features in a multi-scale and progressive manner. In addition, we introduce several novel attention mechanisms, i.e., the motion attention module detects and suppresses the content discrepancies among the reference images; the saturation attention module facilitates differentiating the misalignment caused by saturation from those caused by motion; and the scale attention module ensures texture blending consistency between different coder/decoder scales. We carry out comprehensive qualitative and quantitative evaluations and ablation studies, which validate that these novel modules work coherently under the same framework and outperform state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.