178 results on '"Chen, Xilin"'
Search Results
2. Towards Robust Semantic Segmentation against Patch-Based Attack via Attention Refinement.
- Author
-
Yuan, Zheng, Zhang, Jie, Wang, Yude, Shan, Shiguang, and Chen, Xilin
- Subjects
CONVOLUTIONAL neural networks ,TRANSFORMER models ,SPINE - Abstract
The attention mechanism has been proven effective on various visual tasks in recent years. In the semantic segmentation task, the attention mechanism is applied in various methods, including the case of both convolution neural networks and vision transformer as backbones. However, we observe that the attention mechanism is vulnerable to patch-based adversarial attacks. Through the analysis of the effective receptive field, we attribute it to the fact that the wide receptive field brought by global attention may lead to the spread of the adversarial patch. To address this issue, in this paper, we propose a robust attention mechanism (RAM) to improve the robustness of the semantic segmentation model, which can notably relieve the vulnerability against patch-based attacks. Compared to the vallina attention mechanism, RAM introduces two novel modules called max attention suppression and random attention dropout, both of which aim to refine the attention matrix and limit the influence of a single adversarial patch on the semantic segmentation results of other positions. Extensive experiments demonstrate the effectiveness of our RAM to improve the robustness of semantic segmentation models against various patch-based attack methods under different attack settings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Architectural design of an open cultural space for a cultural heritage art museum.
- Author
-
Chen, Yuanchiu, Zhang, Shan, Dong, Zuorong, Shen, Fei, Chen, Xilin, and Lin, Yuyan
- Published
- 2024
- Full Text
- View/download PDF
4. Acupotomy alleviates knee osteoarthritis in rabbit by regulating chondrocyte mitophagy via Pink1-Parkin pathway.
- Author
-
ZHU Wenting, GUO Changqing, DU Mei, MA Yunxuan, CUI Yongqi, and CHEN Xilin
- Published
- 2024
- Full Text
- View/download PDF
5. Automated Monitoring and Emergency Response System for Sensitive Areas Along High-Speed Railway Lines.
- Author
-
Chiu, Chen-Yuan, Lin, Yi-Chia, Shen, Fei, Liu, Yujie, Chen, Xilin, and Lin, Yuyan
- Published
- 2024
- Full Text
- View/download PDF
6. Mortality and Severe Complications Among Newly Graduated Surgeons in the United States.
- Author
-
Howard, Ryan A., Thelen, Angela E., Chen, Xilin, Gates, Rebecca, Krumm, Andrew E., Millis, Michael Andrew, Gupta, Tanvi, Brown, Craig S., Bandeh-Ahmadi, Hoda, Wnuk, Greg M., Yee, Chia Chye, Ryan, Andrew M., Mukherjee, Bhramar, Dimick, Justin B., and George, Brian C.
- Published
- 2024
- Full Text
- View/download PDF
7. Audio-guided self-supervised learning for disentangled visual speech representations.
- Author
-
Feng, Dalu, Yang, Shuang, Shan, Shiguang, and Chen, Xilin
- Abstract
4 Conclusion: In this paper, we propose a novel two-branch framework to learn the disentangled visual speech representations based on two particular observations. Its main idea is to introduce the audio signal to guide the learning of speech-relevant cues and introduce a bottleneck to restrict the speech-irrelevant branch from learning high-frequency and fine-grained speech cues. Experiments on both the word-level and sentence-level audio-visual speech datasets LRW and LRS2-BBC show the effectiveness. Our future work is to explore more explicit auxiliary tasks and constraints beyond the reconstruction task of the speech-relevant and irrelevant branch to improve further its ability of capturing speech cues in the video. Meanwhile, it’s also a nice try to combine multiple types of knowledge representations [10] to further boost the obtained speech epresentations, which is also left for the future work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Creating an age-inclusive workplace: The impact of HR practices on employee work engagement.
- Author
-
Fan, Peng, Song, Yixiao, Fang, Miaoying, and Chen, Xilin
- Subjects
JOB involvement ,SOCIAL exchange ,INSTITUTIONAL environment - Abstract
Drawing on social exchange theory, our study aims to examine how age-inclusive human resource (HR) practices affect work engagement by shaping the age-diversity climate and perceived organizational support (POS). We hypothesize that diversity beliefs play a moderating role in the relationship between age-inclusive HR practices and POS. Our analysis of a sample of 983 employees from 48 organizations in China highlights the direct impact of age-inclusive HR practices on work engagement. Moreover, age-diversity climate and POS mediate the association between age-inclusive HR practices and work engagement. We further demonstrate that diversity beliefs play a moderating role in the association between age-inclusive HR practices and POS. Our findings not only contribute to the literature but also provide practical implications for managing an aging workforce. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Importance First: Generating Scene Graph of Human Interest.
- Author
-
Wang, Wenbin, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Subjects
HUMAN beings ,TASK performance - Abstract
Scene graph aims to faithfully reveal humans' perception of image content. When humans look at a scene, they usually focus on their interested parts in a special priority. This innate habit indicates a hierarchical preference about human perception. Therefore, we argue to generate the Scene Graph of Interest which should be hierarchically constructed, so that the important primary content is firstly presented while the secondary one is presented on demand. To achieve this goal, we propose the Tree–Guided Importance Ranking (TGIR) model. We represent the scene with a hierarchical structure by firstly detecting objects in the scene and organizing them into a Hierarchical Entity Tree (HET) according to their spatial scale, considering that larger objects are more likely to be noticed instantly. After that, the scene graph is generated guided by structural information of HET which is modeled by the elaborately designed Hierarchical Contextual Propagation (HCP) module. To further highlight the key relationship in the scene graph, all relationships are re-ranked through additionally estimating their importance by the Relationship Ranking Module (RRM). To train RRM, the most direct way is to collect the key relationship annotation, which is the so-called Direct Supervision scheme. As collecting annotation may be cumbersome, we further utilize two intuitive and effective cues, visual saliency and spatial scale, and treat them as Approximate Supervision, according to the findings that these cues are positively correlated with relationship importance. With these readily available cues, the RRM is still able to estimate the importance even without key relationship annotation. Experiments indicate that our method not only achieves state-of-the-art performances on scene graph generation, but also is expert in mining image-specific relationships which play a great role in serving subsequent tasks such as image captioning and cross-modal retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Co3O4-ZnO/rGO catalyst preparation and rhodamine B degradation by sulfate radical photocatalysis.
- Author
-
Zhang, Zhanmei, Zhang, Yi, Chen, Xilin, Huang, Ziran, Zou, Zuqin, and Zheng, Huaili
- Abstract
Copyright of Journal of Zhejiang University: Science A is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
11. Acupotomy ameliorates subchondral bone absorption and mechanical properties in rabbits with knee osteoarthritis by regulating bone morphogenetic protein 2-Smad1 pathway.
- Author
-
Chen Xilin, GUO Yan, LU Juan, QIN Luxue, HU Tingyao, ZENG Xin, WANG Xinyue, ZHANG Anran, ZHUANG Yuxin, ZHONG Honggang, and GUO Changqing
- Published
- 2023
- Full Text
- View/download PDF
12. Electroacupuncture Exerts Chondroprotective Effect in Knee Osteoarthritis of Rabbits Through the Mitophagy Pathway.
- Author
-
Xing, Longfei, Chen, Xilin, Guo, Changqing, Zhu, Wenting, Hu, Tingyao, Ma, Weiwei, Du, Mei, and Xu, Yue
- Subjects
KNEE osteoarthritis ,ELECTROACUPUNCTURE ,RABBITS ,TRANSMISSION electron microscopes ,MITOCHONDRIAL membranes - Abstract
Purpose: Mitochondrial dysfunction of chondrocytes has become an area of focus in Knee Osteoarthritis (KOA) in recent years. Activation of mitophagy could promote the survival of chondrocytes and alleviate cartilage degeneration. The aim of this study was to explore whether mitophagy was involved in the cartilage protection of KOA rabbits after electroacupuncture (EA) intervention.Methods: The rabbits were divided into 3 groups, Control group, KOA group, EA group, with 6 rabbits in each group. KOA model rabbits were established by modified Videman's extended immobilization method for 6 weeks and randomly divided into KOA group and EA group. The rabbits in EA group were treated every other day for 3 weeks. The degree of cartilage degeneration was detected by Safranine O-Fast Green staining and immunofluorescence. The morphological changes of chondrocytes mitochondria were detected by transmission electron microscope. ATP concentration in cartilage was measured by ATP Assay Kit. The changes of Pink1-Parkin signal pathway were detected by immunofluorescence, Western blot, and Real-time PCR.Results: The morphology showed that EA could reduce the degeneration of KOA cartilage and increase the distribution of collagen II. We also found that EA could activate mitophagy in KOA rabbit chondrocytes to remove damaged mitochondria and restore mitochondrial homeostasis, which was manifested as increasing the expression of LC3 II/I, promoting the colocalization of TOM20 and LC3B, reducing the accumulation of mitochondrial markers outer mitochondrial membrane 20 (TOM20) and inner mitochondrial membrane 23 (TIM23), and increasing ATP production in chondrocytes. This regulation might be achieved by upregulating the Pink1-Parkin signal pathway.Conclusion: EA may play a role in protecting KOA cartilage by activating mitophagy mediated through Pink1-Parkin pathway. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Association of Surgical Resident Competency Ratings With Patient Outcomes.
- Author
-
Kendrick, Daniel E., Thelen, Angela E., Chen, Xilin, Gupta, Tanvi, Yamazaki, Kenji, Krumm, Andrew E., Bandeh-Ahmadi, Hoda, Clark, Michael, Luckoscki, John, Fan, Zhaohui, Wnuk, Greg M., Ryan, Andrew M., Mukherjee, Bhramar, Hamstra, Stanley J., Dimick, Justin B., Holmboe, Eric S., and George, Brian C.
- Published
- 2023
- Full Text
- View/download PDF
14. Acupotomy Improves Synovial Hypoxia, Synovitis and Angiogenesis in KOA Rabbits.
- Author
-
Guo, Yan, Xu, Yue, He, Meng, Chen, Xilin, Xing, Longfei, Hu, Tingyao, Zhang, Yi, Du, Mei, Zhang, Dian, Zhang, Qian, and Li, Bin
- Subjects
VASCULAR endothelial growth factors ,RABBITS ,ENZYME-linked immunosorbent assay ,VASTUS medialis ,RECTUS femoris muscles - Abstract
Purpose: Knee osteoarthritis (KOA) is a chronic inflammatory disease highly associated with intra-articular hypertension, hypoxia and angiogenesis of synovial tissue. Our previous studies showed that acupotomy could treat KOA in a variety of ways, including reducing cartilage deterioration and enhancing biomechanical qualities. However, the mechanism of hypoxia and angiogenesis induced by acupotomy in KOA synovium remains unclear. This study looked for the benign intervention of acupotomy in synovial pathology. Methods: The rabbits were divided into 3 groups, Normal group, KOA group, and KOA + Acupotomy (Apo) group, with 11 rabbits in each group. The KOA rabbit model was established by the modified Videman method with six weeks. The KOA + Apo group performed the intervention. The tendon insertion of vastus medialis, vastus lateralis, rectus femoris, biceps femoris, and anserine bursa were selected as treatment points in rabbits. Rabbits were treated once every 7 days for 3 weeks. We observed the intra-articular pressure and oxygen partial pressure (BOLD MRI). The synovial morphology was monitored by Hematoxylin-Eosin Staining (HE Staining). The expression of hypoxia-inducible transcription factor-1α (HIF-1α), vascular endothelial growth factor (VEGF), interleukin-1β (IL-1β) and tumour necrosis factor-α (TNF-α) was detected using Immunohistochemical (IHC), Western Blot and Enzyme-Linked Immunosorbent Assay (ELISA). Results: Acupotomy reduced intra-articular hypertension and improved the synovial oxygen situation, synovial inflammatory and angiogenesis. HIF-1α, VEGF, IL-1β and TNF-α expression were downregulated by acupotomy. Conclusion: Acupotomy may reduce inflammation and angiogenesis in KOA rabbit by reducing abnormally elevated intra-articular pressure and improving synovial oxygen environment. The above may provide a new theoretical foundation for acupotomy treatment of KOA. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. CMOS-GAN: Semi-Supervised Generative Adversarial Model for Cross-Modality Face Image Synthesis.
- Author
-
Yu, Shikang, Han, Hu, Shan, Shiguang, and Chen, Xilin
- Subjects
FACE ,GENERATIVE adversarial networks ,COMMUNITIES - Abstract
Cross-modality face image synthesis such as sketch-to-photo, NIR-to-RGB, and RGB-to-depth has wide applications in face recognition, face animation, and digital entertainment. Conventional cross-modality synthesis methods usually require paired training data, i.e., each subject has images of both modalities. However, paired data can be difficult to acquire, while unpaired data commonly exist. In this paper, we propose a novel semi-supervised cross-modality synthesis method (namely CMOS-GAN), which can leverage both paired and unpaired face images to learn a robust cross-modality synthesis model. Specifically, CMOS-GAN uses a generator of encoder-decoder architecture for new modality synthesis. We leverage pixel-wise loss, adversarial loss, classification loss, and face feature loss to exploit the information from both paired multi-modality face images and unpaired face images for model learning. In addition, since we expect the synthetic new modality can also be helpful for improving face recognition accuracy, we further use a modified triplet loss to retain the discriminative features of the subject in the synthetic modality. Experiments on three cross-modality face synthesis tasks (NIR-to-VIS, RGB-to-depth, and sketch-to-photo) show the effectiveness of the proposed approach compared with the state-of-the-art. In addition, we also collect a large-scale RGB-D dataset (VIPL-MumoFace-3K) for the RGB-to-depth synthesis task. We plan to open-source our code and VIPL-MumoFace-3K dataset to the community (https://github.com/skgyu/CMOS-GAN). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Surgical Trainee Performance and Alignment With Surgical Program Director Expectations.
- Author
-
Abbott, Kenneth L., Krumm, Andrew E., Kelley, Jesse K., Kendrick, Daniel E., Clark, Michael J., Chen, Xilin, Gupta, Tanvi, Jones, Andrew T., Moreno, Beatriz Ibaáñez, Kwakye, Gifty, Zaidi, Nikki L. Bibler, Swanson, David B., Bell, Richard H., and George, Brian C.
- Published
- 2022
- Full Text
- View/download PDF
17. Feature Completion for Occluded Person Re-Identification.
- Author
-
Hou, Ruibing, Ma, Bingpeng, Chang, Hong, Gu, Xinqian, Shan, Shiguang, and Chen, Xilin
- Subjects
CONVOLUTIONAL neural networks ,COMPUTER vision ,SOURCE code ,FEATURE extraction ,TASK analysis - Abstract
Person re-identification (reID) plays an important role in computer vision. However, existing methods suffer from performance degradation in occluded scenes. In this work, we propose an occlusion-robust block, Region Feature Completion (RFC), for occluded reID. Different from most previous works that discard the occluded regions, RFC block can recover the semantics of occluded regions in feature space. First, a Spatial RFC (SRFC) module is developed. SRFC exploits the long-range spatial contexts from non-occluded regions to predict the features of occluded regions. The unit-wise prediction task leads to an encoder/decoder architecture, where the region-encoder models the correlation between non-occluded and occluded region, and the region-decoder utilizes the spatial correlation to recover occluded region features. Second, we introduce Temporal RFC (TRFC) module which captures the long-term temporal contexts to refine the prediction of SRFC. RFC block is lightweight, end-to-end trainable and can be easily plugged into existing CNNs to form RFCnet. Extensive experiments are conducted on occluded and commonly holistic reID benchmarks. Our method significantly outperforms existing methods on the occlusion datasets, while remains top even superior performance on holistic datasets. The source code is available at https://github.com/blue-blue272/OccludedReID-RFCnet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Measurement of filtering efficiency of artificial radioactive aerosol filter membrane.
- Author
-
Guoxiu, Qin, Chen, Xilin, Xu, Youning, Li, Fan, Zhou, Wenping, and Li, Weizhe
- Subjects
FILTERS & filtration ,MEMBRANE filters ,RADIOACTIVE aerosols ,UNITS of measurement - Abstract
When monitoring radioactive aerosol in the atmosphere, choosing a filter membrane with better surface collection characteristics and filtering efficiency will facilitate the improvement of the monitoring efficiency. In this paper, a developed calibration system for radioactive aerosol monitor was improved. A measurement device for filtering efficiency of filter membrane is added to the sampling and measurement unit. The device is tested using several commonly used filter membranes, and the measurements were compared with a general method. The results showed that the measurement proposed in this paper can quickly and accurately measure the filtering efficiency of various filter membranes. The deviation from general method is less than 2%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. SANet: Statistic Attention Network for Video-Based Person Re-Identification.
- Author
-
Bai, Shutao, Ma, Bingpeng, Chang, Hong, Huang, Rui, Shan, Shiguang, and Chen, Xilin
- Subjects
FEATURE extraction ,SOURCE code ,PEDESTRIANS ,MACHINE learning ,IDENTIFICATION ,POSE estimation (Computer vision) - Abstract
Capturing long-range dependencies during feature extraction is crucial for video-based person re-identification (re-id) since it would help to tackle many challenging problems such as occlusion and dramatic pose variation. Moreover, capturing subtle differences, such as bags and glasses, is indispensable to distinguish similar pedestrians. In this paper, we propose a novel and efficacious Statistic Attention (SA) block which can capture both the long-range dependencies and subtle differences. SA block leverages high-order statistics of feature maps, which contain both long-range and high-order information. By modeling relations with these statistics, SA block can explicitly capture long-range dependencies with less time complexity. In addition, high-order statistics usually concentrate on details of feature maps and can perceive the subtle differences between pedestrians. In this way, SA block is capable of discriminating pedestrians with subtle differences. Furthermore, this lightweight block can be conveniently inserted into existing deep neural networks at any depth to form Statistic Attention Network (SANet). To evaluate its performance, we conduct extensive experiments on two challenging video re-id datasets, showing that our SANet outperforms the state-of-the-art methods. Furthermore, to show the generalizability of SANet, we evaluate it on three image re-id datasets and two more general image classification datasets, including ImageNet. The source code is available at http://vipl.ict.ac.cn/resources/codes/code/SANet_code.zip. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Acupotomy inhibits aberrant formation of subchondral bone through regulating osteoprotegerin/receptor activator of nuclear factor-κB ligand pathway in rabbits with knee osteoarthritis induced by modified Videman method.
- Author
-
QIN Luxue, GUO Changqing, ZHAO Ruili, WANG Tong, WANG Junmei, GUO Yan, ZHANG Wei, HU Tingyao, CHEN Xilin, ZHANG Qian, ZHANG Dian, and XU Yue
- Published
- 2022
- Full Text
- View/download PDF
21. A Spatio-Temporal Approach for Apathy Classification.
- Author
-
Das, Abhijit, Niu, Xuesong, Dantcheva, Antitza, Happy, S. L., Han, Hu, Zeghari, Radia, Robert, Philippe, Shan, Shiguang, Bremond, Francois, and Chen, Xilin
- Subjects
APATHY ,VIDEO excerpts ,SOCIAL interaction ,EMOTIONS ,CLASSIFICATION - Abstract
Apathy is characterized by symptoms such as reduced emotional response, lack of motivation, and limited social interaction. Current methods for apathy diagnosis require the patient’s presence in a clinic and time consuming clinical interviews, which are costly and inconvenient for both, patients and clinical staff, hindering among other large-scale diagnostics. In this work, we propose a novel spatio-temporal framework for apathy classification, which is streamlined to analyze facial dynamics and emotion in videos. Specifically, we divide the videos into smaller clips, and proceed to extract associated facial dynamics and emotion-based features. Statistical representations/descriptors based on each feature and clip serve as input of the proposed Gated Recurrent Unit (GRU)-architecture. Temporal representations of individual features at the lower level of the proposed architecture are combined at deeper layers of the proposed GRU architecture, in order to obtain the final feature-set for apathy classification. Based on extensive experiments, we show that fusion of characteristics such as emotion and facial dynamics in proposed deep-bi-directional GRU obtains an accuracy of 95.34% in apathy classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. The Quality of Operative Performance Narrative Feedback: A Retrospective Data Comparison Between End of Rotation Evaluations and Workplace-based Assessments.
- Author
-
Ahle, Samantha L. MHS Med-Ed, Eskender, Mickyas, Schuller, Mary MSEd, Carnes, Emily, Chen, Xilin, Koehler, Jeanne, Willey, Gabrielle, Latif, Ahmed, Doyle, Jennifer, Wnuk, Gregory MHSA, Fryer, Jonathan P. MHPE, Mellinger, John D., and George, Brian C. Ed
- Published
- 2022
- Full Text
- View/download PDF
23. Learning on 3D Meshes With Laplacian Encoding and Pooling.
- Author
-
Qiao, Yi-Ling, Gao, Lin, Yang, Jie, Rosin, Paul L., Lai, Yu-Kun, and Chen, Xilin
- Subjects
MULTILAYER perceptrons ,COMPUTER vision ,COMPUTER graphics ,MATRIX multiplications ,ENCODING ,TRIANGULATION ,DEEP learning - Abstract
3D models are commonly used in computer vision and graphics. With the wider availability of mesh data, an efficient and intrinsic deep learning approach to processing 3D meshes is in great need. Unlike images, 3D meshes have irregular connectivity, requiring careful design to capture relations in the data. To utilize the topology information while staying robust under different triangulations, we propose to encode mesh connectivity using Laplacian spectral analysis, along with mesh feature aggregation blocks (MFABs) that can split the surface domain into local pooling patches and aggregate global information amongst them. We build a mesh hierarchy from fine to coarse using Laplacian spectral clustering, which is flexible under isometric transformations. Inside the MFABs there are pooling layers to collect local information and multi-layer perceptrons to compute vertex features of increasing complexity. To obtain the relationships among different clusters, we introduce a Correlation Net to compute a correlation matrix, which can aggregate the features globally by matrix multiplication with cluster features. Our network architecture is flexible enough to be used on meshes with different numbers of vertices. We conduct several experiments including shape segmentation and classification, and our method outperforms state-of-the-art algorithms for these tasks on the ShapeNet and COSEG datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Personalized Convolution for Face Recognition.
- Author
-
Han, Chunrui, Shan, Shiguang, Kan, Meina, Wu, Shuzhe, and Chen, Xilin
- Subjects
NOSE ,CONVOLUTIONAL neural networks ,DEEP learning ,FEATURE extraction ,FACE perception - Abstract
Face recognition has been significantly advanced by deep learning based methods. In all face recognition methods based on convolutional neural network (CNN), the convolutional kernels for feature extraction are fixed regardless of the input face once the training stage is finished. By contrast, we humans are usually impressed by some unique characteristics of different persons, such as one's blue eyes while another one's crooked nose, or even someone's naevus at specific location. Inspired by this observation, we propose a personalized convolution method which aims to extract special distinguishing characteristics of each person for more accurate face recognition. Specifically, given a face, we adaptively generate a set of kernels for him/her, named by us ordinary kernel, which is further analytically decomposed into two orthogonal components, i.e., the commonality component and the specialty component. The former characterizes the commonality among subjects which is optimized on a reference set. The latter is the residual part by filtering out the commonality component from the ordinary kernel, so as to capture those special characteristics, named by us personalized kernel. The CNNs with personalized kernels for convolution can highlight those specialty of a person's distinguishing characteristics while suppress his/her commonality with others, leading to better distinguishing of different faces. Additionally, as a by-product, the reference set also facilitates the adaptation of our method to different scenarios by simply selecting faces of a particular population. Extensive experiments on the challenging LFW, IJB-A and IJB-C datasets validate that our proposed personalized convolution achieves significant improvement over the conventional CNN, and also other existing methods for face recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. A Novel Forced Resonant Mechanical DC Circuit Breaker by Using Auxiliary Oscillation Switch for Zero-Crossing.
- Author
-
Qi, Lei, Chen, Xilin, Qu, Xinyuan, Zhan, Liangtao, Zhang, Xiangyu, and Cui, Xiang
- Subjects
CURRENT fluctuations ,OSCILLATIONS ,POINT processes ,FAULT currents ,EPISTOLARY fiction - Abstract
A low-cost mechanical circuit breaker (MCB) is an important solution in the field of dc breaking. However, the zero-crossing points in the breaking process of the existing MCBs are either insufficient in number or slow in generation, which limits the breaking performance of MCBs. This letter proposes a novel forced resonant MCB (FR-MCB). A large number of zero-crossing points can be generated in a short time by utilizing a resonant circuit and a power electronic-based auxiliary oscillation switch. Thus, the FR-MCB can exhibit more reliable and faster breaking performance while maintaining the low cost. After the detailed analysis of current oscillation and breaking process in the FR-MCB, a prototype experiment of up to 10 kV/8 kA is conducted to verify its feasibility. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. OCNet: Object Context for Semantic Segmentation.
- Author
-
Yuan, Yuhui, Huang, Lang, Guo, Jianyuan, Zhang, Chao, Chen, Xilin, and Wang, Jingdong
- Subjects
PIXELS ,SPARSE matrices ,PYRAMIDS - Abstract
In this paper, we address the semantic segmentation task with a new context aggregation scheme named object context, which focuses on enhancing the role of object information. Motivated by the fact that the category of each pixel is inherited from the object it belongs to, we define the object context for each pixel as the set of pixels that belong to the same category as the given pixel in the image. We use a binary relation matrix to represent the relationship between all pixels, where the value one indicates the two selected pixels belong to the same category and zero otherwise. We propose to use a dense relation matrix to serve as a surrogate for the binary relation matrix. The dense relation matrix is capable to emphasize the contribution of object information as the relation scores tend to be larger on the object pixels than the other pixels. Considering that the dense relation matrix estimation requires quadratic computation overhead and memory consumption w.r.t. the input size, we propose an efficient interlaced sparse self-attention scheme to model the dense relations between any two of all pixels via the combination of two sparse relation matrices. To capture richer context information, we further combine our interlaced sparse self-attention scheme with the conventional multi-scale context schemes including pyramid pooling (Zhao et al. 2017) and atrous spatial pyramid pooling (Chen et al. 2018). We empirically show the advantages of our approach with competitive performances on five challenging benchmarks including: Cityscapes, ADE20K, LIP, PASCAL-Context and COCO-Stuff. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
27. IAUnet: Global Context-Aware Feature Learning for Person Reidentification.
- Author
-
Hou, Ruibing, Ma, Bingpeng, Chang, Hong, Gu, Xinqian, Shan, Shiguang, and Chen, Xilin
- Subjects
CONVOLUTIONAL neural networks ,SOURCE code ,CONTEXTUAL learning ,ARTIFICIAL neural networks - Abstract
Person reidentification (reID) by convolutional neural network (CNN)-based networks has achieved favorable performance in recent years. However, most of existing CNN-based methods do not take full advantage of spatial–temporal context modeling. In fact, the global spatial–temporal context can greatly clarify local distractions to enhance the target feature representation. To comprehensively leverage the spatial–temporal context information, in this work, we present a novel block, interaction–aggregation-update (IAU), for high-performance person reID. First, the spatial–temporal IAU (STIAU) module is introduced. STIAU jointly incorporates two types of contextual interactions into a CNN framework for target feature learning. Here, the spatial interactions learn to compute the contextual dependencies between different body parts of a single frame, while the temporal interactions are used to capture the contextual dependencies between the same body parts across all frames. Furthermore, a channel IAU (CIAU) module is designed to model the semantic contextual interactions between channel features to enhance the feature representation, especially for small-scale visual cues and body parts. Therefore, the IAU block enables the feature to incorporate the globally spatial, temporal, and channel context. It is lightweight, end-to-end trainable, and can be easily plugged into existing CNNs to form IAUnet. The experiments show that IAUnet performs favorably against state of the art on both image and video reID tasks and achieves compelling results on a general object categorization task. The source code is available at https://github.com/blue-blue272/ImgReID-IAnet. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
28. Learning efficient text-to-image synthesis via interstage cross-sample similarity distillation.
- Author
-
Mao, Fengling, Ma, Bingpeng, Chang, Hong, Shan, Shiguang, and Chen, Xilin
- Abstract
For a given text, previous text-to-image synthesis methods commonly utilize a multistage generation model to produce images with high resolution in a coarse-to-fine manner. However, these methods ignore the interaction among stages, and they do not constrain the consistent cross-sample relations of images generated in different stages. These deficiencies result in inefficient generation and discrimination. In this study, we propose an interstage cross-sample similarity distillation model based on a generative adversarial network (GAN) for learning efficient text-to-image synthesis. To strengthen the interaction among stages, we achieve interstage knowledge distillation from the refined stage to the coarse stages with novel interstage cross-sample similarity distillation blocks. To enhance the constraint on the cross-sample relations of the images generated at different stages, we conduct cross-sample similarity distillation among the stages. Extensive experiments on the Oxford-102 and Caltech-UCSD Birds-200–2011 (CUB) datasets show that our model generates visually pleasing images and achieves quantitatively comparable performance with state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. Unsupervised Adversarial Domain Adaptation for Cross-Domain Face Presentation Attack Detection.
- Author
-
Wang, Guoqing, Han, Hu, Shan, Shiguang, and Chen, Xilin
- Abstract
Face presentation attack detection (PAD) is essential for securing the widely used face recognition systems. Most of the existing PAD methods do not generalize well to unseen scenarios because labeled training data of the new domain is usually not available. In light of this, we propose an unsupervised domain adaptation with disentangled representation (DR-UDA) approach to improve the generalization capability of PAD into new scenarios. DR-UDA consists of three modules, i.e., ML-Net, UDA-Net and DR-Net. ML-Net aims to learn a discriminative feature representation using the labeled source domain face images via metric learning. UDA-Net performs unsupervised adversarial domain adaptation in order to optimize the source domain and target domain encoders jointly, and obtain a common feature space shared by both domains. As a result, the source domain PAD model can be effectively transferred to the unlabeled target domain for PAD. DR-Net further disentangles the features irrelevant to specific domains by reconstructing the source and target domain face images from the common feature space. Therefore, DR-UDA can learn a disentangled representation space which is generative for face images in both domains and discriminative for live vs. spoof classification. The proposed approach shows promising generalization capability in several public-domain face PAD databases. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
30. Learning Multifunctional Binary Codes for Personalized Image Retrieval.
- Author
-
Liu, Haomiao, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Subjects
BINARY codes ,IMAGE retrieval ,CONTENT-based image retrieval - Abstract
Due to the highly complex semantic information of images, even with the same query image, the expected content-based image retrieval results could be very different and personalized in different scenarios. However, most existing hashing methods only preserve one single type of semantic similarity, making them incapable of addressing such realistic retrieval tasks. To deal with this problem, we propose a unified hashing framework to encode multiple types of information into the binary codes by exploiting convolutional networks (CNNs). Specifically, we assume that typical retrieval tasks are generally defined in two aspects, i.e. high-level semantics (e.g. object categories) and visual attributes (e.g. object shape and color). To this end, our Dual Purpose Hashing model is trained to jointly preserve two kinds of similarities characterizing the two aspects respectively. Moreover, since images with both category and attribute labels are scarce, our model is carefully designed to leverage the abundant partially labelled data as training inputs to alleviate the risk of overfitting. With such a framework, the binary codes of new-coming images can be readily obtained by quantizing the outputs of a specific CNN layer, and different retrieval tasks can be achieved by using the binary codes in different ways. Experiments on two large-scale datasets show that our method achieves comparable or even better performance than those state-of-the-art methods specifically designed for each individual retrieval task while being more compact than the compared methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. RhythmNet: End-to-End Heart Rate Estimation From Face via Spatial-Temporal Representation.
- Author
-
Niu, Xuesong, Shan, Shiguang, Han, Hu, and Chen, Xilin
- Subjects
HEART beat ,IMAGE color analysis ,VISIBLE spectra ,DATABASES - Abstract
Heart rate (HR) is an important physiological signal that reflects the physical and emotional status of a person. Traditional HR measurements usually rely on contact monitors, which may cause inconvenience and discomfort. Recently, some methods have been proposed for remote HR estimation from face videos; however, most of them focus on well-controlled scenarios, their generalization ability into less-constrained scenarios (e.g., with head movement, and bad illumination) are not known. At the same time, lacking large-scale HR databases has limited the use of deep models for remote HR estimation. In this paper, we propose an end-to-end RhythmNet for remote HR estimation from the face. In RyhthmNet, we use a spatial-temporal representation encoding the HR signals from multiple ROI volumes as its input. Then the spatial-temporal representations are fed into a convolutional network for HR estimation. We also take into account the relationship of adjacent HR measurements from a video sequence via Gated Recurrent Unit (GRU) and achieves efficient HR measurement. In addition, we build a large-scale multi-modal HR database (named as VIPL-HR), which contains 2,378 visible light videos (VIS) and 752 near-infrared (NIR) videos of 107 subjects. Our VIPL-HR database contains various variations such as head movements, illumination variations, and acquisition device changes, replicating a less-constrained scenario for HR estimation. The proposed approach outperforms the state-of-the-art methods on both the public-domain and our VIPL-HR databases. 1 VIPL-HR is available at: http://vipl.ict.ac.cn/view_database.php?id=15 [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. Deep Heterogeneous Hashing for Face Video Retrieval.
- Author
-
Qiao, Shishi, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Subjects
COVARIANCE matrices ,BINARY codes ,RIEMANNIAN manifolds ,VECTOR spaces ,HUMAN facial recognition software ,TASK performance ,PATTERN matching - Abstract
Retrieving videos of a particular person with face image as query via hashing technique has many important applications. While face images are typically represented as vectors in Euclidean space, characterizing face videos with some robust set modeling techniques (e.g. covariance matrices as exploited in this study, which reside on Riemannian manifold), has recently shown appealing advantages. This hence results in a thorny heterogeneous spaces matching problem. Moreover, hashing with handcrafted features as done in many existing works is clearly inadequate to achieve desirable performance for this task. To address such problems, we present an end-to-end Deep Heterogeneous Hashing (DHH) method that integrates three stages including image feature learning, video modeling, and heterogeneous hashing in a single framework, to learn unified binary codes for both face images and videos. To tackle the key challenge of hashing on manifold, a well-studied Riemannian kernel mapping is employed to project data (i.e. covariance matrices) into Euclidean space and thus enables to embed the two heterogeneous representations into a common Hamming space, where both intra-space discriminability and inter-space compatibility are considered. To perform network optimization, the gradient of the kernel mapping is innovatively derived via structured matrix backpropagation in a theoretically principled way. Experiments on three challenging datasets show that our method achieves quite competitive performance compared with existing hashing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Serum soluble VSIG4 as a surrogate marker for the diagnosis of lymphoma‐associated hemophagocytic lymphohistiocytosis.
- Author
-
Yuan, Shunzong, Wang, Yanqing, Luo, Hui, Jiang, Zheng, Qiao, Bing, Jiang, Yan, Hu, Yaning, Cheng, Yang, Chen, Xilin, Gong, Weihua, Huang, Yong, Zhao, Weipeng, Luo, Deyan, Liu, Bing, Su, Hang, Zhou, Jianfeng, and Song, Shiping
- Subjects
BIOMARKERS ,MACROPHAGE activation syndrome ,CYTOTOXIC T cells ,RECEIVER operating characteristic curves ,ENZYME-linked immunosorbent assay ,BLOOD proteins - Abstract
Summary: Lymphoma‐associated haemophagocytic lymphohistiocytosis (L‐HLH) is characterized by excessively activated macrophages and cytotoxic T lymphocytes, but few reliable markers for activated macrophages are available clinically. This study, designed to discover novel biomarkers for the diagnosis of lymphoma patients with L‐HLH, was initiated between 2016 and 2018. Fifty‐seven adult lymphoma patients were enrolled — 39 without HLH and 18 with HLH. The differential serum protein expression profile was first screened between lymphoma patients with and without L‐HLH by a quantitative mass spectrometric approach. Soluble V‐set and immunoglobulin domain‐containing 4 (sVSIG4), specifically expressed by macrophages, was significantly upregulated in the L‐HLH group. Subsequently, sVSIG4 concentration was confirmed by enzyme‐linked immunosorbent assay to be significantly increased in lymphoma patients with L‐HLH. When it was exploited for the diagnosis of lymphoma patients with L‐HLH, the area under a receiver operating characteristic curve was 0·98 with an optimal cut‐off point of 2195 pg/ml and the corresponding sensitivity and specificity were 94·44% and 94·87% respectively. In addition, the one‐year overall survival was significantly worse in patients with a sVSIG4 concentration above 2195 pg/ml compared with those below 2195 pg/ml (5·3% vs. 72·2%, P < 0·0001). sVSIG4 may be a surrogate marker of activated macrophages for the diagnosis of lymphoma patients with L‐HLH. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. Learning to Recognize Visual Concepts for Visual Question Answering With Structural Label Space.
- Author
-
Gao, Difei, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Abstract
Solving visual question answering (VQA) task requires recognizing many diverse visual concepts as the answer. These visual concepts contain rich structural semantic meanings, e.g., some concepts in VQA are highly related (e.g., red & blue), some of them are less relevant (e.g., red & standing). It is very natural for humans to efficiently learn concepts by utilizing their semantic meanings to concentrate on distinguishing relevant concepts and eliminate the disturbance of irrelevant concepts. However, previous works usually use a simple MLP to output visual concept as the answer in a flat label space that treats all labels equally, causing limitations in representing and using the semantic meanings of labels. To address this issue, we propose a novel visual recognition module named Dynamic Concept Recognizer (DCR), which is easy to be plugged in an attention-based VQA model, to utilize the semantics of the labels in answer prediction. Concretely, we introduce two key features in DCR: 1) a novel structural label space to depict the difference of semantics between concepts, where the labels in new label space are assigned to different groups according to their meanings. This type of semantic information helps decompose the visual recognizer in VQA into multiple specialized sub-recognizers to improve the capacity and efficiency of the recognizer. 2) A feature attention mechanism to capture the similarity between relevant groups of concepts, e.g., human-related group “chef, waiter” is more related to “swimming, running, etc.” than scene related group “sunny, rainy, etc.”. This type of semantic information helps sub-recognizers for relevant groups to adaptively share part of modules and to share the knowledge between relevant sub-recognizers to facilitate the learning procedure. Extensive experiments on several datasets have shown that the proposed structural label space and DCR module can efficiently learn the visual concept recognition and benefit the performance of the VQA model. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Design of a calibration system for radioactive aerosol monitor.
- Author
-
Qin, Guoxiu, Xu, Youning, Chen, Xilin, Chen, Yongyong, Li, Fan, and Li, Weizhe
- Subjects
RADIOACTIVE aerosols ,MONODISPERSE colloids ,AUTOMATIC control systems ,TEST systems - Abstract
A calibration system was designed to meet the calibration requirement of the radioactive aerosal monitors. The system consists of a radioactive aerosol generation unit, a dilution and mixing unit, a sampling and measuring unit, a purification unit and an automatic control unit. The commonly-used α and β radioactive solutions Am(NO
3 )3 and CsCl were selected to prepared different concentrations of monodisperse radioactive aerosol. After testing the calibration system, it was determined that the measurement uncertainty of the α and β efficiency calibration factors taken by the system for the radioactive aerosol monitor was 2.8% and 2.6%, respectively. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
36. AttGAN: Facial Attribute Editing by Only Changing What You Want.
- Author
-
He, Zhenliang, Zuo, Wangmeng, Kan, Meina, Shan, Shiguang, and Chen, Xilin
- Subjects
FACE ,GALLIUM nitride ,HUMAN facial recognition software - Abstract
Facial attribute editing aims to manipulate single or multiple attributes on a given face image, i.e., to generate a new face image with desired attributes while preserving other details. Recently, the generative adversarial net (GAN) and encoder–decoder architecture are usually incorporated to handle this task with promising results. Based on the encoder–decoder architecture, facial attribute editing is achieved by decoding the latent representation of a given face conditioned on the desired attributes. Some existing methods attempt to establish an attribute-independent latent representation for further attribute editing. However, such attribute-independent constraint on the latent representation is excessive because it restricts the capacity of the latent representation and may result in information loss, leading to over-smooth or distorted generation. Instead of imposing constraints on the latent representation, in this work, we propose to apply an attribute classification constraint to the generated image to just guarantee the correct change of desired attributes, i.e., to change what you want. Meanwhile, the reconstruction learning is introduced to preserve attribute-excluding details, in other words, to only change what you want. Besides, the adversarial learning is employed for visually realistic editing. These three components cooperate with each other forming an effective framework for high quality facial attribute editing, referred as AttGAN. Furthermore, the proposed method is extended for attribute style manipulation in an unsupervised manner. Experiments on two wild datasets, CelebA and LFW, show that the proposed method outperforms the state-of-the-art on realistic attribute editing with other facial details well preserved. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. A Novel Sign Language Recognition Framework Using Hierarchical Grassmann Covariance Matrix.
- Author
-
Wang, Hanjie, Chai, Xiujuan, and Chen, Xilin
- Abstract
Visual sign language recognition is an interesting and challenging problem. To create a discriminative representation, a hierarchical Grassmann covariance matrix (HGCM) model is proposed for sign description. Furthermore, a multi-temporal belief propagation (MTBP) based segmentation approach is presented for continuous sequence spotting. Concretely speaking, a sign is represented by multiple covariance matrices, followed by evaluating and selecting their most significant singular vectors. These covariance matrices are transformed into a more compact and discriminative HGCM, which is formulated on the Grassmann manifold. Continuous sign sequences can be recognized frame by frame using the HGCM model, before being optimized by MTBP, which is a carefully designed graphic model. The proposed method is thoroughly evaluated on isolated and synthetic and real continuous sign datasets as well as on HDM05. Extensive experimental results convincingly show the effectiveness of our proposed framework. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
38. Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning.
- Author
-
Han, Hu, Li, Jie, Jain, Anil K., Shan, Shiguang, and Chen, Xilin
- Subjects
TATTOOING ,DIGITAL video ,VIDEO surveillance ,DIGITAL images ,ARTIFICIAL neural networks ,OBJECT recognition (Computer vision) ,IMAGE fusion - Abstract
The explosive growth of digital images in video surveillance and social media has led to the significant need for efficient search of persons of interest in law enforcement and forensic applications. Despite tremendous progress in primary biometric traits (e.g., face and fingerprint) based person identification, a single biometric trait alone can not meet the desired recognition accuracy in forensic scenarios. Tattoos, as one of the important soft biometric traits, have been found to be valuable for assisting in person identification. However, tattoo search in a large collection of unconstrained images remains a difficult problem, and existing tattoo search methods mainly focus on matching cropped tattoos, which is different from real application scenarios. To close the gap, we propose an efficient tattoo search approach that is able to learn tattoo detection and compact representation jointly in a single convolutional neural network (CNN) via multi-task learning. While the features in the backbone network are shared by both tattoo detection and compact representation learning, individual latent layers of each sub-network optimize the shared features toward the detection and feature learning tasks, respectively. We resolve the small batch size issue inside the joint tattoo detection and compact representation learning network via random image stitch and preceding feature buffering. We evaluate the proposed tattoo search system using multiple public-domain tattoo benchmarks, and a gallery set with about 300K distracter tattoo images compiled from these datasets and images from the Internet. In addition, we also introduce a tattoo sketch dataset containing 300 tattoos for sketch-based tattoo search. Experimental results show that the proposed approach has superior performance in tattoo detection and tattoo search at scale compared to several state-of-the-art tattoo retrieval algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
39. Adaptive Metric Learning For Zero-Shot Recognition.
- Author
-
Jiang, Huajie, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Subjects
KNOWLEDGE gap theory ,SEMANTICS ,IMAGE recognition (Computer vision) ,SIMILARITY (Geometry) ,TASK analysis - Abstract
Zero-shot learning (ZSL) has enjoyed great popularity in recent years due to its ability to recognize novel objects, where semantic information is exploited to build up relations among different categories. Traditional ZSL approaches usually focus on learning more robust visual-semantic embeddings among seen classes and directly apply them to the unseen classes without considering whether they are suitable. It is well known that domain gap exists between seen and unseen classes. In order to tackle such problem, we propose a novel adaptive metric learning approach to measure the compatibility between visual samples and class semantics, where class similarities are utilized to adapt the visual-semantic embedding to the unseen classes. Extensive experiments on four benchmark ZSL datasets show the effectiveness of the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
40. Deep Supervised Hashing for Fast Image Retrieval.
- Author
-
Liu, Haomiao, Wang, Ruiping, Shan, Shiguang, and Chen, Xilin
- Subjects
IMAGE retrieval ,IMAGE representation ,BINARY codes ,HASHING ,BODY image ,COST functions - Abstract
In this paper, we present a new hashing method to learn compact binary codes for highly efficient image retrieval on large-scale datasets. While the complex image appearance variations still pose a great challenge to reliable retrieval, in light of the recent progress of Convolutional Neural Networks (CNNs) in learning robust image representation on various vision tasks, this paper proposes a novel Deep Supervised Hashing method to learn compact similarity-preserving binary code for the huge body of image data. Specifically, we devise a CNN architecture that takes pairs/triplets of images as training inputs and encourages the output of each image to approximate discrete values (e.g. + 1 / - 1 ). To this end, the loss functions are elaborately designed to maximize the discriminability of the output space by encoding the supervised information from the input image pairs/triplets, and simultaneously imposing regularization on the real-valued outputs to approximate the desired discrete values. For image retrieval, new-coming query images can be easily encoded by forward propagating through the network and then quantizing the network outputs to binary codes representation. Extensive experiments on three large scale datasets CIFAR-10, NUS-WIDE, and SVHN show the promising performance of our method compared with the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
41. Locality-constrained framework for face alignment.
- Author
-
Zhang, Jie, Zhao, Xiaowei, Kan, Meina, Shan, Shiguang, Chai, Xiujuan, and Chen, Xilin
- Abstract
Although the conventional active appearance model (AAM) has achieved some success for face alignment, it still suffers from the generalization problem when be applied to unseen subjects and images. To deal with the generalization problem of AAM, we first reformulate the original AAM as sparsity-regularized AAM, which can achieve more compact/better shape and appearance priors by selecting nearest neighbors as the bases of the shape and appearance model. To speed up the fitting procedure, the sparsity in sparsity-regularized AAM is approximated by using the locality (i.e., K-nearest neighbor), and thus inducing the locality-constrained active appearance-model (LC-AAM). The LC-AAM solves a constrained AAM-like fitting problem with the K-nearest neighbors as the bases of shape and appearance model. To alleviate the adverse influence of inaccurate K-nearest neighbor results, the locality constraint is further embedded in the discriminative fitting method denoted as LC-DFM, which can find better K-nearest neighbor results by employing shape-indexed feature, and can also tolerate some inaccurate neighbors benefited from the regression model rather than the generative model in AAM. Extensive experiments on several datasets demonstrate that our methods outperform the state-of-the-arts in both detection accuracy and generalization ability. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
42. Unifying Visual Attribute Learning with Object Recognition in a Multiplicative Framework.
- Author
-
Liang, Kongming, Chang, Hong, Ma, Bingpeng, Shan, Shiguang, and Chen, Xilin
- Subjects
VISUAL learning ,OBJECT recognition (Computer vision) ,IMAGE representation ,COMPUTER vision ,DEEP learning ,LEARNING problems - Abstract
Attributes are mid-level semantic properties of objects. Recent research has shown that visual attributes can benefit many typical learning problems in computer vision community. However, attribute learning is still a challenging problem as the attributes may not always be predictable directly from input images and the variation of visual attributes is sometimes large across categories. In this paper, we propose a unified multiplicative framework for attribute learning, which tackles the key problems. Specifically, images and category information are jointly projected into a shared feature space, where the latent factors are disentangled and multiplied to fulfil attribute prediction. The resulting attribute classifier is category-specific instead of being shared by all categories. Moreover, our model can leverage auxiliary data to enhance the predictive ability of attribute classifiers, which can reduce the effort of instance-level attribute annotation to some extent. By integrated into an existing deep learning framework, our model can both accurately predict attributes and learn efficient image representations. Experimental results show that our method achieves superior performance on both instance-level and category-level attribute prediction. For zero-shot learning based on visual attributes and human-object interaction recognition, our method can improve the state-of-the-art performance on several widely used datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
43. Hierarchical Attention for Part-Aware Face Detection.
- Author
-
Wu, Shuzhe, Kan, Meina, Shan, Shiguang, and Chen, Xilin
- Subjects
HUMAN facial recognition software ,IMAGE analysis ,ACCURACY ,GAUSSIAN processes ,HIERARCHICAL clustering (Cluster analysis) - Abstract
Expressive representations for characterizing face appearances are essential for accurate face detection. Due to different poses, scales, illumination, occlusion, etc, face appearances generally exhibit substantial variations, and the contents of each local region (facial part) vary from one face to another. Current detectors, however, particularly those based on convolutional neural networks, apply identical operations (e.g. convolution or pooling) to all local regions on each face for feature aggregation (in a generic sliding-window configuration), and take all local features as equally effective for the detection task. In such methods, not only is each local feature suboptimal due to ignoring region-wise distinctions, but also the overall face representations are semantically inconsistent. To address the issue, we design a hierarchical attention mechanism to allow adaptive exploration of local features. Given a face proposal, part-specific attention modeled as learnable Gaussian kernels is proposed to search for proper positions and scales of local regions to extract consistent and informative features of facial parts. Then face-specific attention predicted with LSTM is introduced to model relations between the local parts and adjust their contributions to the detection tasks. Such hierarchical attention leads to a part-aware face detector, which forms more expressive and semantically consistent face representations. Extensive experiments are performed on three challenging face detection datasets to demonstrate the effectiveness of our hierarchical attention and make comparisons with state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
44. Identifying patients with time-sensitive injuries: Association of mortality with increasing prehospital time.
- Author
-
Chen, Xilin, Guyette, Francis X., Peitzman, Andrew B., Billiar, Timothy R., Sperry, Jason L., and Brown, Joshua B.
- Published
- 2019
- Full Text
- View/download PDF
45. Hyperspectral Light Field Stereo Matching.
- Author
-
Zhu, Kang, Xue, Yujia, Fu, Qiang, Kang, Sing Bing, Chen, Xilin, and Yu, Jingyi
- Subjects
IMAGE processing ,MARKOV random fields ,DIGITAL image processing ,COMPUTER vision ,REMOTE sensing - Abstract
In this paper, we describe how scene depth can be extracted using a hyperspectral light field capture (H-LF) system. Our H-LF system consists of a $5 \times 6$ 5 × 6 array of cameras, with each camera sampling a different narrow band in the visible spectrum. There are two parts to extracting scene depth. The first part is our novel cross-spectral pairwise matching technique, which involves a new spectral-invariant feature descriptor and its companion matching metric we call bidirectional weighted normalized cross correlation (BWNCC). The second part, namely, H-LF stereo matching, uses a combination of spectral-dependent correspondence and defocus cues. These two new cost terms are integrated into a Markov Random Field (MRF) for disparity estimation. Experiments on synthetic and real H-LF data show that our approach can produce high-quality disparity maps. We also show that these results can be used to produce the complete plenoptic cube in addition to synthesizing all-focus and defocused color images under different sensor spectral responses. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
46. Cross Euclidean-to-Riemannian Metric Learning with Application to Face Recognition from Video.
- Author
-
Huang, Zhiwu, Wang, Ruiping, Shan, Shiguang, Van Gool, Luc, and Chen, Xilin
- Subjects
RIEMANNIAN manifolds ,EUCLIDEAN metric ,HUMAN facial recognition software ,VIDEOS ,MATRICES (Mathematics) - Abstract
Riemannian manifolds have been widely employed for video representations in visual classification tasks including video-based face recognition. The success mainly derives from learning a discriminant Riemannian metric which encodes the non-linear geometry of the underlying Riemannian manifolds. In this paper, we propose a novel metric learning framework to learn a distance metric across a Euclidean space and a Riemannian manifold to fuse average appearance and pattern variation of faces within one video. The proposed metric learning framework can handle three typical tasks of video-based face recognition: Video-to-Still, Still-to-Video and Video-to-Video settings. To accomplish this new framework, by exploiting typical Riemannian geometries for kernel embedding, we map the source Euclidean space and Riemannian manifold into a common Euclidean subspace, each through a corresponding high-dimensional Reproducing Kernel Hilbert Space (RKHS). With this mapping, the problem of learning a cross-view metric between the two source heterogeneous spaces can be converted to learning a single-view Euclidean distance metric in the target common Euclidean space. By learning information on heterogeneous data with the shared label, the discriminant metric in the common space improves face recognition from videos. Extensive experiments on four challenging video face databases demonstrate that the proposed framework has a clear advantage over the state-of-the-art methods in the three classical video-based face recognition scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
47. Fusing magnitude and phase features with multiple face models for robust face recognition.
- Author
-
Li, Yan, Shan, Shiguang, Wang, Ruiping, Cui, Zhen, and Chen, Xilin
- Abstract
High accuracy face recognition is of great importance for a wide variety of real-world applications. Although significant progress has been made in the last decades, fully automatic face recognition systems have not yet approached the goal of surpassing the human vision system, even in controlled conditions. In this paper, we propose an approach for robust face recognition by fusing two complementary features: one is Gabor magnitude of multiple scales and orientations and the other is Fourier phase encoded by spatial pyramid based local phase quantization (SPLPQ). To reduce the high dimensionality of both features, block-wise fisher discriminant analysis (BFDA) is applied and further combined by score-level fusion. Moreover, inspired by the biological cognitive mechanism, multiple face models are exploited to further boost the robustness of the proposed approach. We evaluate the proposed approach on three challenging databases, i.e., FRGC ver2.0, LFW, and CFW-p, that address two face classification scenarios, i.e., verification and identification. Experimental results consistently exhibit the complementarity of the two features and the performance boost gained by the multiple face models. The proposed approach achieved approximately 96% verification rate when FAR was 0.1% on FRGC ver2.0 Exp.4, impressively surpassing all the best known results. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
48. Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach.
- Author
-
Han, Hu, Jain, Anil K., Wang, Fang, Shan, Shiguang, and Chen, Xilin
- Subjects
COMPUTER multitasking ,HUMAN facial recognition software ,VIDEO surveillance ,ELECTRONIC surveillance ,ARTIFICIAL neural networks ,NEURAL computers - Abstract
Face attribute estimation has many potential applications in video surveillance, face retrieval, and social media. While a number of methods have been proposed for face attribute estimation, most of them did not explicitly consider the attribute correlation and heterogeneity (e.g., ordinal versus nominal and holistic versus local) during feature representation learning. In this paper, we present a Deep Multi-Task Learning (DMTL) approach to jointly estimate multiple heterogeneous attributes from a single face image. In DMTL, we tackle attribute correlation and heterogeneity with convolutional neural networks (CNNs) consisting of shared feature learning for all the attributes, and category-specific feature learning for heterogeneous attributes. We also introduce an unconstrained face database (LFW+), an extension of public-domain LFW, with heterogeneous demographic attributes (age, gender, and race) obtained via crowdsourcing. Experimental results on benchmarks with multiple face attributes (MORPH II, LFW+, CelebA, LFWA, and FotW) show that the proposed approach has superior performance compared to state of the art. Finally, evaluations on a public-domain face database (LAP) with a single attribute show that the proposed approach has excellent generalization ability. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
49. Geometry-Aware Similarity Learning on SPD Manifolds for Visual Recognition.
- Author
-
Huang, Zhiwu, Wang, Ruiping, Li, Xianqiu, Liu, Wenxian, Shan, Shiguang, Van Gool, Luc, and Chen, Xilin
- Subjects
SIMILARITY transformations ,SYMMETRIC matrices ,RIEMANNIAN geometry ,MANIFOLDS (Mathematics) ,VISUAL texture recognition - Abstract
Symmetric positive definite (SPD) matrices have been employed for data representation in many visual recognition tasks. The success is mainly attributed to learning discriminative SPD matrices encoding the Riemannian geometry of the underlying SPD manifolds. In this paper, we propose a geometry-aware SPD similarity learning (SPDSL) framework to learn discriminative SPD features by directly pursuing a manifold-manifold transformation matrix of full column rank. Specifically, by exploiting the Riemannian geometry of the manifolds of fixed-rank positive semidefinite (PSD) matrices, we present a new solution to reduce optimization over the space of column full-rank transformation matrices to optimization on the PSD manifold, which has a well-established Riemannian structure. Under this solution, we exploit a new supervised SPDSL technique to learn the manifold–manifold transformation by regressing the similarities of selected SPD data pairs to their ground-truth similarities on the target SPD manifold. To optimize the proposed objective function, we further derive an optimization algorithm on the PSD manifold. Evaluations on three visual classification tasks show the advantages of the proposed approach over the existing SPD-based discriminant learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Logistics of air medical transport: When and where does helicopter transport reduce prehospital time for trauma?
- Author
-
Chen, Xilin, Gestring, Mark L., Rosengart, Matthew R., Peitzman, Andrew B., Billiar, Timothy R., Sperry, Jason L., and Brown, Joshua B.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.