Author: "Wenhan Yang" / Topic: business - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wenhan Yang"' showing total 117 results

Start Over Author "Wenhan Yang" Topic business

117 results on '"Wenhan Yang"'

1. Towards Analysis-Friendly Face Representation With Scalable Feature and Texture Compression

Author: Siwei Ma, Shiqi Wang, Xinfeng Zhang, Wenhan Yang, Shanshe Wang, Wen Gao, and Shurun Wang
Subjects: FOS: Computer and information sciences, Texture compression, Artificial neural network, business.industry, Computer science, Computer Vision and Pattern Recognition (cs.CV), Deep learning, Image and Video Processing (eess.IV), Feature extraction, Computer Science - Computer Vision and Pattern Recognition, Multi-task learning, Pattern recognition, Electrical Engineering and Systems Science - Image and Video Processing, Computer Science Applications, Feature (computer vision), Signal Processing, FOS: Electrical engineering, electronic engineering, information engineering, Media Technology, Artificial intelligence, Electrical and Electronic Engineering, business, Transform coding, Image compression
Abstract: Compactly representing visual information plays a fundamental role in optimizing the ultimate utility of myriad visual data-centered applications. Numerous approaches have been proposed to efficiently compress the texture and visual features for human visual perception and machine intelligence, respectively; however, much less work has been dedicated to studying the interactions between them. Here, we investigate the integration of feature and texture compression and show that a universal and collaborative visual information representation can be achieved in a hierarchical way. In particular, we study feature and texture compression in a scalable coding framework, where the base layer serves as the deep learning feature and the enhancement layer targets to perfectly reconstruct the texture. Based on the strong generative capability of deep neural networks, the gap between the base feature layer and enhancement layer is further filled with feature-level texture reconstruction, with the goal of further constructing texture representations from features. As such, the residuals between the original and reconstructed texture could be further conveyed in the enhancement layer. To improve the efficiency of the proposed framework, the base layer neural network is trained in a multitask manner such that the learned features enjoy both high-quality reconstruction and high-accuracy analysis. The framework and optimization strategies are further applied in face image compression, and promising coding performance has been achieved in terms of both rate-fidelity and rate-accuracy evaluations.
Published: 2022

2. Single Image Deraining: From Model-Based to Data-Driven and Beyond

Author: Jiaying Liu, Yuming Fang, Wenhan Yang, Shiqi Wang, and Robby T. Tan
Subjects: FOS: Computer and information sciences, Computer science, Computer Vision and Pattern Recognition (cs.CV), media_common.quotation_subject, Computer Science - Computer Vision and Pattern Recognition, Context (language use), 02 engineering and technology, Machine learning, computer.software_genre, Convolutional neural network, Data-driven, Artificial Intelligence, FOS: Electrical engineering, electronic engineering, information engineering, 0202 electrical engineering, electronic engineering, information engineering, Selection (linguistics), Function (engineering), media_common, business.industry, Applied Mathematics, Image and Video Processing (eess.IV), Electrical Engineering and Systems Science - Image and Video Processing, Visualization, Recurrent neural network, Computational Theory and Mathematics, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, computer, Software
Abstract: The goal of single-image deraining is to restore the rain-free background scenes of an image degraded by rain streaks and rain accumulation. The early single-image deraining methods employ a cost function, where various priors are developed to represent the properties of rain and background layers. Since 2017, single-image deraining methods step into a deep-learning era, and exploit various types of networks, i.e. convolutional neural networks, recurrent neural networks, generative adversarial networks, etc., demonstrating impressive performance. Given the current rapid development, in this paper, we provide a comprehensive survey of deraining methods over the last decade. We summarize the rain appearance models, and discuss two categories of deraining approaches: model-based and data-driven approaches. For the former, we organize the literature based on their basic models and priors. For the latter, we discuss developed ideas related to architectures, constraints, loss functions, and training datasets. We present milestones of single-image deraining methods, review a broad selection of previous works in different categories, and provide insights on the historical development route from the model-based to data-driven methods. We also summarize performance comparisons quantitatively and qualitatively. Beyond discussing the technicality of deraining methods, we also discuss the future directions., Comment: https://flyywh.github.io/Single_rain_removal_survey/
Published: 2021

3. Impulsivity in heroin‐dependent individuals: structural and functional abnormalities within frontostriatal circuits

Author: Yan Xu, Wenhan Yang, Kai Yuan, Jun Li, Jun Liu, Shuang Liu, Longmao Chen, Min Zhang, Ziqiang Shao, and Shicong Wang
Subjects: Oncology, medicine.medical_specialty, Cognitive Neuroscience, media_common.quotation_subject, Nucleus accumbens, Impulsivity, 050105 experimental psychology, Nicotine, 03 medical and health sciences, Behavioral Neuroscience, Cellular and Molecular Neuroscience, 0302 clinical medicine, Neuroimaging, Internal medicine, medicine, 0501 psychology and cognitive sciences, Radiology, Nuclear Medicine and imaging, Risk factor, media_common, business.industry, Addiction, 05 social sciences, Neuropsychology, Psychiatry and Mental health, Neurology, Superior frontal gyrus, Neurology (clinical), medicine.symptom, business, 030217 neurology & neurosurgery, medicine.drug
Abstract: High levels of impulsivity are a risk factor for the initiation of heroin use and a core behavioral characteristic of heroin dependence. Impulsivity also contributes to the maintenance of drug use and hinders effective therapy. Here we sought to identify neuroimaging markers of impulsivity in heroin-dependent individuals (HDI), with a focus on the nucleus accumbens (NAc), a key region implicated in impulsivity and drug addiction generally. Volume and resting-state functional connectivity (RSFC) differences of the bilateral NAc were investigated between 21 HDI and 21 age-, gender-, nicotine-, alcohol-matched healthy controls (HC). The neuroimaging results were then correlated with the Barratt Impulsivity Scales (BIS-11). Higher motor impulsivity (t = 2.347, p = 0.0253) and larger right NAc volume (F (1,38) = 4.719, p = 0.036) was observed in HDI. The right NAc volume was positively correlated with BIS total (r = 0.6196, p = 0.0239) /motor (r = 0.5921, p = 0.0330) scores in HC and BIS motor (r = 0.5145, p = 0.0170) score in HDI. A negative correlation was found between RSFC of the right NAc-bilateral superior frontal gyrus (SFG) and motor impulsivity in HDI (left: r=-0.6537, p = 0.0013; right: r=-0.6167, p = 0.0029) and HC (left: r=-0.6490,p = 0.0164; right: r=-0.6993, p = 0.0078). We aimed to reveal novel multimodality neuroimaging biomarkers of the higher impulsivity in HDI by focusing on the NAc and corresponding functional circuits. Higher motor impulsivity was observed in HDI. Furthermore, the volume of the right NAc and the RSFC strength of right NAc-SFG could be neuroimaging biomarkers for the severity of impulsivity in HDI. These potential biomarkers could be a target for novel treatments in HDI.
Published: 2021

4. Benchmarking Low-Light Image Enhancement and Beyond

Author: Dejia Xu, Wenhan Yang, Jiaying Liu, Haofeng Huang, and Minhao Fan
Subjects: Point (typography), business.industry, Computer science, Machine vision, Perspective (graphical), 02 engineering and technology, Benchmarking, Semantic similarity, Artificial Intelligence, Face (geometry), Pattern recognition (psychology), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Computer Vision and Pattern Recognition, Artificial intelligence, business, Face detection, Software
Abstract: In this paper, we present a systematic review and evaluation of existing single-image low-light enhancement algorithms. Besides the commonly used low-level vision oriented evaluations, we additionally consider measuring machine vision performance in the low-light condition via face detection task to explore the potential of joint optimization of high-level and low-level vision enhancement. To this end, we first propose a large-scale low-light image dataset serving both low/high-level vision with diversified scenes and contents as well as complex degradation in real scenarios, called Vision Enhancement in the LOw-Light condition (VE-LOL). Beyond paired low/normal-light images without annotations, we additionally include the analysis resource related to human, i.e. face images in the low-light condition with annotated face bounding boxes. Then, efforts are made on benchmarking from the perspective of both human and machine visions. A rich variety of criteria is used for the low-level vision evaluation, including full-reference, no-reference, and semantic similarity metrics. We also measure the effects of the low-light enhancement on face detection in the low-light condition. State-of-the-art face detection methods are used in the evaluation. Furthermore, with the rich material of VE-LOL, we explore the novel problem of joint low-light enhancement and face detection. We develop an enhanced face detector to apply low-light enhancement and face detection jointly. The features extracted by the enhancement module are fed to the successive layer with the same resolution of the detection module. Thus, these features are intertwined together to unitedly learn useful information across two phases, i.e. enhancement and detection. Experiments on VE-LOL provide a comparison of state-of-the-art low-light enhancement algorithms, point out their limitations, and suggest promising future directions. Our dataset has supported the Track “Face Detection in Low Light Conditions” of CVPR UG2+ Challenge (2019–2020) ( http://cvpr2020.ug2challenge.org/ ).
Published: 2021

5. Band Representation-Based Semi-Supervised Low-Light Image Enhancement: Bridging the Gap Between Signal Fidelity and Perceptual Quality

Author: Yuming Fang, Yue Wang, Wenhan Yang, Shiqi Wang, and Jiaying Liu
Subjects: Bridging (networking), Artificial neural network, Computer science, business.industry, Image quality, Visibility (geometry), Pattern recognition, 02 engineering and technology, Computer Graphics and Computer-Aided Design, Visualization, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Noise (video), Representation (mathematics), business, Software
Abstract: It has been widely acknowledged that under-exposure causes a variety of visual quality degradation because of intensive noise, decreased visibility, biased color, etc . To alleviate these issues, a novel semi-supervised learning approach is proposed in this paper for low-light image enhancement. More specifically, we propose a deep recursive band network (DRBN) to recover a linear band representation of an enhanced normal-light image based on the guidance of the paired low/normal-light images. Such design philosophy enables the principled network to generate a quality improved one by reconstructing the given bands based upon another learnable linear transformation which is perceptually driven by an image quality assessment neural network. On one hand, the proposed network is delicately developed to obtain a variety of coarse-to-fine band representations, of which the estimations benefit each other in a recursive process mutually. On the other hand, the extracted band representation of the enhanced image in the recursive band learning stage of DRBN is capable of bridging the gap between the restoration knowledge of paired data and the perceptual quality preference to high-quality images. Subsequently, the band recomposition learns to recompose the band representation towards fitting perceptual regularization of high-quality images with the perceptual guidance. The proposed architecture can be flexibly trained with both paired and unpaired data. Extensive experiments demonstrate that our method produces better enhanced results with visually pleasing contrast and color distributions, as well as well-restored structural details.
Published: 2021

6. Sparse Gradient Regularized Deep Retinex Network for Robust Low-Light Image Enhancement

Author: Wenjing Wang, Wenhan Yang, Jiaying Liu, Haofeng Huang, and Shiqi Wang
Subjects: Compression artifact, Color constancy, Computer science, Noise (signal processing), business.industry, Noise reduction, Pattern recognition, 02 engineering and technology, Real image, Computer Graphics and Computer-Aided Design, Signal, Regularization (mathematics), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, Software, Image restoration
Abstract: Due to the absence of a desirable objective for low-light image enhancement, previous data-driven methods may provide undesirable enhanced results including amplified noise, degraded contrast and biased colors. In this work, inspired by Retinex theory, we design an end-to-end signal prior-guided layer separation and data-driven mapping network with layer-specified constraints for single-image low-light enhancement. A Sparse Gradient Minimization sub-Network (SGM-Net) is constructed to remove the low-amplitude structures and preserve major edge information, which facilitates extracting paired illumination maps of low/normal-light images. After the learned decomposition, two sub-networks (Enhance-Net and Restore-Net) are utilized to predict the enhanced illumination and reflectance maps, respectively, which helps stretch the contrast of the illumination map and remove intensive noise in the reflectance map. The effects of all these configured constraints, including the signal structure regularization and losses, combine together reciprocally, which leads to good reconstruction results in overall visual quality. The evaluation on both synthetic and real images, particularly on those containing intensive noise, compression artifacts and their interleaved artifacts, shows the effectiveness of our novel models, which significantly outperforms the state-of-the-art methods.
Published: 2021

7. Towards Coding for Human and Machine Vision: Scalable Face Image Coding

Author: Ling-Yu Duan, Yueyu Hu, Shuai Yang, Jiaying Liu, and Wenhan Yang
Subjects: Pixel, Machine vision, business.industry, Computer science, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Pipeline (software), Computer Science Applications, Visualization, Generative model, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Electrical and Electronic Engineering, business, Decoding methods, Coding (social sciences)
Abstract: The past decades have witnessed the rapid development of image and video coding techniques in the era of big data. However, the signal fidelity-driven coding pipeline design limits the capability of the existing image/video coding frameworks to fulfill the needs of both machine and human vision. In this paper, we come up with a novel face image coding framework by leveraging both the compressive and the generative models, to support machine vision and human perception tasks jointly. Given an input image, the feature analysis is first applied, and then the generative model is employed to reconstruct image with compact structure and color features, where sparse edges are extracted to connect both kinds of vision and a key reference pixel selection method is proposed to determine the priorities of the reference color pixels for scalable coding. The compact edge map serves as the basic layer for machine vision tasks, and the reference pixels act as an enhanced layer to guarantee signal fidelity for human vision. By introducing advanced generative models, we train a decoding network to reconstruct images from compact structure and color representations, which is flexible to accept inputs in a scalable way and to control the imagery effect of the outputs between signal fidelity and visual realism. Experimental results and comprehensive performance analysis over the face image dataset demonstrate the superiority of our framework in both human vision tasks and machine vision tasks, which provide useful evidence on the emerging standardization efforts on MPEG VCM (Video Coding for Machine).
Published: 2021

8. Generalized Face Antispoofing by Learning to Fuse Features From High- and Low-Frequency Domains

Author: Wenhan Yang, Shiqi Wang, and Baoliang Chen
Subjects: Spoofing attack, business.industry, Generalization, Computer science, Feature extraction, 020207 software engineering, Pattern recognition, 02 engineering and technology, Facial recognition system, Computer Science Applications, Hardware and Architecture, Feature (computer vision), Face (geometry), Frequency domain, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Fuse (electrical), Artificial intelligence, business, Software
Abstract: In this article, we propose a face spoofing detection method by learning to fuse high-frequency (HF) and low-frequency (LF) features, in an effort to improve the generalization capability and fill up the domain gap between training and testing when the antispoofing is practically conducted in unseen scenarios. In particular, the proposed face antispoofing model consists of two streams that extract HF and LF components of a facial image with three high-pass and three low-pass filters. Moreover, considering the fact that spoofing features exist in different feature levels, we train our network with a novel multiscale triplet loss. The cross-frequency spatial attention module further enables the two streams to communicate and exchange information with each other. Finally, the outputs of the two streams are fused with a weighting strategy for final classification. Extensive experiments conducted on intra- and cross-database settings show the superiority of the proposed scheme.
Published: 2021

9. Association between maternal prepregnancy body mass index and risk of preterm birth in more than 1 million Asian American mothers

Author: Buyun Liu, Yuxiao Wu, Linda Snetselaar, Mark K. Santillan, Wei Bao, Rui Gao, and Wenhan Yang
Subjects: Class III obesity, business.industry, Endocrinology, Diabetes and Metabolism, nutritional and metabolic diseases, 030209 endocrinology & metabolism, 030204 cardiovascular system & hematology, Overweight, Birth certificate, Article, 03 medical and health sciences, 0302 clinical medicine, Class II obesity, Class I obesity, medicine, medicine.symptom, Underweight, business, Live birth, Body mass index, Demography
Abstract: BACKGROUND: Asian Americans are among the fastest growing subpopulations in the United States. However, evidence about maternal prepregnancy body mass index (BMI) and preterm birth among Asian Americans is lacking. METHODS: This population-based study used nationwide birth certificate data from the US National Vital Statistics System 2014 to 2018. All Asian American mothers who had a singleton live birth were included. According to Asian-specific cutoffs, maternal prepregnancy BMI was classified into underweight (BMI < 18.5 kg/m(2)), normal weight (BMI 18.5–22.9 kg/m(2)), overweight (BMI 23.0–27.4 kg/m(2)), class I obesity (BMI 27.5–32.4 kg/m(2)), class II obesity (BMI 32.5–37.4 kg/m(2)), and class III obesity (BMI ≥37.5 kg/m(2)). Preterm birth was defined as gestational age less than 37 weeks. Multivariable logistic regression models were used to estimate the odds ratio (OR) of preterm birth. RESULTS: We included 1 081 341 Asian American mother-infant pairs. The rate of preterm birth was 6.51% (n = 70 434). The rate of maternal prepregnancy overweight and obesity was 46.80% (n = 506 042). Compared with mothers with normal weight, the adjusted OR of preterm delivery was 1.04 (95% CI, 1.01–1.07) for underweight mothers, 1.18 (95% CI, 1.16–1.20) for overweight mothers, 1.41 (95% CI, 1.37–1.44) for mothers with class I obesity, 1.69 (95% CI, 1.63–1.76) for mothers with class II obesity, and 1.78 (95% CI, 1.66–1.90) for mothers with class III obesity. Similar patterns of associations were observed in Asian American mothers across different country origins. CONCLUSIONS: Among Asian American mothers, maternal prepregnancy overweight or obesity, defined by Asian-specific, lower BMI cutoffs, was significantly associated with an increased risk of preterm birth. The risk of preterm birth increased with increasing obesity severity. These findings highlight the importance of using Asian-specific BMI cutoffs in assessing risk of preterm birth among Asian American mothers.
Published: 2020

10. Association of attention‐deficit/hyperactivity disorder with diabetes mellitus in <scp>US</scp> adults

Author: Jin Jing, Linda Snetselaar, Guifeng Xu, Wenhan Yang, and Buyun Liu
Subjects: Adult, Male, Time Factors, Endocrinology, Diabetes and Metabolism, 030209 endocrinology & metabolism, 030204 cardiovascular system & hematology, Young Adult, 03 medical and health sciences, 0302 clinical medicine, Neurodevelopmental disorder, Risk Factors, Diabetes mellitus, mental disorders, Diabetes Mellitus, Prevalence, medicine, Humans, National Health Interview Survey, Attention deficit hyperactivity disorder, Aged, business.industry, Odds ratio, Middle Aged, medicine.disease, Health Surveys, Obesity, United States, Confidence interval, Attention Deficit Disorder with Hyperactivity, Female, business, Body mass index, Demography
Abstract: Attention-deficit/hyperactivity disorder (ADHD) is a childhood-onset neurodevelopmental disorder that usually persists into adulthood. However, limited evidence is available regarding its influence on adult health outcomes beyond neuropsychiatric comorbidities. This study aimed to examine the association of ADHD with diabetes in US adults.We analyzed data from the National Health Interview Survey (NHIS), a leading health survey of a nationally representative sample in the United States. We included adults aged 20-79 years who participated in the NHIS 2007 and 2012. Physician-diagnosed ADHD and diabetes were reported during an in-person household interview. Logistic regression with survey sampling weights was used to estimate the odds ratio (OR) and 95% confidence interval (CI) of diabetes.This analysis included 52 821 adults (weighted mean age 45.5 years; 48.6% males). Among them, 1642 participants reported a diagnosis of ADHD and 4631 reported a diagnosis of diabetes. In the multivariable analysis adjusting for age, sex, race/ethnicity, education level, family income level, smoking, alcohol drinking, physical activity, and body mass index, the OR of diabetes among adults with ADHD vs those without ADHD was 1.54 (95% CI, 1.16-2.04). In the stratified analyses, the significant association of ADHD with diabetes remained in most strata, and the associations were not significantly modified by age, sex, race/ethnicity, or obesity status.In a nationally representative sample of US adults, we found a significant association between a history of ADHD diagnosis and diabetes.背景: 注意缺陷力/多动症(ADHD)是一种儿童期出现并通常持续到成年的一种神经发育障碍。除外神经精神病合并症以外, ADHD对患者长期健康结局影响的研究证据非常有限。本项研究旨在了解美国成人ADHD与糖尿病的关系。方法: 美国国家健康访谈调查(NHIS)是一项针对美国全国代表性样本的健康调查。我们纳入了2007年和2012参加这项调查的20-79岁成年人。在面对面的家庭访谈中收集了ADHD和糖尿病的诊断信息。用纳入了调查样本权重的逻辑回归分析来估计糖尿病的优势比(OR)和95%置信区间(CI)。结果: 这项分析包括52821名成年人(加权平均年龄45.5岁；48.6%为男性)。其中, 有1642名受试者报告了ADHD的诊断, 4631名受试者报告了糖尿病的诊断。在对年龄, 性别, 种族/族裔, 教育程度, 家庭收入水平, 吸烟, 饮酒, 体育锻炼和体重指数进行校正的多变量分析中, 患有ADHD的成年人与未患ADHD的成年人相比, 其糖尿病OR为1.54(95 % CI:1.16-2.04)。在分层分析中, ADHD与糖尿病之间的显著关联存在于大多数层级中, 并且该关联并未因年龄, 性别, 种族/民族或肥胖状况有明显改变。结论: 在美国成年人的全国代表性样本中, 我们发现了ADHD的诊断历史与糖尿病之间存在显著关联。.
Published: 2020

11. Coarse-to-Fine Hyper-Prior Modeling for Learned Image Compression

Author: Yueyu Hu, Wenhan Yang, and Jiaying Liu
Subjects: Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, Data compression ratio, Pattern recognition, 02 engineering and technology, General Medicine, Coarse to fine, Redundancy (information theory), 0202 electrical engineering, electronic engineering, information engineering, Redundancy (engineering), Codec, 020201 artificial intelligence & image processing, Artificial intelligence, business, Image compression
Abstract: Approaches to image compression with machine learning now achieve superior performance on the compression rate compared to existing hybrid codecs. The conventional learning-based methods for image compression exploits hyper-prior and spatial context model to facilitate probability estimations. Such models have limitations in modeling long-term dependency and do not fully squeeze out the spatial redundancy in images. In this paper, we propose a coarse-to-fine framework with hierarchical layers of hyper-priors to conduct comprehensive analysis of the image and more effectively reduce spatial redundancy, which improves the rate-distortion performance of image compression significantly. Signal Preserving Hyper Transforms are designed to achieve an in-depth analysis of the latent representation and the Information Aggregation Reconstruction sub-network is proposed to maximally utilize side-information for reconstruction. Experimental results show the effectiveness of the proposed network to efficiently reduce the redundancies in images and improve the rate-distortion performance, especially for high-resolution images. Our project is publicly available at https://huzi96.github.io/coarse-to-fine-compression.html.
Published: 2020

12. Towards Scale-Free Rain Streak Removal via Self-Supervised Fractal Band Learning

Author: Jiaying Liu, Shiqi Wang, Wenhan Yang, Dejia Xu, and Xiaodong Wang
Subjects: Generalization, business.industry, Computer science, Streak, Training (meteorology), Pattern recognition, 02 engineering and technology, General Medicine, Fractal, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Scale (map), business
Abstract: Data-driven rain streak removal methods, which most of rely on synthesized paired data, usually come across the generalization problem when being applied in real cases. In this paper, we propose a novel deep-learning based rain streak removal method injected with self-supervision to improve the ability to remove rain streaks in various scales. To realize this goal, we made efforts in two aspects. First, considering that rain streak removal is highly correlated with texture characteristics, we create a fractal band learning (FBL) network based on frequency band recovery. It integrates commonly seen band feature operations with neural modules and effectively improves the capacity to capture discriminative features for deraining. Second, to further improve the generalization ability of FBL for rain streaks in various scales, we add cross-scale self-supervision to regularize the network training. The constraint forces the extracted features of inputs in different scales to be equivalent after rescaling. Therefore, FBL can offer similar responses based on solely image content without the interleave of scale and is capable to remove rain streaks in various scales. Extensive experiments in quantitative and qualitative evaluations demonstrate the superiority of our FBL for rain streak removal, especially for the real cases where very large rain streaks exist, and prove the effectiveness of its each component. Our code will be public available at: https://github.com/flyywh/AAAI-2020-FBL-SS.
Published: 2020

13. LR3M: Robust Low-Light Enhancement via Low-Rank Regularized Retinex Model

Author: Xutong Ren, Wen-Huang Cheng, Wenhan Yang, and Jiaying Liu
Subjects: Color constancy, Computer science, business.industry, Noise reduction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Computer Graphics and Computer-Aided Design, Visualization, Noise, Robustness (computer science), Histogram, 0202 electrical engineering, electronic engineering, information engineering, Coherence (signal processing), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, Software
Abstract: Noise causes unpleasant visual effects in low-light image/video enhancement. In this paper, we aim to make the enhancement model and method aware of noise in the whole process. To deal with heavy noise which is not handled in previous methods, we introduce a robust low-light enhancement approach, aiming at well enhancing low-light images/videos and suppressing intensive noise jointly. Our method is based on the proposed Low-Rank Regularized Retinex Model (LR3M), which is the first to inject low-rank prior into a Retinex decomposition process to suppress noise in the reflectance map. Our method estimates a piece-wise smoothed illumination and a noise-suppressed reflectance sequentially, avoiding remaining noise in the illumination and reflectance maps which are usually presented in alternative decomposition methods. After getting the estimated illumination and reflectance, we adjust the illumination layer and generate our enhancement result. Furthermore, we apply our LR3M to video low-light enhancement. We consider inter-frame coherence of illumination maps and find similar patches through reflectance maps of successive frames to form the low-rank prior to make use of temporal correspondence. Our method performs well for a wide variety of images and videos, and achieves better quality both in enhancing and denoising, compared with the state-of-the-art methods.
Published: 2020

14. Screening and surveillance of multiple solid tumours using plasma placental-like chondroitin sulfate A (pl-CSA)

Author: Qian Youhui, Shiling Chen, Aiwen Le, Baozhen Zhang, Xiao Zhonglin, Zhang Juzuo, Kang Zhang, Guodong Wu, Beini Sun, Xiujun Fan, Zhai Rihong, Shaowu Ye, Wenhan Yang, Chen Zhilong, and Li Tian
Subjects: Male, Pathology, medicine.medical_specialty, Placenta, Enzyme-Linked Immunosorbent Assay, Disease, Mice, 03 medical and health sciences, 0302 clinical medicine, Pregnancy, Neoplasms, Cancer screening, medicine, Animals, Humans, Lung cancer, biology, business.industry, Chondroitin Sulfates, Cancer, circulating pl-CSA, General Medicine, Gold standard (test), Middle Aged, medicine.disease, Antibodies, Anti-Idiotypic, cancer screening, biology.protein, biomarker, Biomarker (medicine), Immunohistochemistry, Female, 030211 gastroenterology & hepatology, Antibody, business, Protein Binding, Research Paper
Abstract: Rationale: Placental-like chondroitin sulfate A (pl-CSA) is known to be exclusively synthesized in multiple cancer tissues and associated with disease severity. Here, we aimed to assess whether pl-CSA is released into bio-fluids and can serve as a cancer biomarker. Methods: A novel ELISA was developed to analyse pl-CSA content in bio-fluids using pl-CSA binding protein and an anti-pl-CSA antibody. Immunohistochemical staining of tissue chips was used as the gold standard control. Results: The developed ELISA method was specific and sensitive (1.22 μg/ml). The pl-CSA content was significantly higher in lysates and supernatants of cancer cell lines than in those of normal cell lines, in plasma from mouse cancer models than in that from control mice, and in plasma from patients with oesophageal, cervical, ovarian, or lung cancer than in that from healthy controls. Similar to the tissue chip analysis, which showed a significant difference in pl-CSA positivity between cancer tissues and normal adjacent tissues, the plasma pl-CSA analysis had 100% sensitivity and specificity for differentiating oesophageal and lung cancer patients from healthy controls. Importantly, in oesophageal and lung cancer patients, the pl-CSA content was significantly higher in late-stage disease than in early-stage disease, and it dramatically decreased after surgical resection of the tumour. Conclusion: These data indicate a direct link between plasma pl-CSA content and tumour presence, indicating that plasma pl-CSA may be a non-invasive biomarker with clinical applicability for the screening and surveillance of patients with multiple types of solid tumours.
Published: 2020

15. Removing Arbitrary-Scale Rain Streaks via Fractal Band Learning With Self-Supervision

Author: Jiaying Liu, Shiqi Wang, and Wenhan Yang
Subjects: business.industry, Computer science, Feature extraction, Training (meteorology), Streak, Pattern recognition, 02 engineering and technology, Computer Graphics and Computer-Aided Design, Fractal, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, Scale (map), Software
Abstract: Data-driven rain streak removal methods, most of which rely on synthesized paired data, usually come across the generalization problem when being applied in real scenarios. In this paper, we propose a novel deep-learning based rain streak removal method injected with self-supervision to obtain the capacity of removing more varied-scale rain streaks in practical applications. To this end, in this work, efforts are made from two perspectives. First, considering that rain streak removal is highly correlated with texture characteristics, we create a fractal band learning (FBL) network based on frequency band recovery. It integrates commonly seen band feature operations as neural forms and effectively improves the capacity to capture discriminative features for deraining. Second, to further improve the generalization ability of FBL to remove rain streaks of varied scales, we incorporate scale-robust self-supervision to regularize the network training. The constraint forces the extracted features of an input rain image at different scales to be equivalent after rescaling operations. Therefore, our method can offer similar responses based on solely image content without the interference of scale change and is capable to remove varied-scale rain streaks. Extensive experiments in quantitative and qualitative evaluations demonstrate the superiority of our method for rain streak removal, especially for the real cases where very large rain streaks exist, and prove the effectiveness of each component.
Published: 2020

16. Association of Maternal Sexually Transmitted Infections With Risk of Preterm Birth in the United States

Author: Bo Wang, Yuxiao Wu, Rui Gao, Mark K. Santillan, Kelli K. Ryckman, Wei Bao, Donna A. Santillan, Wenhan Yang, and Buyun Liu
Subjects: Adult, medicine.medical_specialty, Population, Gonorrhea, Sexually Transmitted Diseases, Birth certificate, urologic and male genital diseases, Cohort Studies, Pregnancy, Risk Factors, medicine, Humans, Syphilis, Pregnancy Complications, Infectious, education, Original Investigation, Retrospective Studies, education.field_of_study, Chlamydia, Obstetrics, business.industry, Research, Infant, Newborn, Obstetrics and Gynecology, General Medicine, Chlamydia Infections, medicine.disease, female genital diseases and pregnancy complications, United States, Causality, Online Only, Premature birth, Premature Birth, Female, Live birth, business
Abstract: Key Points Question Are maternal sexually transmitted infections (gonorrhea, syphilis, or chlamydia) associated with preterm birth? Findings In this population-based cohort study using US nationwide birth certificate data and including more than 14 million mother-infant pairs with singleton live births, maternal sexually transmitted infections were associated with an increased risk of preterm birth, especially moderately and very preterm birth. Meaning These results suggest that maternal sexually transmitted infections may potentially affect neonatal outcomes., This cohort study examines US nationwide birth certificate data for associations between 4 common sexually transmitted infections and preterm birth., Importance Maternal infection has been implicated in the pathogenesis of preterm birth through intrauterine inflammatory response. Chlamydia, gonorrhea, and syphilis are among the most common sexually transmitted infections worldwide, but studies on their association with preterm birth are sparse. Objective To examine the association between maternal chlamydia, gonorrhea, and syphilis infections in pregnancy and the risk of preterm birth in a large population-based study in the US. Design, Setting, and Participants This population-based retrospective cohort study examined nationwide birth certificate data from the US National Vital Statistics System between 2016 and 2019. All mothers who had a singleton live birth and available data on chlamydia, gonorrhea, or syphilis infection before or during pregnancy and gestational age at birth were included in analysis. Exposures Sexually transmitted infection (chlamydia, gonorrhea, or syphilis) occurring before or during pregnancy. Main Outcomes and Measures Preterm birth, defined as gestational age less than 37 weeks. Results This study included 14 373 023 mothers (mean [SD] age 29 [5.8] years; Hispanic, 3 435 333 [23.9%]; non-Hispanic Asian, 912 425 [6.3%]; non-Hispanic Black, 2 058 006 [14.3%]; and non-Hispanic White, 7 386 568 [51.4%]). Among the mothers, 267 260 (1.9%) had chlamydia, 43 147 (0.3%) had gonorrhea, and 16 321 (0.1%) had syphilis. Among the newborns, 1 146 800 (8.0%) were preterm births. The rate of preterm birth was 9.9%, 12.2%, and 13.3% among women with chlamydia, gonorrhea, and syphilis infection, respectively. After adjustment for sociodemographic and medical and/or health factors, the adjusted odds ratio of preterm birth was 1.03 (95% CI, 1.02-1.04) for chlamydia, 1.11 (95% CI, 1.08-1.15) for gonorrhea, 1.17 (95% CI, 1.11-1.22) for syphilis, and 1.06 (95% CI, 1.05-1.07) for any of these sexually transmitted infections comparing mothers with these conditions and those without. Conclusions and Relevance Maternal sexually transmitted infections (gonorrhea, syphilis, or chlamydia) were associated with an increased risk of preterm birth. Pregnant women with sexually transmitted infections before or during pregnancy might benefit from targeted prevention for preterm birth.
Published: 2021

17. Association of Iron-Deficiency Anemia and Non-Iron-Deficiency Anemia with Neurobehavioral Development in Children Aged 6–24 Months

Author: Wenhan Yang, Juan Zheng, and Jie Liu
Subjects: Male, Pediatrics, medicine.medical_specialty, Anemia, anemia, children, iron-deficiency anemia, neurobehavioral development, Neurogenesis, Gross motor skill, Child Behavior, Article, Child Development, Risk Factors, Prevalence, medicine, Humans, Public Health Surveillance, Development assessment, TX341-641, Adverse effect, Serum ferritin, Developmental quotient, Nutrition and Dietetics, Anemia, Iron-Deficiency, business.industry, Nutrition. Foods and food supply, Age Factors, Infant, medicine.disease, Iron-deficiency anemia, Child, Preschool, Female, Health Impact Assessment, Hemoglobin, business, Biomarkers, Food Science
Abstract: (1) Background: Anemia has comprehensive adverse effects on the growth and development of children. In this study, we analyzed the potential effects of different types of anemia on early-life neurobehavioral development. (2) Methods: A total of 2601 children aged 6–24 months, whose parents agreed to participate in this study, underwent routine blood tests and neurobehavioral development assessment. The children’s parents or other primary caregivers were interviewed with a face-to-face questionnaire at the time of enrollment in the study. Anemia was determined by hemoglobin < 110 g/L and classified into iron-deficiency and non-iron-deficiency anemia according to the levels of serum ferritin, C-reactive protein, and alpha-1-acid glycoprotein. Neurobehavioral development was assessed by the China Developmental Scale for Children and divided into five domains: gross motor, fine movement, adaptability, language, and social behavior. The development quotient (DQ) was used to measure the level of total neurobehavioral development and each domain of neurobehavioral development. (3) Results: The prevalence of anemia in children aged 6–24 months was 26.45%, of which iron-deficiency anemia only accounted for 27.33%. Compared with children without anemia, those with iron-deficiency anemia had a significantly lower developmental quotient (DQ) for total neurobehavioral development and gross motor and adaptability development. The partial regression coefficients were −1.33 (95% CI −2.36, −0.29; p = 0.012), −1.88 (95% CI −3.74, −0.03; p = 0.047), and 1.48 (95% CI −2.92, −0.05; p = 0.042), respectively. Children with non-iron-deficiency anemia had significantly lower DQ for total neurobehavioral development and gross motor and fine movement development than those without anemia. The partial regression coefficients were −0.94 (95% CI −1.64, −0.25; p = 0.008), −1.25 (95% CI −2.48, −0.03; p = 0.044), and −1.18 (95% CI −2.15, −0.21; p = 0.017), respectively. There were no statistically significant differences in total neurobehavioral development and the five domains of neurobehavioral development between children with non-iron-deficiency and iron-deficiency anemia. The partial β values were 0.40 (95% CI −1.53, 2.33; p = 0.684), 0.21 (95% CI −1.39, 1.81; p = 0.795), 0.63 (95% CI −1.03, 2.28; p = 0.457), 0.16 (95% CI −1.78, 2.10; p = 0.871), 0.35 (95% CI −1.32, 2.01; p = 0.684), and 0.34 (95% CI −0.77, 1.46; p = 0.545), respectively. (4) Conclusions: Both iron-deficiency anemia and non-iron-deficiency anemia were negatively correlated with the neurobehavioral development of children. Negative correlations were found between iron-deficiency anemia and gross motor and adaptability development and between non-iron-deficiency anemia and gross motor and fine movement development.
Published: 2021
Full Text: View/download PDF

18. Changes in ALFF and ReHo values in methamphetamine abstinent individuals based on the Harvard‐Oxford atlas: A longitudinal resting‐state fMRI study

Author: Yanyao Du, Wenhan Yang, Jun Liu, and Jun Zhang
Subjects: Adult, Male, Right frontal pole, medicine.medical_specialty, Adolescent, media_common.quotation_subject, Amphetamine-Related Disorders, Medicine (miscellaneous), Audiology, Impulsivity, Methamphetamine, Young Adult, medicine, Humans, Middle frontal gyrus, Longitudinal Studies, media_common, Pharmacology, Brain Mapping, Resting state fMRI, medicine.diagnostic_test, business.industry, Brain, Middle Aged, Abstinence, Magnetic Resonance Imaging, Frontal Lobe, Psychiatry and Mental health, Potential biomarkers, Female, medicine.symptom, business, Functional magnetic resonance imaging, Biomarkers, medicine.drug
Abstract: Methamphetamine (MA) abuse has become a global public health problem due to damage to various systems throughout the body, especially the central nervous system. However, the differences in resting-state brain function between short-term and long-term abstinence, the pros and cons of treatments, and the relationship between resting-state brain function and behavioral tests are unknown. Sixty-three MA abstinent individuals were followed up for nearly 1 year and treated with three different methods. The amplitude of low-frequency fluctuation (ALFF) and regional homogeneity (ReHo) based on the Harvard-Oxford atlas (HOA) were measured by resting-state functional magnetic resonance imaging (fMRI). Impulsivity was evaluated by the Barratt Impulsivity Scale-11 (BIS-11). Brain regions with significant increases in ALFF and ReHo values in the long-term abstinent group compared to the short-term abstinent group were around the right frontal pole (McKetin et al., 2012, https://doi.org/10.1111/j.1360-0443.2012.03933.x) and right middle frontal gyrus (Wang et al., 2015, https://doi.org/10.1371/journal.pone.0133431). There were no significant differences among the three groups that experienced long-term abstinence. The changes in ALFF and ReHo in the right middle frontal gyrus were significantly associated with BIS total scores, BIS attention scores, and BIS nonplanning scores. The right middle frontal gyrus is a critical region in MA long-term abstinent individuals exposed to therapeutic intervention, and this region may be useful, when combined with BIS-11, as a potential biomarker to identify the effect of abstinence with therapeutic intervention in MA individuals.
Published: 2021

19. Compressed Domain Deep Video Super-Resolution

Author: Shiqi Wang, Kangkang Hu, Peilin Chen, Long Sun, Meng Wang, and Wenhan Yang
Subjects: business.industry, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Video processing, Computer Graphics and Computer-Aided Design, Convolutional neural network, Rendering (computer graphics), Robustness (computer science), Encoding (memory), Prior probability, Computer vision, Artificial intelligence, Bitstream, business, Software, Coding (social sciences)
Abstract: Real-world video processing algorithms are often faced with the great challenges of processing the compressed videos instead of pristine videos. Despite the tremendous successes achieved in deep-learning based video super-resolution (SR), much less work has been dedicated to the SR of compressed videos. Herein, we propose a novel approach for compressed domain deep video SR by jointly leveraging the coding priors and deep priors. By exploiting the diverse and ready-made spatial and temporal coding priors ( e.g., partition maps and motion vectors) extracted directly from the video bitstream in an effortless way, the video SR in the compressed domain allows us to accurately reconstruct the high resolution video with high flexibility and substantially economized computational complexity. More specifically, to incorporate the spatial coding prior, the Guided Spatial Feature Transform (GSFT) layer is proposed to modulate features of the prior with the guidance of the video information, making the prior features more fine-grained and content-adaptive. To incorporate the temporal coding prior, a guided soft alignment scheme is designed to generate local attention off-sets to compensate for decoded motion vectors. Our soft alignment scheme combines the merits of explicit and implicit motion modeling methods, rendering the alignment of features more effective for SR in terms of the computational complexity and robustness to inaccurate motion fields. Furthermore, to fully make use of the deep priors, the multi-scale fused features are generated from a scale-wise convolution reconstruction network for final SR video reconstruction. To promote the compressed domain video SR research, we build an extensive Compressed Videos with Coding Prior ( CVCP ) dataset, including compressed videos of diverse content and various coding priors extracted from the bitstream. Extensive experimental results show the effectiveness of coding priors in compressed domain video SR.
Published: 2021

20. Brain responses to drug cues predict craving changes in abstinent heroin users: A preliminary study

Author: Min Zhang, Shicong Wang, Jun Liu, Shuang Liu, Ziqiang Shao, Longmao Chen, Yan Xu, Wenhan Yang, and Kai Yuan
Subjects: Male, Craving, Audiology, Heroin, 0302 clinical medicine, Thalamus, media_common, Heroin Dependence, Putamen, 05 social sciences, fMRI, Opioid use disorder, Middle Aged, Magnetic Resonance Imaging, Sexual cue, Neurology, Drug cue, Female, Cues, medicine.symptom, psychological phenomena and processes, medicine.drug, RC321-571, Adult, medicine.medical_specialty, Sexual Behavior, Cognitive Neuroscience, media_common.quotation_subject, Prefrontal Cortex, Neurosciences. Biological psychiatry. Neuropsychiatry, behavioral disciplines and activities, 050105 experimental psychology, 03 medical and health sciences, Neuroimaging, mental disorders, Connectome, medicine, Humans, 0501 psychology and cognitive sciences, business.industry, Addiction, Abstinence, medicine.disease, Behavior, Addictive, nervous system, business, Insula, 030217 neurology & neurosurgery, Follow-Up Studies
Abstract: Background : Loss of control over drug intake occurring in drug addiction is believed to result from disruption of reward circuits, including reduced responsiveness to natural rewards (e.g., monetary, sex) and heightened responsiveness to drug reward. Yet few studies have assessed reward deficiency and related brain responses in abstinent heroin users with opioid use disorder, and less is known whether the brain responses can predict cue-induced craving changes following by prolonged abstinence. Method : 31 heroin users (age: 44.13±7.68 years, male: 18 (58%), duration of abstinence: 85.2±52.5 days) were enrolled at a mandatory detoxification center. By employing a cue-reactivity paradigm including three types of cues (drug, sexual, neutral), brain regional activations and circuit-level functional coupling were extracted. Among the 31 heroin users, 15 were followed up longitudinally to assess cue induced craving changes in the ensuing 6 months. Results : One way analysis of variance results showed that heroin users have differential brain activations to the three cues (neutral, drug and sexual) in the left dorsolateral prefrontal cortex (DLPFC), insula, orbiotofrontal cortex (OFC) and the bilateral thalamus. Drug cue induced greater activations in left DLPFC, insula and OFC compared to sexual cue. The psychophysiological interactions (PPI) analysis revealed negative couplings of the left DLPFC and the left OFC, bilateral thalamus, putamen in heroin users during drug cue exposure. In the 6-month follow-up study, both drug cue induced activation of the left DLPFC and the functional coupling of the left DLPFC-bilateral thalamus at baseline was correlated with craving reductions, which were not found for sexual cues. Conclusion : Our preliminary study provided novel evidence for the reward deficiency theory of opioid use disorder. Our findings also have clinical implications, as drug cue induced activation of the left DLPFC and functional coupling of left DLPFC-bilateral thalamus may be potential neuroimaging markers for craving changes during prolonged abstinence. Evidently, the findings in the current preliminary study should be confirmed by large sample size in the future.
Published: 2021

21. Reduced midbrain functional connectivity and recovery in abstinent heroin users

Author: Yangding Li, Yan Xu, Shicong Wang, Longmao Chen, Xinwen Wen, Jun Liu, Cui Yan, Li Fan, Jing Luo, Wenhan Yang, Min Zhang, Kai Yuan, Shuang Liu, Ziqiang Shao, and Fei Tang
Subjects: Brain Mapping, business.industry, Heroin Dependence, Ventral Tegmental Area, Striatum, Nucleus accumbens, Magnetic Resonance Imaging, Ventral tegmental area, Midbrain, Psychiatry and Mental health, medicine.anatomical_structure, Cross-Sectional Studies, nervous system, Dopaminergic pathways, Cortex (anatomy), mental disorders, Neural Pathways, medicine, Humans, Orbitofrontal cortex, business, Neuroscience, Biological Psychiatry, Anterior cingulate cortex
Abstract: Dopaminergic pathways from the midbrain to striatum as well as cortex are involved in addiction. However, the alternations of these pathways and whether the recoveries of aberrant circuits would be detected after prolonged abstinence in heroin users are rarely known. The resting-state functional connectivity (RSFC) patterns of midbrain (i.e., the ventral tegmental area (VTA) and substantia nigra (SN)) were compared between 40 abstinent heroin users with opioid use disorder (HUs) and 35 healthy controls (HCs). Then, we tested the functional recovery hypothesis by both cross-sectional and longitudinal design. For cross-sectional design, HUs were separated into short-term abstainers (STs) (3-15 days) and long-term abstainers (LTs) (>15 days). With regard to longitudinal design, 22 subjects among HUs were followed up for 10 months. A sandwich estimator method was used to analyze the differences between baseline HUs and follow-up HUs. HUs showed lower RSFC between midbrain and several cortical areas (medial orbitofrontal cortex (mOFC) and anterior cingulate cortex) compared with HCs. Besides, lower RSFC of VTA-right nucleus accumbens circuit as well as right SN- caudate circuit was also found in HUs. The enhanced RSFC value of VTA-left mOFC circuit was observed in LTs, compared with STs. Additionally, longitudinal design also revealed the increased RSFC values of the midbrain with frontal cortex after 10 months prolonged abstinence. We revealed abnormal functional organizations of midbrain-striato and midbrain-cortical circuits in HUs. More importantly, partially recovery of these dysfunctions can be found after long-term abstinence.
Published: 2021

22. Teacher-Student Learning With Multi-Granularity Constraint Towards Compact Facial Feature Representation

Author: Shurun Wang, Wenhan Yang, Xinfeng Zhang, Siwei Ma, Wang Shanshe, and Shiqi Wang
Subjects: Constraint (information theory), Artificial neural network, Computer science, Feature (computer vision), business.industry, Deep learning, Feature extraction, Pattern recognition, Artificial intelligence, Granularity, business, Representation (mathematics), Decoding methods
Abstract: In this paper, we propose a novel end-to-end feature compression scheme by leveraging the representation and learning capability of deep neural networks, towards intelligent front-end equipped analysis with promising accuracy and efficiency. In particular, the extracted features are compactly coded in an end-to-end manner by optimizing the rate- distortion cost to achieve feature-in-feature representation. The multi-granularity constraint is further imposed, serving as the optimization objective to make the feature compression more "healthier" from the perspective of ultimate utility. More specifically, the analysis accuracy is considered in the coarse granularity level constraint, ensuring the capability of facial analysis with the reconstructed feature. Furthermore, at the fine granularity level the feature fidelity is involved to preserve the original feature quality. Moreover, a latent code level teacher-student enhancement model is proposed to efficiently transfer the low bit-rate representation into a high bit- rate one. Such a strategy further allows us to adaptively shift the representation cost to decoding computations, leading to more flexible feature compression with enhanced decoding capability. We verify the effectiveness of the proposed model with the facial feature, and experimental results reveal better compression performance in terms of rate-accuracy compared with existing models.
Published: 2021

23. HLA-Face: Joint High-Low Adaptation for Low Light Face Detection

Author: Jiaying Liu, Wenhan Yang, and Wenjing Wang
Subjects: FOS: Computer and information sciences, Scheme (programming language), business.industry, Computer science, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine learning, computer.software_genre, Facial recognition system, Task (project management), Face (geometry), Artificial intelligence, business, Joint (audio engineering), Face detection, Adaptation (computer science), computer, computer.programming_language
Abstract: Face detection in low light scenarios is challenging but vital to many practical applications, e.g., surveillance video, autonomous driving at night. Most existing face detectors heavily rely on extensive annotations, while collecting data is time-consuming and laborious. To reduce the burden of building new datasets for low light conditions, we make full use of existing normal light data and explore how to adapt face detectors from normal light to low light. The challenge of this task is that the gap between normal and low light is too huge and complex for both pixel-level and object-level. Therefore, most existing low-light enhancement and adaptation methods do not achieve desirable performance. To address the issue, we propose a joint High-Low Adaptation (HLA) framework. Through a bidirectional low-level adaptation and multi-task high-level adaptation scheme, our HLA-Face outperforms state-of-the-art methods even without using dark face labels for training. Our project is publicly available at https://daooshee.github.io/HLA-Face-Website/, Comment: Accepted to CVPR 2021
Published: 2021

24. Self-Aligned Video Deraining with Transmission-Depth Consistency

Author: Dengxin Dai, Wenhan Yang, Wending Yan, and Robby T. Tan
Subjects: Transmission (telecommunications), Consistency (statistics), business.industry, Frame (networking), Streak, Optical flow, Computer vision, Artificial intelligence, business, Visibility, Focus (optics), Encoder
Abstract: In this paper, we address the problem of rain streaks and rain accumulation removal in video, by developing a self-alignment network with transmission-depth consistency. Existing video based deraining methods focus only on rain streak removal, and commonly use optical flow to align the rain video frames. However, besides rain streaks, rain accummulation can considerably degrade visibility; and, optical flow estimation in a rain video is still erroneous, making the deraining performance tend to be inaccurate. Our method employs deformable convolution layers in our encoder to achieve feature-level frame alignment, and hence avoids using optical flow. For rain streaks, our method predicts the current frame from its adjacent frames, such that rain streaks that appear randomly in the temporal domain can be removed. For rain accumulation, our method employs a transmission-depth consistency loss to resolve the ambiguity between the depth and water-droplet density. Our network estimates the depth from consecutive rain-accumulation-removal outputs, and calculates the transmission map using a commonly used physics model. To ensure photometric-temporal and depth-temporal consistencies, our method estimates the camera poses, so that it can warp one frame to its adjacent frames. Experimental results show that our method is effective in removing both rain streaks and rain accumulation, outperforming those of state-of-the-art methods quantitatively and qualitatively.
Published: 2021

25. Scale-Free Single Image Deraining Via Visibility-Enhanced Recurrent Wavelet Learning

Author: Jiaying Liu, Zongming Guo, Shuai Yang, and Wenhan Yang
Subjects: business.industry, Computer science, Streak, Mist, Training (meteorology), Wavelet transform, Pattern recognition, Context (language use), 02 engineering and technology, Real image, Computer Graphics and Computer-Aided Design, Wavelet, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Visibility, business, Software
Abstract: In this paper, we address a rain removal problem from a single image, even in the presence of large rain streaks and rain streak accumulation (where individual streaks cannot be seen and thus are visually similar to mist or fog). For rain streak removal, the mismatch problem between different streak sizes in training and testing phases leads to poor performance, especially when there are large streaks. To mitigate this problem, we embed a hierarchical representation of wavelet transform into a recurrent rain removal process: 1) rain removal on the low-frequency component and 2) recurrent detail recovery on high-frequency components under the guidance of the recovered low-frequency component. Benefiting from the recurrent multi-scale modeling of wavelet transform-like design, the proposed network trained on streaks with one size can adapt to those with larger sizes, which significantly favors real rain streak removal. The dilated residual dense network is used as the basic model of the recurrent recovery process. The network includes multiple paths with different receptive fields, thus it can make full use of multi-scale redundancy and utilize context information in large regions. Furthermore, to handle heavy rain cases where rain streak accumulation is presented, we construct a detail appearing rain accumulation removal to not only improve the visibility but also enhance the details in dark regions. The evaluation of both synthetic and real images, particularly on those containing large rain streaks and heavy accumulation, shows the effectiveness of our novel models, which significantly outperforms the state-of-the-art methods.
Published: 2019

26. Context-Aware Text-Based Binary Image Stylization and Synthesis

Author: Shuai Yang, Zongming Guo, Jiaying Liu, and Wenhan Yang
Subjects: FOS: Computer and information sciences, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Binary image, Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Inpainting, 02 engineering and technology, Geometric shape, Computer Graphics and Computer-Aided Design, Visualization, Rendering (computer graphics), Texture transfer, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, Software, ComputingMethodologies_COMPUTERGRAPHICS, Texture synthesis
Abstract: In this work, we present a new framework for the stylization of text-based binary images. First, our method stylizes the stroke-based geometric shape like text, symbols and icons in the target binary image based on an input style image. Second, the composition of the stylized geometric shape and a background image is explored. To accomplish the task, we propose legibility-preserving structure and texture transfer algorithms, which progressively narrow the visual differences between the binary image and the style image. The stylization is then followed by a context-aware layout design algorithm, where cues for both seamlessness and aesthetics are employed to determine the optimal layout of the shape in the background. Given the layout, the binary image is seamlessly embedded into the background by texture synthesis under a context-aware boundary constraint. According to the contents of binary images, our method can be applied to many fields. We show that the proposed method is capable of addressing the unsupervised text stylization problem and is superior to state-of-the-art style transfer methods in automatic artistic typography creation. Besides, extensive experiments on various tasks, such as visual-textual presentation synthesis, icon/symbol rendering and structure-guided image inpainting, demonstrate the effectiveness of the proposed method., Comment: Accepted by IEEE Trans. on Image Processing. Project page: http://www.icst.pku.edu.cn/struct/Projects/UTS.html
Published: 2019

27. The differences of lipid profiles between only children and children with siblings: A national survey in China

Author: Bingjie Ma, Wenhan Yang, Li Cai, Lizi Lin, Jin Jing, Yajun Chen, and Jun Ma
Subjects: Male, Rural Population, 0301 basic medicine, China, Food intake, Adolescent, Physical activity, lcsh:Medicine, Subgroup analysis, Triglycerides blood, Article, 03 medical and health sciences, 0302 clinical medicine, Total cholesterol, Humans, Medicine, Child, lcsh:Science, Triglycerides, Lipoprotein cholesterol, Multidisciplinary, business.industry, Siblings, lcsh:R, Only Child, medicine.disease, Health Surveys, Diet, Lipoproteins, LDL, 030104 developmental biology, Increased risk, Female, lipids (amino acids, peptides, and proteins), lcsh:Q, Lipoproteins, HDL, business, 030217 neurology & neurosurgery, Dyslipidemia, Demography
Abstract: With the increasing number of the one-child family, it is important to investigate whether the only-child status is associated with dyslipidemia. Among a national sample of 65,347 Chinese children aged 6–17 years, 16,100 lipid profiles were available. Children’s height, weight, total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C) were measured. In comparison to children with siblings, only children (OC) were more likely to be boys and live in urban areas. OC had less physical activity, less fried food intake, but more meat and dairy intakes. OC had significantly higher levels of TC (3.97 ± 0.78 vs. 3.89 ± 0.77) and LDL-C (2.12 ± 0.65 vs. 2.06 ± 0.64) in the overall group, and also in the subgroups of rural boys and girls. The prevalence of hyper-TC (5.48% vs. 4.43%) and hyper-LDL-C (3.97% vs. 2.96%) were significantly higher in OC than their counterparts. Furthermore, we found higher odds of hyper-LDL-C [1.43 (1.12, 1.83)] in OC after adjustments. In the subgroup analysis, only-child status was associated with increased risk of hyper-TC [1.86 (1.06, 3.26)] and hyper-LDL-C [2.65 (1.14, 6.16)] among rural boys, and hyper-LDL-C among rural girls [2.20 (1.14, 4.22)]. In conclusion, higher levels of TC and LDL-C were found in OC especially for rural children. Being an only-child was associated with increased risk of hyper-LDL-C.
Published: 2019

28. D3R-Net: Dynamic Routing Residue Recurrent Network for Video Rain Removal

Author: Shuai Yang, Wenhan Yang, Jiaying Liu, and Zongming Guo
Subjects: Context model, Pixel, Computer science, business.industry, Rain removal, Feature extraction, Pattern recognition, 02 engineering and technology, Adaptive routing, Residual, Computer Graphics and Computer-Aided Design, Network planning and design, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Artificial intelligence, business, Software
Abstract: In this paper, we address the problem of video rain removal by considering rain occlusion regions, i.e., very low light transmittance for rain streaks. Different from additive rain streaks, in such occlusion regions, the details of backgrounds are completely lost. Therefore, we propose a hybrid rain model to depict both rain streaks and occlusions. Integrating the hybrid model and useful motion segmentation context information, we present a Dynamic Routing Residue Recurrent Network (D3R-Net). D3R-Net first extracts the spatial features by a residual network. Then, the spatial features are aggregated by recurrent units along the temporal axis. In the temporal fusion, the context information is embedded into the network in a “ dynamic routing ” way. A heap of recurrent units takes responsibility for handling the temporal fusion in given contexts, e.g., rain or non-rain regions. In the certain forward and backward processes, one of these recurrent units is mainly activated. Then, a context selection gate is employed to detect the context and select one of these temporally fused features generated by these recurrent units as the final fused feature. Finally, this last feature plays a role of “ residual feature .” It is combined with the spatial feature and then used to reconstruct the negative rain streaks. In such a D3R-Net, we incorporate motion segmentation, which denotes whether a pixel belongs to fast moving edges or not, and rain type indicator, indicating whether a pixel belongs to rain streaks, rain occlusions, and non-rain regions, as the context variables. Extensive experiments on a series of synthetic and real videos with rain streaks verify not only the superiority of the proposed method over state of the art but also the effectiveness of our network design and its each component.
Published: 2019

29. Recurrent Multi-Frame Deraining: Combining Physics Guidance and Adversarial Learning

Author: Jiashi Feng, Bin Cheng, Jiaying Liu, Robby T. Tan, Shiqi Wang, and Wenhan Yang
Subjects: business.industry, Applied Mathematics, Rain removal, Frame (networking), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Process (computing), Construct (python library), Power (physics), Multi frame, Adversarial system, Computational Theory and Mathematics, Artificial Intelligence, Computer vision, Computer Vision and Pattern Recognition, Stage (hydrology), Artificial intelligence, business, Software, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: Existing video rain removal methods mainly focus on rain streak removal and are solely trained based on the synthetic data, which neglect more complex degradation factors, e.g., rain accumulation, and the prior knowledge in real rain data. Thus, in this paper, we build a more comprehensive rain model with several degradation factors and construct a novel two-stage video rain removal method that combines the power of synthetic videos and real data. Specifically, a novel two-stage progressive network is proposed: recovery guided by a physics model, and further restoration by adversarial learning. The first stage performs an inverse recovery process guided by our proposed rain model. An initially estimated background frame is obtained based on the input rain frame. The second stage employs adversarial learning to refine the result, i.e., recovering the overall color and illumination distributions of the frame, the background details that are failed to be recovered in the first stage, and removing the artifacts generated in the first stage. Furthermore, we also introduce a more comprehensive rain model that includes degradation factors, e.g., occlusion and rain accumulation, which appear in real scenes yet ignored by existing methods. This model, which generates more realistic rain images, will train and evaluate our models better. Extensive evaluations on synthetic and real videos show the effectiveness of our method in comparisons to the state-of-the-art methods. Our datasets, results and code are available at: https://github.com/flyywh/Recurrent-Multi-Frame-Deraining.
Published: 2021

30. Treatment Response Prediction and Individualized Identification of Short-Term Abstinence Methamphetamine Dependence Using Brain Graph Metrics

Author: Ru Yang, Jun Liu, Jing Luo, Fei Tang, Cui Yan, Sihong Huang, Wenhan Yang, and Xuefei Yang
Subjects: medicine.medical_specialty, lcsh:RC435-571, Middle temporal gyrus, Inferior frontal gyrus, Audiology, graph metrics, Cuneus, 03 medical and health sciences, 0302 clinical medicine, methamphetamine dependence, lcsh:Psychiatry, medicine, support vector machine, 030212 general & internal medicine, Original Research, Psychiatry, medicine.diagnostic_test, business.industry, treatment response, Confidence interval, Support vector machine, Psychiatry and Mental health, medicine.anatomical_structure, classification, Superior frontal gyrus, Feature (computer vision), business, Functional magnetic resonance imaging, 030217 neurology & neurosurgery
Abstract: Background:The abuse of methamphetamine (MA) worldwide has gained international attention as the most rapidly growing illicit drug problem. The classification and treatment response prediction of MA addicts are thereby paramount, in order for effective treatments to be more targeted to individuals. However, there has been limited progress.Methods:In the present study, 43 MA-dependent participants and 38 age- and gender-matched healthy controls were enrolled, and their resting-state functional magnetic resonance imaging data were collected. MA-dependent participants who showed 50% reduction in craving were defined as responders to treatment. The present study used the machine learning method, which is a support vector machine (SVM), to detect the most relevant features for discriminating and predicting the treatment response for MA-dependent participants based on the features extracted from the functional graph metrics.Results:A classifier was able to differentiate MA-dependent subjects from normal controls, with a cross-validated prediction accuracy, sensitivity, and specificity of 73.2% [95% confidence interval (CI) = 71.23–74.17%), 66.05% (95% CI = 63.06–69.04%), and 80.35% (95% CI = 77.77–82.93%), respectively, at the individual level. The most accurate combination of classifier features included the nodal efficiency in the right middle temporal gyrus and the community index in the left precentral gyrus and cuneus. Between these two, the community index in the left precentral gyrus had the highest importance. In addition, the classification performance of the other classifier used to predict the treatment response of MA-dependent subjects had an accuracy, sensitivity, and specificity of 71.2% (95% CI = 69.28–73.12%), 86.75% (95% CI = 84.48–88.92%), and 55.65% (95% CI = 52.61–58.79%), respectively, at the individual level. Furthermore, the most accurate combination of classifier features included the nodal clustering coefficient in the right orbital part of the superior frontal gyrus, the nodal local efficiency in the right orbital part of the superior frontal gyrus, and the right triangular part of the inferior frontal gyrus and right temporal pole of middle temporal gyrus. Among these, the nodal local efficiency in the right temporal pole of the middle temporal gyrus had the highest feature importance.Conclusion:The present study identified the most relevant features of MA addiction and treatment based on SVMs and the features extracted from the graph metrics and provided possible biomarkers to differentiate and predict the treatment response for MA-dependent patients. The brain regions involved in the best combinations should be given close attention during the treatment of MA.
Published: 2021

31. Association of Patient Sex with Efficacy of Programmed Death-1/Ligand-1 Inhibitors in Advanced Non–small-cell Lung Cancer: A Systematic Review and Meta-analysis

Author: Shun Xu, Wenhan Yang, Shu Liu, and Xin Zhang
Subjects: Oncology, medicine.medical_specialty, business.industry, Meta-analysis, Internal medicine, medicine, Non small cell, Programmed death 1, Lung cancer, medicine.disease, business, Ligand (biochemistry)
Published: 2021

32. Camera Invariant Feature Learning for Generalized Face Anti-spoofing

Author: Wenhan Yang, Haoliang Li, Sam Kwong, Shiqi Wang, and Baoliang Chen
Subjects: FOS: Computer and information sciences, 021110 strategic, defence & security studies, Spoofing attack, Computer Networks and Communications, business.industry, Computer science, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, 0211 other engineering and technologies, Computer Science - Computer Vision and Pattern Recognition, Pattern recognition, 02 engineering and technology, Facial recognition system, Domain (software engineering), Artificial Intelligence (cs.AI), Feature (computer vision), Face (geometry), Artificial intelligence, Safety, Risk, Reliability and Quality, business, Divergence (statistics)
Abstract: There has been an increasing consensus in learning based face anti-spoofing that the divergence in terms of camera models is causing a large domain gap in real application scenarios. We describe a framework that eliminates the influence of inherent variance from acquisition cameras at the feature level, leading to the generalized face spoofing detection model that could be highly adaptive to different acquisition devices. In particular, the framework is composed of two branches. The first branch aims to learn the camera invariant spoofing features via feature level decomposition in the high frequency domain. Motivated by the fact that the spoofing features exist not only in the high frequency domain, in the second branch the discrimination capability of extracted spoofing features is further boosted from the enhanced image based on the recomposition of the high-frequency and low-frequency information. Finally, the classification results of the two branches are fused together by a weighting strategy. Experiments show that the proposed method can achieve better performance in both intra-dataset and cross-dataset settings, demonstrating the high generalization capability in various application scenarios.
Published: 2021
Full Text: View/download PDF

33. Raw-Guided Enhancing Reprocess of Low-Light Image via Deep Exposure Adjustment

Author: Jiaying Liu, Yueyu Hu, Haofeng Huang, and Wenhan Yang
Subjects: Ground truth, Computer science, business.industry, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, Image processing, 02 engineering and technology, Construct (python library), Ambiguity, Domain (software engineering), Image (mathematics), 0202 electrical engineering, electronic engineering, information engineering, RGB color model, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, Joint (audio engineering), media_common
Abstract: Enhancement of images captured in low-light conditions remains to be a challenging problem even with the advanced machine learning techniques. The challenges include the ambiguity of the ground truth for a low-light image and the loss of information during the RAW image processing. To tackle the problems, in this paper, we take a novel view to regard low-light image enhancement as an exposure time adjustment problem and propose a corresponding explicit and mathematical definition. Based on that, we construct a RAW-Guiding exposure time adjustment Network (RGNET), which overcomes RGB images’ nonlinearity and RAW images’ inaccessibility. That is, RGNET is only trained with RGB images and corresponding RAW images, which helps project nonlinear RGB images into a linear domain, simultaneously without using RAW images in the testing phase. Furthermore, our network consists of three individual sub-modules for unprocessing, reconstruction and processing, respectively. To the best of our knowledge, the proposed sub-net for unprocessing is the first learning-based unprocessing method. After the joint training of three parts, each pre-trained seperately with the RAW image guidance, experimental results demonstrate that RGNET outperforms state-of-the-art low-light image enhancement methods.
Published: 2021

34. Sensitivity-Aware Bit Allocation for Intermediate Deep Feature Compression

Author: Jiaying Liu, Wenhan Yang, Sifeng Xia, and Yuzhang Hu
Subjects: Computer science, business.industry, Deep learning, Process (computing), Data_CODINGANDINFORMATIONTHEORY, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Feature (computer vision), Compression (functional analysis), Bit rate, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Sensitivity (control systems), Focus (optics), business, Algorithm, 0105 earth and related environmental sciences, Communication channel, Degradation (telecommunications)
Abstract: In this paper, we focus on compressing and trans-mitting deep intermediate features to support the prosperous applications at the cloud side efficiently, and propose a sensitivity-aware bit allocation algorithm for the deep intermediate feature compression. Considering that different channels’ contributions to the final inference result of the deep learning model might differ a lot, we design a channel-wise bit allocation mechanism to maintain the accuracy while trying to reduce the bit-rate cost. The algorithm consists of two passes. In the first pass, only one channel is exposed to compression degradation while other channels are kept as the original ones in order to test this channel’s sensitivity to the compression degradation. This process will be repeated until all channels’ sensitivity is obtained. Then, in the second pass, bits allocated to each channel will be automatically decided according to the sensitivity obtained in the first pass to make sure that the channel with higher sensitivity can be allocated with more bits to maintain accuracy as much as possible. With the well-designed algorithm, our method surpasses state-of-the-art compression tools with on average 6.4% BD-rate saving.
Published: 2020

35. Just Noticeable Distortion Profile Inference: A Patch-Level Structural Visibility Learning Approach

Author: Zhangkai Ni, Shiqi Wang, Wenhan Yang, Sam Kwong, Xinfeng Zhang, and Xuelin Shen
Subjects: Visual perception, Computer science, business.industry, Inference, Pattern recognition, 02 engineering and technology, Computer Graphics and Computer-Aided Design, Visualization, Frequency domain, Distortion, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, Software, Data compression
Abstract: In this paper, we propose an effective approach to infer the just noticeable distortion (JND) profile based on patch-level structural visibility learning. Instead of pixel-level JND profile estimation, the image patch, which is regarded as the basic processing unit to better correlate with the human perception, can be further decomposed into three conceptually independent components for visibility estimation. In particular, to incorporate the structural degradation into the patch-level JND model, a deep learning-based structural degradation estimation model is trained to approximate the masking of structural visibility. In order to facilitate the learning process, a JND dataset is further established, including 202 pristine images and 7878 distorted images generated by advanced compression algorithms based on the upcoming Versatile Video Coding (VVC) standard. Extensive experimental results further show the superiority of the proposed approach over the state-of-the-art. Our dataset is available at: https://github.com/ShenXuelin-CityU/PWJNDInfer .
Published: 2020

36. Integrating Semantic Segmentation and Retinex Model for Low-Light Image Enhancement

Author: Jiaying Liu, Wenhan Yang, Minhao Fan, and Wenjing Wang
Subjects: Color constancy, Computer science, business.industry, media_common.quotation_subject, Perspective (graphical), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, 02 engineering and technology, Pipeline (software), Synthetic data, 0202 electrical engineering, electronic engineering, information engineering, Contrast (vision), 020201 artificial intelligence & image processing, Computer vision, Segmentation, Noise (video), Artificial intelligence, business, Image restoration, media_common
Abstract: Retinex model is widely adopted in various low-light image enhancement tasks. The basic idea of the Retinex theory is to decompose images into reflectance and illumination. The ill-posed decomposition is usually handled by hand-crafted constraints and priors. With the recently emerging deep-learning based approaches as tools, in this paper, we integrate the idea of Retinex decomposition and semantic information awareness. Based on the observation that various objects and backgrounds have different material, reflection and perspective attributes, regions of a single low-light image may require different adjustment and enhancement regarding contrast, illumination and noise. We propose an enhancement pipeline with three parts that effectively utilize the semantic layer information. Specifically, we extract the segmentation, reflectance as well as illumination layers, and concurrently enhance every separate region, i.e. sky, ground and objects for outdoor scenes. Extensive experiments on both synthetic data and real world images demonstrate the superiority of our method over current state-of-the-art low-light enhancement algorithms.
Published: 2020

37. When Bitstream Prior Meets Deep Prior

Author: Long Sun, Wenhan Yang, Shiqi Wang, and Peilin Chen
Subjects: Sequence, Dependency (UML), Computational complexity theory, Computer science, business.industry, Deep learning, Process (computing), 020206 networking & telecommunications, 02 engineering and technology, Motion vector, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Bitstream, business, Decoding methods
Abstract: The standard paradigm of video super-resolution (SR) is to generate the spatial-temporal coherent high-resolution (HR) sequence from the corresponding low-resolution (LR) version which has already been decoded from the bitstream. However, a highly practical while relatively under-studied way is enabling the built-in SR functionality in the decoder, in the sense that almost all videos are compactly represented. In this paper, we systematically investigate the SR of compressed LR videos by leveraging the interactivity between decoding prior and deep prior. By fully exploiting the compact video stream information, the proposed bitstream prior embedded SR framework achieves compressed video SR and quality enhancement simultaneously in a single feed-forward process. More specifically, we propose a motion vector guided multi-scale local attention module that explicitly exploits the temporal dependency and suppresses coding artifacts with substantially economized computational complexity. Moreover, a scale-wise deep residual-in-residual network is learned to reconstruct the SR frames from the multi-scale fused features. To facilitate the research of compressed video SR, we also build a large-scale dataset with compressed videos of diverse content, including ready-made diversified kinds of side information extracted from the bitstream. Both quantitative and qualitative evaluations show that our model achieves superior performance for compressed video SR, and offers competitive performance compared to the sequential combinations of the state-of-the-art methods for compressed video artifacts removal and SR.
Published: 2020

38. MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition

Author: Jiaying Liu, Wenhan Yang, Sijie Song, and Lilang Lin
Subjects: FOS: Computer and information sciences, Computer science, business.industry, Computer Science - Artificial Intelligence, Feature vector, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 020207 software engineering, 02 engineering and technology, Overfitting, Machine learning, computer.software_genre, Jigsaw, Task (project management), Artificial Intelligence (cs.AI), ComputingMethodologies_PATTERNRECOGNITION, Discriminative model, Action (philosophy), 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), 020201 artificial intelligence & image processing, Artificial intelligence, business, Feature learning, computer
Abstract: In this paper, we address self-supervised representation learning from human skeletons for action recognition. Previous methods, which usually learn feature presentations from a single reconstruction task, may come across the overfitting problem, and the features are not generalizable for action recognition. Instead, we propose to integrate multiple tasks to learn more general representations in a self-supervised manner. To realize this goal, we integrate motion prediction, jigsaw puzzle recognition, and contrastive learning to learn skeleton features from different aspects. Skeleton dynamics can be modeled through motion prediction by predicting the future sequence. And temporal patterns, which are critical for action recognition, are learned through solving jigsaw puzzles. We further regularize the feature space by contrastive learning. Besides, we explore different training strategies to utilize the knowledge from self-supervised tasks for action recognition. We evaluate our multi-task self-supervised learning approach with action classifiers trained under different configurations, including unsupervised, semi-supervised and fully-supervised settings. Our experiments on the NW-UCLA, NTU RGB+D, and PKUMMD datasets show remarkable performance for action recognition, demonstrating the superiority of our method in learning more discriminative and general features. Our project website is available at https://langlandslin.github.io/projects/MSL/., Accepted by ACMMM 2020
Published: 2020

39. Memory-Augmented Auto-Regressive Network for Frame Recurrent Inter Prediction

Author: Sifeng Xia, Yuzhang Hu, Jiaying Liu, and Wenhan Yang
Subjects: Autoregressive model, Computer science, business.industry, Encoding (memory), Motion estimation, Frame (networking), Feature extraction, Redundancy (engineering), Pattern recognition, Artificial intelligence, business, Reference frame, Coding (social sciences)
Abstract: Inter prediction is quite important for the modern codecs to remove temporal redundancy. In this paper, we make endeavors in generating artificial reference frames with previous reconstructed frames for inter prediction, to offer a better choice when the traditional block-wise motion estimation fails to find a good reference block. Long-term temporal dynamics are tracked during the whole coding process to generate more accurate and realistic artificial reference frames. Specifically, we propose a Memory-Augmented Auto-Regressive Network (MAAR-Net) for frame prediction in video coding. MAAR-Net regresses the current frame with two nearest frames via an auto-regressive (AR) model to better capture the main spatial and temporal structures. The AR regression coefficients are generated based on adjacent frame information as well as the long-term motion dynamics accumulated and propagated by a convolutional Long Short-Term Memory (LSTM). To generate the target frame with higher quality, a quality attention mechanism is introduced for the temporal regularization between different reconstructed frames. With the well-designed network, our method surpasses HEVC on average 4.0% BD-rate saving and up to 10.6% BD-rate saving for the luma component under the low-delay configuration.
Published: 2020

40. Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network

Author: Wenhan Yang, Zhangkai Ni, Shiqi Wang, Lin Ma, and Sam Kwong
Subjects: FOS: Computer and information sciences, Computer science, business.industry, media_common.quotation_subject, Computer Vision and Pattern Recognition (cs.CV), Supervised learning, Computer Science - Computer Vision and Pattern Recognition, Pattern recognition, 02 engineering and technology, Computer Graphics and Computer-Aided Design, Regularization (mathematics), Image (mathematics), Domain (software engineering), Set (abstract data type), Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), 020201 artificial intelligence & image processing, Quality (business), Artificial intelligence, business, Software, media_common
Abstract: Improving the aesthetic quality of images is challenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert-retouched versions. However, the style and characteristics of photos retouched by experts may not meet the needs or preferences of general users. In this paper, we present an unsupervised image enhancement generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images. The proposed model is based on single deep GAN which embeds the modulation and attention mechanisms to capture richer global and local features. Based on the proposed model, we introduce two losses to deal with the unsupervised image enhancement: (1) fidelity loss, which is defined as a L2 regularization in the feature domain of a pre-trained VGG network to ensure the content between the enhanced image and the input image is the same, and (2) quality loss that is formulated as a relativistic hinge adversarial loss to endow the input image the desired characteristics. Both quantitative and qualitative results show that the proposed model effectively improves the aesthetic quality of images. Our code is available at: https://github.com/eezkni/UEGAN.
Published: 2020

41. Face Anti-Spoofing by Fusing High and Low Frequency Features for Advanced Generalization Capability

Author: Shiqi Wang, Wenhan Yang, and Baoliang Chen
Subjects: Authentication, Spoofing attack, Computer science, Generalization, business.industry, Face (geometry), Component (UML), Fuse (electrical), Pattern recognition, Artificial intelligence, business, Image (mathematics), Domain (software engineering)
Abstract: In face authentication systems, face anti-spoofing is an indispensable part. Recently, CNN-based approaches have achieved promising results when training and testing in similar scenes. However, performance usually drops drastically when the model is tested on unseen datasets due to the domain generalization problem. In this paper, we propose a new face anti-spoofing model consisting of two streams to fuse high frequency (HF) and low frequency (LF) information of a facial image for high generalization capability. More concretely, three high-pass and low-pass filters are utilized to extract high and low frequency component of a facial image, respectively. The two components are proceeded by two sub-networks with a cross-frequency spatial attention (CFSA) module, which makes two streams communicate and exchange information with each other. Considering the two sub-networks are responsible for different kinds of information, self-channel attention is incorporated after CFSA, then the outputs of the two sub-networks are fused for final classification. Experiments on cross-database results show that the proposed method can largely improve the generalization capacity in face spoofing detection.
Published: 2020

42. Decreased Relative Cerebral Blood Flow in Unmedicated Heroin-Dependent Individuals

Author: Wenhan Yang, Ru Yang, Fei Tang, Jing Luo, Jun Zhang, Changlong Chen, Chunmei Duan, Yuan Deng, Lidan Fan, and Jun Liu
Subjects: medicine.medical_specialty, lcsh:RC435-571, Trail Making Test, Thalamus, Precuneus, 03 medical and health sciences, 0302 clinical medicine, Inferior temporal gyrus, lcsh:Psychiatry, Internal medicine, mental disorders, magnetic resonance perfusion imaging, Medicine, Middle frontal gyrus, Original Research, Psychiatry, heroin addiction, business.industry, Neuropsychology, arterial spin labeling, 030227 psychiatry, Psychiatry and Mental health, medicine.anatomical_structure, nervous system, Cerebral blood flow, neurocognitive, Cerebellar vermis, Cardiology, reward circuits, business, 030217 neurology & neurosurgery
Abstract: Understanding the brain mechanisms of heroin dependence is invaluable for developing effective treatment. Measurement of regional cerebral blood flow (CBF) provides a method to visualize brain circuits that are functionally impaired by heroin dependence. This study examined regional CBF alterations and their clinical associations in unmedicated heroin-dependent individuals (HDIs) using a relatively large sample. Sixty-eight (42 males, 26 females; age: 40.9 ± 7.3 years) HDIs and forty-seven (34 males, 13 females; age: 39.3 ± 9.2 years) matched healthy controls (HCs) underwent high-resolution T1 and whole-brain arterial spin labeling (ASL) perfusion magnetic resonance imaging (MRI) scans. Additionally, clinical characteristics were collected for neurocognitive assessments. HDIs showed worse neuropsychological performance than HCs and had decreased relative CBF (rCBF) in the bilateral middle frontal gyrus (MFG), inferior temporal gyrus, precuneus, posterior cerebellar lobe, cerebellar vermis, and the midbrain adjacent to the ventral tegmental area; right posterior cingulate gyrus, thalamus, and calcarine. rCBF in the bilateral MFG was negatively correlated with Trail Making Test time in HDIs. HDIs had limbic, frontal, and parietal hypoperfusion areas. Low CBF in the MFG indicated cognitive impairment in HDIs. Together, these findings suggest the MFG as a critical region in HDIs and suggest ASL-derived CBF as a potential marker for use in heroin addiction studies.
Published: 2020

43. Increased Amplitude of Low-Frequency Fluctuation in Right Angular Gyrus and Left Superior Occipital Gyrus Negatively Correlated With Heroin Use

Author: Jun Liu, Wenhan Yang, Ru Yang, Jun Zhang, Yuan Deng, Chunmei Duan, Jing Luo, and Jiyuan Chen
Subjects: Left superior occipital gyrus, medicine.medical_specialty, lcsh:RC435-571, resting state fMRI, Right angular gyrus, Audiology, Heroin, 03 medical and health sciences, 0302 clinical medicine, lcsh:Psychiatry, Medicine, Semantic memory, Original Research, Psychiatry, heroin addicts, Resting state fMRI, medicine.diagnostic_test, business.industry, Confounding, 030227 psychiatry, Psychiatry and Mental health, Amplitude, amplitude of low-frequency fluctuation (ALFF), heroin use, addiction, business, Functional magnetic resonance imaging, human activities, 030217 neurology & neurosurgery, medicine.drug
Abstract: Abnormal amplitude of low-frequency fluctuation has been implicated in heroin addiction. However, previous studies lacked consistency and didn't consider the impact of confounding factors such as methadone and alcohol. Fifty-one heroin-dependent (HD) individuals and 40 healthy controls underwent resting-state functional magnetic resonance imaging. The 'amplitude of low-frequency fluctuation' (ALFF) value was calculated and support vector machine (SVM) classification analysis was applied to analyze the data. Compared with healthy controls, heroin addicts exhibited increased ALFF in the right angular gyrus (AG) and left superior occipital gyrus (SOG). A negative correlation was observed between increased ALFF in the right angular gyrus and left superior occipital gyrus and the duration of heroin use (p 1=0.004, r 1=-0.426; p 2=0.009, r 2=-0.361). Moreover, the ALFF in the right AG and left SOG could discriminate the HD subjects from the controls with acceptable accuracy (Acc1=64.85%, p 1=0.004; Acc2=63.80%, p 2=0.005). HD patients showed abnormal ALFF in the brain areas involved in semantic memory and visual networks. The longer HD individuals abused heroin, the less the ALFF of associated brain regions increased. These observed patterns suggested that the accumulative effect of heroin's neurotoxicity overpowered self-recovery of the brain and may be applied as a potential biomarker to identify HD individuals from the controls.
Published: 2020

44. A JND Dataset Based on VVC Compressed Images

Author: Wenhan Yang, Xinfeng Zhang, Xuelin Shen, Sam Kwong, Shiqi Wang, and Zhangkai Ni
Subjects: business.industry, Computer science, Quantization (signal processing), Distortion, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 0202 electrical engineering, electronic engineering, information engineering, 020206 networking & telecommunications, 020201 artificial intelligence & image processing, Pattern recognition, 02 engineering and technology, Artificial intelligence, business, Visualization
Abstract: In this paper, we establish a just noticeable distortion (JND) dataset based on the next generation video coding standard Versatile Video Coding (VVC). The dataset consists of 202 images which cover a wide range of content with resolution 1920×1080. Each image is encoded by VTM 5.0 intra coding with the quantization parameter (QP) ranging from 13 to 51. The details regarding dataset construction, subjective testing and data post-processing are described in this paper. Finally, the significance of the dataset towards future video coding research is envisioned. All source images as well as the testing data have been made available to the public.
Published: 2020

45. Association of Anemia with Neurodevelopmental Disorders in a Nationally Representative Sample of US Children

Author: Rui Gao, Linda Snetselaar, Wei Bao, Lane Strathearn, Buyun Liu, and Wenhan Yang
Subjects: Male, Pediatrics, medicine.medical_specialty, Adolescent, Anemia, Total population, Logistic regression, 03 medical and health sciences, 0302 clinical medicine, Age Distribution, 030225 pediatrics, mental disorders, medicine, Prevalence, Attention deficit hyperactivity disorder, National Health Interview Survey, Humans, 030212 general & internal medicine, Sex Distribution, Association (psychology), Child, business.industry, medicine.disease, Prognosis, United States, Autism spectrum disorder, Neurodevelopmental Disorders, Child, Preschool, Pediatrics, Perinatology and Child Health, Learning disability, Female, medicine.symptom, business
Abstract: Objective To examine the associations of anemia with autism spectrum disorder (ASD), attention deficit hyperactivity disorder (ADHD), and learning disability in US children. Study design We included children and adolescents aged 3-17 years from the National Health Interview Survey (NHIS), 1997-2018. Information about physician-diagnosed history of anemia, ASD, ADHD, and learning disability was reported by a parent or guardian. Multiple logistic regression with sample weights was used to estimate the ORs and 95% CIs of neurodevelopmental disorders according to the presence of anemia. Results Of the total population of 213 893 children aged 3-17 years (mean age [SE], 10.01 [0.01] years), 2379 were reported to have a diagnosis of anemia, for a weighted prevalence of 1.06% (95% CI, 1.01-1.12). The prevalence of ASD was 1.94% (95% CI, 1.20-2.68) among children with anemia and 1.07% (95% CI, 1.01-1.14) among those without anemia. The corresponding prevalences were 12.24% (95% CI, 10.47-14.00) and 7.73% (95% CI, 7.58-7.88) for ADHD and 15.03% (95% CI, 13.08-16.99) and 7.75% (95% CI, 7.39-7.70) for learning disability, respectively. Compared with those without anemia, children with anemia were more likely to have neurodevelopmental disorders, with an aOR of 2.07 (95% CI, 1.39-3.08) for ASD, 1.84 (95% CI, 1.55-2.19) for ADHD, and 2.22 (95% CI, 1.90-2.60) for learning disability. Conclusions In a nationally representative sample of US children, we found significant associations between anemia and neurodevelopmental disorders including ASD, ADHD, and learning disability. Further investigation is warranted to assess the causality and elucidate the underlying mechanisms.
Published: 2020

46. Self-Learning Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

Author: Shiqi Wang, Wenhan Yang, Robby T. Tan, and Jiaying Liu
Subjects: Computer science, business.industry, Frame (networking), Feature extraction, Streak, Training (meteorology), Process (computing), 020206 networking & telecommunications, 02 engineering and technology, Consistency (database systems), Motion estimation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business
Abstract: In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process. The method is inspired by fact that the adjacent frames are highly correlated and can be regarded as different versions of identical scene, and rain streaks are randomly distributed along the temporal dimension. With this in mind, we construct a two-stage Self-Learned Deraining Network (SLDNet) to remove rain streaks based on both temporal correlation and consistency. In the first stage, SLDNet utilizes the temporal correlations and learns to predict the clean version of the current frame based on its adjacent rain video frames. In the second stage, SLDNet enforces the temporal consistency among different frames. It takes both the current rain frame and adjacent rain video frames to recover structural details. The first stage is responsible for reconstructing main structures, and the second stage is responsible for extracting structural details. We build our network architecture with two sub-tasks, i.e. motion estimation, and rain region detection, and optimize them jointly. Our extensive experiments demonstrate the effectiveness of our method, offering better results both quantitatively and qualitatively.
Published: 2020

47. From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement

Author: Wenhan Yang, Jiaying Liu, Yue Wang, Shiqi Wang, and Yuming Fang
Subjects: Computer science, business.industry, Contrast (statistics), 020207 software engineering, Pattern recognition, 02 engineering and technology, Visualization, Colors of noise, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Noise (video), Representation (mathematics), business, Image restoration
Abstract: Under-exposure introduces a series of visual degradation, i.e. decreased visibility, intensive noise, and biased color, etc. To address these problems, we propose a novel semi-supervised learning approach for low-light image enhancement. A deep recursive band network (DRBN) is proposed to recover a linear band representation of an enhanced normal-light image with paired low/normal-light images, and then obtain an improved one by recomposing the given bands via another learnable linear transformation based on a perceptual quality-driven adversarial learning with unpaired data. The architecture is powerful and flexible to have the merit of training with both paired and unpaired data. On one hand, the proposed network is well designed to extract a series of coarse-to-fine band representations, whose estimations are mutually beneficial in a recursive process. On the other hand, the extracted band representation of the enhanced image in the first stage of DRBN (recursive band learning) bridges the gap between the restoration knowledge of paired data and the perceptual quality preference to real high-quality images. Its second stage (band recomposition) learns to recompose the band representation towards fitting perceptual properties of high-quality images via adversarial learning. With the help of this two-stage design, our approach generates enhanced results with well-reconstructed details and visually promising contrast and color distributions. Qualitative and quantitative evaluations demonstrate the superiority of our DRBN.
Published: 2020

48. White Matter Abnormalities Based on TBSS and Its Correlation With Impulsivity Behavior of Methamphetamine Addicts

Author: Wenhan Yang, Jing Luo, Cui Yan, Sihong Huang, and Jun Liu
Subjects: medicine.medical_specialty, Internal capsule, External capsule, lcsh:RC435-571, Poison control, impulsivity, Audiology, Impulsivity, Corpus callosum, White matter, 03 medical and health sciences, 0302 clinical medicine, Corona radiata, lcsh:Psychiatry, Medicine, methamphetamine, Original Research, Psychiatry, Barratt Impulsivity Scale, business.industry, diffusion tensor imaging, 030227 psychiatry, Psychiatry and Mental health, medicine.anatomical_structure, Superior cerebellar peduncle, tract-based spatial statistics, medicine.symptom, business, 030217 neurology & neurosurgery
Abstract: Background Methamphetamine (MA) abuse is one of the most rapidly growing illicit drug problems worldwide. Impulsivity has been considered as a core impairment underpinning addictive behavior. Studies have demonstrated that MA addicts have white matter abnormalities based on ROIs. There are few studies on whole brain, and the association between whole brain tracts and impulsivity in MA dependence remain unclear. Tract-based spatial statistics (TBSS) was used to detect four DTI measures, and these were correlated with the Barratt Impulsivity Scale (BIS) to verify and expand the previous results. Methods A total of 28 MA addicts and 22 healthy controls were recruited. MRI was performed to evaluate the brain structural changes, the BIS was used to evaluate impulsivity behavior, white matter differences were compared between MA addicts and healthy controls, and then determine correlation between diffusion parameters and BIS scores. Results MA addicts had significantly lower FA, and higher AD, RD, and MD in a wide range of white matter, which mainly included: corona radiata, internal capsule, superior longitudinal fasciculus, external capsule, inferior fronto-occipital fascicules, posterior thalamic radiation, sagittal stratum, fornix and stria terminalis, cerebral peduncle, superior cerebellar peduncle, corpus callosum, and corticolspinal tract compared with controls. The MA group had significantly higher total score, attention and motor scores compared to healthy controls. Higher MD in the right corticospinal tract was significantly associated with higher total scores. Conclusion MA addicts exhibit a globally diminished white matter integrity. furthermore, they present with high levels of impulsivity, and this dysfunction is associated with MD in corticospinal tracts. Future studies on larger sample sizes, gender effects and longitudinal studies are needed.
Published: 2020

49. Risk of Gastrointestinal Adverse Events in Cancer Patients Treated With Immune Checkpoint Inhibitor Plus Chemotherapy: A Systematic Review and Meta-Analysis

Author: Wenhan Yang, Peng Men, Huimin Xue, Mingyan Jiang, and Qiuhua Luo
Subjects: 0301 basic medicine, Cancer Research, medicine.medical_specialty, cytotoxic T-lymphocyte-associated protein 4, Nausea, medicine.medical_treatment, immune checkpoint inhibitor, Cochrane Library, chemotherapy, lcsh:RC254-282, Gastroenterology, law.invention, 03 medical and health sciences, 0302 clinical medicine, Randomized controlled trial, gastrointestinal adverse events, law, Internal medicine, Medicine, Adverse effect, programmed death 1, Chemotherapy, business.industry, Cancer, lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens, medicine.disease, 030104 developmental biology, Oncology, programmed death ligand 1, 030220 oncology & carcinogenesis, Relative risk, Meta-analysis, Systematic Review, medicine.symptom, business
Abstract: Background: The combination of immune checkpoint inhibitors (ICIs) and chemotherapy can improve clinical outcomes in the treatment of various tumors, but may also be associated with more adverse events (AEs). We performed a systematic review and meta-analysis to characterize the risk of gastrointestinal AEs in cancer patients treated with ICI plus chemotherapy. Methods: This review was based on comprehensive search through PubMed, EMBASE, and the Cochrane Library for randomized controlled trials (RCTs) that reported gastrointestinal AEs following the use of ICI plus chemotherapy. Literature screening, data extraction, and quality evaluation were performed by two individual reviewers. Revman (version 5.3) was used for meta-analysis. Risk ratios (RR) with 95% confidence interval (CI) were calculated. Meta-analysis was conducted according to different types of ICIs [programmed death 1 (PD-1), programmed death ligand 1 (PD-L1), and cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) inhibitors]. Results: After a full-text review, 10 trials involving 5,142 patients were included in the study. Compared with chemotherapy alone, PD-1 inhibitor plus chemotherapy significantly increased the risk of diarrhea (RR = 1.38, 95% CI, 1.13-1.68, P = 0.001; I 2 = 0%) and colitis (RR = 2.90, 95% CI, 1.02-8.21, P = 0.050; I 2 = 0%), PD-L1 inhibitor plus chemotherapy significantly increased the risk of nausea (RR = 1.17, 95% CI, 1.02-1.35, P = 0.020; I 2 = 0%), while CTLA-4 inhibitor plus chemotherapy significantly increased the risk of decreased appetite (RR = 1.49, 95% CI, 1.17-1.90, P = 0.001; I 2 = 0%), diarrhea (RR = 2.23, 95% CI, 1.90-2.63, P < 0.00001; I 2 = 0%), and colitis (RR = 28.39, 95% CI, 5.59-144.24, P < 0.001; I 2 = 0%). Conclusions: This meta-analysis demonstrated that ICI plus chemotherapy is associated with a higher risk of gastrointestinal AEs. However, combining different ICIs may lead to diverse gastrointestinal toxicities. Clinicians should be aware of these AEs in the application of ICI plus chemotherapy.
Published: 2020

50. Towards Coding for Human and Machine Vision: A Scalable Image Coding Approach

Author: Wenhan Yang, Shuai Yang, Jiaying Liu, Yueyu Hu, and Ling-Yu Duan
Subjects: FOS: Computer and information sciences, Pixel, Computer science, Machine vision, business.industry, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Iterative reconstruction, Generative model, Scalability, Computer vision, Artificial intelligence, business, Coding (social sciences)
Abstract: The past decades have witnessed the rapid development of image and video coding techniques in the era of big data. However, the signal fidelity-driven coding pipeline design limits the capability of the existing image/video coding frameworks to fulfill the needs of both machine and human vision. In this paper, we come up with a novel image coding framework by leveraging both the compressive and the generative models, to support machine vision and human perception tasks jointly. Given an input image, the feature analysis is first applied, and then the generative model is employed to perform image reconstruction with features and additional reference pixels, in which compact edge maps are extracted in this work to connect both kinds of vision in a scalable way. The compact edge map serves as the basic layer for machine vision tasks, and the reference pixels act as a sort of enhanced layer to guarantee signal fidelity for human vision. By introducing advanced generative models, we train a flexible network to reconstruct images from compact feature representations and the reference pixels. Experimental results demonstrate the superiority of our framework in both human visual quality and facial landmark detection, which provide useful evidence on the emerging standardization efforts on MPEG VCM (Video Coding for Machine)., Comment: Project page: https://williamyang1991.github.io/projects/VCM-Face/
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

117 results on '"Wenhan Yang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources