946 results on '"Huang, Kaizhu"'
Search Results
902. Estimating Nonlinear Spatiotemporal Membrane Dynamics in Active Dendrites
- Author
-
Omori, Toshiaki, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Loo, Chu Kiong, editor, Yap, Keem Siah, editor, Wong, Kok Wai, editor, Teoh, Andrew, editor, and Huang, Kaizhu, editor
- Published
- 2014
- Full Text
- View/download PDF
903. Inter Subject Correlation of Brain Activity during Visuo-Motor Sequence Learning
- Author
-
Miyapuram, Krishna Prasad, Pamnani, Ujjval, Doya, Kenji, Bapi, Raju S., Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Loo, Chu Kiong, editor, Yap, Keem Siah, editor, Wong, Kok Wai, editor, Teoh, Andrew, editor, and Huang, Kaizhu, editor
- Published
- 2014
- Full Text
- View/download PDF
904. Human Implicit Intent Discrimination Using EEG and Eye Movement
- Author
-
Park, Ukeob, Mallipeddi, Rammohan, Lee, Minho, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Loo, Chu Kiong, editor, Yap, Keem Siah, editor, Wong, Kok Wai, editor, Teoh, Andrew, editor, and Huang, Kaizhu, editor
- Published
- 2014
- Full Text
- View/download PDF
905. Transfer Entropy and Information Flow Patterns in Functional Brain Networks during Cognitive Activity
- Author
-
Shovon, Md. Hedayetul Islam, Nandagopal, D (Nanda), Vijayalakshmi, Ramasamy, Du, Jia Tina, Cocks, Bernadine, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Kobsa, Alfred, Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Loo, Chu Kiong, editor, Yap, Keem Siah, editor, Wong, Kok Wai, editor, Teoh, Andrew, editor, and Huang, Kaizhu, editor
- Published
- 2014
- Full Text
- View/download PDF
906. Cross-modality interactive attention network for multispectral pedestrian detection.
- Author
-
Zhang, Lu, Liu, Zhiyong, Zhang, Shifeng, Yang, Xu, Qiao, Hong, Huang, Kaizhu, and Hussain, Amir
- Subjects
- *
PEDESTRIANS , *MACHINE learning , *ARCHITECTURE - Abstract
Highlights • A two-stream framework for multispectral pedestrian detection is presented. • A novel cross-modality interactive attention mechanism for fusion is proposed. • The approach utilizes the modal correlations and fuses features adaptively. • The approach outperforms the related works with better effectiveness. Abstract Multispectral pedestrian detection is an emerging solution with great promise in many around-the-clock applications, such as automotive driving and security surveillance. To exploit the complementary nature and remedy contradictory appearance between modalities, in this paper, we propose a novel cross-modality interactive attention network that takes full advantage of the interactive properties of multispectral input sources. Specifically, we first utilize the color (RGB) and thermal streams to build up two detached feature hierarchy for each modality, then by taking the global features, correlations between two modalities are encoded in the attention module. Next, the channel responses of halfway feature maps are recalibrated adaptively for subsequent fusion operation. Our architecture is constructed in the multi-scale format to better deal with different scales of pedestrians, and the whole network is trained in an end-to-end way. The proposed method is extensively evaluated on the challenging KAIST multispectral pedestrian dataset and achieves state-of-the-art performance with high efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
907. IAN: The Individual Aggregation Network for Person Search.
- Author
-
Xiao, Jimin, Xie, Yanchun, Tillo, Tammam, Huang, Kaizhu, Wei, Yunchao, and Feng, Jiashi
- Subjects
- *
AGGREGATION (Statistics) , *PEDESTRIANS , *VISUAL fields , *IDENTIFICATION , *LEARNING - Abstract
Abstract Person search in real-world scenarios is a new challenging computer version task with many meaningful applications. The challenge of this task mainly comes from: (1) unavailable bounding boxes for pedestrians and the model needs to search for the person over the whole gallery images; (2) huge variance of visual appearance of a particular person owing to varying poses, lighting conditions, and occlusions. To address these two critical issues in modern person search applications, we propose a novel Individual Aggregation Network (IAN) that can accurately localize persons by learning to minimize intra-person feature variations. IAN is built upon the state-of-the-art object detection framework, i.e., faster R-CNN, so that high-quality region proposals for pedestrians can be produced in an online manner. In addition, to relieve the negative effect caused by varying visual appearances of the same individual, IAN introduces a novel center loss that can increase the intra-class compactness of feature representations. The engaged center loss encourages persons with the same identity to have similar feature characteristics. Extensive experimental results on two benchmarks, i.e., CUHK-SYSU and PRW, well demonstrate the superiority of the proposed model. In particular, IAN achieves 77.23% mAP and 80.45% top-1 accuracy on CUHK-SYSU, which outperform the state-of-the-art by 1.7% and 1.85%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
908. Multi-semantic hypergraph neural network for effective few-shot learning.
- Author
-
Chen, Hao, Li, Linyan, Hu, Fuyuan, Lyu, Fan, Zhao, Liuqing, Huang, Kaizhu, Feng, Wei, and Xia, Zhenping
- Subjects
- *
ARTIFICIAL neural networks , *HYPERGRAPHS - Abstract
• Multi-semantic hypergraph explore higher-order relationships among few samples. • Orthogonalized mapping function helps to obtain rich multi-semantic information. • Multi-semantic distribution information improve the rationality of hypergraphs. • Hypergraph and Multi-Semantic Distribution Information with node-edge message passing. Recently, Graph-based Few-Shot Learning (FSL) methods exhibit good generalization by mining relations among few samples with Graph Neural Networks. However, most Graph-based FSL methods consider only binary relations and ignore the multi-semantic information of the global context knowledge. We propose a framework of Multi-Semantic Hypergraph for FSL (MSH-FSL) to explore complex latent high-order multi-semantic relations among the few samples. By mining the complex relationship structure of multi-node and multi-semantics, more refined feature representation can be learned, which yields better classification robustness. Specifically, we first construct a novel Multi-Semantic Hypergraph by obtaining associated instances with different semantic features via orthogonal mapping. With the constructed hypergraph, we then develop the Hyergraph Neural Network along with a novel multi-generation hypergraph message passing so as to better leverage the complex latent semantic relations among samples. Finally, after a number of generations, the hyper-node representations embedded in the learned hypergraph become more accurate for obtaining few-shot prediction. In the 5-way 1-shot task of ResNet-12 on mini-Imagenet dataset, the multi-semantic hypergraph outperforms single-semantic graph by 3.1%, and with the proposed semantic-distribution message passing, the improvement can further reach 6.1%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
909. Editorial: Collaborative Computing for Data-Driven Systems.
- Author
-
Wang, Xinheng, Iqbal, Muddesar, Gao, Honghao, Huang, Kaizhu, and Tchernykh, Andrei
- Subjects
- *
COMPUTER systems - Abstract
Editorial: Over the last few years, owing to the development, deployment, and use of Internet of Things (IoT) systems and smart devices, a large volume of data has been generated from various operation systems. With the commercialization of 5G networks and guarantee of transmission of large volume of data with short delay, more applications could be developed over the next few years to change the way we live and work. It features six selected high-quality papers, ranging from new methodology to handle the data and make better decisions, better data processing methodologies to process the data inside the system and across the systems, methods to process the low-quality but complex data generated from the end users, which is normally the case for crowdsensing systems, to cost-effective ways to provide services to benefit both service providers and end users. [Extracted from the article]
- Published
- 2020
- Full Text
- View/download PDF
910. Extension I: BMPM for Imbalanced Learning
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
911. Extension II: A Regression Model from M4
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
912. Conclusion and Future Work
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
913. Introduction
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
914. Extension III: Variational Margin Settings within Local Data in Support Vector Regression
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
915. A General Global Learning Model: MEMPM
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
916. Global Learning vs. Local Learning
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
917. Learning Locally and Globally: Maxi-Min Margin Machine
- Author
-
Huang, Kaizhu, Yang, Haiqin, King, Irwin, and Lyu, Michael
- Published
- 2008
- Full Text
- View/download PDF
918. Deep learning-assisted ultra-accurate smartphone testing of paper-based colorimetric ELISA assays.
- Author
-
Duan, Sixuan, Cai, Tianyu, Zhu, Jia, Yang, Xi, Lim, Eng Gee, Huang, Kaizhu, Hoettges, Kai, Zhang, Quan, Fu, Hao, Guo, Qiang, Liu, Xinyu, Yang, Zuming, and Song, Pengfei
- Subjects
- *
DEEP learning , *MACHINE learning , *SMARTPHONES , *ENZYME-linked immunosorbent assay , *MEDICAL screening , *MOBILE apps - Abstract
Smartphone has long been considered as one excellent platform for disease screening and diagnosis, especially when combined with microfluidic paper-based analytical devices (μPADs) that feature low cost, ease of use, and pump-free operations. In this paper, we report a deep learning-assisted smartphone platform for ultra-accurate testing of paper-based microfluidic colorimetric enzyme-linked immunosorbent assay (c-ELISA). Different from existing smartphone-based μPAD platforms, whose sensing reliability is suffered from uncontrolled ambient lighting conditions, our platform is able to eliminate those random lighting influences for enhanced sensing accuracy. We first constructed a dataset that contains c-ELISA results (n = 2048) of rabbit IgG as the model target on μPADs under eight controlled lighting conditions. Those images are then used to train four different mainstream deep learning algorithms. By training with these images, the deep learning algorithms can well eliminate the influences of lighting conditions. Among them, the GoogLeNet algorithm gives the highest accuracy (>97%) in quantitative rabbit IgG concentration classification/prediction, which also provides 4% higher area under curve (AUC) value than that of the traditional curve fitting results analysis method. In addition, we fully automate the whole sensing process and achieve the "image in, answer out" to maximize the convenience of the smartphone. A simple and user-friendly smartphone application has been developed that controls the whole process. This newly developed platform further enhances the sensing performance of μPADs for use by laypersons in low-resource areas and can be facilely adapted to the real disease protein biomarkers detection by c-ELISA on μPADs. [Display omitted] • This deep learning-assisted smartphone platform is unaffected by ambient lighting. • A fully automated "image in, answer out" operation fashion. • A 2048 custom image dataset is used to test 4 mainstream deep learning algorithms. • GoogLeNet provides >97% accuracy in quantitative rabbit IgG testing. • The area under the curve (AUC) is 4% higher than that of conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
919. Approximately optimizing NDCG using pair-wise loss.
- Author
-
Jin, Xiao-Bo, Geng, Guang-Gang, Xie, Guo-Sen, and Huang, Kaizhu
- Subjects
- *
RANKING , *LEARNING , *ALGORITHMS , *PATTERN perception , *BIG data - Abstract
The Normalized Discounted Cumulative Gain (NDCG) is used to measure the performance of ranking algorithms. Much of the work on learning to rank by optimizing NDCG directly or indirectly is based on list-wise approaches. In our work, we approximately optimize a variant of NDCG called NDCG β using pair-wise approaches. NDCG β utilizes the linear discounting function. We first prove that the DCG error of NDCG β is equal to the weighted pair-wise loss; then, on that basis, RankBoost ndcg and RankSVM ndcg are proposed to optimize the upper bound of the pair-wise 0–1 loss function. The experimental results from applying our approaches and ten other state-of-the-art methods to five public datasets show the superiority of the proposed methods, especially RankSVM ndcg . In addition, RankBoost ndcg are less influenced by the initial weight distribution. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
920. Unsupervised domain adaptation in homogeneous distance space for person re-identification.
- Author
-
Zheng, Dingyuan, Xiao, Jimin, Wei, Yunchao, Wang, Qiufeng, Huang, Kaizhu, and Zhao, Yao
- Subjects
- *
HOMOGENEOUS spaces , *DATA distribution , *INFORMATION resources - Abstract
Data distribution alignment and clustering-based self-training are two feasible solutions to tackle unsupervised domain adaptation (UDA) on person re-identification (re-ID). Most existing alignment-based methods solely learn the source domain decision boundaries and align the data distribution of the target domain to the source domain, thus the re-ID performance on the target domain completely depends on the shared decision boundaries and how well the alignment is performed. However, two domains can hardly be precisely aligned because of the label space discrepancy of two domains, resulting in poor target domain re-ID performance. Although clustering-based self-training approaches could learn independent decision boundaries on the pseudo-labelled target domain data, they ignore both the accurate ID-related information of the labelled source domain data and the underlying relations between two domains. To fully exploit the source domain data to learn discriminative target domain ID-related features, in this paper, we propose a novel cross-domain alignment method in the homogeneous distance space, which is constructed by the newly designed stair-stepping alignment (SSA) matcher. Such alignment method can be integrated into both alignment-based framework and clustering-based framework. Extensive experiments validate the effectiveness of our proposed alignment method in these two frameworks. We achieve superior performance when the proposed alignment module is integrated into the clustering-based framework. Codes will be available at: http://github.com/Dingyuan-Zheng/HDS. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
921. Towards Model Robustness and Generalization Against Adversarial Examples for Deep Neural Networks
- Author
-
Zhang, Shufei, Huang, Kaizhu, and Goulermas, John
- Abstract
Recent years have witnessed the remarkable success of deep neural network (DNN) models spanning a wide range of applications including image classification, image generation, object detection and natural language processing. Despite the impressive performance boosting over various learning tasks, DNNs are demonstrated to be strikingly vulnerable to certain well-crafted adversarial perturbations. While such perturbations are imperceptible to human, they can easily mislead the prediction of DNNs with high confidence. Along with the increasing deployment of DNN models in safety-critical scenarios, it becomes extremely crucial to ensure model robustness against potential adversarial attacks. One of the most popular methods to defend adversarial attacks is adversarial training method. In this thesis, we aim to provide new understanding on adversarial example and analyze the current adversarial training methods from perspectives of latent representation/distribution, smoothness, optimization and robustness generalization. Moreover, we also analyze the relationship between robustness and generalization. For latent representation adversarial examples, we focus on considering how to learn the robust representations and the latent distribution which retains the more structure information of clean data distribution. For smoothness, we describe two methods to promote the latent and output smoothness of deep neural networks and analyze the relationship between smoothness and robust generalization. For optimization of adversarial training, we analyze the drawback of adversarial training and introduce a better optimization method for adversarial training. For robustness generalization, we analyze why robustness generalization is hard to achieve and introduce a simple but effective method to improve the robustness generalization of adversarial training. Finally, we analyze the relationship between robustness and generalization theoretically and empirically.
- Published
- 2021
- Full Text
- View/download PDF
922. End-to-end weakly supervised semantic segmentation with reliable region mining.
- Author
-
Zhang, Bingfeng, Xiao, Jimin, Wei, Yunchao, Huang, Kaizhu, Luo, Shan, and Zhao, Yao
- Subjects
- *
MINIATURE objects , *ENERGY dissipation , *MINES & mineral resources - Abstract
• We make an exten sion of our previous wok and design a more powerful end to end n etwork for weakly supervised semantic segmentation • We propose two new loss functions for utilizing the reliable labels, including a new dense energy loss and a batch based class distance loss. The former relies on shallow features, whilst the latter focuses on distinguishing high level s emantic features for different classes. • We design a new attention module to extract comprehensive global information. By using a re weighting technique, it can suppress dominant or noisy attention values and aggregate sufficient global information. • Our approach achieves a new state of the art performance for weakly supervised semantic segmentation. Weakly supervised semantic segmentation is a challenging task that only takes image-level labels as supervision but produces pixel-level predictions for testing. To address such a challenging task, most current approaches generate pseudo pixel masks first that are then fed into a separate semantic segmentation network. However, these two-step approaches suffer from high complexity and being hard to train as a whole. In this work, we harness the image-level labels to produce reliable pixel-level annotations and design a fully end-to-end network to learn to predict segmentation maps. Concretely, we firstly leverage an image classification branch to generate class activation maps for the annotated categories, which are further pruned into tiny reliable object/background regions. Such reliable regions are then directly served as ground-truth labels for the segmentation branch, where both global information and local information sub-branches are used to generate accurate pixel-level predictions. Furthermore, a new joint loss is proposed that considers both shallow and high-level features. Despite its apparent simplicity, our end-to-end solution achieves competitive mIoU scores (val : 65.4%, test : 65.3%) on Pascal VOC compared with the two-step counterparts. By extending our one-step method to two-step, we get a new state-of-the-art performance on the Pascal VOC 2012 dataset(val : 69.3%, test : 69.2%). Code is available at: https://github.com/zbf1991/RRM. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
923. Sparse matrix factorization with [formula omitted] norm for matrix completion.
- Author
-
Jin, Xiaobo, Miao, Jianyu, Wang, Qiufeng, Geng, Guanggang, and Huang, Kaizhu
- Subjects
- *
MATRIX decomposition , *SPARSE matrices , *MATRIX norms , *FACTORIZATION , *DEEP learning , *MATRICES (Mathematics) - Abstract
• We propose two matrix factorization methods DSMF and ISMF with l 2 , 1 norm, where the former directly minimizes F 2 -norm loss function whiling the latter indirectly optimize the upper bound of F -norm function. • We theoretically prove the convergence property of DSMF and discuss the convergence condition of ISMF. • The experiments on on the simulation and benchmark datasets show that our methods achieve the comparable performance with the deep learning-based matrix completion methods. Matrix factorization is a popular matrix completion method, however, it is difficult to determine the ranks of the factor matrices. We propose two new sparse matrix factorization methods with l 2 , 1 norm to explicitly force the row sparseness of the factor matrices, where the rank of the factor matrices is adaptively controlled by the regularization coefficient. We further theoretically prove the convergence property of our algorithms. The experimental results on the simulation and the benchmark datasets show that our methods achieve superior performance than its counterparts. Moreover our proposed methods can attain comparable performance with the deep learning-based matrix completion methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
924. Evaluation of deep learning and conventional approaches for image steganalysis
- Author
-
Sophia Zhao, Huimin Zhao, Guoliang Xie, Stephen Marshall, Jinchang Ren, Ren, Jinchang, Hussain, Amir, Zhao, Huimin, Huang, Kaizhu, Zheng, Jiangbin, Cai, Jun, Chen, Rongjun, and Xiao, Yinyin
- Subjects
Steganalysis ,Steganography ,business.industry ,Computer science ,Deep learning ,Digital content ,TK ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,computer.file_format ,JPEG ,Digital media ,03 medical and health sciences ,0302 clinical medicine ,Binary classification ,030212 general & internal medicine ,Artificial intelligence ,business ,Classifier (UML) ,computer - Abstract
Steganography is the technique that’s used to embed secret messages into digital media without changing their appearances. As a countermeasure to steganography, steganalysis detects the presence of hidden data in a digital content. For the last decade, the majority of image steganalysis approaches can be formed by two stages. The first stage is to extract effective features from the image content and the second is to train a classifier in machine learning by using the features from stage one. Ultimately the image steganalysis becomes a binary classification problem. Since Deep Learning related architecture unify these two stages and save researchers lots of time designing hand-crafted features, the design of a CNN-based steganalyzer has therefore received increasing attention over the past few years. In this paper, we will examine the development in image steganalysis, both in spatial domain and in JPEG domain, and discuss the future directions.
- Published
- 2020
925. Person Re-identification and Tracking in Video Surveillance
- Author
-
Xie, Yanchun, Xiao, Jimin, Huang, kaizhu, and Luo, shan
- Abstract
Video surveillance system is one of the most essential topics in the computer vision field. As the rapid and continuous increasement of using video surveillance cameras to obtain portrait information in scenes, it becomes a very important system for security and criminal investigations. Video surveillance system includes many key technologies, including the object recognition, the object localization, the object re-identification, object tracking, and by which the system can be used to identify or suspect the movements of the objects and persons. In recent years, person re-identification and visual object tracking have become hot research directions in the computer vision field. The re-identification system aims to recognize and identify the target of the required attributes, and the tracking system aims at following and predicting the movement of the target after the identification process. Researchers have used deep learning and computer vision technologies to significantly improve the performance of person re-identification. However, the study of person re-identification is still challenging due to complex application environments such as lightning variations, complex background transformations, low-resolution images, occlusions, and a similar dressing of different pedestrians. The challenge of this task also comes from unavailable bounding boxes for pedestrians, and the need to search for the person over the whole gallery images. To address these critical issues in modern person identification applications, we propose an algorithm that can accurately localize persons by learning to minimize intra-person feature variations. We build our model upon the state-of-the-art object detection framework, i.e., faster R-CNN, so that high-quality region proposals for pedestrians can be produced in an online manner. In addition, to relieve the negative effects caused by varying visual appearances of the same individual, we introduce a novel center loss that can increase the intra-class compactness of feature representations. The engaged center loss encourages persons with the same identity to have similar feature characteristics. Besides the localization of a single person, we explore a more general visual object tracking problem. The main task of the visual object tracking is to predict the location and size of the tracking target accurately and reliably in subsequent image sequences when the target is given at the beginning of the sequence. A visual object tracking algorithm with high accuracy, good stability, and fast inference speed is necessary. In this thesis, we study the updating problem for two kinds of tracking algorithms among the mainstream tracking approaches, and improve the robustness and accuracy. Firstly, we extend the siamese tracker with a model updating mechanism to improve their tracking robustness. A siamese tracker uses a deep convolutional neural network to obtain features and compares the new frame features with the target features in the first frame. The candidate region with the highest similarity score is considered as the tracking result. However, these kinds of trackers are not robust against large target variation due to the no-update matching strategy during the whole tracking process. To combat this defect, we propose an ensemble siamese tracker, where the final similarity score is also affected by the similarity with tracking results in recent frames instead of solely considering the first frame. Tracking results in recent frames are used to adjust the model for a continuous target change. Meanwhile, we combine adaptive candidate sampling strategy and large displacement optical flow method to improve its performance further. Secondly, we investigate the classic correlation filter based tracking algorithm and propose to provide a better model selection strategy by reinforcement learning. Correlation filter has been proven to be a useful tool for a number of approaches in visual tracking, particularly for seeking a good balance between tracking accuracy and speed. However, correlation filter based models are susceptible to wrong updates stemming from inaccurate tracking results. To date, little effort has been devoted to handling the correlation filter update problem. In our approach, we update and maintain multiple correlation filter models in parallel, and we use deep reinforcement learning for the selection of an optimal correlation filter model among them. To facilitate the decision process efficiently, we propose a decision-net to deal with target appearance modeling, which is trained through hundreds of challenging videos using proximal policy optimization and a lightweight learning network. An exhaustive evaluation of the proposed approach on the OTB100 and OTB2013 benchmarks show the effectiveness of our approach.
- Published
- 2020
- Full Text
- View/download PDF
926. Sequence Similarity Alignment Algorithm in Bioinformatics: Techniques and Challenges
- Author
-
Yuren Liu, Yijun Yan, Stephen Marshall, Jinchang Ren, Ren, Jinchang, Hussain, Amir, Zhao, Huimin, Huang, Kaizhu, Zheng, Jiangbin, Cai, Jun, Chen, Rongjun, and Xiao, Yinyin
- Subjects
Structure (mathematical logic) ,Computer science ,TK ,Information processing ,Sequence alignment ,Function (mathematics) ,Residual ,Bioinformatics ,03 medical and health sciences ,0302 clinical medicine ,Similarity (network science) ,Simple (abstract algebra) ,030212 general & internal medicine ,Algorithm ,Sequence (medicine) - Abstract
Sequence similarity alignment is a basic information processing method in bioinformatics. It is very important for discovering the information of function, structure and evolution in biological sequences. The main idea is to use a specific mathematical model or algorithm to find the maximum matching base or residual number between two or more sequences. The results of alignment reflect to what extent the algorithm reflects the similarity relationship between sequences and their biological characteristics. Therefore, the simple and effective algorithm of sequence similarity alignment in bioinformatics has always been a concern of biologists. This paper reviews some widely used sequence alignment algorithms including double-sequence alignment and multi-sequence alignment, simultaneously, introduces a method to call genetic variants from next-generation gene sequence data.
- Published
- 2020
927. Learning Density Models via Structured Latent Variables
- Author
-
Yang, X and Huang, Kaizhu
- Abstract
As one principal approach to machine learning and cognitive science, the probabilistic framework has been continuously developed both theoretically and practically. Learning a probabilistic model can be thought of as inferring plausible models to explain observed data. The learning process exploits random variables as building blocks which are held together with probabilistic relationships. The key idea behind latent variable models is to introduce latent variables as powerful attributes (setting/instrument) to reveal data structures and explore underlying features which can sensitively describe the real-world data. The classical research approaches engage shallow architectures, including latent feature models and finite mixtures of latent variable models. Within the classical frameworks, we should make certain assumptions about the form, structure, and distribution of the data. Since the shallow form may not describe the data structures sufficiently, new types of latent structures are promptly developed with the probabilistic frameworks. In this line, three main research interests are sparked, including infinite latent feature models, mixtures of the mixture models, and deep models. This dissertation summarises our work which is advancing the state-of-the-art in both classical and emerging areas. In the first block, a finite latent variable model with the parametric priors is presented for clustering and is further extended into a two-layer mixture model for discrimination. These models embed the dimensionality reduction in their learning tasks by designing a latent structure called common loading. Referred to as the joint learning models, these models attain more appropriate low-dimensional space that better matches the learning task. Meanwhile, the parameters are optimised simultaneously for both the low-dimensional space and model learning. However, these joint learning models must assume the fixed number of features as well as mixtures, which are normally tuned and searched using a trial and error approach. In general, the simpler inference can be performed by fixing more parameters. However, the fixed parameters will limit the flexibility of models, and false assumptions could even derive incorrect inferences from the data. Thus, a richer model is allowed for reducing the number of assumptions. Therefore an infinite tri-factorisation structure is proposed with non-parametric priors in the second block. This model can automatically determine an optimal number of features and leverage the interrelation between data and features. In the final block, we introduce how to promote the shallow latent structures model to deep structures to handle the richer structured data. This part includes two tasks: one is a layer-wise-based model, another is a deep autoencoder-based model. In a deep density model, the knowledge of cognitive agents can be modelled using more complex probability distributions. At the same time, inference and parameter computation procedure are straightforward by using a greedy layer-wise algorithm. The deep autoencoder-based joint learning model is trained in an end-to-end fashion which does not require pre-training of the autoencoder network. Also, it can be optimised by standard backpropagation without the inference of maximum a posteriori. Deep generative models are much more efficient than their shallow architectures for unsupervised and supervised density learning tasks. Furthermore, they can also be developed and used in various practical applications.
- Published
- 2019
- Full Text
- View/download PDF
928. Deep Learning from Smart City Data
- Author
-
Chen, Qi, Wang, Wei, Coenen, Frans, and Huang, Kaizhu
- Abstract
Rapid urbanisation brings severe challenges on sustainable development and living quality of urban residents. Smart cities develop holistic solutions in the field of urban ecosystems using collected data from different types of Internet of Things (IoT) sources. Today, smart city research and applications have significantly surged as consequences of IoT and machine learning technological enhancement. As advanced machine learning methods, deep learning techniques provide an effective framework which facilitates data mining and knowledge discovery tasks especially in the area of computer vision and natural language processing. In recent years, researchers from various research fields attempted to apply deep learning technologies into smart city applications in order to establish a new smart city era. Much of the research effort on smart city has been made, for example, intelligence transportation, smart healthcare, public safety, etc. Meanwhile, we still face a lot of challenges as the deep learning techniques are still premature for smart city. In this thesis, we first provide a review of the latest research on the convergence of deep learning and smart city for data processing. The review is conducted from two perspectives: while the technique-oriented view presents the popular and extended deep learning models, the application-oriented view focuses on the representative application domains in smart cities. We then focus on two areas, which are intelligence transportation and social media analysis, to demonstrate how deep learning could be used in real-world applications by addressing some prominent issues, e.g., external knowledge integration, multi-modal knowledge fusion, semi-supervised or unsupervised learning, etc. In intelligent transportation area, an attention-based recurrent neural network is proposed to learn from traffic flow readings and external factors for multi-step prediction. More specifically, the attention mechanism is used to model the dynamic temporal dependencies of traffic flow data and a general fusion component is designed to incorporate the external factors. For the traffic event detection task, a multi-modal Generative Adversarial Network (mmGAN) is designed. The proposed model contains a sensor encoder and a social encoder to learn from both traffic flow sensor data and social media data. Meanwhile, the mmGAN model is extended to a semi-supervised architecture by leveraging generative adversarial training to further learn from unlabelled data. In social media analysis area, three deep neural models are proposed for crisis-related data classification and COVID-19 tweet analysis. We designed an adversarial training method to generate adversarial examples for image and textual social data to improve the robustness of multi-modal learning. As most social media data related to crisis or COVID-19 is not labelled, we then proposed two unsupervised text classification models on the basis of the state-of-the-art BERT model. We used the adversarial domain adaptation technique and the zero-shot learning framework to extract knowledge from a large amount of unlabeled social media data. To demonstrate the effectiveness of our proposed solutions for smart city applications, we have collected a large amount of real-time publicly available traffic sensor data from the California department of transportation and social media data (i.e., traffic, crisis and COVID-19) from Twitter, and built a few datasets for examining prediction or classification performances. The proposed methods successfully addressed the limitations of existing approaches and outperformed the popular baseline methods on these real-world datasets. We hope the work would move the relevant research one step further in creating truly intelligence for smart cities.
929. Person Re-identification with and without Supervision
- Author
-
Zheng, Dingyuan, Xiao, Jimin, Huang, Kaizhu, and Huang, Xiaowei
930. Learning and Leveraging Structured Knowledge from User-Generated Social Media Data
- Author
-
Dong, Hang, Wang, Wei, Coenen, Frans, and Huang, Kaizhu
- Abstract
Knowledge has long been a crucial element in Artificial Intelligence (AI), which can be traced back to knowledge-based systems, or expert systems, in the 1960s. Knowledge provides contexts to facilitate machine understanding and improves the explainability and performance of many semantic-based applications. The acquisition of knowledge is, however, a complex step, normally requiring much effort and time from domain experts. In machine learning as one key domain of AI, the learning and leveraging of structured knowledge, such as ontologies and knowledge graphs, have become popular in recent years with the advent of massive user-generated social media data. The main hypothesis in this thesis is therefore that a substantial amount of useful knowledge can be derived from user-generated social media data. A popular, common type of social media data is social tagging data, accumulated from users' tagging in social media platforms. Social tagging data exhibit unstructured characteristics, including noisiness, flatness, sparsity, incompleteness, which prevent their efficient knowledge discovery and usage. The aim of this thesis is thus to learn useful structured knowledge from social media data regarding these unstructured characteristics. Several research questions have then been formulated related to the hypothesis and the research challenges. A knowledge-centred view has been considered throughout this thesis: knowledge bridges the gap between massive user-generated data to semantic-based applications. The study first reviews concepts related to structured knowledge, then focuses on two main parts, learning structured knowledge and leveraging structured knowledge from social tagging data. To learn structured knowledge, a machine learning system is proposed to predict subsumption relations from social tags. The main idea is to learn to predict accurate relations with features, generated with probabilistic topic modelling and founded on a formal set of assumptions on deriving subsumption relations. Tag concept hierarchies can then be organised to enrich existing Knowledge Bases (KBs), such as DBpedia and ACM Computing Classification Systems. The study presents relation-level evaluation, ontology-level evaluation, and the novel, Knowledge Base Enrichment based evaluation, and shows that the proposed approach can generate high quality and meaningful hierarchies to enrich existing KBs. To leverage structured knowledge of tags, the research focuses on the task of automated social annotation and propose a knowledge-enhanced deep learning model. Semantic-based loss regularisation has been proposed to enhance the deep learning model with the similarity and subsumption relations between tags. Besides, a novel, guided attention mechanism, has been proposed to mimic the users' behaviour of reading the title before digesting the content for annotation. The integrated model, Joint Multi-label Attention Network (JMAN), significantly outperformed the state-of-the-art, popular baseline methods, with consistent performance gain of the semantic-based loss regularisers on several deep learning models, on four real-world datasets. With the careful treatment of the unstructured characteristics and with the novel probabilistic and neural network based approaches, useful knowledge can be learned from user-generated social media data and leveraged to support semantic-based applications. This validates the hypothesis of the research and addresses the research questions. Future studies are considered to explore methods to efficiently learn and leverage other various types of structured knowledge and to extend current approaches to other user-generated data.
931. Open-Pose 3D zero-shot learning: Benchmark and challenges.
- Author
-
Zhao W, Yang G, Zhang R, Jiang C, Yang C, Yan Y, Hussain A, and Huang K
- Abstract
With the explosive 3D data growth, the urgency of utilizing zero-shot learning to facilitate data labeling becomes evident. Recently, methods transferring language or language-image pre-training models like Contrastive Language-Image Pre-training (CLIP) to 3D vision have made significant progress in the 3D zero-shot classification task. These methods primarily focus on 3D object classification with an aligned pose; such a setting is, however, rather restrictive, which overlooks the recognition of 3D objects with open poses typically encountered in real-world scenarios, such as an overturned chair or a lying teddy bear. To this end, we propose a more realistic and challenging scenario named open-pose 3D zero-shot classification, focusing on the recognition of 3D objects regardless of their orientation. First, we revisit the current research on 3D zero-shot classification and propose two benchmark datasets specifically designed for the open-pose setting. We empirically validate many of the most popular methods in the proposed open-pose benchmark. Our investigations reveal that most current 3D zero-shot classification models suffer from poor performance, indicating a substantial exploration room towards the new direction. Furthermore, we study a concise pipeline with an iterative angle refinement mechanism that automatically optimizes one ideal angle to classify these open-pose 3D objects. In particular, to make validation more compelling and not just limited to existing CLIP-based methods, we also pioneer the exploration of knowledge transfer based on Diffusion models. While the proposed solutions can serve as a new benchmark for open-pose 3D zero-shot classification, we discuss the complexities and challenges of this scenario that remain for further research development. The code is available publicly at https://github.com/weiguangzhao/Diff-OP3D., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024. Published by Elsevier Ltd.)
- Published
- 2024
- Full Text
- View/download PDF
932. EgPDE-Net: Building Continuous Neural Networks for Time Series Prediction With Exogenous Variables.
- Author
-
Gao P, Yang X, Zhang R, Guo P, Goulermas JY, and Huang K
- Abstract
While exogenous variables have a major impact on performance improvement in time series analysis, interseries correlation and time dependence among them are rarely considered in the present continuous methods. The dynamical systems of multivariate time series could be modeled with complex unknown partial differential equations (PDEs) which play a prominent role in many disciplines of science and engineering. In this article, we propose a continuous-time model for arbitrary-step prediction to learn an unknown PDE system in multivariate time series whose governing equations are parameterized by self-attention and gated recurrent neural networks. The proposed model, exogenous-guided PDE network (EgPDE-Net), takes account of the relationships among the exogenous variables and their effects on the target series. Importantly, the model can be reduced into a regularized ordinary differential equation (ODE) problem with specially designed regularization guidance, which makes the PDE problem tractable to obtain numerical solutions and feasible to predict multiple future values of the target series at arbitrary time points. Extensive experiments demonstrate that our proposed model could achieve competitive accuracy over strong baselines: on average, it outperforms the best baseline by reducing 9.85% on RMSE and 13.98% on MAE for arbitrary-step prediction.
- Published
- 2024
- Full Text
- View/download PDF
933. ES-GNN: Generalizing Graph Neural Networks Beyond Homophily With Edge Splitting.
- Author
-
Guo J, Huang K, Zhang R, and Yi X
- Abstract
While Graph Neural Networks (GNNs) have achieved enormous success in multiple graph analytical tasks, modern variants mostly rely on the strong inductive bias of homophily. However, real-world networks typically exhibit both homophilic and heterophilic linking patterns, wherein adjacent nodes may share dissimilar attributes and distinct labels. Therefore, GNNs smoothing node proximity holistically may aggregate both task-relevant and irrelevant (even harmful) information, limiting their ability to generalize to heterophilic graphs and potentially causing non-robustness. In this work, we propose a novel Edge Splitting GNN (ES-GNN) framework to adaptively distinguish between graph edges either relevant or irrelevant to learning tasks. This essentially transfers the original graph into two subgraphs with the same node set but complementary edge sets dynamically. Given that, information propagation separately on these subgraphs and edge splitting are alternatively conducted, thus disentangling the task-relevant and irrelevant features. Theoretically, we show that our ES-GNN can be regarded as a solution to a disentangled graph denoising problem, which further illustrates our motivations and interprets the improved generalization beyond homophily. Extensive experiments over 11 benchmark and 1 synthetic datasets not only demonstrate the effective performance of ES-GNN but also highlight its robustness to adversarial graphs and mitigation of the over-smoothing problem.
- Published
- 2024
- Full Text
- View/download PDF
934. Instance-Specific Model Perturbation Improves Generalized Zero-Shot Learning.
- Author
-
Yang G, Huang K, Zhang R, and Yang X
- Abstract
Zero-shot learning (ZSL) refers to the design of predictive functions on new classes (unseen classes) of data that have never been seen during training. In a more practical scenario, generalized zero-shot learning (GZSL) requires predicting both seen and unseen classes accurately. In the absence of target samples, many GZSL models may overfit training data and are inclined to predict individuals as categories that have been seen in training. To alleviate this problem, we develop a parameter-wise adversarial training process that promotes robust recognition of seen classes while designing during the test a novel model perturbation mechanism to ensure sufficient sensitivity to unseen classes. Concretely, adversarial perturbation is conducted on the model to obtain instance-specific parameters so that predictions can be biased to unseen classes in the test. Meanwhile, the robust training encourages the model robustness, leading to nearly unaffected prediction for seen classes. Moreover, perturbations in the parameter space, computed from multiple individuals simultaneously, can be used to avoid the effect of perturbations that are too extreme and ruin the predictions. Comparison results on four benchmark ZSL data sets show the effective improvement that the proposed framework made on zero-shot methods with learned metrics., (© 2024 Massachusetts Institute of Technology.)
- Published
- 2024
- Full Text
- View/download PDF
935. Learning Disentangled Graph Convolutional Networks Locally and Globally.
- Author
-
Guo J, Huang K, Yi X, and Zhang R
- Abstract
Graph convolutional networks (GCNs) emerge as the most successful learning models for graph-structured data. Despite their success, existing GCNs usually ignore the entangled latent factors typically arising in real-world graphs, which results in nonexplainable node representations. Even worse, while the emphasis has been placed on local graph information, the global knowledge of the entire graph is lost to a certain extent. In this work, to address these issues, we propose a novel framework for GCNs, termed LGD-GCN, taking advantage of both local and global information for disentangling node representations in the latent space. Specifically, we propose to represent a disentangled latent continuous space with a statistical mixture model, by leveraging neighborhood routing mechanism locally. From the latent space, various new graphs can then be disentangled and learned, to overall reflect the hidden structures with respect to different factors. On the one hand, a novel regularizer is designed to encourage interfactor diversity for model expressivity in the latent space. On the other hand, the factor-specific information is encoded globally via employing a message passing along these new graphs, in order to strengthen intrafactor consistency. Extensive evaluations on both synthetic and five benchmark datasets show that LGD-GCN brings significant performance gains over the recent competitive models in both disentangling and node classification. Particularly, LGD-GCN is able to outperform averagely the disentangled state-of-the-arts by 7.4% on social network datasets.
- Published
- 2024
- Full Text
- View/download PDF
936. FastAdaBelief: Improving Convergence Rate for Belief-Based Adaptive Optimizers by Exploiting Strong Convexity.
- Author
-
Zhou Y, Huang K, Cheng C, Wang X, Hussain A, and Liu X
- Abstract
AdaBelief, one of the current best optimizers, demonstrates superior generalization ability over the popular Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is theoretically appealing in which it has a data-dependent O(√T) regret bound when objective functions are convex, where T is a time horizon. It remains, however, an open problem whether the convergence rate can be further improved without sacrificing its generalization ability. To this end, we make the first attempt in this work and design a novel optimization algorithm called FastAdaBelief that aims to exploit its strong convexity in order to achieve an even faster convergence rate. In particular, by adjusting the step size that better considers strong convexity and prevents fluctuation, our proposed FastAdaBelief demonstrates excellent generalization ability and superior convergence. As an important theoretical contribution, we prove that FastAdaBelief attains a data-dependent O(logT) regret bound, which is substantially lower than AdaBelief in strongly convex cases. On the empirical side, we validate our theoretical analysis with extensive experiments in scenarios of strong convexity and nonconvexity using three popular baseline models. Experimental results are very encouraging: FastAdaBelief converges the quickest in comparison to all mainstream algorithms while maintaining an excellent generalization ability, in cases of both strong convexity or nonconvexity. FastAdaBelief is, thus, posited as a new benchmark model for the research community.
- Published
- 2023
- Full Text
- View/download PDF
937. Mind the Gap: Alleviating Local Imbalance for Unsupervised Cross-Modality Medical Image Segmentation.
- Author
-
Su Z, Yao K, Yang X, Wang Q, Yan Y, Sun J, and Huang K
- Subjects
- Humans, Image Processing, Computer-Assisted, Heart, Semantics
- Abstract
Unsupervised cross-modality medical image adaptation aims to alleviate the severe domain gap between different imaging modalities without using the target domain label. A key in this campaign relies upon aligning the distributions of source and target domain. One common attempt is to enforce the global alignment between two domains, which, however, ignores the fatal local-imbalance domain gap problem, i.e., some local features with larger domain gap are harder to transfer. Recently, some methods conduct alignment focusing on local regions to improve the efficiency of model learning. While this operation may cause a deficiency of critical information from contexts. To tackle this limitation, we propose a novel strategy to alleviate the domain gap imbalance considering the characteristics of medical images, namely Global-Local Union Alignment. Specifically, a feature-disentanglement style-transfer module first synthesizes the target-like source images to reduce the global domain gap. Then, a local feature mask is integrated to reduce the 'inter-gap' for local features by prioritizing those discriminative features with larger domain gap. This combination of global and local alignment can precisely localize the crucial regions in segmentation target while preserving the overall semantic consistency. We conduct a series of experiments with two cross-modality adaptation tasks, i,e. cardiac substructure and abdominal multi-organ segmentation. Experimental results indicate that our method achieves state-of-the-art performance in both tasks.
- Published
- 2023
- Full Text
- View/download PDF
938. Machine learning and 3D bioprinting.
- Author
-
Sun J, Yao K, An J, Jing L, Huang K, and Huang D
- Abstract
48With the growing number of biomaterials and printing technologies, bioprinting has brought about tremendous potential to fabricate biomimetic architectures or living tissue constructs. To make bioprinting and bioprinted constructs more powerful, machine learning (ML) is introduced to optimize the relevant processes, applied materials, and mechanical/biological performances. The objectives of this work were to collate, analyze, categorize, and summarize published articles and papers pertaining to ML applications in bioprinting and their impact on bioprinted constructs, as well as the directions of potential development. From the available references, both traditional ML and deep learning (DL) have been applied to optimize the printing process, structural parameters, material properties, and biological/mechanical performance of bioprinted constructs. The former uses features extracted from image or numerical data as inputs in prediction model building, and the latter uses the image directly for segmentation or classification model building. All of these studies present advanced bioprinting with a stable and reliable printing process, desirable fiber/droplet diameter, and precise layer stacking, and also enhance the bioprinted constructs with better design and cell performance. The current challenges and outlooks in developing process-material-performance models are highlighted, which may pave the way for revolutionizing bioprinting technologies and bioprinted construct design., Competing Interests: The authors declare no conflicts of interest., (Copyright: © 2023, Sun J, Yao K, An J, et al.)
- Published
- 2023
- Full Text
- View/download PDF
939. Rebalanced Zero-Shot Learning.
- Author
-
Ye Z, Yang G, Jin X, Liu Y, and Huang K
- Abstract
Zero-shot learning (ZSL) aims to identify unseen classes with zero samples during training. Broadly speaking, present ZSL methods usually adopt class-level semantic labels and compare them with instance-level semantic predictions to infer unseen classes. However, we find that such existing models mostly produce imbalanced semantic predictions, i.e. these models could perform precisely for some semantics, but may not for others. To address the drawback, we aim to introduce an imbalanced learning framework into ZSL. However, we find that imbalanced ZSL has two unique challenges: (1) Its imbalanced predictions are highly correlated with the value of semantic labels rather than the number of samples as typically considered in the traditional imbalanced learning; (2) Different semantics follow quite different error distributions between classes. To mitigate these issues, we first formalize ZSL as an imbalanced regression problem which offers empirical evidences to interpret how semantic labels lead to imbalanced semantic predictions. We then propose a re-weighted loss termed Re-balanced Mean-Squared Error (ReMSE), which tracks the mean and variance of error distributions, thus ensuring rebalanced learning across classes. As a major contribution, we conduct a series of analyses showing that ReMSE is theoretically well established. Extensive experiments demonstrate that the proposed method effectively alleviates the imbalance in semantic prediction and outperforms many state-of-the-art ZSL methods.
- Published
- 2023
- Full Text
- View/download PDF
940. A Novel 3D Unsupervised Domain Adaptation Framework for Cross-Modality Medical Image Segmentation.
- Author
-
Yao K, Su Z, Huang K, Yang X, Sun J, Hussain A, and Coenen F
- Subjects
- Brain diagnostic imaging, Humans, Image Processing, Computer-Assisted methods, Abdomen, Magnetic Resonance Imaging methods
- Abstract
We consider the problem of volumetric (3D) unsupervised domain adaptation (UDA) in cross-modality medical image segmentation, aiming to perform segmentation on the unannotated target domain (e.g. MRI) with the help of labeled source domain (e.g. CT). Previous UDA methods in medical image analysis usually suffer from two challenges: 1) they focus on processing and analyzing data at 2D level only, thus missing semantic information from the depth level; 2) one-to-one mapping is adopted during the style-transfer process, leading to insufficient alignment in the target domain. Different from the existing methods, in our work, we conduct a first of its kind investigation on multi-style image translation for complete image alignment to alleviate the domain shift problem, and also introduce 3D segmentation in domain adaptation tasks to maintain semantic consistency at the depth level. In particular, we develop an unsupervised domain adaptation framework incorporating a novel quartet self-attention module to efficiently enhance relationships between widely separated features in spatial regions on a higher dimension, leading to a substantial improvement in segmentation accuracy in the unlabeled target domain. In two challenging cross-modality tasks, specifically brain structures and multi-organ abdominal segmentation, our model is shown to outperform current state-of-the-art methods by a significant margin, demonstrating its potential as a benchmark resource for the biomedical and health informatics research community.
- Published
- 2022
- Full Text
- View/download PDF
941. Guest Editorial: Advances in Deep Learning for Clinical and Healthcare Applications.
- Author
-
Ieracitano C, Morabito FC, Squartini S, Huang K, Li X, and Mahmud M
- Published
- 2022
- Full Text
- View/download PDF
942. Analyzing Cell-Scaffold Interaction through Unsupervised 3D Nuclei Segmentation.
- Author
-
Yao K, Sun J, Huang K, Jing L, Liu H, Huang D, and Jude C
- Abstract
Fibrous scaffolds have been extensively used in three-dimensional (3D) cell culture systems to establish in vitro models in cell biology, tissue engineering, and drug screening. It is a common practice to characterize cell behaviors on such scaffolds using confocal laser scanning microscopy (CLSM). As a noninvasive technology, CLSM images can be utilized to describe cell-scaffold interaction under varied morphological features, biomaterial composition, and internal structure. Unfortunately, such information has not been fully translated and delivered to researchers due to the lack of effective cell segmentation methods. We developed herein an end-to-end model called Aligned Disentangled Generative Adversarial Network (AD-GAN) for 3D unsupervised nuclei segmentation of CLSM images. AD-GAN utilizes representation disentanglement to separate content representation (the underlying nuclei spatial structure) from style representation (the rendering of the structure) and align the disentangled content in the latent space. The CLSM images collected from fibrous scaffold-based culturing A549, 3T3, and HeLa cells were utilized for nuclei segmentation study. Compared with existing commercial methods such as Squassh and CellProfiler, our AD-GAN can effectively and efficiently distinguish nuclei with the preserved shape and location information. Building on such information, we can rapidly screen cell-scaffold interaction in terms of adhesion, migration and proliferation, so as to improve scaffold design., Competing Interests: The authors declare that there is no conflict of interest., (Copyright: © 2022 Yao, et al.)
- Published
- 2021
- Full Text
- View/download PDF
943. Manifold adversarial training for supervised and semi-supervised learning.
- Author
-
Zhang S, Huang K, Zhu J, and Liu Y
- Subjects
- Benchmarking, Supervised Machine Learning standards
- Abstract
We propose a new regularization method for deep learning based on the manifold adversarial training (MAT). Unlike previous regularization and adversarial training methods, MAT further considers the local manifold of latent representations. Specifically, MAT manages to build an adversarial framework based on how the worst perturbation could affect the statistical manifold in the latent space rather than the output space. Particularly, a latent feature space with the Gaussian Mixture Model (GMM) is first derived in a deep neural network. We then define the smoothness by the largest variation of Gaussian mixtures when a local perturbation is given around the input data point. On one hand, the perturbations are added in the way that would rough the statistical manifold of the latent space the worst. On the other hand, the model is trained to promote the manifold smoothness the most in the latent space. Importantly, since the latent space is more informative than the output space, the proposed MAT can learn a more robust and compact data representation, leading to further performance improvement. The proposed MAT is important in that it can be considered as a superset of one recently-proposed discriminative feature learning approach called center loss. We conduct a series of experiments in both supervised and semi-supervised learning on four benchmark data sets, showing that the proposed MAT can achieve remarkable performance, much better than those of the state-of-the-art approaches. In addition, we present a series of visualization which could generate further understanding or explanation on adversarial examples., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2021. Published by Elsevier Ltd.)
- Published
- 2021
- Full Text
- View/download PDF
944. Novel deep neural network based pattern field classification architectures.
- Author
-
Huang K, Zhang S, Zhang R, and Hussain A
- Subjects
- Algorithms, Bayes Theorem, Biometric Identification trends, Handwriting, Humans, Machine Learning trends, Pattern Recognition, Automated trends, Biometric Identification methods, Deep Learning trends, Neural Networks, Computer, Pattern Recognition, Automated methods
- Abstract
Field classification is a new extension of traditional classification frameworks that attempts to utilize consistent information from a group of samples (termed fields). By forgoing the independent identically distributed (i.i.d.) assumption, field classification can achieve remarkably improved accuracy compared to traditional classification methods. Most studies of field classification have been conducted on traditional machine learning methods. In this paper, we propose integration with a Bayesian framework, for the first time, in order to extend field classification to deep learning and propose two novel deep neural network architectures: the Field Deep Perceptron (FDP) and the Field Deep Convolutional Neural Network (FDCNN). Specifically, we exploit a deep perceptron structure, typically a 6-layer structure, where the first 3 layers remove (learn) a 'style' from a group of samples to map them into a more discriminative space and the last 3 layers are trained to perform classification. For the FDCNN, we modify the AlexNet framework by adding style transformation layers within the hidden layers. We derive a novel learning scheme from a Bayesian framework and design a novel and efficient learning algorithm with guaranteed convergence for training the deep networks. The whole framework is interpreted with visualization features showing that the field deep neural network can better learn the style of a group of samples. Our developed models are also able to achieve transfer learning and learn transformations for newly introduced fields. We conduct extensive comparative experiments on benchmark data (including face, speech, and handwriting data) to validate our learning approach. Experimental results demonstrate that our proposed deep frameworks achieve significant improvements over other state-of-the-art algorithms, attaining new benchmark performance., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2020. Published by Elsevier Ltd.)
- Published
- 2020
- Full Text
- View/download PDF
945. Imbalanced learning with a biased minimax probability machine.
- Author
-
Huang K, Yang H, King I, and Lyu MR
- Subjects
- Computer Simulation, Algorithms, Artificial Intelligence, Models, Statistical, Pattern Recognition, Automated methods
- Abstract
Imbalanced learning is a challenged task in machine learning. In this context, the data associated with one class are far fewer than those associated with the other class. Traditional machine learning methods seeking classification accuracy over a full range of instances are not suitable to deal with this problem, since they tend to classify all the data into a majority class, usually the less important class. In this correspondence, the authors describe a new approach named the biased minimax probability machine (BMPM) to deal with the problem of imbalanced learning. This BMPM model is demonstrated to provide an elegant and systematic way for imbalanced learning. More specifically, by controlling the accuracy of the majority class under all possible choices of class-conditional densities with a given mean and covariance matrix, this model can quantitatively and systematically incorporate a bias for the minority class. By establishing an explicit connection between the classification accuracy and the bias, this approach distinguishes itself from the many current imbalanced-learning methods; these methods often impose a certain bias on the minority data by adapting intermediate factors via the trial-and-error procedure. The authors detail the theoretical foundation, prove its solvability, propose an efficient optimization algorithm, and perform a series of experiments to evaluate the novel model. The comparison with other competitive methods demonstrates the effectiveness of this new model.
- Published
- 2006
- Full Text
- View/download PDF
946. Maximizing sensitivity in medical diagnosis using biased minimax probability machine.
- Author
-
Huang K, Yang H, King I, and Lyu MR
- Subjects
- Computer Simulation, Humans, Models, Statistical, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Breast Neoplasms diagnosis, Decision Support Systems, Clinical, Decision Support Techniques, Diagnosis, Computer-Assisted methods, Heart Diseases diagnosis
- Abstract
The challenging task of medical diagnosis based on machine learning techniques requires an inherent bias, i.e., the diagnosis should favor the "ill" class over the "healthy" class, since misdiagnosing a patient as a healthy person may delay the therapy and aggravate the illness. Therefore, the objective in this task is not to improve the overall accuracy of the classification, but to focus on improving the sensitivity (the accuracy of the "ill" class) while maintaining an acceptable specificity (the accuracy of the "healthy" class). Some current methods adopt roundabout ways to impose a certain bias toward the important class, i.e., they try to utilize some intermediate factors to influence the classification. However, it remains uncertain whether these methods can improve the classification performance systematically. In this paper, by engaging a novel learning tool, the biased minimax probability machine (BMPM), we deal with the issue in a more elegant way and directly achieve the objective of appropriate medical diagnosis. More specifically, the BMPM directly controls the worst case accuracies to incorporate a bias toward the "ill" class. Moreover, in a distribution-free way, the BMPM derives the decision rule in such a way as to maximize the worst case sensitivity while maintaining an acceptable worst case specificity. By directly controlling the accuracies, the BMPM provides a more rigorous way to handle medical diagnosis; by deriving a distribution-free decision rule, the BMPM distinguishes itself from a large family of classifiers, namely, the generative classifiers, where an assumption on the data distribution is necessary. We evaluate the performance of the model and compare it with three traditional classifiers: the k-nearest neighbor, the naive Bayesian, and the C4.5. The test results on two medical datasets, the breast-cancer dataset and the heart disease dataset, show that the BMPM outperforms the other three models.
- Published
- 2006
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.