"Kian Ming" / Database: Academic Search Index - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kian Ming"' showing total 41 results

Start Over "Kian Ming" Database Academic Search Index

41 results on '"Kian Ming"'

1. EnViTSA: Ensemble of Vision Transformer with SpecAugment for Acoustic Event Classification.

Author: Lim, Kian Ming, Lee, Chin Poo, Lee, Zhi Yang, and Alqahtani, Ali
Subjects: *TRANSFORMER models, *ARTIFICIAL neural networks, *DEEP learning, *DATA augmentation, *AUDITORY masking, *FOURIER transforms, *CLASSIFICATION
Abstract: Recent successes in deep learning have inspired researchers to apply deep neural networks to Acoustic Event Classification (AEC). While deep learning methods can train effective AEC models, they are susceptible to overfitting due to the models' high complexity. In this paper, we introduce EnViTSA, an innovative approach that tackles key challenges in AEC. EnViTSA combines an ensemble of Vision Transformers with SpecAugment, a novel data augmentation technique, to significantly enhance AEC performance. Raw acoustic signals are transformed into Log Mel-spectrograms using Short-Time Fourier Transform, resulting in a fixed-size spectrogram representation. To address data scarcity and overfitting issues, we employ SpecAugment to generate additional training samples through time masking and frequency masking. The core of EnViTSA resides in its ensemble of pre-trained Vision Transformers, harnessing the unique strengths of the Vision Transformer architecture. This ensemble approach not only reduces inductive biases but also effectively mitigates overfitting. In this study, we evaluate the EnViTSA method on three benchmark datasets: ESC-10, ESC-50, and UrbanSound8K. The experimental results underscore the efficacy of our approach, achieving impressive accuracy scores of 93.50%, 85.85%, and 83.20% on ESC-10, ESC-50, and UrbanSound8K, respectively. EnViTSA represents a substantial advancement in AEC, demonstrating the potential of Vision Transformers and SpecAugment in the acoustic domain. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. SCL: Self-supervised contrastive learning for few-shot image classification.

Author: Lim, Jit Yan, Lim, Kian Ming, Lee, Chin Poo, and Tan, Yong Xuan
Subjects: *IMAGE recognition (Computer vision), *SUPERVISED learning, *DEEP learning
Abstract: Few-shot learning aims to train a model with a limited number of base class samples to classify the novel class samples. However, to attain generalization with a limited number of samples is not a trivial task. This paper proposed a novel few-shot learning approach named Self-supervised Contrastive Learning (SCL) that enriched the model representation with multiple self-supervision objectives. Given the base class samples, the model is trained with the base class loss. Subsequently, contrastive-based self-supervision is introduced to minimize the distance between each training sample with their augmented variants to improve the sample discrimination. To recognize the distant sample, rotation-based self-supervision is proposed to enable the model to learn to recognize the rotation degree of the samples for better sample diversity. The multitask environment is introduced where each training sample is assigned with two class labels: base class label and rotation class label. Complex augmentation is put forth to help the model learn a deeper understanding of the object. The image structure of the training samples are augmented independent of the base class information. The proposed SCL is trained to minimize the base class loss, contrastive distance loss, and rotation class loss simultaneously to learn the generic features and improve the novel class performance. With the multiple self-supervision objectives, the proposed SCL outperforms state-of-the-art few-shot approaches on few-shot image classification benchmark datasets. • A novel self-supervised contrastive learning for few-shot image classification. • Contrastive learning is introduced to obtain a better sample discrimination. • Rotation prediction is proposed to enhance the sample diversity. • Heavy transformation is proposed for deeper object understanding. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

3. HGR-ViT: Hand Gesture Recognition with Vision Transformer.

Author: Tan, Chun Keat, Lim, Kian Ming, Chang, Roy Kwang Yang, Lee, Chin Poo, and Alqahtani, Ali
Subjects: *ARTIFICIAL neural networks, *GESTURE, *AMERICAN Sign Language, *HUMAN-computer interaction, *COMMUNICATION barriers
Abstract: Hand gesture recognition (HGR) is a crucial area of research that enhances communication by overcoming language barriers and facilitating human-computer interaction. Although previous works in HGR have employed deep neural networks, they fail to encode the orientation and position of the hand in the image. To address this issue, this paper proposes HGR-ViT, a Vision Transformer (ViT) model with an attention mechanism for hand gesture recognition. Given a hand gesture image, it is first split into fixed size patches. Positional embedding is added to these embeddings to form learnable vectors that capture the positional information of the hand patches. The resulting sequence of vectors are then served as the input to a standard Transformer encoder to obtain the hand gesture representation. A multilayer perceptron head is added to the output of the encoder to classify the hand gesture to the correct class. The proposed HGR-ViT obtains an accuracy of 99.98%, 99.36% and 99.85% for the American Sign Language (ASL) dataset, ASL with Digits dataset, and National University of Singapore (NUS) hand gesture dataset, respectively. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

4. Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition.

Author: Lim, Kian Ming, Lee, Chin Poo, Tan, Kok Seang, Alqahtani, Ali, and Ali, Mohammed
Subjects: *HUMAN activity recognition, *CONVOLUTIONAL neural networks, *HUMAN behavior
Abstract: Human action recognition is a constantly evolving field that is driven by numerous applications. In recent years, significant progress has been made in this area due to the development of advanced representation learning techniques. Despite this progress, human action recognition still poses significant challenges, particularly due to the unpredictable variations in the visual appearance of an image sequence. To address these challenges, we propose the fine-tuned temporal dense sampling with 1D convolutional neural network (FTDS-1DConvNet). Our method involves the use of temporal segmentation and temporal dense sampling, which help to capture the most important features of a human action video. First, the human action video is partitioned into segments through temporal segmentation. Each segment is then processed through a fine-tuned Inception-ResNet-V2 model, where max pooling is performed along the temporal axis to encode the most significant features as a fixed-length representation. This representation is then fed into a 1DConvNet for further representation learning and classification. The experiments on UCF101 and HMDB51 demonstrate that the proposed FTDS-1DConvNet outperforms the state-of-the-art methods, with a classification accuracy of 88.43% on the UCF101 dataset and 56.23% on the HMDB51 dataset. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

5. COVID-19 Diagnosis on Chest Radiographs with Enhanced Deep Neural Networks.

Author: Lee, Chin Poo and Lim, Kian Ming
Subjects: *ARTIFICIAL neural networks, *CHEST X rays, *COVID-19 testing, *DATA augmentation, *DEEP learning
Abstract: The COVID-19 pandemic has caused a devastating impact on the social activity, economy and politics worldwide. Techniques to diagnose COVID-19 cases by examining anomalies in chest X-ray images are urgently needed. Inspired by the success of deep learning in various tasks, this paper evaluates the performance of four deep neural networks in detecting COVID-19 patients from their chest radiographs. The deep neural networks studied include VGG16, MobileNet, ResNet50 and DenseNet201. Preliminary experiments show that all deep neural networks perform promisingly, while DenseNet201 outshines other models. Nevertheless, the sensitivity rates of the models are below expectations, which can be attributed to several factors: limited publicly available COVID-19 images, imbalanced sample size for the COVID-19 class and non-COVID-19 class, overfitting or underfitting of the deep neural networks and that the feature extraction of pre-trained models does not adapt well to the COVID-19 detection task. To address these factors, several enhancements are proposed, including data augmentation, adjusted class weights, early stopping and fine-tuning, to improve the performance. Empirical results on DenseNet201 with these enhancements demonstrate outstanding performance with an accuracy of 0.999%, precision of 0.9899%, sensitivity of 0.98%, specificity of 0.9997% and F1-score of 0.9849% on the COVID-Xray-5k dataset. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

6. MFRD-80K: A Dataset and Benchmark for Masked Face Recognition.

Author: Chin Poo Lee and Kian Ming Lim
Subjects: *CONVOLUTIONAL neural networks, *HUMAN facial recognition software, *MEDICAL masks, *PUBLIC spaces, *SUPPORT vector machines, *RANDOM forest algorithms
Abstract: Wearing face masks in public spaces has become an essential step to prevent the spread of COVID-19. This step poses some challenges to conventional face recognition due to several reasons: 1) the absence of large real-world masked face recognition dataset, and 2) the loss of some visual cues due to the occlusion by the face masks. To address these challenges, this paper presents a real-world masked face recognition dataset that consists of 80500 masked face images of 161 subjects, referred to as MFRD-80K dataset. Every subject contributes 500 masked face images, which are then partitioned into 60:20:20 for train, validation and test. Subsequently, we conduct some benchmark studies to evaluate the performance of the existing face recognition and classification methods on the MFRD-80K dataset. The methods include k-Nearest Neighbour, Multinomial Logistic Regression, Support Vector Machines, Random Forest, Multilayer Perceptron and Convolutional Neural Networks. Since the parameter settings affect the performance of each method, a grid search is performed to determine the optimal parameter settings. The empirical results demonstrate that Convolutional Neural Network achieves the highest test accuracy of 97.16% on MFRD-80K dataset. [ABSTRACT FROM AUTHOR]
Published: 2021

7. Efficient-PrototypicalNet with self knowledge distillation for few-shot learning.

Author: Lim, Jit Yan, Lim, Kian Ming, Ooi, Shih Yin, and Lee, Chin Poo
Subjects: *THEORY of self-knowledge, *MACHINE learning, *KNOWLEDGE transfer, *TRANSFER of training
Abstract: • Use a complex pre-trained model on a large scale image dataset for transfer learning. • Applied self knowledge distillation with born-again strategy into prototypical network. • A metric-based few-shot learning framework with transfer learning and knowledge distillation. The focus of recent few-shot learning research has been on the development of learning methods that can quickly adapt to unseen tasks with small amounts of data and low computational cost. In order to achieve higher performance in few-shot learning tasks, the generalizability of the method is essential to enable it generalize well from seen tasks to unseen tasks with limited number of samples. In this work, we investigate a new metric-based few-shot learning framework which transfers the knowledge from another effective classification model to produce well generalized embedding and improve the effectiveness in handling unseen tasks. The idea of our proposed Efficient-PrototypicalNet involves transfer learning, knowledge distillation, and few-shot learning. We employed a pre-trained model as a feature extractor to obtain useful features from tasks and decrease the task complexity. These features reduce the training difficulty in few-shot learning and increase the performance. Besides that, we further apply knowledge distillation to our framework and achieve extra performance improvement. The proposed Efficient-PrototypicalNet was evaluated on five benchmark datasets, i.e., Omniglot, mini ImageNet, tiered ImageNet, CIFAR-FS, and FC100. The proposed Efficient-PrototypicalNet achieved the state-of-the-art performance on most datasets in the 5-way K-shot image classification task, especially on the mini ImageNet dataset. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

8. SSL-ProtoNet: Self-supervised Learning Prototypical Networks for few-shot learning.

Author: Lim, Jit Yan, Lim, Kian Ming, Lee, Chin Poo, and Tan, Yong Xuan
Subjects: *SUPERVISED learning, *IMAGE recognition (Computer vision), *SOURCE code, *CLUSTER sampling
Abstract: Few-shot learning is seeking to generalize well to unseen tasks with insufficient labeled samples. Existing works have achieved generalization by exploring inter-class discrimination. However, their performance is limited because sample discrimination is neglected. In this work, we propose a metric-based few-shot approach that leverages self-supervised learning, Prototypical networks, and knowledge distillation, referred to as SSL-ProtoNet, to utilize sample discrimination. The proposed SSL-ProtoNet consists of three stages: pre-training stage, fine-tuning stage, and self-distillation stage. In the pre-training stage, self-supervised learning is leveraged to cluster the samples with their augmented variants to enhance the sample discrimination. The learned representation is then served as an initial point for the next stage. In the fine-tuning stage, the model weights transferred from the pre-training stage are fine-tuned to the target few-shot tasks. A self-supervised loss and a few-shot loss are integrated to prevent overfitting during few-shot task adaptation and to maintain the embedding diversity. In the self-distillation stage, the model is arranged in a teacher–student architecture. The teacher model will serve as a guidance in student model training to reduce overfitting and further improve the performance. The experimental results show that the proposed SSL-ProtoNet outshines the state-of-the-art few-shot image classification methods on three benchmark few-shot datasets, namely, mini ImageNet, tiered ImageNet, and CIFAR-FS. The source code for the proposed method is available at https://github.com/Jityan/sslprotonet. • A metric-based few-shot approach that leverages self-supervised learning. • A noisy transformation is proposed optimize the learned representation. • Self-supervised learning is proposed to enhance sample discrimination. • A self-supervised loss signal to preserve the representation diversity. • Knowledge in the model is further self-distilled for better performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Convolutional neural network with spatial pyramid pooling for hand gesture recognition.

Author: Tan, Yong Soon, Lim, Kian Ming, Tee, Connie, Lee, Chin Poo, and Low, Cheng Yaw
Subjects: *CONVOLUTIONAL neural networks, *GESTURE, *AMERICAN Sign Language, *PYRAMIDS, *COMMUNICATION barriers, *HAND
Abstract: Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

10. Incorporation of tin in boron doped silicon for reduced deactivation of boron during post-laser-anneal rapid thermal processing.

Author: Liu, Fangyue, Tan, Kian-Ming, Wang, Xincai, Low, David Kuang Yong, Lai, Doreen Mei Ying, Lim, Poh Chong, Samudra, Ganesh, and Yeo, Yee-Chia
Subjects: *BORON, *RAPID thermal processing, *TIN, *SEMICONDUCTOR doping, *ANNEALING of metals, *INDUSTRIAL lasers
Abstract: The supersaturated and metastable boron produced by laser anneal could deactivate during post-laser-thermal-cycles and lead to undesirable performance degradation. The effect of tin incorporation on the thermal stability of boron was studied for the first time and suppressed boron deactivation during post-laser-rapid-thermal-anneal was observed with tin coimplantation. High resolution x-ray diffraction measurement indicates that the tensile strain caused by a high boron concentration was reduced by the introduction of tin, which effectively reduces the strain energy and therefore, enhances the thermal stability of boron in post-laser-anneal rapid thermal processing. [ABSTRACT FROM AUTHOR]
Published: 2007
Full Text: View/download PDF

11. GE15: opening up new vistas for comparative research on Malaysian politics.

Author: Ong, Kian Ming
Subjects: *COALITION governments, *POLITICAL parties, *AUTHORITARIANISM, *ELECTORAL reform, *SOCIAL movements, MALAYSIAN elections
Abstract: The article focuses on the 15th General Election (GE15) of Malaysia on November 19, 2022 that resulted in transition of power to coalition government following defeat of political party Pakatan Harapan. Topics discussed include study of Malaysia as dominant party authoritarian regime to understand consolidating democracy, electoral reform in authoritarian regimes, and electoral reform through social movements.
Published: 2023
Full Text: View/download PDF

12. Sonographic assessment of musculoskeletal causes of calf pain and swelling.

Author: Leow, Kheng Song, Chew, Kian Ming, Chawla, Ashish, and Lim, Tze Chwan
Subjects: *SOFT tissue infections, *VENOUS thrombosis, *CALVES, *HOSPITAL emergency services, *PAIN
Abstract: Calf pain or swelling is a common presentation to the emergency department. The differential diagnoses are wide. Deep vein thrombosis (DVT) is often the first diagnosis to be excluded given its potentially fatal complications. Musculoskeletal causes of calf pain or swelling such as Baker's cyst, muscle or tendon tear, soft tissue infection, and inflammation are not uncommon and can often be confidently diagnosed with ultrasonography (US). Familiarity with these conditions and the sonographic findings would be useful in making timely and correct diagnosis. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

13. LAVRF: Sign language recognition via Lightweight Attentive VGG16 with Random Forest.

Author: Ewe, Edmond Li Ren, Lee, Chin Poo, Lim, Kian Ming, Kwek, Lee Chung, and Alqahtani, Ali
Subjects: *RANDOM forest algorithms, *SIGN language, *AMERICAN Sign Language, *MISSING data (Statistics), *STAIR climbing
Abstract: Sign language recognition presents significant challenges due to the intricate nature of hand gestures and the necessity to capture fine-grained details. In response to these challenges, a novel approach is proposed—Lightweight Attentive VGG16 with Random Forest (LAVRF) model. LAVRF introduces a refined adaptation of the VGG16 model integrated with attention modules, complemented by a Random Forest classifier. By streamlining the VGG16 architecture, the Lightweight Attentive VGG16 effectively manages complexity while incorporating attention mechanisms that dynamically concentrate on pertinent regions within input images, resulting in enhanced representation learning. Leveraging the Random Forest classifier provides notable benefits, including proficient handling of high-dimensional feature representations, reduction of variance and overfitting concerns, and resilience against noisy and incomplete data. Additionally, the model performance is further optimized through hyperparameter optimization, utilizing the Optuna in conjunction with hill climbing, which efficiently explores the hyperparameter space to discover optimal configurations. The proposed LAVRF model demonstrates outstanding accuracy on three datasets, achieving remarkable results of 99.98%, 99.90%, and 100% on the American Sign Language, American Sign Language with Digits, and NUS Hand Posture datasets, respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. A four dukkha state-space model for hand tracking.

Author: Lim, Kian Ming, Tan, Alan W.C., and Tan, Shing Chiang
Subjects: *TRACKING algorithms, *STATE-space methods, *MONTE Carlo method, *COMPUTATIONAL intelligence, *TRAJECTORIES (Mechanics), *MATHEMATICAL models
Abstract: In this paper, we propose a hand tracking method which was inspired by the notion of the four dukkha: birth, aging, sickness and death (BASD) in Buddhism. Based on this philosophy, we formalize the hand tracking problem in the BASD framework, and apply it to hand track hand gestures in isolated sign language videos. The proposed BASD method is a novel nature-inspired computational intelligence method which is able to handle complex real-world tracking problem. The proposed BASD framework operates in a manner similar to a standard state-space model, but maintains multiple hypotheses and integrates hypothesis update and propagation mechanisms that resemble the effect of BASD. The survival of the hypothesis relies upon the strength, aging and sickness of existing hypotheses, and new hypotheses are birthed by the fittest pairs of parent hypotheses. These properties resolve the sample impoverishment problem of the particle filter. The estimated hand trajectories show promising results for the American sign language. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

15. Possibility of Gravitational Quantization Under The Teleparallel Theory of Gravitation.

Author: Kian Ming, Triyanta, and Kosasih, J. S.
Subjects: *GRAVITATIONAL fields, *QUANTIZATION (Physics), *GENERAL relativity (Physics), *GAUGE field theory, *YANG-Mills theory
Abstract: Teleparallel gravity (TG) or tele-equivalent general relativity (TEGR) is an alternative gauge theory for gravity. In TG tetrad fields are defined to express gravitational fields and act like gauge potentials in standard gauge theory. The lagrangians for the gravitational field in TG and for the Yang-Mills field in standard gauge theory differ due to different indices that stick on the components of the corresponding fields: two external indices for tetrad field and internal and external indices for the Yang-Mills field. Different types of indices lead to different possible contractions and thus lead to different expression of the lagrangian for the Yang Mills field and for the tetrad field. As TG is a gauge theory it is then natural to quantize gravity in TG by applying the same procedure of quantization as in the standard gauge theory. Here we will discuss on the possibility to quantize gravity, canonically and functionally, under the framework of TG theory. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

16. Possibility of gravitational quantization under the teleparallel theory of gravitation.

Author: Kian Ming, Triyanta, and Kosasih, J. S.
Subjects: *GRAVITATIONAL fields, *QUANTIZATION (Physics), *GAUGE field theory, *FUNCTIONAL analysis, *LAGRANGIAN functions
Abstract: Teleparallel gravity (TG) or tele-equivalent general relativity (TEGR) is an alternative gauge theory for gravity. In TG tetrad fields are defined to express gravitational fields and act like gauge potentials in standard gauge theory. The lagrangians for the gravitational field in TG and for the Yang-Mills field in standard gauge theory differ due to different indices that stick on the components of the corresponding fields: two external indices for tetrad field and internal and external indices for the Yang-Mills field. Different types of indices lead to different possible contractions and thus lead to different expression of the lagrangian for the Yang Mills field and for the tetrad field. As TG is a gauge theory it is then natural to quantize gravity in TG by applying the same procedure of quantization as in the standard gauge theory. Here we will discuss on the possibility to quantize gravity, canonically and functionally, under the framework of TG theory. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

17. Malapportionment and democracy: A curvilinear relationship.

Author: Ong, Kian-Ming, Kasuya, Yuko, and Mori, Kota
Subjects: *DEMOCRACY, *POLITICAL parties, *POLITICAL candidates, *POLITICAL competition, *CURVILINEAR coordinates
Abstract: This article examines electoral malapportionment by illuminating the relationship between malapportionment level and democracy. Although a seminal study rejects this relationship, we argue that a logical and empirically significant relationship exists, which is curvilinear and is based on a framework focusing on incumbent politicians' incentives and the constraints they face regarding malapportionment. Malapportionment is lowest in established democracies and electoral authoritarian regimes with an overwhelmingly strong incumbent; it is relatively high in new democracies and authoritarian regimes with robust opposition forces. The seminal study's null finding is due to the mismatch between theoretical mechanisms and choice of democracy indices. Employing an original cross-national dataset, we conduct regression analyses; the results support our claims. Furthermore, on controlling the degree of democracy, the single-member district system's effects become insignificant. Australia, Belarus, the Gambia, Japan, Malaysia, Tunisia, and the United States illustrate the political logic underlying curvilinear relations at democracy's various levels. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

18. Bidirectional Long Short-Term Memory with Temporal Dense Sampling for human action recognition.

Author: Tan, Kok Seang, Lim, Kian Ming, Lee, Chin Poo, and Kwek, Lee Chung
Subjects: *HUMAN activity recognition, *CONVOLUTIONAL neural networks, *TIME-varying networks, *HUMAN behavior
Abstract: Long Short-Term Memory networks are making significant inroads into improving time series applications, including human action recognition. In a human action video, the spatial and temporal streams carry distinctive yet prominent information, hence many researchers turn to spatio-temporal models for human action recognition. A spatio-temporal model integrates the temporal network (e.g. Long Short-Term Memory) and spatial network (e.g. Convolutional Neural Networks). There are few challenges in the existing human action recognition: (1) the uni-directional modeling of Long Short-Term Memory making it unable to preserve the information from the future, (2) the sparse sampling strategy tends to lose prominent information when performing dimension reduction on the input of Long Short-Term Memory, and (3) the fusion strategy for consolidating the temporal network and spatial network. In view of this, we propose a Bidirectional Long Short-Term Memory with Temporal Dense Sampling and Fusion Network method to address the above-mentioned challenges. The Temporal Dense Sampling partitions the human action video into segments and then performs maxpooling operation along the temporal axis in each segment. A multi-stream bidirectional Long Short-Term Memory network is adopted to encode the long-term spatial and temporal dependencies in both forward and backward directions. Instead of assigning fixed weights to the spatial network and temporal network, we propose a fusion network where a fully-connected layer is trained to adaptively assign the weights for the networks. The empirical results demonstrate that the proposed Bidirectional Long Short-Term Memory with Temporal Dense Sampling and Fusion Network method outshines the state-of-the-art methods with an accuracy of 94.78% on UCF101 dataset and 70.72% on HMDB51 dataset. • Temporal dense sampling to extract significant activations in temporal axis. • Multi-stream bidirectional LSTM to encode spatial and temporal dependencies. • Fusion network to adaptively assign weight for each stream. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

19. Block-based histogram of optical flow for isolated sign language recognition.

Author: Lim, Kian Ming, Tan, Alan W.C., and Tan, Shing Chiang
Subjects: *OPTICAL flow, *HISTOGRAMS, *SIGN language, *IMAGE reconstruction, *IMAGE analysis, *MATHEMATICAL symmetry
Abstract: In this paper, we propose a block-based histogram of optical flow (BHOF) to generate hand representation in sign language recognition. Optical flow of the sign language video is computed in a region centered around the location of the detected hand position. The hand patches of optical flow are segmented into M spatial blocks, where each block is a cuboid of a segment of a frame across the entire sign gesture video. The histogram of each block is then computed and normalized by its sum. The feature vector of all blocks are then concatenated as the BHOF sign gesture representation. The proposed method provides a compact scale-invariant representation of the sign language. Furthermore, block-based histogram encodes spatial information and provides local translation invariance in the extracted optical flow. Additionally, the proposed BHOF also introduces sign language length invariancy into its representation, and thereby, produce promising recognition rate in signer independent problems. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

20. A feature covariance matrix with serial particle filter for isolated sign language recognition.

Author: Lim, Kian Ming, Tan, Alan W.C., and Tan, Shing Chiang
Subjects: *FEATURE extraction, *ANALYSIS of covariance, *MONTE Carlo method, *SIGN language, *PATTERN recognition systems
Abstract: As is widely recognized, sign language recognition is a very challenging visual recognition problem. In this paper, we propose a feature covariance matrix based serial particle filter for isolated sign language recognition. At the preprocessing stage, the fusion of the median and mode filters is employed to extract the foreground and thereby enhances hand detection. We propose to serially track the hands of the signer, as opposed to tracking both hands at the same time, to reduce the misdirection of target objects. Subsequently, the region around the tracked hands is extracted to generate the feature covariance matrix as a compact representation of the tracked hand gesture, and thereby reduce the dimensionality of the features. In addition, the proposed feature covariance matrix is able to adapt to new signs due to its ability to integrate multiple correlated features in a natural way, without any retraining process. The experimental results show that the hand trajectories as obtained through the proposed serial hand tracking are closer to the ground truth. The sign gesture recognition based on the proposed methods yields a 87.33% recognition rate for the American Sign Language. The proposed hand tracking and feature extraction methodology is an important milestone in the development of expert systems designed for sign language recognition, such as automated sign language translation systems. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

21. Propagators and Vertices of Scalar-Gravity Interaction In The Teleparallel Theory of Gravitation.

Author: Kian Ming and Triyanta
Subjects: *GRAVITATIONAL fields, *SCALAR field theory, *COUPLING constants, *BILINEAR forms, *LAGRANGIAN functions
Abstract: In the teleparallel theory of gravitation, interaction between scalar field and gravitational field are described by a scalar-gravity Lagrangian. Gravitational field are introduced to the scalar field Lagrangian using coupling prescription. From the bilinear term of this Lagrangian we get the propagator of gravitational field after introducing the gauge fixing term similar with Lorenz gauge. Then, by writing a tetrad field in the form of trivial term and gravity term, we derived the corresponding vertices from the interaction terms. There are six kind vertices derived in this paper. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

22. Self-taught learning of a deep invariant representation for visual tracking via temporal slowness principle.

Author: Kuen, Jason, Lim, Kian Ming, and Lee, Chin Poo
Subjects: *MACHINE learning, *INVARIANTS (Mathematics), *TRACKING algorithms, *COMPUTER vision, *TRAINING
Abstract: Visual representation is crucial for visual tracking method׳s performances. Conventionally, visual representations adopted in visual tracking rely on hand-crafted computer vision descriptors. These descriptors were developed generically without considering tracking-specific information. In this paper, we propose to learn complex-valued invariant representations from tracked sequential image patches, via strong temporal slowness constraint and stacked convolutional autoencoders. The deep slow local representations are learned offline on unlabeled data and transferred to the observational model of our proposed tracker. The proposed observational model retains old training samples to alleviate drift, and collect negative samples which are coherent with target׳s motion pattern for better discriminative tracking. With the learned representation and online training samples, a logistic regression classifier is adopted to distinguish target from background, and retrained online to adapt to appearance changes. Subsequently, the observational model is integrated into a particle filter framework to perform visual tracking. Experimental results on various challenging benchmark sequences demonstrate that the proposed tracker performs favorably against several state-of-the-art trackers. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

23. Three-by-three correlation matrices: its exact shape and a family of distributions.

Author: Kian Ming A. Chai
Subjects: *MATRICES (Mathematics), *DISTRIBUTION (Probability theory), *BAYESIAN analysis, *DIRICHLET problem, *STATISTICAL correlation, *CONVEX domains, *ISOMORPHISM (Mathematics)
Abstract: We give a novel and simple convex construction of three-by-three correlation matrices. This construction reveals the exact shape of the volume for these matrices: it is a tetrahedron point-wise transformed through the sine function. Hence the space of three-by-three correlation matrices is isomorphic to the standard three-simplex, and the matrices can be sampled by placing distributions on the three-simplex. This gives densities on the matrices that are flexible and easily interpreted; these will be useful in Bayesian analysis of correlation matrices. Examples using Dirichlet distributions are provided. We show the uniqueness of the construction, and we also prove that there is no parallel construction for higher order correlation matrices. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

24. Variants and Pitfalls in MR Imaging of Foot and Ankle Injuries.

Author: Bin Othman, Mohamad Isham, Kian Ming Chew, and Peh, Wilfred C. G.
Subjects: *ANKLE, *ANKLE injuries, *ANKLE injury treatment, *STIFLE joint, *MEDICAL sciences, *PATIENTS, *MAGNETIC resonance imaging, *WOUNDS & injuries
Abstract: Foot and ankle injuries are very common, particularly among young active athletic individuals. MR imaging has become one of the modalities of choice in the assessment of foot and ankle injuries. Accurate interpretation of MR images and diagnosis of pathology requires familiarity with normal anatomical variants and common diagnostic pitfalls. This article describes the common anatomical variants and technical pitfalls in MR imaging of the foot and ankle. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

25. DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling.

Author: Yee, Pui Sin, Lim, Kian Ming, and Lee, Chin Poo
Subjects: *CONVOLUTIONAL neural networks, *PYRAMIDS
Abstract: Dissimilar to object classification, scene classification needs to consider not only the components that exist in the image but also their corresponding distribution. The greatest challenge of scene classification, especially indoor scene classification, is that many classes share the same representative components whereas the degree of similarity can be low within the same class. Some images have no clear indication that they belong to a particular class. In view of this, we propose a DeepScene model that leverages Convolutional Neural Network as the base architecture. As color cues are important for scene classification, two solutions are proposed to convert grayscale scene images to RGB images, which are replication and deep neural network based style transfer for colorization. To address the challenge of objects with varying sizes and positions in the scene, Spatial Pyramid Pooling is incorporated into the Convolutional Neural Network. The Spatial Pyramid Pooling performs multi-level pooling to enable the multi-size training of the model for improved scale and translational invariance. Ensemble learning is then adopted to boost the overall performance in scene classification. The proposed DeepScene model outshines the state-of-art methods with accuracy of 98.1% on Event-8, 95.6% on Scene-15 and 71.0% on MIT-67. • Deep neural network-based style transfer for colorization of grayscale images. • CNN with SPP to enable multi-size training for scale and translational invariance. • Weighted average ensemble to improve the performance in scene classification. • Model regularization by dropout and Ridge regression. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

26. Recent Advances in Traffic Sign Recognition: Approaches and Datasets.

Author: Lim, Xin Roy, Lee, Chin Poo, Lim, Kian Ming, Ong, Thian Song, Alqahtani, Ali, and Ali, Mohammed
Subjects: *TRAFFIC signs & signals, *DEEP learning, *COMPUTER vision, *FEATURE extraction, *DRIVERLESS cars, *MACHINE learning, *AUTONOMOUS vehicles, *AUTOMOBILE license plates
Abstract: Autonomous vehicles have become a topic of interest in recent times due to the rapid advancement of automobile and computer vision technology. The ability of autonomous vehicles to drive safely and efficiently relies heavily on their ability to accurately recognize traffic signs. This makes traffic sign recognition a critical component of autonomous driving systems. To address this challenge, researchers have been exploring various approaches to traffic sign recognition, including machine learning and deep learning. Despite these efforts, the variability of traffic signs across different geographical regions, complex background scenes, and changes in illumination still poses significant challenges to the development of reliable traffic sign recognition systems. This paper provides a comprehensive overview of the latest advancements in the field of traffic sign recognition, covering various key areas, including preprocessing techniques, feature extraction methods, classification techniques, datasets, and performance evaluation. The paper also delves into the commonly used traffic sign recognition datasets and their associated challenges. Additionally, this paper sheds light on the limitations and future research prospects of traffic sign recognition. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

27. Gait-CNN-ViT: Multi-Model Gait Recognition with Convolutional Neural Networks and Vision Transformer.

Author: Mogan, Jashila Nair, Lee, Chin Poo, Lim, Kian Ming, Ali, Mohammed, and Alqahtani, Ali
Subjects: *GAIT in humans, *TRANSFORMER models, *CONVOLUTIONAL neural networks
Abstract: Gait recognition, the task of identifying an individual based on their unique walking style, can be difficult because walking styles can be influenced by external factors such as clothing, viewing angle, and carrying conditions. To address these challenges, this paper proposes a multi-model gait recognition system that integrates Convolutional Neural Networks (CNNs) and Vision Transformer. The first step in the process is to obtain a gait energy image, which is achieved by applying an averaging technique to a gait cycle. The gait energy image is then fed into three different models, DenseNet-201, VGG-16, and a Vision Transformer. These models are pre-trained and fine-tuned to encode the salient gait features that are specific to an individual's walking style. Each model provides prediction scores for the classes based on the encoded features, and these scores are then summed and averaged to produce the final class label. The performance of this multi-model gait recognition system was evaluated on three datasets, CASIA-B, OU-ISIR dataset D, and OU-ISIR Large Population dataset. The experimental results showed substantial improvement compared to existing methods on all three datasets. The integration of CNNs and ViT allows the system to learn both the pre-defined and distinct features, providing a robust solution for gait recognition even under the influence of covariates. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

28. Variational Multinomial Logit Gaussian Process.

Author: Chai, Kian Ming A. and Opper, Manfred
Subjects: *VARIATIONAL principles, *GAUSSIAN processes, *LOGITS, *MAXIMUM likelihood statistics, *NONPARAMETRIC statistics, *APPROXIMATION theory, *SPARSE matrices
Abstract: Gaussian process prior with an appropriate likelihood function is a flexible non-parametric model for a variety of learning tasks. One important and standard task is multi-class classification, which is the categorization of an item into one of several fixed classes. A usual likelihood function for this is the multinomial logistic likelihood function. However, exact inference with this model has proved to be difficult because high-dimensional integrations are required. In this paper, we propose a variational approximation to this model, and we describe the optimization of the variational parameters. Experiments have shown our approximation to be tight. In addition, we provide data-independent bounds on the marginal likelihood of the model, one of which is shown to be much tighter than the existing variational mean-field bound in the experiments. We also derive a proper lower bound on the predictive likelihood that involves the Kullback-Leibler divergence between the approximating and the true posterior. We combine our approach with a recently proposed sparse approximation to give a variational sparse approximation to the Gaussian process multi-class model. We also derive criteria which can be used to select the inducing set, and we show the effectiveness of these criteria over random selection in an experiment. [ABSTRACT FROM AUTHOR]
Published: 2012

29. Pakatan Rakyat: What is Different This Time?

Author: Ong, Kian Ming
Subjects: *COALITIONS, *POLITICAL opposition, *POLITICAL parties, *IDEOLOGY, *MULTICULTURALISM, *TWENTY-first century, MALAYSIAN politics & government
Abstract: Pakatan Raykat or the People's Coalition is the third attempt by the main opposition parties in Malaysia to form a coalition to challenge the electoral dominance of the Barisan Nasional (BN) or the National Front. This paper argues that in spite of the internal and external challenges faced by Pakatan Rakyat, including ideological differences, the institutional weaknesses of Parti Keadilan Rakyat and the continued attempts by the BN to destabilise the government in Pakatan Rakyat-controlled states, this opposition coalition will probably outlast its predecessors. The opposition's successes in the post-March 2008 by-elections have increased the vote-pooling incentives for these parties to stay together. The formalisation of Pakatan Rakyat as a registered political party mirroring the BN will increase the longevity of the opposition coalition even if it does not enjoy the same kind of electoral success in the next general election. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

30. Ultra High-Stress Liner Comprising Diamond-Like Carbon for Performance Enhancement of p-Channel Multiple-Gate Transistors.

Author: Kian-Ming Tan, Mingchu Yang, Tsung-Yang Liow, Rinus Tek Po Lee, and Yee-Chia Yeo
Subjects: *FIELD-effect transistors, *CARBON, *SILICON nitride, *STRAINS & stresses (Mechanics), *COMPLEMENTARY metal oxide semiconductors, *THIN films, *SILICON-on-insulator technology, *SUBSTRATES (Materials science)
Abstract: We report the demonstration of strained p-channel multiple-gate transistors or FinFETs with a novel liner-stressor material comprising diamond-like carbon (DLC). In this work, a DLC film with very high intrinsic compressive stress up to 6 GPa was employed. For FinFET devices having a 20 nm thin DLC liner stressor, more than 30% enhancement in saturation drain current IDsat is observed over FinFETs without a DLC liner. The performance enhancement is attributed to the coupling of compressive stress from the DLC liner to the channel, leading to hole mobility improvement. Due to its extremely high intrinsic stress value, significant IDsat enhancement is observed even when the thickness of the DLC film deposited is less than 40 nm. The DLC liner stressor is a promising stressor material for performance enhancement of p-channel transistors in future technology nodes. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

31. Strained Silicon Nanowire Transistors With Germanium Source and Drain Stressors.

Author: Tsung-Yang Liow, Kian-Ming Tan, Rinus Tek Po Lee, Ming Zhu, Ben Lian-Huat Tan, N. Balasubramanian, and Yee-Chia Yeo
Subjects: *NANOWIRES, *TRANSISTORS, *SILICON, *GERMANIUM, *COMPLEMENTARY metal oxide semiconductors, *SILICON-on-insulator technology, *FIELD-effect transistors, *PLASMA-enhanced chemical vapor deposition
Abstract: We report the first demonstration of pure germanium (Ge) source/drain (S/D) stressors on the ultranarrow or ultrathin Si S/D regions of nanowire FETs with gate lengths down to 5 nm. Ge S/D compressively strains the channel to provide up to ∼100% IDsat enhancement. We also introduce a novel Melt-Enhanced Dopant diffusion and activation technique to form fully embedded Si0.15Ge0.85 S/D stressors in nanowire FETs, further boosting the channel strain and achieving ∼125% IDsat enhancement. [ABSTRACT FROM AUTHOR]
Published: 2008
Full Text: View/download PDF

32. Strained n-Channel FinFETs Featuring In Situ Doped Silicon-Carbon (Si1-yCy) Source and Drain Stressors With High Carbon Content.

Author: Tsung-Yang Liow, Kian-Ming Tan, Weeks, Doran, Rinus Tek Po Lee, Ming Zhu, Keat-Mun Hoe, Chih-Hang Tung, Bauer, Matthias, Spear, Jennifer, Thomas, Shawn G., Samudra, Ganesh S., Balasubramanian, N., and Yee-Chia Yeo
Subjects: *SILICON, *NANOSILICON, *DOPED semiconductors, *DOPED semiconductor superlattices, *PHOSPHORUS, *GATE array circuits, *INTEGRATED circuits
Abstract: Phosphorus in situ doped Si1-yCy films (SiC:P) with substitutional carbon concentration of 1.7% and 2.1% were selectively grown in the source and drain regions of double-gate ‹110‹-oriented (110)-sidewall FinFETs to induce tensile strain in the silicon channel. In situ doping removes the need for a high-temperature spike anneal for source/drain (S/D) dopant activation and thus preserves the carbon substitutionality in the SiC:P films as grown. A strain-induced IDsat enhancement of ~∼15% and ∼22 % was obtained for n-channel FinFETs with 1.7% and 2.1% carbon incorporated in the S/D respectively. [ABSTRACT FROM AUTHOR]
Published: 2008
Full Text: View/download PDF

33. Hand gesture recognition via enhanced densely connected convolutional neural network.

Author: Tan, Yong Soon, Lim, Kian Ming, and Lee, Chin Poo
Subjects: *CONVOLUTIONAL neural networks, *DEEP learning, *DATA augmentation, *GESTURE, *AMERICAN Sign Language, *SIGNAL convolution, *SUPERVISED learning, *HAND
Abstract: • A taxonomy of vision-based hand gesture recognition in the literature is presented. • Model customization and data augmentation are explored to improve generalization. • Ablation study for the proposed model has been conducted. • Performance of the proposed model is evaluated on several hand gesture datasets. Hand gesture recognition (HGR) serves as a fundamental way of communication and interaction for human being. While HGR can be applied in human computer interaction (HCI) to facilitate user interaction, it can also be utilized for bridging the language barrier. For instance, HGR can be utilized to recognize sign language, which is a visual language represented by hand gestures and used by the deaf and mute all over the world as a primary way of communication. Hand-crafted approach for vision-based HGR typically involves multiple stages of specialized processing, such as hand-crafted feature extraction methods, which are usually designed to deal with particular challenges specifically. Hence, the effectiveness of the system and its ability to deal with varied challenges across multiple datasets are heavily reliant on the methods being utilized. In contrast, deep learning approach such as convolutional neural network (CNN), adapts to varied challenges via supervised learning. However, attaining satisfactory generalization on unseen data is not only dependent on the architecture of the CNN, but also dependent on the quantity and variety of the training data. Therefore, a customized network architecture dubbed as enhanced densely connected convolutional neural network (EDenseNet) is proposed for vision-based hand gesture recognition. The modified transition layer in EDenseNet further strengthens feature propagation, by utilizing bottleneck layer to propagate the features being reused to all the feature maps in a bottleneck manner, and the following Conv layer smooths out the unwanted features. Differences between EDenseNet and DenseNet are discerned, and its performance gains are scrutinized in the ablation study. Furthermore, numerous data augmentation techniques are utilized to attenuate the effect of data scarcity, by increasing the quantity of training data, and enriching its variety to further improve generalization. Experiments have been carried out on multiple datasets, namely one NUS hand gesture dataset and two American Sign Language (ASL) datasets. The proposed EDenseNet obtains 98.50% average accuracy without augmented data, and 99.64% average accuracy with augmented data, outperforming other deep learning driven instances in both settings, with and without augmented data. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

34. Gait-ViT: Gait Recognition with Vision Transformer.

Author: Mogan, Jashila Nair, Lee, Chin Poo, Lim, Kian Ming, and Muthu, Kalaiarasi Sonai
Subjects: *CONVOLUTIONAL neural networks, *IMAGE recognition (Computer vision)
Abstract: Identifying an individual based on their physical/behavioral characteristics is known as biometric recognition. Gait is one of the most reliable biometrics due to its advantages, such as being perceivable at a long distance and difficult to replicate. The existing works mostly leverage Convolutional Neural Networks for gait recognition. The Convolutional Neural Networks perform well in image recognition tasks; however, they lack the attention mechanism to emphasize more on the significant regions of the image. The attention mechanism encodes information in the image patches, which facilitates the model to learn the substantial features in the specific regions. In light of this, this work employs the Vision Transformer (ViT) with an attention mechanism for gait recognition, referred to as Gait-ViT. In the proposed Gait-ViT, the gait energy image is first obtained by averaging the series of images over the gait cycle. The images are then split into patches and transformed into sequences by flattening and patch embedding. Position embedding, along with patch embedding, are applied on the sequence of patches to restore the positional information of the patches. Subsequently, the sequence of vectors is fed to the Transformer encoder to produce the final gait representation. As for the classification, the first element of the sequence is sent to the multi-layer perceptron to predict the class label. The proposed method obtained 99.93% on CASIA-B, 100% on OU-ISIR D and 99.51% on OU-LP, which exhibit the ability of the Vision Transformer model to outperform the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

35. Advances in Vision-Based Gait Recognition: From Handcrafted to Deep Learning.

Author: Mogan, Jashila Nair, Lee, Chin Poo, and Lim, Kian Ming
Subjects: *DEEP learning, *BIOMETRY
Abstract: Identifying people's identity by using behavioral biometrics has attracted many researchers' attention in the biometrics industry. Gait is a behavioral trait, whereby an individual is identified based on their walking style. Over the years, gait recognition has been performed by using handcrafted approaches. However, due to several covariates' effects, the competence of the approach has been compromised. Deep learning is an emerging algorithm in the biometrics field, which has the capability to tackle the covariates and produce highly accurate results. In this paper, a comprehensive overview of the existing deep learning-based gait recognition approach is presented. In addition, a summary of the performance of the approach on different gait datasets is provided. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

36. Investigation of silicon-germanium fins fabricated using germanium condensation on vertical compliant structures.

Author: Tsung-Yang Liow, Kian-Ming Tan, Yee-Chia Yeo, Agarwal, Ajay, Du, Anyan, Chih-Hang Tung, and Balasubramanian, Narayanan
Subjects: *GERMANIUM crystals, *HETEROSTRUCTURES, *CRYSTALS, *IMAGE analysis, *TRANSMISSION electron microscopy, *DISLOCATIONS in crystals, *RELAXATION (Nuclear physics)
Abstract: We report the formation of defect-free SiGe vertical heterostructures using Ge condensation on vertical SiGe structures. To evaluate the effectiveness of substrate compliance in vertical structures, SiGe fins of various widths were subjected to Ge condensation. This formed vertical fin heterostructures comprising a SiGe core region sandwiched by Ge-rich regions. Using cross-sectional transmission electron microscopy (TEM), wide fins were found to contain more dislocations than narrower fins, in which we observed few or no dislocations. Lattice strain analysis using high-resolution TEM image analysis was used to confirm that strain relaxation has occurred. In the wide fins (noncompliant substrate), strain relaxation was dislocation mediated. In the narrow fins, substrate compliance enabled strain relaxation in the Ge-rich layer with reduced dislocation formation. Hence, we also demonstrated the formation of a strain-relaxed homogeneous SiGe fin (∼90% Ge concentration) with no observable dislocations. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

37. Text-to-image synthesis with self-supervised bi-stage generative adversarial network.

Author: Tan, Yong Xuan, Lee, Chin Poo, Neo, Mai, Lim, Kian Ming, and Lim, Jit Yan
Subjects: *GENERATIVE adversarial networks, *IMAGE registration, *SUPERVISED learning
Abstract: • Self-supervision is integrated into the bi-stage text-to-image synthesis model to provide more training sample variants. • Self-supervision is integrated to create multitask learning for optimizing the learned representation. • The L1 distance loss minimizes the distance between real and synthesized images to enhance visual realism. • The one-sided label smoothing alleviates the overconfidence of the discriminators, thus stabilizing the model training. • The feature matching mitigates mode collapse, thus diversifying the synthesized images and optimizing the model training. Text-to-image synthesis is challenging as generating images that are visually realistic and semantically consistent with the given text description involves multi-modal learning with text and image. To address the challenges, this paper presents a text-to-image synthesis model that utilizes self-supervision and bi-stage image distribution architecture, referred to as the Self-Supervised Bi-Stage Generative Adversarial Network (SSBi-GAN). The self-supervision diversifies the learned representation thus improving the quality of the synthesized images. Besides that, the bi-stage architecture with Residual network enables the generation of larger images with finer visual contents. Not only that, some enhancements including L1 distance, one-sided smoothing and feature matching are incorporated to enhance the visual realism and semantic consistency of the images as well as the training stability of the model. The empirical results on Oxford-102 and CUB datasets corroborate the ability of the proposed SSBi-GAN in generating visually realistic and semantically consistent images. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

38. Text-to-image synthesis with self-supervised learning.

Author: Tan, Yong Xuan, Lee, Chin Poo, Neo, Mai, and Lim, Kian Ming
Subjects: *GENERATIVE adversarial networks, *SUPERVISED learning, *IMAGE registration, *SIGNAL generators
Abstract: • Integrating self-supervised learning into discriminator to improve learned representation. • The self-supervised signal enable generator to produce more diverse images. • Enhancing the training stability and image quality of generative adversarial nets. • A text-to-image synthesis framework with self-supervised learning. Text-to-image synthesis extracts the meaning from the text description and converts it into an image correspondingly. Text-to-image synthesis is widely leveraged in many applications, such as graphic design, image editing, etc. Text-to-image synthesis approaches are mainly built on the basis of generative adversarial networks. One of the main challenges in text-to-image synthesis is to generate images that are visually realistic. Not only that, the text-to-image synthesis model is inherently susceptible to overconfidence and training instability issues. To address these challenges, this paper proposes a self-supervised text-to-image synthesis with some enhancements, including self-supervised learning, feature matching, L1 distance loss, and one-sided label smoothing. The self-supervised learning offers more image variations thus improving the classification power of the discriminator. The feature matching and L1 distance functions motivate the generator to synthesize images that are visually more similar to the real images based on the given text description. The one-sided label smoothing adds a penalty value when the discriminator makes a correct classification to alleviate the overconfidence problem and to improve the training stability. The performance of the proposed self-supervised text-to-image synthesis is evaluated on the Oxford-102 and CUB datasets. The empirical results demonstrate that the proposed self-supervised text-to-image synthesis generates images with richer image content diversity, more visually realistic, and more semantically consistent with the given text description. The proposed self-supervised text-to-image synthesis also outshines the methods in comparison in terms of the inception score and Structural Similarity Index. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

39. The Role of Carbon and Dysprosium in Ni[Dy]Si:C Contacts for Schottky-Barrier Height Reduction and Application in N-Channel MOSFETs With Si:C Source/Drain Stressors.

Author: Tek Po Lee, Rinus, Tian-Yi Koh, Alvin, Kian-Ming Tan, Tsung-Yang Liow, Dong Zhi Chi, and Yee-Chia Yeo
Subjects: *CARBON, *DYSPROSIUM, *SILICIDES, *SCHOTTKY barrier diodes, *METAL semiconductor field-effect transistors
Abstract: We clarify the role of carbon and dysprosium in nickel-dysprosium-silicide (Ni[Dy]Si:C) contacts formed on silicon:carbon (Si1-yCy or Si:C) for Schottky-barrier height (SBH) reduction. Carbon-induced energy bandgap Eg narrowing and the segregation of dysprosium (Dy) at the Ni[Dy]Si:C/Si:C interface were shown to be responsible for SBH reduction in this paper. First, we show that electron barrier height (ΦBN) reduction of up to 69 meV (or 10.3%) for NiSi can be achieved with the scaling of substitutional carbon Csub concentration from 0% to 1.0%. Second, new evidence revealing the segregation of Dy-based interlayer at the Ni[Dy]Si:C/Si:C interface and an additional 321 meV (or 53%) reduction in ΦBN for NiSi:C are presented. This could be due to charge transfer at the Ni[Dy]Si:C/Si:C interface. The successful modulation of ΦBN for Ni[Dy]S:C translates to an effective 41% reduction in device REXT, resulting in improved drive current performance. This opens new avenues to optimize the Si1-yCy contact interface for extending transistor performance in future technological generations. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

40. Magnetic resonance imaging of painful swollen legs in the emergency department: a pictorial essay.

Author: Chawla, Ashish, Dubey, Niraj, Chew, Kian, Singh, Dinesh, Gaikwad, Vishal, Peh, Wilfred, Chew, Kian Ming, and Peh, Wilfred Cg
Subjects: *LEG pain, *NECROTIZING fasciitis, *LEG, *DIFFERENTIAL diagnosis, *EMERGENCY medical services, *DIAGNOSIS, *MAGNETIC resonance imaging, *THERAPEUTICS
Abstract: Patients presenting with a painful swollen leg are not infrequently encountered at the emergency department and can pose a diagnostic dilemma for attending physicians. The potential causes of leg pain and swelling include trauma, infection, inflammation, and neurogenic, vascular, and iatrogenic conditions; with magnetic resonance imaging (MRI) being an important tool in evaluation. We describe the MRI features of various conditions causing painful swollen legs. We also discuss the differential diagnosis and the useful clinical and laboratory findings that radiologists should be aware of, in order to arrive at an accurate diagnosis. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

41. Nonlinear mixed effects modelling for the analysis of longitudinal body core temperature data in healthy volunteers.

Author: Kok-Yong Seng, Ying Chen, Ting Wang, Adam Kian Ming Chai, David Chiok Yuen Fun, Ya Shi Teo, Pearl Min Sze Tan, Wee Hon Ang, and Jason Kai Wei Lee
Subjects: *VOLUNTEERS, *BODY temperature, *MEDICAL thermography, *PHYSICAL diagnosis, *PHYSIOLOGY
Abstract: Many longitudinal studies have collected serial body core temperature (Tc) data to understand thermal work strain of workers under various environmental and operational heat stress environments. This provides the opportunity for the development of mathematical models to analyse and forecast temporal Tc changes across populations of subjects. Such models can reduce the need for invasive methods that continuously measure Tc. This current work sought to develop a nonlinear mixed effects modelling framework to delineate the dynamic changes of Tc and its association with a set of covariates of interest (e.g. heart rate, chest skin temperature), and the structure of the variability of Tc in various longitudinal studies. Data to train and evaluate the model were derived from two laboratory investigations involving male soldiers who participated in either a 12 (N = 18) or 15 km (N = 16) foot march with varied clothing, load and heat acclimatisation status. Model qualification was conducted using nonparametric bootstrap and cross validation procedures. For cross validation, the trajectory of a new subject’s Tc was simulated via Bayesian maximum a posteriori estimation when using only the baseline Tc or using the baseline Tc as well as measured Tc at the end of every work (march) phase. The final model described Tc versus time profiles using a parametric function with its main parameters modelled as a sigmoid hyperbolic function of the load and/or chest skin temperature. Overall, Tc predictions corresponded well with the measured data (root mean square deviation: 0.16 °C), and compared favourably with those provided by two recently published Kalman filter models. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

41 results on '"Kian Ming"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources