Author: "Pilsung Kang" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Pilsung Kang"' showing total 166 results

Start Over Author "Pilsung Kang"

166 results on '"Pilsung Kang"'

1. Detection and Defense: Student-Teacher Network for Adversarial Robustness

Author: Kyoungchan Park and Pilsung Kang
Subjects: Adversarial attack, adversarial detection, adversarial defense, student-teacher network, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Defense against adversarial attacks is critical for the reliability and safety of deep neural networks (DNNs). Current state-of-the-art defense methods achieve significant robustness against adversarial attacks. However, such defense methods cannot distinguish between adversarial examples (AEs) and normal examples (NEs). Thus, they apply the same defense process for both examples to perform classification, resulting in performance degradation for NEs. In this paper, we propose a novel defense method based on the student-teacher framework that can minimize the classification performance degradation for NEs by detecting AEs and then applying the defense process only to AEs. Focusing on the fact that distortion in the hidden layer features is inevitable for the success of adversarial attacks, we train the student network to predict the undistorted hidden layer features of the teacher network (target DNN). Therefore, our method can detect AEs through the difference in the hidden layer features between the student and teacher network, and then recover the classification result of AEs using the penultimate layer features predicted by the student network. Through extensive experiments on representative image classification benchmark datasets, i.e., CIFAR-10, CIFAR-100, and TinyImagenet, we demonstrate the superiority of our method in both detection and defense compared with state-of-the-art methods. Furthermore, we show that our method achieves robust detection and defense performance for a fully white-box attack that assumes an attacker knows the information of our entire detection and defense mechanism.
Published: 2024
Full Text: View/download PDF

2. The complete mitochondrial genome of an Antarctic moss, Andreaea regularis Müll. Hal. 1890 (Andreaeaceae)

Author: Kyungwon Min, Syahril Sulaiman, Hyodong Lee, Pilsung Kang, Young-Jun Yoon, and Hyoungseok Lee
Subjects: andreaeales, bryophyte, mitogenome, phylogeny, rock moss, Genetics, QH426-470
Abstract: In the present study, we determined the complete mitochondrial genome of Andreaea regularis Müll. Hal. 1890, a lantern moss of the genus Andreaea Hedw. (Andreaeaceae). The A. regularis mitochondrial genome, with a total length of 118,833 bp, consists of 40 protein-coding genes, 3 ribosomal RNA genes, and 24 transfer RNA genes. A phylogenetic tree constructed with 19 complete mitochondrial genomes composed of liverworts, hornworts, and 15 mosses showed that Andreaeales formed the closest sister to Sphagnales before divergence of the remaining moss groups, indicating A. regularis being one of the earliest mosses. Our findings could be beneficial to investigate the bryophyte evolution.
Published: 2023
Full Text: View/download PDF

3. Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding

Author: Myeongsup Kim and Pilsung Kang
Subjects: Text embedding augmentation, adversarial training, pseudo-label, generating, retraining, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Pre-trained language models (LMs) have been shown to achieve outstanding performance in various natural language processing tasks; however, these models have a significantly large number of parameters to handle large-scale text corpora during the pre-training process, and thus, they entail the risk of overfitting when fine-tuning for small task-oriented datasets is conducted. In this paper, we propose a text embedding augmentation method to prevent such overfitting. The proposed method applies augmentation to a text embedding by generating an adversarial embedding, which is not identical to original input embedding but maintaining the characteristics of the original input embedding, using PGD-based adversarial training for input text data. A pseudo-label that is identical to the label of the input text is then assigned to adversarial embedding to conduct retraining by using adversarial embedding and pseudo-label as input embedding and label pair for a separate LM. Experimental results on several text classification benchmark datasets demonstrated that the proposed method effectively prevented overfitting, which commonly occurs when adjusting a large-scale pre-trained LM to a specific task.
Published: 2022
Full Text: View/download PDF

4. AnoViT: Unsupervised Anomaly Detection and Localization With Vision Transformer-Based Encoder-Decoder

Author: Yunseung Lee and Pilsung Kang
Subjects: Anomaly detection, anomaly localization, vision transformer, MVTecAD, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Image anomaly detection problems aim to determine whether an image is abnormal, and to detect anomalous areas. These methods are actively used in various fields such as manufacturing, medical care, and intelligent information. Encoder-decoder structures have been widely used in the field of anomaly detection because they can easily learn normal patterns in an unsupervised learning environment and calculate a score to identify abnormalities through a reconstruction error indicating the difference between input and reconstructed images. Therefore, current image anomaly detection methods have commonly used convolutional encoder-decoders to extract normal information through the local features of images. However, they are limited in that only local features of the image can be utilized when constructing a normal representation owing to the characteristics of convolution operations using a filter of fixed size. Therefore, we propose a vision transformer-based encoder-decoder model, named AnoViT, designed to reflect normal information by additionally learning the global relationship between image patches, which is capable of both image anomaly detection and localization. While existing vision transformers perform image classification using only a class token, the proposed approach constructs a feature map that maintains the existing location information of individual patches by using the embeddings of all patches passed through multiple self-attention layers. Subsequently, the feature map, which has been transformed into three dimensions, is used to perform decoding. This design preserves the spatial information sufficiently by excluding the fully-connected layer, which extracts latent vectors in existing convolution-based encoder-decoders. The proposed AnoViT model performed better than the convolution-based model on three benchmark datasets. In MVTecAD, which is a representative benchmark dataset for anomaly localization, it showed improved results on 10 out of 15 classes compared with the baseline. Furthermore, the proposed method showed good performance regardless of the class and type of the anomalous area when localization results were evaluated qualitatively.
Published: 2022
Full Text: View/download PDF

5. De Novo Transcriptome Assembly and Comparative Analysis of Differentially Expressed Genes Involved in Cold Acclimation and Freezing Tolerance of the Arctic Moss Aulacomnium turgidum (Wahlenb.) Schwaegr

Author: Pilsung Kang, Yo-Han Yoo, Dong-Il Kim, Joung Han Yim, and Hyoungseok Lee
Subjects: Aulacomnium turgidum, cold acclimation, de novo assembly, freezing stress, RNA-seq, Botany, QK1-989
Abstract: Cold acclimation refers to a phenomenon in which plants become more tolerant to freezing after exposure to non-lethal low temperatures. Aulacomnium turgidum (Wahlenb.) Schwaegr is a moss found in the Arctic that can be used to study the freezing tolerance of bryophytes. To improve our understanding of the cold acclimation effect on the freezing tolerance of A. turgidum, we compared the electrolyte leakage of protonema grown at 25 °C (non-acclimation; NA) and at 4 °C (cold acclimation; CA). Freezing damage was significantly lower in CA plants frozen at −12 °C (CA-12) than in NA plants frozen at −12 °C (NA-12). During recovery at 25 °C, CA-12 demonstrated a more rapid and greater level of the maximum photochemical efficiency of photosystem II than NA-12, indicating a greater recovery capacity for CA-12 compared to NA-12. For the comparative analysis of the transcriptome between NA-12 and CA-12, six cDNA libraries were constructed in triplicate, and RNA-seq reads were assembled into 45,796 unigenes. The differential gene expression analysis showed that a significant number of AP2 transcription factor genes and pentatricopeptide repeat protein-coding genes related to abiotic stress and the sugar metabolism pathway were upregulated in CA-12. Furthermore, starch and maltose concentrations increased in CA-12, suggesting that cold acclimation increases freezing tolerance and protects photosynthetic efficiency through the accumulation of starch and maltose in A. turgidum. A de novo assembled transcriptome can be used to explore genetic sources in non-model organisms.
Published: 2023
Full Text: View/download PDF

6. Learning-Free Unsupervised Extractive Summarization Model

Author: Myeongjun Jang and Pilsung Kang
Subjects: Text summarization, natural language processing, sentence representation vector, integer linear programming, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Text summarization is an information condensation technique that abbreviates a source document to a few representative sentences with the intention to create a coherent summary containing relevant information of source corpora. This promising subject has been rapidly developed since the advent of deep learning. However, summarization models based on deep neural network have several critical shortcomings. First, a large amount of labeled training data is necessary. This problem is standard for low-resource languages in which publicly available labeled data do not exist. In addition, a significant amount of computational ability is required to train neural models with enormous network parameters. In this study, we propose a model called Learning Free Integer Programming Summarizer (LFIP-SUM), which is an unsupervised extractive summarization model. The advantage of our approach is that parameter training is unnecessary because the model does not require any labeled training data. To achieve this, we formulate an integer programming problem based on pre-trained sentence embedding vectors. We also use principal component analysis to automatically determine the number of sentences to be extracted and to evaluate the importance of each sentence. Experimental results demonstrate that the proposed model exhibits generally acceptable performance compared with deep learning summarization models although it does not learn any parameters during the model construction process.
Published: 2021
Full Text: View/download PDF

7. Lifelong Language Learning With the Most Forgotten Knowledge

Author: Heejeong Choi and Pilsung Kang
Subjects: Lifelong language learning, natural language processing, catastrophic forgetting, a stream of text data, generative replay, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Lifelong language learning enables a language model to accumulate knowledge through training on a stream of text data. Recent research on lifelong language learning is based on samples of previous tasks from an episodic memory or generative model. LAMOL, a representative generative model-based lifelong language learning model, preserves the previous information with the generated pseudo-old samples, which are suboptimal. In this paper, we propose an improved version of LAMOL, MFK-LAMOL, which constructs a generative replay using a more effective method. When a new task is received, MFK-LAMOL replays sufficient previous data and retrieves important examples for training alongside the new task. Specifically, it selects the examples with the most forgotten knowledge learned from previous tasks based on the extent to which they include knowledge that has been forgotten after learning new information. We showed that the proposed method outperforms LAMOL on a stream of three different natural language processing tasks.
Published: 2021
Full Text: View/download PDF

8. Intrusion Detection Based on Sequential Information Preserving Log Embedding Methods and Anomaly Detection Algorithms

Author: Czangyeob Kim, Myeongjun Jang, Seungwan Seo, Kyeongchan Park, and Pilsung Kang
Subjects: System anomaly detection, cyber security, system log embedding, advanced persistent threat, ADFA-LD, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Previous methods for system intrusion detection have mainly consisted of those based on pattern matching that employs prior knowledge extracted from experts’ domain knowledge. However, pattern matching-based methods have a major drawback that it can be bypassed through various modified techniques. These advanced persistent threats cause limitation to the pattern matching-based detecting mechanism, because they are not only more sophisticated than usual threats but also specialized in the targeted attacking object. The defense mechanism should have to comprehend unusual phenomenons or behaviors to successfully handles the advanced threats. To achieve this, various security techniques based on machine learning have been developed recently. Among these, anomaly detection algorithms, which are trained in unsupervised fashion, are capable of reducing efforts of security experts and securing labeled dataset through post analysis. It is further possible to distinguish abnormal behaviors more precisely by training classification models if sufficient amounts of labeled dataset is obtained through post analysis of anomaly detection results. In this study, we proposed an end-to-end abnormal behavior detection method based on sequential information preserving log embedding algorithms and machine learning-based anomaly detection algorithms. Contrary to other machine learning based system anomaly detection models, which borrow domain experts’ knowledge to extract significant features from the log data, raw log data are transformed into a fixed size of continuous vector regardless of their length, and these vectors are used to train the anomaly detection models. Experimental results based on a real system call trace dataset, our proposed log embedding method with unsupervised anomaly detection model yielded a favorable performance, at most 0.8708 in terms of AUROC, and it can be further improved up to 0.9745 with supervised classification algorithms if sufficient labeled attack log data become available.
Published: 2021
Full Text: View/download PDF

9. Comparative Study of Deep Learning-Based Sentiment Classification

Author: Seungwan Seo, Czangyeob Kim, Haedong Kim, Kyounghyun Mo, and Pilsung Kang
Subjects: Sentiment classification, deep learning, convolutional neural network, recurrent neural network, word embedding, character embedding, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The purpose of sentiment classification is to determine whether a particular document has a positive or negative nuance. Sentiment classification is extensively used in many business domains to improve products or services by understanding the opinions of customers regarding these products. Deep learning achieves state-of-the-art results in various challenging domains. With the success of deep learning, many studies have proposed deep-learning-based sentiment classification models and achieved better performances compared with conventional machine learning models. However, one practical issue occurring in deep-learning-based sentiment classification is that the best model structure depends on the characteristics of the dataset on which the deep learning model is trained; moreover, it is manually determined based on the domain knowledge of an expert or selected from a grid search of possible candidates. Herein, we present a comparative study of different deep-learning-based sentiment classification model structures to derive meaningful implications for building sentiment classification models. Specifically, eight deep-learning models, three based on convolutional neural networks and five based on recurrent neural networks, with two types of input structures, i.e., word level and character level, are compared for 13 review datasets, and the classification performances are discussed under different perspectives.
Published: 2020
Full Text: View/download PDF

10. A Taste of Scientific Computing on the GPU-Accelerated Edge Device

Author: Pilsung Kang and Sungmin Lim
Subjects: Edge computing, scientific computing, GPU (graphics processing unit), Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: The computing power provided by the edge device is becoming increasingly improved in recent days, largely due to the technological development that realizes accelerators with a great amount of hardware parallelism. Thanks to the enhanced computing power, the application domain of the edge device is greatly expanding so that it is capable of performing new tasks previously inconceivable. In this paper, we investigate the potentials of the state-of-the-art edge devices in the context of scientific computing. We implement and perform stochastic biochemical simulation on a small cluster of modern, GPU-accelerated edge systems to examine the computational capability of the device. In addition, we perform and analyze the NAS Parallel Benchmarks (NPB) performance on the edge device. By comparing the performance of the edge device with other GPU (graphics processing unit) and CPU based systems across different platforms, we evaluate the applicability and usefulness of the modern hardware-accelerated edge devices in the scientific computing domain.
Published: 2020
Full Text: View/download PDF

11. Programming for High-Performance Computing on Edge Accelerators

Author: Pilsung Kang
Subjects: edge computing, parallel systems, high-performance computing, GPU (Graphics Processing Unit), accelerators, programming model, Mathematics, QA1-939
Abstract: The field of edge computing has grown considerably over the past few years, with applications in artificial intelligence and big data processing, particularly due to its powerful accelerators offering a large amount of hardware parallelism. As the computing power of the latest edge systems increases, applications of edge computing are being expanded to areas that have traditionally required substantially high-performant computing resources such as scientific computing. In this paper, we review the latest literature and present the current status of research for implementing high-performance computing (HPC) on edge devices equipped with parallel accelerators, focusing on software environments including programming models and benchmark methods. We also examine the applicability of existing approaches and discuss possible improvements necessary towards realizing HPC on modern edge systems.
Published: 2023
Full Text: View/download PDF

12. An Evaluation of Modern Accelerator-Based Edge Devices for Object Detection Applications

Author: Pilsung Kang and Athip Somtham
Subjects: edge computing, object detection, GPU (graphics processing unit), TPU (tensor processing unit), deep learning, Mathematics, QA1-939
Abstract: Edge AI is one of the newly emerged application domains where networked IoT (Internet of Things) devices are deployed to perform AI computations at the edge of the cloud environments. Today’s edge devices are typically equipped with powerful accelerators within their architecture to efficiently process the vast amount of data generated in place. In this paper, we evaluate major state-of-the-art edge devices in the context of object detection, which is one of the principal applications of modern AI technology. For our evaluation study, we choose recent devices with different accelerators to compare performance behavior depending on different architectural characteristics. The accelerators studied in this work include the GPU and the edge version of the TPU, and these accelerators can be used to boost the performance of deep learning operations. By performing a set of major object detection neural network benchmarks on the devices and by analyzing their performance behavior, we assess the effectiveness and capability of the modern edge devices accelerated by a powerful parallel hardware. Based on the benchmark results in the perspectives of detection accuracy, inference latency, and energy efficiency, we provide a latest report of comparative evaluation for major modern edge devices in the context of the object detection application of the AI technology.
Published: 2022
Full Text: View/download PDF

13. Supervised Paragraph Vector: Distributed Representations of Words, Documents and Class Labels

Author: Eunjeong L. Park, Sungzoon Cho, and Pilsung Kang
Subjects: Class label, distributed representations, representation learning, document embedding, word embedding, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: While the traditional method of deriving representations for documents was bag-of-words, they suffered from high dimensionality and sparsity. Recently, many methods to obtain lower dimensional and densely distributed representations were proposed. Paragraph Vector is one of such algorithms, which extends the word2vec algorithm by considering the paragraph as an additional word. However, it generates a single representation for all tasks, while different tasks may require different representations. In this paper, we propose a Supervised Paragraph Vector, a task-specific variant of Paragraph Vector for situations where class labels exist. Essentially, Supervised Paragraph Vector uses class labels along with words and documents and obtains corresponding representations with respect to the particular classification task. In order to prove the benefits of the proposed algorithm, three performance criteria are used: interpretability, discriminative power, and computational efficiency. To test interpretability, we find words that are close and far to class vectors and demonstrate that such words are closely related to the corresponding class. We also use principal component analysis to visualize all words, documents, and class labels at the same time and show that our method effectively displays the related words and documents for each class label. To evaluate discriminative power and computational efficiency, we perform document classification on four commonly used datasets with various classifiers and achieve comparable classification accuracies to bag-of-words and Paragraph Vector.
Published: 2019
Full Text: View/download PDF

14. Machine Learning Classification of First-Onset Drug-Naive MDD Using Structural MRI

Author: Donghwa Kim, Pilsung Kang, Junhong Kim, Czang Yeob Kim, Jong-Ha Lee, Sangil Suh, and Moon-Soo Lee
Subjects: Adolescent, depression, diagnosis, machine learning, neuroimaging, support vector machine, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: It is hard to differentiate adolescent Major Depressive Disorder (MDD) patients from healthy adolescent controls based on structural MRI research findings, as the clinical characteristics of the patient group are heterogeneous, and the neuroimaging study results are ambiguous. We aimed to determine whether it is possible to reliably train a highly accurate predictive classification algorithm, even with the first onset of drug-naive adolescent MDD, solely using structural magnetic resonance imaging and without using any other clinical data from the patients. We also estimated the probability of the subject belonging to the predicted class to quantify the confidence of the prediction. Medication-naive adolescent patients in their first episode of MDD and healthy volunteers, matched for age, sex, and years of education, were prospectively recruited. Twenty-seven patients and 27 controls participated in the study. The two most significant variables were the standard deviations of intensity of the right ventral diencephalon and thickness of the superior segment of the circular sulcus of the insula. A participant is diagnosed as having MDD when the variation of either intensity in the right ventral diencephalon region or thickness of the superior segment of the circular sulcus of the insula increases. Structural brain changes can be used to build an accurate classification model for machine learning, even when the duration of illness is relatively short and the influence of MDD on the brain structure is minimal.
Published: 2019
Full Text: View/download PDF

15. Draw-a-Deep Pattern: Drawing Pattern-Based Smartphone User Authentication Based on Temporal Convolutional Neural Network

Author: Junhong Kim and Pilsung Kang
Subjects: mobile user authentication, behavioral biometrics, temporal convolution neural network, recurrent neural network, sequence modeling, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Present-day smartphones provide various conveniences, owing to high-end hardware specifications and advanced network technology. Consequently, people rely heavily on smartphones for a myriad of daily-life tasks, such as work scheduling, financial transactions, and social networking, which require a strong and robust user authentication mechanism to protect personal data and privacy. In this study, we propose draw-a-deep-pattern (DDP)—a deep learning-based end-to-end smartphone user authentication method using sequential data obtained from drawing a character or freestyle pattern on the smartphone touchscreen. In our model, a recurrent neural network (RNN) and a temporal convolution neural network (TCN), both of which are specialized in sequential data processing, are employed. The main advantages of the proposed DDP are (1) it is robust to the threats to which current authentication systems are vulnerable, e.g., shoulder surfing attack and smudge attack, and (2) it requires few parameters for training; therefore, the model can be consistently updated in real-time, whenever new training data are available. To verify the performance of the DDP model, we collected data from 40 participants in one of the most unfavorable environments possible, wherein all potential intruders know how the authorized users draw the characters or symbols (shape, direction, stroke, etc.) of the drawing pattern used for authentication. Of the two proposed DDP models, the TCN-based model yielded excellent authentication performance with average values of 0.99%, 1.41%, and 1.23% in terms of AUROC, FAR, and FRR, respectively. Furthermore, this model exhibited improved authentication performance and higher computational efficiency than the RNN-based model in most cases. To contribute to the research/industrial communities, we made our dataset publicly available, thereby allowing anyone studying or developing a behavioral biometric-based user authentication system to use our data without any restrictions.
Published: 2022
Full Text: View/download PDF

16. High-resolution adaptive optical imaging within thick scattering media using closed-loop accumulation of single scattering

Author: Sungsam Kang, Pilsung Kang, Seungwon Jeong, Yongwoo Kwon, Taeseok D. Yang, Jin Hee Hong, Moonseok Kim, Kyung–Deok Song, Jin Hyoung Park, Jun Ho Lee, Myoung Joon Kim, Ki Hean Kim, and Wonshik Choi
Subjects: Science
Abstract: Optical imaging deep in biological tissue is difficult due to multiple scattering and specimen induced aberrations of both the incident and reflected light. Here, Kang et al. develop an adaptive closed-loop algorithm to correct tissue aberrations in the presence of multiple scattering for deep tissue imaging.
Published: 2017
Full Text: View/download PDF

17. Solving the Cold-Start Problem in Short-Term Load Forecasting Using Tree-Based Methods

Author: Jihoon Moon, Junhong Kim, Pilsung Kang, and Eenjun Hwang
Subjects: short-term load forecasting, building electric energy consumption forecasting, cold-start problem, transfer learning, multivariate random forests, random forest, Technology
Abstract: An energy-management system requires accurate prediction of the electric load for optimal energy management. However, if the amount of electric load data is insufficient, it is challenging to perform an accurate prediction. To address this issue, we propose a novel electric load forecasting scheme using the electric load data of diverse buildings. We first divide the electric energy consumption data into training and test sets. Then, we construct multivariate random forest (MRF)-based forecasting models according to each building except the target building in the training set and a random forest (RF)-based forecasting model using the limited electric load data of the target building in the test set. In the test set, we compare the electric load of the target building with that of other buildings to select the MRF model that is the most similar to the target building. Then, we predict the electric load of the target building using its input variables via the selected MRF model. We combine the MRF and RF models by considering the different electric load patterns on weekdays and holidays. Experimental results demonstrate that combining the two models can achieve satisfactory prediction performance even if the electric data of only one day are available for the target building.
Published: 2020
Full Text: View/download PDF

18. The complete mitogenome of the Arctic moss Aulacomnium turgidum (Wahlenb.) Schwaegr

Author: Pilsung Kang, Sung Mi Cho, Jungeun Lee, Joung Han Yim, Jun Hyuck Lee, and Hyoungseok Lee
Subjects: arctic moss, aulacomnium turgidum, bryophyta, rhizogoniales, Genetics, QH426-470
Abstract: The Arctic moss Aulacomnium turgidum (Wahlenb.) Schwaegr. is distributed widely above the Arctic Circle and can regenerate successfully after 400 years of ice entombment. Here, we report the complete mitogenome sequence of A. turgidum (103,937 bp). The genome contains 3 ribosomal RNAs, 24 transfer RNAs, and 40 protein-encoding genes. In a phylogenetic tree generated using the combined amino acid sequences of 32 mitochondrial genes from A. turgidum, 25 Bryophyta, and three Marchantiophyta, the phylogenetic position of A. turgidum (Rhizogoniales) is close to that of the Hypnales and Ptychomniales, forming a monophyletic clade with perfect supporting values.
Published: 2019
Full Text: View/download PDF

19. Insider Threat Detection Based on User Behavior Modeling and Anomaly Detection Algorithms

Author: Junhong Kim, Minsik Park, Haedong Kim, Suhyoun Cho, and Pilsung Kang
Subjects: insider threat detection, anomaly detection, machine learning, behavioral model, latent dirichlet allocation, e-mail network, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Insider threats are malicious activities by authorized users, such as theft of intellectual property or security information, fraud, and sabotage. Although the number of insider threats is much lower than external network attacks, insider threats can cause extensive damage. As insiders are very familiar with an organization’s system, it is very difficult to detect their malicious behavior. Traditional insider-threat detection methods focus on rule-based approaches built by domain experts, but they are neither flexible nor robust. In this paper, we propose insider-threat detection methods based on user behavior modeling and anomaly detection algorithms. Based on user log data, we constructed three types of datasets: user’s daily activity summary, e-mail contents topic distribution, and user’s weekly e-mail communication history. Then, we applied four anomaly detection algorithms and their combinations to detect malicious activities. Experimental results indicate that the proposed framework can work well for imbalanced datasets in which there are only a few insider threats and where no domain experts’ knowledge is provided.
Published: 2019
Full Text: View/download PDF

20. Bin2Vec: A Better Wafer Bin Map Coloring Scheme for Comprehensible Visualization and Effective Bad Wafer Classification

Author: Junhong Kim, Hyungseok Kim, Jaesun Park, Kyounghyun Mo, and Pilsung Kang
Subjects: wafer bin map (WBM), Bin2Vec, Word2Vec, bad wafer classification, convolution neural network, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: A wafer bin map (WBM), which is the result of an electrical die-sorting test, provides information on which bins failed what tests, and plays an important role in finding defective wafer patterns in semiconductor manufacturing. Current wafer inspection based on WBM has two problems: good/bad WBM classification is performed by engineers and the bin code coloring scheme does not reflect the relationship between bin codes. To solve these problems, we propose a neural network-based bin coloring method called Bin2Vec to make similar bin codes are represented by similar colors. We also build a convolutional neural network-based WBM classification model to reduce the variations in the decisions made by engineers with different expertise by learning the company-wide historical WBM classification results. Based on a real dataset with a total of 27,701 WBMs, our WBM classification model significantly outperformed benchmarked machine learning models. In addition, the visualization results of the proposed Bin2Vec method makes it easier to discover meaningful WBM patterns compared with the random RGB coloring scheme. We expect the proposed framework to improve both efficiencies by automating the bad wafer classification process and effectiveness by assigning similar bin codes and their corresponding colors on the WBM.
Published: 2019
Full Text: View/download PDF

21. Technical Report of NICE Challenge at CVPR 2024: Caption Re-ranking Evaluation Using Ensembled CLIP and Consensus Scores.

Author: Kiyoon Jeong, Woojun Lee, Woongchan Nam, Minjeong Ma, and Pilsung Kang 0001
Published: 2024
Full Text: View/download PDF

22. Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification.

Author: Yookyung Kho, Jaehee Kim, and Pilsung Kang 0001
Published: 2023
Full Text: View/download PDF

23. Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews.

Author: Yukyung Lee, Jaehee Kim, Doyoon Kim, Yookyung Kho, Younsun Kim, and Pilsung Kang 0001
Published: 2023
Full Text: View/download PDF

24. Which is better? Exploring Prompting Strategy For LLM-based Metrics.

Author: Joonghoon Kim, Sangmin Lee, Seung Hun Han, Saeran Park, Jiyoon Lee, Kiyoon Jeong, and Pilsung Kang 0001
Published: 2023
Full Text: View/download PDF

25. K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables.

Author: Jounghee Kim and Pilsung Kang 0001
Published: 2022
Full Text: View/download PDF

26. An Active Reference Reset Method Adapting Distribution Shift for Robust System Anomaly Detection.

Author: Seungwan Seo, Heejeong Choi, and Pilsung Kang 0001
Published: 2022
Full Text: View/download PDF

27. Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking.

Author: Takyoung Kim, Hoonsang Yoon, Yukyung Lee, Pilsung Kang 0001, and Misuk Kim
Published: 2022
Full Text: View/download PDF

28. Multi2OIE: Multilingual Open Information Extraction based on Multi-Head Attention with BERT.

Author: Youngbin Ro, Yukyung Lee, and Pilsung Kang 0001
Published: 2020
Full Text: View/download PDF

29. Benchmarking GPU-Accelerated Edge Devices.

Author: Jongmin Jo, Sucheol Jeong, and Pilsung Kang 0002
Published: 2020
Full Text: View/download PDF

30. Implementing Scientific Simulations on GPU-accelerated Edge Devices.

Author: Sungmin Lim and Pilsung Kang 0002
Published: 2020
Full Text: View/download PDF

31. Anomaly Detection of Air Dryer for Radar based on Machine Learning Algorithms

Author: Yojin Kim, Pilsung Kang, Dong-Won Yeom, and Kun-Woo Kim
Published: 2023

32. Machine Learning Model-based Faulty Wafer Classification and Test Item Reduction

Author: Hoyeong Kim and Pilsung Kang
Subjects: General Earth and Planetary Sciences, General Environmental Science
Published: 2022

33. Cross-modal distillation with audio–text fusion for fine-grained emotion classification using BERT and Wav2vec 2.0

Author: Donghwa Kim and Pilsung Kang
Subjects: Artificial Intelligence, Cognitive Neuroscience, Computer Science Applications
Published: 2022

34. Purification and characterization of a new cold-active cellulolytic enzyme produced by Pseudoalteromonas sp. ArcC09 from the Arctic Beaufort Sea

Author: Min Ju Kim, Ha Ju Park, Pilsung Kang, Il Chan Kim, Joung Han Yim, and Se Jong Han
Subjects: Environmental Engineering, Bioengineering, Waste Management and Disposal
Abstract: A cold-active endoglucanase-producing bacterium was isolated from the Beaufort Sea of the Arctic Ocean and identified as Pseudoalteromonas sp. ArcC09. Cellulolytic activity of ArcC09 reached a maximum of 60 U/mg when cultivated in ZoBell medium for 72 h at 15 °C. This purified endoglucanase, with a molecular mass of 28 kDa, exhibited maximum activity at pH 7.0 and 55 °C. The ArcC09 endoglucanase exhibited 10% and 36% of its maximal activity even at low temperatures of 5 °C and 15 °C, respectively. However, it showed lower thermal stability than a mesophilic cellulase, which is characteristic of a psychrophilic enzyme. The activity was inhibited by CuSO4, and linear alkylbenzene sulfonate (LAS). These findings supplement the understanding of cold-active endoglucanases and may have commercial applications in enzymatic digestion of cellulosic biomass to fermentable sugars.
Published: 2022

35. KoBERTSEG: Local Context Based Topic Segmentation Using KoBERT

Author: Kyoosung So, Yunseung Lee, Euisuk Chung, and Pilsung Kang
Subjects: General Earth and Planetary Sciences, General Environmental Science
Published: 2022

36. Text-to-SQL for Korean Language based on Multilingual BERT

Author: Hoonsang Yoon, JaeHyuk Heo, Jeong Sub Kim, and Pilsung Kang
Subjects: General Earth and Planetary Sciences, General Environmental Science
Published: 2022

37. Building an Integrated Framework of Korean Text Summarization and Text-to-Speech

Author: Takyoung Kim, Jina Kim, Hyeongwon Kang, Subin Kim, and Pilsung Kang
Subjects: General Earth and Planetary Sciences, General Environmental Science
Published: 2022

38. Why Do Vision Transformers Have Better Adversarial Robustness than Cnns?

Author: Jaehyuk Heo, Seungwan Seo, and Pilsung Kang
Published: 2023

39. Multi-modal Korean Emotion Recognition with Consistency Regularization

Author: Jounghee Kim and Pilsung Kang
Subjects: General Earth and Planetary Sciences, General Environmental Science
Published: 2021

40. Dynamic Clustering for Wafer Map Patterns Using Self-Supervised Learning on Convolutional Autoencoders

Author: Pilsung Kang and Donghwa Kim
Subjects: Hyperparameter, business.industry, Computer science, Semiconductor device fabrication, Feature extraction, Pattern recognition, Condensed Matter Physics, Industrial and Manufacturing Engineering, Electronic, Optical and Magnetic Materials, Visualization, Data modeling, ComputingMethodologies_PATTERNRECOGNITION, Robustness (computer science), Benchmark (computing), Artificial intelligence, Electrical and Electronic Engineering, business, Cluster analysis
Abstract: Defect pattern analysis in wafer bin maps (WBM) plays a significant role in the semiconductor manufacturing process because it helps identify problematic steps or equipment so that process engineers can take appropriate actions to improve the overall yield. Clustering algorithms have been widely used to detect different defect patterns. However, most clustering algorithms, such as K-means clustering and self-organizing map, are required to determine the number of clusters in advance. To resolve this issue, we propose a self-supervised learning-based dynamic WBM clustering method. The proposed model first uses pseudo-labeled data, of which, the labels are dynamically determined by the Dirichlet process mixture model (DPMM). Thereafter, it is fine-tuned using pseudo-labels in a self-supervised manner. Experimental results based on the WM-811K dataset indicate that the proposed model not only outperforms the benchmark models but also demonstrates robustness to hyperparameters. In addition, the defect patterns identified by our model are more accurately and distinctively localized than those identified by the benchmark models.
Published: 2021

41. Unsupervised Anomaly Detection with Wider and Deeper LSTM-GAN for Energy Consumption Pattern

Author: Hye-Yeon Kim, Pilsung Kang, and Hyeong-Suk Kim
Subjects: Computer science, business.industry, General Earth and Planetary Sciences, Pattern recognition, Anomaly detection, Energy consumption, Artificial intelligence, business, General Environmental Science
Published: 2021

42. Maintainable and reusable scientific software adaptation: democratizing scientific software adaptation.

Author: Pilsung Kang 0002, Eli Tilevich, Srinidhi Varadarajan, and Naren Ramakrishnan
Published: 2011
Full Text: View/download PDF

43. K-Means Clustering Seeds Initialization Based on Centrality, Sparsity, and Isotropy.

Author: Pilsung Kang 0001 and Sungzoon Cho
Published: 2009
Full Text: View/download PDF

44. Modular, Fine-Grained Adaptation of Parallel Programs.

Author: Pilsung Kang 0002, Naresh K. C. Selvarasu, Naren Ramakrishnan, Calvin J. Ribbens, Danesh K. Tafti, and Srinidhi Varadarajan
Published: 2009
Full Text: View/download PDF

45. Modular implementation of adaptive decisions in stochastic simulations.

Author: Pilsung Kang 0002, Yang Cao 0001, Naren Ramakrishnan, Calvin J. Ribbens, and Srinidhi Varadarajan
Published: 2009
Full Text: View/download PDF

46. Improving Korean Emotion Classification via Colloquial-Adaptive Pretraining

Author: Donghwa Kim, Junghoon Lee, Pilsung Kang, and Youngbin Ro
Subjects: Emotion classification, General Earth and Planetary Sciences, Psychology, General Environmental Science, Cognitive psychology
Published: 2021

47. The Adaptive Code Kitchen: Flexible Tools for Dynamic Application Composition.

Author: Pilsung Kang 0002, Mike Heffner, Joy Mukherjee, Naren Ramakrishnan, Srinidhi Varadarajan, Calvin J. Ribbens, and Danesh K. Tafti
Published: 2007
Full Text: View/download PDF

48. Continual Retraining of Keystroke Dynamics Based Authenticator.

Author: Pilsung Kang 0001, Seongseob Hwang, and Sungzoon Cho
Published: 2007
Full Text: View/download PDF

49. EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems.

Author: Pilsung Kang 0001 and Sungzoon Cho
Published: 2006
Full Text: View/download PDF

50. Benchmarking Modern Edge Devices for AI Applications

Author: Jongmin Jo and Pilsung Kang
Subjects: Edge device, business.industry, Computer science, Deep learning, Benchmarking, Computer architecture, Artificial Intelligence, Hardware and Architecture, Computer Vision and Pattern Recognition, Applications of artificial intelligence, Artificial intelligence, Electrical and Electronic Engineering, business, Software, Edge computing
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

166 results on '"Pilsung Kang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources