Author: "Chang Su Kim" / Topic: artificial intelligence - Searchworks@Jio Institute Digital Library Search Results

1. Monocular Human Depth Estimation Via Pose Estimation

Author: Jae-Han Lee, Jinyoung Jun, Chul Lee, and Chang-Su Kim
Subjects: Monocular, General Computer Science, Mean squared error, Computer science, business.industry, Feature extraction, General Engineering, human pose estimation, human depth estimation, Estimator, Pattern recognition, TK1-9971, Feature (computer vision), Depth map, loss rebalancing strategy, General Materials Science, Artificial intelligence, Electrical engineering. Electronics. Nuclear engineering, business, Pose, Block (data storage), Monocular depth estimation
Abstract: We propose a novel monocular depth estimator, which improves the prediction accuracy on human regions by utilizing pose information. The proposed algorithm consists of two networks — PoseNet and DepthNet — to estimate keypoint heatmaps and a depth map, respectively. We incorporate the pose information from PoseNet to improve the depth estimation performance of DepthNet. Specifically, we develop the feature blending block, which fuses the features from PoseNet and DepthNet and feeds them into the next layer of DepthNet, to make the networks learn to predict the depths of human regions more accurately. Furthermore, we develop a novel joint training scheme using partially labeled datasets, which balances multiple loss functions effectively by adjusting weights. Experimental results demonstrate that the proposed algorithm can improve depth estimation performance significantly, especially around human regions. For example, the proposed algorithm improves the depth estimation performance on the human regions of ResNet-50 by 2.8% and 7.0% in terms of $\delta _{1}$ and RMSE, respectively, on the proposed HD + P dataset.
Published: 2021

2. Light Field Super-Resolution via Adaptive Feature Remixing

Author: Yeong Jun Koh, Soonkeun Chang, Keunsoo Ko, and Chang-Su Kim
Subjects: business.industry, Computer science, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Iterative reconstruction, Computer Graphics and Computer-Aided Design, Convolution, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Angular resolution, Artificial intelligence, business, Image resolution, Software, Light field, Interpolation
Abstract: A novel light field super-resolution algorithm to improve the spatial and angular resolutions of light field images is proposed in this work. We develop spatial and angular super-resolution (SR) networks, which can faithfully interpolate images in the spatial and angular domains regardless of the angular coordinates. For each input image, we feed adjacent images into the SR networks to extract multi-view features using a trainable disparity estimator. We concatenate the multi-view features and remix them through the proposed adaptive feature remixing (AFR) module, which performs channel-wise pooling. Finally, the remixed feature is used to augment the spatial or angular resolution. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art algorithms on various light field datasets. The source codes and pre-trained models are available at https://github.com/keunsoo-ko/ LFSR-AFR
Published: 2021
Full Text: View/download PDF

3. Multi-Scale Warping for Video Frame Interpolation

Author: Whan Choi, Yeong Jun Koh, and Chang-Su Kim
Subjects: Video frame interpolation, General Computer Science, Computer science, business.industry, Feature extraction, Frame (networking), General Engineering, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, convolutional neural network, kernel-based approach, multi-scale feature, deformable convolution, TK1-9971, Kernel (image processing), Benchmark (computing), General Materials Science, Computer vision, adaptive convolution, Artificial intelligence, Electrical engineering. Electronics. Nuclear engineering, Image warping, Motion interpolation, business, Encoder, Interpolation
Abstract: A novel video interpolation network to improve the temporal resolutions of video sequences is proposed in this work. We develop a multi-scale warping module to interpolate intermediate frames robustly for both small and large motions. Specifically, the proposed multi-scale warping module deals with large motions between two consecutive frames using coarse-scale features, while estimating detailed local motions by exploring fine-scale features. To this end, it takes multi-scale features from the encoder and estimates kernel weights and offset vectors for each scale. Finally, it synthesizes multi-scale warping frames and combines them to obtain an intermediate frame. Extensive experimental results demonstrate that the proposed algorithm outperforms state-of-the-art video interpolation algorithms on various benchmark datasets.
Published: 2021

4. Superpixels for image and video processing based on proximity-weighted patch matching

Author: Won-Dong Jang, Se-Ho Lee, and Chang-Su Kim
Subjects: Pixel, Matching (graph theory), Computer Networks and Communications, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, Image processing, Pattern recognition, 02 engineering and technology, Video processing, Image segmentation, Motion vector, Hardware and Architecture, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Segmentation, Artificial intelligence, business, Software
Abstract: In this paper, a temporal superpixel algorithm using proximity-weighted patch matching (PPM) is proposed to yield temporally consistent superpixels for image and video processing. PPM estimates the motion vector of a superpixel robustly, by considering the patch matching distances of neighboring superpixels as well as the superpixel itself. In each frame, we initialize superpixels by transferring the superpixel labels of the previous frame using PPM motion vectors. Then, we update the superpixel labels of boundary pixels by minimizing a cost function, which is composed of feature distance, compactness, contour, and temporal consistency terms. Finally, we carry out superpixel splitting, merging, and relabeling to regularize superpixel sizes and correct inaccurate labels. Extensive experimental results confirm that the proposed algorithm outperforms the state-of-the-art conventional algorithms significantly. Also, it is demonstrated that the proposed algorithm can be applied to video object segmentation and video saliency detection tasks.
Published: 2020
Full Text: View/download PDF

5. Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression and Semi-Supervised Domain Adaptation

Author: Yeong Jun Koh, Chang-Su Kim, and Kyung Rae Kim
Subjects: General Computer Science, Artificial neural network, business.industry, Computer science, General Engineering, Context (language use), Pattern recognition, Object (computer science), Ordinal regression, semi-supervised domain adaptation, Domain (software engineering), Future motion estimation, Motion estimation, Video tracking, Benchmark (computing), General Materials Science, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, cyclic ordinal regression, business, lcsh:TK1-9971
Abstract: A novel algorithm to estimate instance-level future motion (FM) in a single image is proposed in this paper. First, the FM of an instance is defined with its direction, speed, and action classes. Then, a deep neural network, called FM-Net, is developed to determine the FM of the instance. More specifically, the multi-context pooling layer is proposed to exploit both object and global context features, and the cyclic ordinal regression scheme is developed using binary classifiers for effective FM classification. Also, the proposed FM-Net is trained in a semi-supervised domain adaptation setting to obtain reliable FM estimation results, even when a source domain in the training process and a target domain in the inference process are different. Extensive experimental results demonstrate that the proposed algorithm provides remarkable performance and thus can be used effectively for computer vision applications, including single object tracking, multiple object tracking, and crowd analysis. Furthermore, the FM dataset, collected from diverse sources and annotated manually, is released as a benchmark for single-image FM estimation.
Published: 2020

6. Online Multiple Object Tracking Based on Open-Set Few-Shot Learning

Author: Yeong Jun Koh, Han-Ul Kim, and Chang-Su Kim
Subjects: General Computer Science, Computer science, open-set classification, Open set, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, few-shot classification, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, 0105 earth and related environmental sciences, Artificial neural network, business.industry, General Engineering, Pattern recognition, Multiple object tracking, Object (computer science), Frame rate, online tracking, Feature (computer vision), Video tracking, Bipartite graph, Embedding, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, lcsh:TK1-9971
Abstract: How to make an online tracking model effectively adapt to newly appearing objects and object disappearance as well as appearance variations of target objects from few examples is an essential issue in multiple object tracking (MOT). Learning target appearances from few examples is a few-shot classification problem, while identifications of newly appearing objects and object disappearance has the aspect of open-set classification. In this work, we regard online MOT as open-set few-show classification to address both learning from few examples (few-shot classification) and unknown classes such as new objects (open-set classification). Specifically, we develop an embedding neural network, called VOFNet, consisting of convolutional and recurrent parts, to perform open-set few-shot classification. The convolutional part constructs a feature from an example of a target object and the recurrent part determines a representative feature of a target object from few examples. Then VOFNet is trained to provide effective features for open-set few-shot classification. Finally, we develop an online multiple object tracker based on the combination of VOFNet and the bipartite matching. The proposed tracker achieves 49.2 multiple object tracking accuracy (MOTA) with 28.9 frames per second on MOT17 dataset, which shows a significantly better trade-off between the accuracy and the speed than the existing algorithms. For example, the proposed algorithm yields about 3.17 times faster speed with 0.99 times lower accuracy than recent existing MOT algorithm [1] .
Published: 2020

7. Object tracking under large motion: Combining coarse-to-fine search with superpixels

Author: Donghui Song, Chang-Su Kim, Sung-Kee Park, and Chansu Kim
Subjects: Information Systems and Management, Similarity (geometry), Computer science, business.industry, 05 social sciences, 050301 education, Pattern recognition, 02 engineering and technology, Object (computer science), Computer Science Applications, Theoretical Computer Science, Active appearance model, Maxima and minima, Discriminative model, Artificial Intelligence, Control and Systems Engineering, Video tracking, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Artificial intelligence, Particle filter, business, 0503 education, Software
Abstract: We propose an object tracking method under large motion in image sequences. Dense sampling and particle filtering have been widely applied to cope with this problem; however, the former is computationally expensive, and the latter is sensitive to local minima. By introducing a novel search method based on coarse-to-fine strategy and image superpixels , we try to solve both drawbacks. In the coarse step, we first extract superpixels associated with a target object on the entire search region by using a simple generative appearance model. In the fine step, we perform a sampling and similarity measurement process within the selected superpixels to find the most accurate location of the target object, also suggest a way to use both a discriminative appearance model and a sophisticated generative appearance model simultaneously. Extensive experiments on popular benchmark dataset demonstrate that the proposed method outperforms other competitive approaches, and also show better results in challenging scenarios such as occlusion, deformation, out-of-view, and in-plane/out-of-plane rotation.
Published: 2019
Full Text: View/download PDF

8. Property-Specific Aesthetic Assessment With Unsupervised Aesthetic Property Discovery

Author: Chul Lee, Chang-Su Kim, and Jun-Tae Lee
Subjects: General Computer Science, Computer science, business.industry, Image aesthetics, image composition, General Engineering, convolutional neural network, 020207 software engineering, Pattern recognition, 02 engineering and technology, aesthetic assessment, Convolutional neural network, ComputerApplications_MISCELLANEOUS, unsupervised attribute clustering, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, General Materials Science, lcsh:Electrical engineering. Electronics. Nuclear engineering, Artificial intelligence, unsupervised property discovery, business, lcsh:TK1-9971, Classifier (UML)
Abstract: We propose the property-specific aesthetic assessment (PSAA) algorithm with unsupervised aesthetic property discovery. The proposed PSAA algorithm uses an aesthetic feature extractor, an aesthetic property classifier, and multiple property-specific assessment networks. The aesthetic feature extractor analyzes aesthetics of images to generate features. Using such aesthetic features, we discover diverse aesthetic properties in an unsupervised manner and develop the aesthetic property classifier to predict the aesthetic property of each image. For each discovered aesthetic property, we train a property-specific assessment network. Thus, we can assess the aesthetic quality of an image using the property-specific network that corresponds to its property. Experimental results on a large dataset show that the proposed PSAA algorithm achieves state-of-the-art aesthetic assessment performance. Furthermore, we demonstrate that PSAA is useful for improving aesthetic qualities of images in two applications: contrast enhancement and image cropping.
Published: 2019
Full Text: View/download PDF

9. ELF-Nets: Deep Learning on Point Clouds Using Extended Laplacian Filter

Author: Chang-Su Kim, Seon Ho Lee, and Han-Ul Kim
Subjects: General Computer Science, Computer science, 3D deep learning, Feature extraction, Point cloud, convolutional neural network, Image processing, 02 engineering and technology, Laplacian filter, Convolutional neural network, semantic part segmentation, Convolution, Matrix (mathematics), 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Point (geometry), business.industry, Deep learning, General Engineering, Filter (signal processing), 021001 nanoscience & nanotechnology, Weighting, 020201 artificial intelligence & image processing, object classification, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, Laplacian matrix, 0210 nano-technology, business, Laplace operator, Algorithm, lcsh:TK1-9971
Abstract: We propose a deep learning framework for various 3D vision tasks, which takes a point cloud as input. The convolution is a basic operator for feature extraction in deep learning. However, it is not directly applicable to a point cloud, which is an irregular, unordered point set. This makes deep learning on point clouds challenging. To address this issue, we propose the extended Laplacian filter (ELF) for point clouds, which adopts the design principles of discrete Laplacian filters in 2D image processing. In other words, ELF extends the Laplacian filters and has the following two properties: 1) it is a two-state filter using two filter matrices (one for a center point and the other for neighboring points), and 2) it employs a scalar weighting function to predict the relative importance of the neighboring points. Then, we develop ELF-Nets, which consist of ELF convolution layers and fully connected layers. Experimental results demonstrate that the proposed ELF-Nets are capable of recognizing the 3D shape of a point cloud effectively and efficiently. In particular, ELF-Nets provide better or comparable performances than the state-of-the-art techniques in both object classification and part segmentation tasks.
Published: 2019

10. Harmonious Semantic Line Detection via Maximal Weight Clique Selection

Author: Dongkwon Jin, Chang-Su Kim, Wonhui Park, and Seong-Gyun Jeong
Subjects: FOS: Computer and information sciences, Clique, Computer science, business.industry, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Complete graph, Construct (python library), Set (abstract data type), Line (geometry), Metric (mathematics), Artificial intelligence, Filter (mathematics), business, Algorithm, Selection (genetic algorithm)
Abstract: A novel algorithm to detect an optimal set of semantic lines is proposed in this work. We develop two networks: selection network (S-Net) and harmonization network (H-Net). First, S-Net computes the probabilities and offsets of line candidates. Second, we filter out irrelevant lines through a selection-and-removal process. Third, we construct a complete graph, whose edge weights are computed by H-Net. Finally, we determine a maximal weight clique representing an optimal set of semantic lines. Moreover, to assess the overall harmony of detected lines, we propose a novel metric, called HIoU. Experimental results demonstrate that the proposed algorithm can detect harmonious semantic lines effectively and efficiently. Our codes are available at https://github.com/dongkwonjin/Semantic-Line-MWCS., Accepted to CVPR2021
Published: 2021
Full Text: View/download PDF

11. Fruit tree disease classification system using generative adversarial networks

Author: Hoe-Kyung Jung, Chang-Su Kim, and Hyesoo Lee
Subjects: Class (computer programming), General Computer Science, business.industry, Computer science, Disease classification, Machine learning, computer.software_genre, Field (computer science), GAN, Crop, Adversarial system, Agriculture, Feature (machine learning), Livestock, Artificial intelligence, Electrical and Electronic Engineering, business, computer, Smart farm, Generative grammar, Classification system
Abstract: Smart farm refers to a farm that can remotely and automatically maintain proper growth and management of crops and livestock by integrating technology with agriculture. Currently, smart farms are concentrated in the field of smart horticulture, and although spreading research is being conducted in limited spaces. In addition, it is difficult to obtain a sufficient amount of data to be used for learning, and there is a problem that data imbalance occurs because it is difficult to obtain a similar amount for each class. In this paper, we propose a method to amplify a small amount of data and to solve the problems of imbalance data by using a feature that can learn to mimic the data of a generative adversarial network. The proposed method can create dataset of various crops and also show high hit rate. Dataset generated from crops would be used to solve problems of data imbalance by learning.
Published: 2021

12. Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps

Author: Yeong Jun Koh, Yuk Heo, and Chang-Su Kim
Subjects: FOS: Computer and information sciences, Computer science, business.industry, Interactive video, Computer Vision and Pattern Recognition (cs.CV), Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science - Computer Vision and Pattern Recognition, Object (computer science), Accurate segmentation, Interaction time, Segmentation, Computer vision, Artificial intelligence, business, Reliability (statistics)
Abstract: We propose a novel guided interactive segmentation (GIS) algorithm for video objects to improve the segmentation accuracy and reduce the interaction time. First, we design the reliability-based attention module to analyze the reliability of multiple annotated frames. Second, we develop the intersection-aware propagation module to propagate segmentation results to neighboring frames. Third, we introduce the GIS mechanism for a user to select unsatisfactory frames quickly with less effort. Experimental results demonstrate that the proposed algorithm provides more accurate segmentation results at a faster speed than conventional algorithms. Codes are available at https://github.com/yuk6heo/GIS-RAmap., accepted to CVPR2021 (oral)
Published: 2021

13. Inter-image Affinity based Interactive Video Object Segmentation

Author: Chang-Su Kim, Yeong Jun Koh, and Yuk Heo
Subjects: Matrix (mathematics), business.industry, Interactive video, Computer science, Frame (networking), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer vision, Segmentation, Artificial intelligence, Object (computer science), business, Image (mathematics)
Abstract: A procedure of interactive video object segmentation is cutting out a target object in a video by repeatedly providing user-annotations (points, lines, etc.) to an algorithm. In this paper, the procedure is achieved by generating the inter-image affinity matrix between the target frame and the annotated frame, and the other matrix between the previous frame and the target frame. By utilizing the two inter-image affinity matrices, we transport encoded features of the annotated frame and the previous frame for the target object. Through the experimental result, we show that exploiting the inter-image affinity matrices are effective for conducting the interactive video object segmentation.
Published: 2020
Full Text: View/download PDF

14. Score Prediction Network and Graph-based Selection for Semantic Line Detection

Author: Chang-Su Kim and Dongkwon Jin
Subjects: business.industry, Computer science, Graph based, Pattern recognition, 02 engineering and technology, Construct (python library), 01 natural sciences, Image (mathematics), 0103 physical sciences, Line (geometry), 0202 electrical engineering, electronic engineering, information engineering, Selection (linguistics), Computer Science::Programming Languages, Graph (abstract data type), 020201 artificial intelligence & image processing, Artificial intelligence, 010306 general physics, business
Abstract: In this paper, we propose a novel semantic line detection algorithm. For an input image, we first detect semantic lines using a semantic line detector by classifying candidate lines. Then, we predict scores indicating whether they are harmonized or not between the detected lines. To this end, we develop a score prediction network (SPNet). Finally, we construct a graph consisting of the detected lines and the predicted scores between them and iteratively select the reliable semantic lines. Experimental results demonstrate that the proposed algorithm detects semantic lines accurately.
Published: 2020
Full Text: View/download PDF

15. Optimized Color Contrast Enhancement For Dichromats Using Local And Global Contrast

Author: Soo-Kyeong Kang, Chul Lee, and Chang-Su Kim
Subjects: Plane (geometry), Computer science, business.industry, Color vision, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, 02 engineering and technology, Measure (mathematics), Visualization, 0202 electrical engineering, electronic engineering, information engineering, Contrast (vision), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Color contrast, business, media_common
Abstract: We propose an optimized color contrast enhancement algorithm for dichromats with color vision deficiency using local and global information. Based on the fact that dichromats perceive colors projected onto a 2D plane, we first measure the color differences on the plane. Then, we formulate an optimization problem to minimize the perceived color differences subject to constraints on the projected plane. Finally, by solving the optimization problem, we obtain the optimal plane and perform color conversion. Simulation results show that the proposed algorithm outperforms the state-of-the-art algorithms in preserving both local details and naturalness of images.
Published: 2020
Full Text: View/download PDF

16. Adaptive Lattice-Aware Image Demosaicking Using Global And Local Information

Author: Chang-Su Kim, Keunsoo Ko, and Ji-Soo Kim
Subjects: Demosaicing, Bayer filter, Pixel, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, Adaptive filter, Lattice (order), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Color filter array, Computer vision, Artificial intelligence, business, 0105 earth and related environmental sciences, Interpolation
Abstract: A novel approach for image demosaicking based on adaptive lattice-aware filter (ALF) and global refinement unit (GRU) is proposed in this work. We generate ALFs dynamically, which are adaptive to positions of pixels within color lattices in a color filter array, to obtain a locally demosaicked image. We then refine the locally demosaicked image using GRU to exploit global information, as well as local information. To extend the receptive fields efficiently, we adopt dilated convolutions in GRU. Experimental results demonstrate that the proposed algorithm provides the state-of-the-art performances in standard demosaicking datasets.
Published: 2020
Full Text: View/download PDF

17. Three-Dimensional Convolutional Neural Network for Prostate MRI Segmentation and Comparison of Prostate Volume Measurements by Use of Artificial Neural Network and Ellipsoid Formula

Author: Beom Jin Park, Yuk Heo, Jeong Yoon Lee, Deuk Jae Sung, Dong Kyu Lee, Chang-Su Kim, and Min Ju Kim
Subjects: Male, Mean squared error, Intraclass correlation, Convolutional neural network, 030218 nuclear medicine & medical imaging, 03 medical and health sciences, 0302 clinical medicine, Imaging, Three-Dimensional, Similarity (network science), Image Interpretation, Computer-Assisted, Medicine, Humans, Radiology, Nuclear Medicine and imaging, Segmentation, Retrospective Studies, Artificial neural network, business.industry, Prostatic Neoplasms, Pattern recognition, General Medicine, Ellipsoid, Magnetic Resonance Imaging, 030220 oncology & carcinogenesis, Artificial intelligence, Neural Networks, Computer, business, Volume (compression)
Abstract: OBJECTIVE. The purposes of this study were to assess the performance of a 3D convolutional neural network (CNN) for automatic segmentation of prostates on MR images and to compare the volume estimates from the 3D CNN with those of the ellipsoid formula. MATERIALS AND METHODS. The study included 330 MR image sets that were divided into 260 training sets and 70 test sets for automated segmentation of the entire prostate. Among these, 162 training sets and 50 test sets were used for transition zone segmentation. Assisted by manual segmentation by two radiologists, the following values were obtained: estimates of ground-truth volume (VGT), software-derived volume (VSW), mean of VGT and VSW (VAV), and automatically generated volume from the 3D CNN (VNET). These values were compared with the volume calculated with the ellipsoid formula (VEL). RESULTS. The Dice similarity coefficient for the entire prostate was 87.12% and for the transition zone was 76.48%. There was no significant difference between VNET and VAV (p = 0.689) in the test sets of the entire prostate, whereas a significant difference was found between VEL and VAV (p < 0.001). No significant difference was found among the volume estimates in the test sets of the transition zone. Overall intraclass correlation coefficients between the volume estimates were excellent (0.887-0.995). In the test sets of entire prostate, the mean error between VGT and VNET (2.5) was smaller than that between VGT and VEL (3.3). CONCLUSION. The fully automated network studied provides reliable volume estimates of the entire prostate compared with those obtained with the ellipsoid formula. Fast and accurate volume measurement by use of the 3D CNN may help clinicians evaluate prostate disease.
Published: 2020

18. Semantic Line Detection Using Mirror Attention and Comparative Ranking and Matching

Author: Jun-Tae Lee, Dongkwon Jin, and Chang-Su Kim
Subjects: Reflection symmetry, Matching (graph theory), Computer science, business.industry, Detector, Line (geometry), Contextual information, Pattern recognition, Pairwise comparison, Artificial intelligence, business, Parallel, Ranking (information retrieval)
Abstract: A novel algorithm to detect semantic lines is proposed in this paper. We develop three networks: detection network with mirror attention (D-Net) and comparative ranking and matching networks (R-Net and M-Net). D-Net extracts semantic lines by exploiting rich contextual information. To this end, we design the mirror attention module. Then, through pairwise comparisons of extracted semantic lines, we iteratively select the most semantic line and remove redundant ones overlapping with the selected one. For the pairwise comparisons, we develop R-Net and M-Net in the Siamese architecture. Experiments demonstrate that the proposed algorithm outperforms the conventional semantic line detector significantly. Moreover, we apply the proposed algorithm to detect two important kinds of semantic lines successfully: dominant parallel lines and reflection symmetry axes. Our codes are available at https://github.com/dongkwonjin/Semantic-Line-DRM.
Published: 2020
Full Text: View/download PDF

19. BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation

Author: Keunsoo Ko, Junheum Park, Chul Lee, and Chang-Su Kim
Subjects: Source code, Computer science, business.industry, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Motion (geometry), 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Filter (video), Temporal resolution, Motion estimation, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, ComputingMethodologies_COMPUTERGRAPHICS, 0105 earth and related environmental sciences, Interpolation, media_common, Volume (compression)
Abstract: Video interpolation increases the temporal resolution of a video sequence by synthesizing intermediate frames between two consecutive frames. We propose a novel deep-learning-based video interpolation algorithm based on bilateral motion estimation. First, we develop the bilateral motion network with the bilateral cost volume to estimate bilateral motions accurately. Then, we approximate bi-directional motions to predict a different kind of bilateral motions. We then warp the two input frames using the estimated bilateral motions. Next, we develop the dynamic filter generation network to yield dynamic blending filters. Finally, we combine the warped frames using the dynamic blending filters to generate intermediate frames. Experimental results show that the proposed algorithm outperforms the state-of-the-art video interpolation algorithms on several benchmark datasets. The source codes and pre-trained models are available at https://github.com/JunHeum/BMBC.
Published: 2020
Full Text: View/download PDF

20. Interactive Video Object Segmentation Using Global and Local Transfer Modules

Author: Yeong Jun Koh, Yuk Heo, and Chang-Su Kim
Subjects: Artificial neural network, business.industry, Computer science, Interactive video, Deep learning, Frame (networking), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Object (computer science), Transfer (computing), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Computer vision, Artificial intelligence, business
Abstract: An interactive video object segmentation algorithm, which takes scribble annotations on query objects as input, is proposed in this paper. We develop a deep neural network, which consists of the annotation network (A-Net) and the transfer network (T-Net). First, given user scribbles on a frame, A-Net yields a segmentation result based on the encoder-decoder architecture. Second, T-Net transfers the segmentation result bidirectionally to the other frames, by employing the global and local transfer modules. The global transfer module conveys the segmentation information in an annotated frame to a target frame, while the local transfer module propagates the segmentation information in a temporally adjacent frame to the target frame. By applying A-Net and T-Net alternately, a user can obtain desired segmentation results with minimal efforts. We train the entire network in two stages, by emulating user scribbles and employing an auxiliary loss. Experimental results demonstrate that the proposed interactive video object segmentation algorithm outperforms the state-of-the-art conventional algorithms. Codes and models are available at https://github.com/yuk6heo/IVOS-ATNet.
Published: 2020
Full Text: View/download PDF

21. PieNet: Personalized Image Enhancement Network

Author: Young Jun Koh, Han-Ul Kim, and Chang-Su Kim
Subjects: business.industry, Computer science, Deep learning, Feature vector, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, 02 engineering and technology, Construct (python library), Machine learning, computer.software_genre, Image (mathematics), Personalization, Metric (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Embedding, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer
Abstract: Image enhancement is an inherently subjective process since people have diverse preferences for image aesthetics. However, most enhancement techniques pay less attention to the personalization issue despite its importance. In this paper, we propose the first deep learning approach to personalized image enhancement, which can enhance new images for a new user, by asking him or her to select about 10–20 preferred images from a random set of images. First, we represent various users’ preferences for enhancement as feature vectors in an embedding space, called preference vectors. We construct the embedding space based on metric learning. Then, we develop the personalized image enhancement network (PieNet) to enhance images adaptively using each user’s preference vector. Experimental results demonstrate that the proposed algorithm is capable of achieving personalization successfully, as well as outperforming conventional general image enhancement algorithms significantly. The source codes and trained models are available at https://github.com/hukim1124/PieNet.
Published: 2020
Full Text: View/download PDF

22. Global and Local Enhancement Networks for Paired and Unpaired Image Enhancement

Author: Young Jun Koh, Chang-Su Kim, and Han-Ul Kim
Subjects: Spatial filter, Computer science, business.industry, 0202 electrical engineering, electronic engineering, information engineering, 020207 software engineering, 020201 artificial intelligence & image processing, Pattern recognition, 02 engineering and technology, Artificial intelligence, Image enhancement, business
Abstract: A novel approach for paired and unpaired image enhancement is proposed in this work. First, we develop global enhancement network (GEN) and local enhancement network (LEN), which can faithfully enhance images. The proposed GEN performs the channel-wise intensity transforms that can be trained easier than the pixel-wise prediction. The proposed LEN refines GEN results based on spatial filtering. Second, we propose different training schemes for paired learning and unpaired learning to train GEN and LEN. Especially, we propose a two-stage training scheme based on generative adversarial networks for unpaired learning. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-arts in paired and unpaired image enhancement. Notably, the proposed unpaired image enhancement algorithm provides better results than recent state-of-the-art paired image enhancement algorithms. The source codes and trained models are available at https://github.com/hukim1124/GleNet.
Published: 2020
Full Text: View/download PDF

23. A Study on Motion Recognition through Kinect-based Machine- Learning

Author: Hoe-Kyung Jung and Chang-Su Kim
Subjects: Control and Systems Engineering, Computer science, business.industry, Motion recognition, Computer vision, Artificial intelligence, business
Published: 2018
Full Text: View/download PDF

24. Photographic composition classification and dominant geometric element detection for outdoor scenes

Author: Jun-Tae Lee, Chang-Su Kim, Chul Lee, and Han-Ul Kim
Subjects: Contextual image classification, Computer science, business.industry, Diagonal, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, Pattern recognition, 02 engineering and technology, Composition (combinatorics), Convolutional neural network, Class (biology), Rule of thirds, Bounding overwatch, Computer Science::Computer Vision and Pattern Recognition, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, Electrical and Electronic Engineering, Symmetry (geometry), business, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: Despite the practical importance of photographic composition for improving or assessing the aesthetical quality of photographs, only a few simple composition rules have been considered for its classification. In this work, we propose novel techniques to classify photographic composition rules of outdoor scenes and detect dominant geometric elements, called composition elements, for each composition class. Specifically, we first categorize composition rules of outdoor photographs into nine classes: RoT, center, horizontal, symmetric, diagonal, curved, vertical, triangle, and pattern. Then, we develop a photographic composition classification algorithm using a convolutional neural network (CNN). To train the CNN, we construct a photographic composition database, which is publicly available. Finally, for each composition class, we propose an effective scheme to locate composition elements, i.e., bounding boxes for main subjects, leading lines, axes of symmetry, triangles, and sky regions. Extensive experimental results demonstrate that the proposed algorithm classifies composition classes reliably and detects composition elements accurately.
Published: 2018
Full Text: View/download PDF

25. Tracking-by-Segmentation Using Superpixel-Wise Neural Network

Author: Se-Ho Lee, Won-Dong Jang, and Chang-Su Kim
Subjects: Conditional random field, General Computer Science, Computer science, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, object segmentation, Tracking-by-segmentation, Minimum bounding box, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Segmentation, object tracking, Artificial neural network, business.industry, Frame (networking), General Engineering, 020207 software engineering, Pattern recognition, Object (computer science), visual tracking, Video tracking, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, lcsh:TK1-9971
Abstract: A tracking-by-segmentation algorithm, which tracks and segments a target object in a video sequence, is proposed in this paper. In the first frame, we segment out the target object in a user-annotated bounding box. Then, we divide subsequent frames into superpixels. We develop a superpixel-wise neural network for tracking-by-segmentation, called TBSNet, which extracts multi-level convolutional features of each superpixel and yields the foreground probability of the superpixel as the output. We train TBSNet in two stages. First, we perform offline training to enable TBSNet to discriminate general objects from the background. Second, during the tracking, we fine-tune TBSNet to distinguish the target object from non-targets and adapt to color change and shape variation of the target object. Finally, we perform conditional random field optimization to improve the segmentation quality further. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art trackers on four challenging data sets.
Published: 2018

26. Multiscale Feature Extractors for Stereo Matching Cost Computation

Author: Yeong Jun Koh, Kyung-Rae Kim, and Chang-Su Kim
Subjects: Matching (statistics), matching cost computation, General Computer Science, business.industry, Computer science, Computation, Reliability (computer networking), Feature extraction, General Engineering, 020206 networking & telecommunications, Pattern recognition, 02 engineering and technology, Convolutional neural network, Stereo matching, Feature (computer vision), convolutional neural networks, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, General Materials Science, Artificial intelligence, Enhanced Data Rates for GSM Evolution, lcsh:Electrical engineering. Electronics. Nuclear engineering, multiscale feature extraction, business, lcsh:TK1-9971
Abstract: We propose four efficient feature extractors based on convolutional neural networks for stereo matching cost computation. Two of them generate multiscale features with diverse receptive field sizes. These multiscale features are used to compute the corresponding multiscale matching costs. We then determine an optimal cost by combining the multiscale costs using edge information. On the other hand, the other two feature extractors produce uni-scale features by combining multiscale features directly through fully connected layers. Finally, after obtaining matching costs using one of the four extractors, we determine optimal disparities based on the cross-based cost aggregation and the semiglobal matching. Extensive experiments on the Middlebury stereo data sets demonstrate the effectiveness and efficiency of the proposed algorithm. Specifically, the proposed algorithm provides competitive matching performance with the state of the arts, while demanding lower computational complexity.
Published: 2018

27. SAF-Nets: Shape-Adaptive Filter Networks for 3D point cloud processing

Author: Seon-Ho Lee and Chang-Su Kim
Subjects: Computer science, business.industry, Noise (signal processing), Deep learning, Point cloud, Pattern recognition, Adaptive filter, Signal Processing, Media Technology, Benchmark (computing), Shape context, Segmentation, Point (geometry), Computer Vision and Pattern Recognition, Artificial intelligence, Electrical and Electronic Engineering, business
Abstract: A deep learning framework for 3D point cloud processing is proposed in this work. In a point cloud, local neighborhoods have various shapes, and the semantic meaning of each point is determined within the local shape context. Thus, we propose shape-adaptive filters (SAFs), which are dynamically generated from the distributions of local points. The proposed SAFs can extract robust features against noise or outliers, by employing local shape contexts to suppress them. Also, we develop the SAF-Nets for classification and segmentation using multiple SAF layers. Extensive experimental results demonstrate that the proposed SAF-Nets significantly outperform the state-of-the-art conventional algorithms on several benchmark datasets. Moreover, it is shown that SAFs can improve scene flow estimation performance as well.
Published: 2021
Full Text: View/download PDF

28. Subpixel rendering for diamond-shaped PenTile displays using patch-based adaptive filters

Author: Kyung Rae Kim, Jae-Han Lee, and Chang-Su Kim
Subjects: Matching (graph theory), business.industry, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, 02 engineering and technology, Filter (signal processing), Subpixel rendering, Adaptive filter, Virtual image, Signal Processing, Human visual system model, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, PenTile matrix family, Computer vision, Computer Vision and Pattern Recognition, Artificial intelligence, Quadratic programming, Electrical and Electronic Engineering, business
Abstract: We propose a novel subpixel rendering algorithm for diamond-shaped PenTile displays, which reduces color distortions while improving apparent resolutions. We develop two types of subpixel rendering filters: main filter and color distortion reduction (CDR) filters. To derive the filters, we formulate a quadratic program to minimize the difference between an original input image and a virtual image that the human visual system perceives. By imposing two constraints for filter size and coefficients, we obtain the main filter, which has a suitable size and is normalized. Then, we design the CDR filters based on the analysis of various patch patterns for image areas. We define the patch patterns to classify local areas with possible color distortions. By imposing additional constraints according to the patch patterns, we derive the CDR filters. Lastly, by matching local areas in the input image into the pre-defined patch patterns, we render the image using the main filter and the CDR filters, which are applied adaptively to the local areas. Experimental results demonstrate that the proposed subpixel rendering algorithm improves apparent resolutions and suppresses color distortions effectively, thereby outperforming conventional algorithms.
Published: 2021
Full Text: View/download PDF

29. Reflection Removal Under Fast Forward Camera Motion

Author: In Kyu Park, Christian Simon, Jun Young Cheong, and Chang-Su Kim
Subjects: Motion compensation, business.industry, Image quality, Computer science, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Iterative reconstruction, 01 natural sciences, Computer Graphics and Computer-Aided Design, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, 010306 general physics, business, Software, Jitter, Coherence (physics)
Abstract: The image quality of an in-vehicle black box camera is often degraded by the reflections of internal objects, dirt, and dust on the windshield. In this paper, we propose a novel algorithm that simultaneously removes the reflections and small dirt artifacts from in-vehicle black box videos under fast forward camera motion. The algorithm exploits the spatiotemporal coherence of the reflection and dirt, which remain stationary relative to the fast-moving background. Unlike previous algorithms, the algorithm first separates stationary reflection and then restores the background scene. To this end, we propose an average image prior, thereby imposing spatiotemporal coherence. The separation model is a two-layer model composed of stationary and background layers, where different gradient sparsity distributions are utilized in a region-based manner. Motion compensation in postprocessing is proposed to alleviate layer jitter due to vehicle vibrations. In evaluation experiments, the proposed algorithm successfully extracts the stationary layer from several real and synthetic black box videos.
Published: 2017
Full Text: View/download PDF

30. Contrast enhancement of noisy low-light images based on structure-texture-noise decomposition

Author: Minhyeok Heo, Chul Lee, Jaemoon Lim, and Chang-Su Kim
Subjects: Texture compression, Noise reduction, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, Texture (music), Image texture, Texture filtering, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Contrast (vision), Computer vision, Electrical and Electronic Engineering, ComputingMethodologies_COMPUTERGRAPHICS, Mathematics, media_common, business.industry, 020206 networking & telecommunications, Pattern recognition, Noise, Computer Science::Graphics, Computer Science::Computer Vision and Pattern Recognition, Signal Processing, Human visual system model, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business
Abstract: We develop an efficient structure-texture-noise decomposition method.We propose the image enhancement algorithm tailored for noisy low-light images.The proposed algorithm provides robust performance under various conditions. A noisy low-light image enhancement algorithm based on structure-texture-noise (STN) decomposition is proposed in this work. We split an input image into structure, texture, and noise components, and enhance the structure and texture components separately. More specifically, we first enhance the contrast of the structure image, by extending a 2D-histogram-based image enhancement scheme based on the characteristics of low-light images. Then, we reconstruct the texture image by retrieving residual texture components from the noise image and enhance it by exploiting the perceptual response of the human visual system (HVS). Experimental results on both synthetic and real-world images demonstrate that the proposed STN algorithm sharpens the texture and enhances the contrast more effectively than conventional algorithms, while providing robust performance under various noise and illumination conditions.
Published: 2017
Full Text: View/download PDF

31. Deep Learning Approach to Video Frame Rate Up-Conversion Using Bilateral Motion Estimation

Author: Chang-Su Kim, Junheum Park, and Chul Lee
Subjects: business.industry, Computer science, Deep learning, Frame (networking), Interpolation (computer graphics), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Approximation algorithm, 020206 networking & telecommunications, 02 engineering and technology, Frame rate, Convolutional neural network, Motion (physics), Motion estimation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, Algorithm, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: We propose a deep learning-based frame rate upconversion algorithm using bilateral motion estimation. We first estimate bilateral motion fields by employing a convolutional neural network. Also, we approximate intermediate bi-directional motion fields, assuming linear motions between successive frames. Finally, we develop the synthesis network to produce an intermediate frame by merging the warped frames, which are obtained using the two kinds of motion fields. Experimental results demonstrate that the proposed algorithm generates high-quality intermediate frames on challenging sequences with large motions and occlusion, and outperforms state-of-the-art conventional algorithms.
Published: 2019
Full Text: View/download PDF

32. Robust Change Detection in High Resolution Satellite Images with Geometric Distortions

Author: Dongkwon Jin, Kyungsun Lim, and Chang-Su Kim
Subjects: Change detection algorithms, Computer science, business.industry, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, High resolution, Image registration, Robustness (computer science), Iterated function, Satellite, Computer vision, Artificial intelligence, business, Change detection
Abstract: Ahstract-A robust change detection algorithm for high resolution satellite images, which are not perfectly registered, is proposed in this work. To achieve this goal, a change detection technique for registered images and an image registration technique are employed in a cooperative way. Specifically, we use not only hand-crafted features but also change detection results to match keypoints extracted from two images. We then align the images using the matching pairs of keypoints. Finally, we obtain a change map from the aligned images. These steps of image registration and change detection are alternately iterated until the convergence. Experimental results demonstrate that proposed algorithm outperforms the conventional change detection technique significantly, when there are geometric distortions between temporal satellite images.
Published: 2019
Full Text: View/download PDF

33. Image Aesthetic Assessment Based on Pairwise Comparison A Unified Approach to Score Regression, Binary Classification, and Personalization

Author: Chang-Su Kim and Jun-Tae Lee
Subjects: business.industry, Computer science, Feature extraction, Pattern recognition, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Visualization, Image (mathematics), Statistical classification, Binary classification, ComputerApplications_MISCELLANEOUS, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Pairwise comparison, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: We propose a unified approach to three tasks of aesthetic score regression, binary aesthetic classification, and personalized aesthetics. First, we develop a comparator to estimate the ratio of aesthetic scores for two images. Then, we construct a pairwise comparison matrix for multiple reference images and an input image, and predict the aesthetic score of the input via the eigenvalue decomposition of the matrix. By varying the reference images, the proposed algorithm can be used for binary aesthetic classification and personalized aesthetics, as well as generic score regression. Experimental results demonstrate that the proposed unified algorithm provides the state-of-the-art performances in all three tasks of image aesthetics.
Published: 2019
Full Text: View/download PDF

34. Instance-Level Future Motion Estimation in a Single Image Based on Ordinal Regression

Author: Yeong Jun Koh, Kyung-Rae Kim, Whan Choi, Chang-Su Kim, and Seong-Gyun Jeong
Subjects: Artificial neural network, business.industry, Computer science, Reliability (computer networking), Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Pattern recognition, 02 engineering and technology, Ordinal regression, Motion (physics), 020204 information systems, Video tracking, Motion estimation, 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Artificial intelligence, business
Abstract: A novel algorithm to estimate instance-level future motion in a single image is proposed in this paper. We first represent the future motion of an instance with its direction, speed, and action classes. Then, we develop a deep neural network that exploits different levels of semantic information to perform the future motion estimation. For effective future motion classification, we adopt ordinal regression. Especially, we develop the cyclic ordinal regression scheme using binary classifiers. Experiments demonstrate that the proposed algorithm provides reliable performance and thus can be used effectively for vision applications, including single and multi object tracking. Furthermore, we release the future motion (FM) dataset, collected from diverse sources and annotated manually, as a benchmark for single-image future motion estimation.
Published: 2019
Full Text: View/download PDF

35. Interactive Image Segmentation via Backpropagating Refinement Scheme

Author: Won-Dong Jang and Chang-Su Kim
Subjects: Scheme (programming language), Pixel, Computer science, business.industry, Segmentation, Pattern recognition, Image segmentation, Artificial intelligence, Object (computer science), business, computer, Convolutional neural network, computer.programming_language
Abstract: An interactive image segmentation algorithm, which accepts user-annotations about a target object and the background, is proposed in this work. We convert user-annotations into interaction maps by measuring distances of each pixel to the annotated locations. Then, we perform the forward pass in a convolutional neural network, which outputs an initial segmentation map. However, the user-annotated locations can be mislabeled in the initial result. Therefore, we develop the backpropagating refinement scheme (BRS), which corrects the mislabeled pixels. Experimental results demonstrate that the proposed algorithm outperforms the conventional algorithms on four challenging datasets. Furthermore, we demonstrate the generality and applicability of BRS in other computer vision tasks, by transforming existing convolutional neural networks into user-interactive ones.
Published: 2019
Full Text: View/download PDF

36. Monocular Depth Estimation Using Relative Depth Maps

Author: Chang-Su Kim and Jae-Han Lee
Subjects: Monocular, Property (programming), business.industry, Pattern recognition, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, Depth map, 0202 electrical engineering, electronic engineering, information engineering, Relative depth, 020201 artificial intelligence & image processing, Pairwise comparison, Artificial intelligence, business, 0105 earth and related environmental sciences, Mathematics
Abstract: We propose a novel algorithm for monocular depth estimation using relative depth maps. First, using a convolutional neural network, we estimate relative depths between pairs of regions, as well as ordinary depths, at various scales. Second, we restore relative depth maps from selectively estimated data based on the rank-1 property of pairwise comparison matrices. Third, we decompose ordinary and relative depth maps into components and recombine them optimally to reconstruct a final depth map. Experimental results show that the proposed algorithm provides the state-of-art depth estimation performance.
Published: 2019
Full Text: View/download PDF

37. Visual Object Tracking by Using Multiple Random Walkers

Author: Juhyeok Mun, Han-Ul Kim, and Chang-Su Kim
Subjects: business.industry, Computer science, Video tracking, Computer vision, Artificial intelligence, business, Tracking (particle physics)
Published: 2016
Full Text: View/download PDF

38. A MEMS-Based Finger Wearable Computer Input Devices

Author: Chang-su Kim and Se-hyun Jung
Subjects: Microelectromechanical systems, General Computer Science, Computer science, business.industry, Wearable computer, Input device, Computer vision, Artificial intelligence, business, Computer hardware
Published: 2016
Full Text: View/download PDF

39. Compressed domain video saliency detection using global and local spatiotemporal features

Author: Je-Won Kang, Se-Ho Lee, and Chang-Su Kim
Subjects: Motion analysis, Computer science, media_common.quotation_subject, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 02 engineering and technology, 050105 experimental psychology, Robustness (computer science), Computer Science::Multimedia, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Discrete cosine transform, Contrast (vision), 0501 psychology and cognitive sciences, Computer vision, Electrical and Electronic Engineering, media_common, business.industry, 05 social sciences, Frame (networking), Pattern recognition, Kadir–Brady saliency detector, Feature (computer vision), Computer Science::Computer Vision and Pattern Recognition, Signal Processing, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Artificial intelligence, business, Decoding methods
Abstract: We propose a compressed domain video saliency detection algorithm.The proposed algorithm extracts features from the partially decoded data.The proposed algorithm performs the detection in real-time with good performance. A compressed domain video saliency detection algorithm, which employs global and local spatiotemporal (GLST) features, is proposed in this work. We first conduct partial decoding of a compressed video bitstream to obtain motion vectors and DCT coefficients, from which GLST features are extracted. More specifically, we extract the spatial features of rarity, compactness, and center prior from DC coefficients by investigating the global color distribution in a frame. We also extract the spatial feature of texture contrast from AC coefficients to identify regions, whose local textures are distinct from those of neighboring regions. Moreover, we use the temporal features of motion intensity and motion contrast to detect visually important motions. Then, we generate spatial and temporal saliency maps, respectively, by linearly combining the spatial features and the temporal features. Finally, we fuse the two saliency maps into a spatiotemporal saliency map adaptively by comparing the robustness of the spatial features with that of the temporal features. Experimental results demonstrate that the proposed algorithm provides excellent saliency detection performance, while requiring low complexity and thus performing the detection in real-time.
Published: 2016
Full Text: View/download PDF

40. Change Detection in High Resolution Satellite Images Using an Ensemble of Convolutional Neural Networks

Author: Chang-Su Kim, Dongkwon Jin, and Kyungsun Lim
Subjects: business.industry, Computer science, Supervised learning, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 0211 other engineering and technologies, Pattern recognition, 02 engineering and technology, Image segmentation, Convolutional neural network, 0202 electrical engineering, electronic engineering, information engineering, Preprocessor, RGB color model, 020201 artificial intelligence & image processing, Artificial intelligence, business, Change detection, Decoding methods, 021101 geological & geomatics engineering
Abstract: In this paper, we propose a novel change detection algorithm for high resolution satellite images using convolutional neural networks (CNNs), which does not require any preprocessing, such as ortho-rectification and classification. When analyzing multi-temporal satellite images, it is crucial to distinguish viewpoint or color variations of an identical object from actual changes. Especially in urban areas, the registration difficulty due to high-rise buildings makes conventional change detection techniques unreliable, if they are not combined with preprocessing schemes using digital surface models or multi-spectral information. We design three encoder-decoder-structured CNNs, which yield change maps from an input pair of RGB satellite images. For the supervised learning of these CNNs, we construct a large fully-labeled dataset using Google Earth images taken in different years and seasons. Experimental results demonstrate that the trained CNNs detect actual changes successfully, even though image pairs are neither perfectly registered nor color-corrected. Furthermore, an ensemble of the three CNNs provides excellent performance, outperforming each individual CNN.
Published: 2018
Full Text: View/download PDF

41. PAC-Net: Pairwise Aesthetic Comparison Network for Image Aesthetic Assessment

Author: Keunsoo Ko, Chang-Su Kim, and Jun-Tae Lee
Subjects: Subjectivity, Computer science, business.industry, Feature extraction, Pattern recognition, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Convolutional neural network, ComputerApplications_MISCELLANEOUS, 0202 electrical engineering, electronic engineering, information engineering, Task analysis, Entropy (information theory), 020201 artificial intelligence & image processing, Pairwise comparison, Artificial intelligence, business, 0105 earth and related environmental sciences
Abstract: Image aesthetic assessment is important for finding well taken and appealing photographs but is challenging due to the ambiguity and subjectivity of aesthetic criteria. We develop the pairwise aesthetic comparison network (PAC-Net), which consists of two parts: aesthetic feature extraction and pairwise feature comparison. To alleviate the ambiguity and subjectivity, we train PAC-Net to learn the relative aesthetic ranks of two images by employing a novel loss function, called aesthetic-adaptive cross entropy loss. Then, we develop simple schemes for using PAC-Net in the tasks of aesthetic ranking and aesthetic classification, respectively. Experimental results demonstrate that PAC-Net achieves the state-of-the-art performances in both the ranking and classification applications.
Published: 2018
Full Text: View/download PDF

42. Reliable Depth-of-Field Rendering Using Estimated Depth Maps

Author: Kyung-Rae Kim, Whan Choi, and Chang-Su Kim
Subjects: business.industry, Computer science, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, 02 engineering and technology, Rendering (computer graphics), Circle of confusion, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Depth of field, business, ComputingMethodologies_COMPUTERGRAPHICS
Abstract: A reliable algorithm for rendering depth of field (DoF) effects using estimated depth maps, obtained through stereo matching, is proposed in this paper. The proposed algorithm generates blurring to simulate images spontaneously seen by human vision systems. We develop two types of windows : circle of confusion (CoC) blurring window and peripheral blurring window. First, the CoC blurring window is determined by comparing the depth values of a gazing point and each sample point. Second, the peripheral blurring window is obtained by calculating the distance between the gazing and sample points. Then, we combine the two windows to make the total blurring window. Finally, through a masking process, we modulate the total blurring window to provide a more natural DoF. Experimental results demonstrate that the proposed algorithm provides realistic blurring, by preserving edges clearly as well as blurring far points from the gazing point effectively.
Published: 2018
Full Text: View/download PDF

43. Sequential Clique Optimization for Video Object Segmentation

Author: Yeong Jun Koh, Young-Yoon Lee, and Chang-Su Kim
Subjects: Clique, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 0202 electrical engineering, electronic engineering, information engineering, 020207 software engineering, 020201 artificial intelligence & image processing, Segmentation, Pattern recognition, 02 engineering and technology, Artificial intelligence, business, Graph
Abstract: A novel algorithm to segment out objects in a video sequence is proposed in this work. First, we extract object instances in each frame. Then, we select a visually important object instance in each frame to construct the salient object track through the sequence. This can be formulated as finding the maximal weight clique in a complete k-partite graph, which is NP hard. Therefore, we develop the sequential clique optimization (SCO) technique to efficiently determine the cliques corresponding to salient object tracks. We convert these tracks into video object segmentation results. Experimental results show that the proposed algorithm significantly outperforms the state-of-the-art video object segmentation and video salient object detection algorithms on recent benchmark datasets.
Published: 2018
Full Text: View/download PDF

44. Video Stabilization Based on Feature Trajectory Augmentation and Selection and Robust Mesh Grid Warping

Author: Chulwoo Lee, Yeong Jun Koh, and Chang-Su Kim
Subjects: Matrix completion, Computer science, Mesh grid, business.industry, Feature extraction, Computer Graphics and Computer-Aided Design, Regularization (mathematics), Image stabilization, Robustness (computer science), Computer vision, Artificial intelligence, Image warping, business, Software
Abstract: We propose a video stabilization algorithm, which extracts a guaranteed number of reliable feature trajectories for robust mesh grid warping. We first estimate feature trajectories through a video sequence and transform the feature positions into rolling-free smoothed positions. When the number of the estimated trajectories is insufficient, we generate virtual trajectories by augmenting incomplete trajectories using a low-rank matrix completion scheme. Next, we detect feature points on a large moving object and exclude them so as to stabilize camera movements, rather than object movements. With the selected feature points, we set a mesh grid on each frame and warp each grid cell by moving the original feature positions to the smoothed ones. For robust warping, we formulate a cost function based on the reliability weights of each feature point and each grid cell. The cost function consists of a data term, a structure-preserving term, and a regularization term. By minimizing the cost function, we determine the robust mesh grid warping and achieve the stabilization. Experimental results demonstrate that the proposed algorithm reconstructs videos more stably than the conventional algorithms.
Published: 2015
Full Text: View/download PDF

45. Video Deraining and Desnowing Using Temporal Correlation and Low-Rank Matrix Completion

Author: Jae-Young Sim, Jin-Hwan Kim, and Chang-Su Kim
Subjects: Matrix completion, business.industry, Frame (networking), Streak, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Low-rank approximation, Pattern recognition, Sparse approximation, Iterative reconstruction, Computer Graphics and Computer-Aided Design, GeneralLiterature_MISCELLANEOUS, Support vector machine, Outlier, Computer vision, Artificial intelligence, business, Software, ComputingMethodologies_COMPUTERGRAPHICS, Mathematics
Abstract: A novel algorithm to remove rain or snow streaks from a video sequence using temporal correlation and low-rank matrix completion is proposed in this paper. Based on the observation that rain streaks are too small and move too fast to affect the optical flow estimation between consecutive frames, we obtain an initial rain map by subtracting temporally warped frames from a current frame. Then, we decompose the initial rain map into basis vectors based on the sparse representation, and classify those basis vectors into rain streak ones and outliers with a support vector machine. We then refine the rain map by excluding the outliers. Finally, we remove the detected rain streaks by employing a low-rank matrix completion technique. Furthermore, we extend the proposed algorithm to stereo video deraining. Experimental results demonstrate that the proposed algorithm detects and removes rain or snow streaks efficiently, outperforming conventional algorithms.
Published: 2015
Full Text: View/download PDF

46. Spatiotemporal Saliency Detection for Video Sequences Based on Random Walk With Restart

Author: Chang-Su Kim, Young-Bae Kim, Jae-Young Sim, and Han-sang Kim
Subjects: business.industry, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Stochastic matrix, Pattern recognition, Video sequence, Random walk, Computer Graphics and Computer-Aided Design, Motion (physics), Image (mathematics), Random walker algorithm, Salient, Computer Science::Computer Vision and Pattern Recognition, Computer vision, Artificial intelligence, business, Software, Mathematics
Abstract: A novel saliency detection algorithm for video sequences based on the random walk with restart (RWR) is proposed in this paper. We adopt RWR to detect spatially and temporally salient regions. More specifically, we first find a temporal saliency distribution using the features of motion distinctiveness, temporal consistency, and abrupt change. Among them, the motion distinctiveness is derived by comparing the motion profiles of image patches. Then, we employ the temporal saliency distribution as a restarting distribution of the random walker. In addition, we design the transition probability matrix for the walker using the spatial features of intensity, color, and compactness. Finally, we estimate the spatiotemporal saliency distribution by finding the steady-state distribution of the walker. The proposed algorithm detects foreground salient objects faithfully, while suppressing cluttered backgrounds effectively, by incorporating the spatial transition matrix and the temporal restarting distribution systematically. Experimental results on various video sequences demonstrate that the proposed algorithm outperforms conventional saliency detection algorithms qualitatively and quantitatively.
Published: 2015
Full Text: View/download PDF

47. Frame-level Matching for Near Duplicate Videos Using Binary Frame Descriptor

Author: Chang-Su Kim, Jun-Tae Lee, Won-Dong Jang, and Kyung-Rae Kim
Subjects: Matching (statistics), business.industry, Computer science, Frame (networking), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Inter frame, Binary number, Function (mathematics), Residual frame, Computer vision, Artificial intelligence, business, Blossom algorithm, Block-matching algorithm
Abstract: %&'()* + ,-./012 3. 45 6789:; 0 ?- DE3. F.-G @A H IJ H 67H K$ LM 3. N IO P QR0STH K$ UVW 3. XY0ST12Z* AP [I ,-./[\4$K0] 3.AbstractIn this paper, we propose a precise frame-level near-duplicate video matching algorithm. First, a binary frame descriptor for near-duplicate video matching is proposed. The binary frame descriptor divides a frame into patches and represent the relations between patches in bits. Seconds, we formulate a cost function for the matching, composed of matching costs and compensatory costs. Then, we roughly determine initial matchings and refine the matchings iteratively to minimize the cost function. Experimental results demonstrate that the proposed algorithm provides efficient performance for frame-level near duplicate video matching.Keyword : Near-duplicate video, frame-level video matching, and binary frame descriptor.
Published: 2015
Full Text: View/download PDF

48. FDQM: Fast Quality Metric for Depth Maps Without View Synthesis

Author: Jae-Young Sim, Won-Dong Jang, Chang-Su Kim, and Tae-Young Chung
Subjects: Pixel, Computer science, Image quality, business.industry, media_common.quotation_subject, View synthesis, Depth map, Metric (mathematics), Media Technology, Benchmark (computing), Computer vision, Quality (business), Artificial intelligence, Electrical and Electronic Engineering, business, media_common
Abstract: We propose a fast quality metric for depth maps, called fast depth quality metric (FDQM), which efficiently evaluates the impacts of depth map errors on the qualities of synthesized intermediate views in multiview video plus depth applications. In other words, the proposed FDQM assesses view synthesis distortions in the depth map domain, without performing the actual view synthesis. First, we estimate the distortions at pixel positions, which are specified by reference disparities and distorted disparities, respectively. Then, we integrate those pixel-wise distortions into an FDQM score by employing a spatial pooling scheme, which considers occlusion effects and the characteristics of human visual attention. As a benchmark of depth map quality assessment, we perform a subjective evaluation test for intermediate views, which are synthesized from compressed depth maps at various bitrates. We compare the subjective results with objective metric scores. Experimental results demonstrate that the proposed FDQM yields highly correlated scores to the subjective ones. Moreover, FDQM requires at least 10 times less computations than conventional quality metrics, since it does not perform the actual view synthesis.
Published: 2015
Full Text: View/download PDF

49. Cubic Subalgebras and Cubic Closed Ideals of B-algebras

Author: Monoranjan Bhowmik, Madhumangal Pal, Chang Su Kim, and Tapan Senapati
Subjects: Pure mathematics, Cubic closed ideal, Inverse image, Cubic surface, Logic, 02 engineering and technology, Management Science and Operations Research, 01 natural sciences, Industrial and Manufacturing Engineering, Theoretical Computer Science, Set (abstract data type), Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Cubic form, Cubic set, 0101 mathematics, Mathematics, Discrete mathematics, Mathematics::Commutative Algebra, lcsh:Mathematics, Applied Mathematics, Image (category theory), lcsh:QA1-939, 010101 applied mathematics, Cubic subalgebra, lcsh:TA1-2040, Control and Systems Engineering, Product (mathematics), 020201 artificial intelligence & image processing, B-algebra, lcsh:Engineering (General). Civil engineering (General), Information Systems
Abstract: In this paper, the concept of cubic set to subalgebras, ideals and closed ideals of B -algebras are introduced. Relations among cubic subalgebras with cubic ideals and cubic closed ideals of B -algebras investigated. The homomorphic image and inverse image of cubic subalgebras, ideals are studied and some related properties are investigated. Also, the product of cubic B -algebras are investigated.
Published: 2015
Full Text: View/download PDF

50. Large-Scale 3D Point Cloud Compression Using Adaptive Radial Distance Prediction in Hybrid Coordinate Domains

Author: Kyu-Yul Lee, Jae-Young Sim, Jae-Kyun Ahn, and Chang-Su Kim
Subjects: Pixel, Color image, business.industry, Binary image, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Point cloud, Luminance, Computer Science::Computer Vision and Pattern Recognition, Signal Processing, Computer vision, Artificial intelligence, Electrical and Electronic Engineering, Range segmentation, business, Mathematics, Color Cell Compression, Coding (social sciences)
Abstract: An adaptive range image coding algorithm for the geometry compression of large-scale 3D point clouds (LS3DPCs) is proposed in this work. A terrestrial laser scanner generates an LS3DPC by measuring the radial distances of objects in a real world scene, which can be mapped into a range image. In general, the range image exhibits different characteristics from an ordinary luminance or color image, and thus the conventional image coding techniques are not suitable for the range image coding. We propose a hybrid range image coding algorithm, which predicts the radial distance of each pixel using previously encoded neighbors adaptively in one of three coordinate domains: range image domain, height image domain, and 3D domain. We first partition an input range image into blocks of various sizes. For each block, we apply multiple prediction modes in the three domains and compute their rate-distortion costs. Then, we perform the prediction of all pixels using the optimal mode and encode the resulting prediction residuals. Experimental results show that the proposed algorithm provides significantly better compression performance on various range images than the conventional image or video coding techniques.
Published: 2015
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

208 results on '"Chang Su Kim"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources