Author: "Cho, Nam Ik" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cho, Nam Ik"' showing total 431 results

Start Over Author "Cho, Nam Ik"

431 results on '"Cho, Nam Ik"'

1. Leveraging Positional Encoding for Robust Multi-Reference-Based Object 6D Pose Estimation

Author: Park, Jaewoo, Kim, Jaeguk, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Accurately estimating the pose of an object is a crucial task in computer vision and robotics. There are two main deep learning approaches for this: geometric representation regression and iterative refinement. However, these methods have some limitations that reduce their effectiveness. In this paper, we analyze these limitations and propose new strategies to overcome them. To tackle the issue of blurry geometric representation, we use positional encoding with high-frequency components for the object's 3D coordinates. To address the local minimum problem in refinement methods, we introduce a normalized image plane-based multi-reference refinement strategy that's independent of intrinsic matrix constraints. Lastly, we utilize adaptive instance normalization and a simple occlusion augmentation method to help our model concentrate on the target object. Our experiments on Linemod, Linemod-Occlusion, and YCB-Video datasets demonstrate that our approach outperforms existing methods. We will soon release the code.
Published: 2024

2. High Dynamic Range Imaging of Dynamic Scenes with Saturation Compensation but without Explicit Motion Compensation

Author: Chung, Haesoo and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: High dynamic range (HDR) imaging is a highly challenging task since a large amount of information is lost due to the limitations of camera sensors. For HDR imaging, some methods capture multiple low dynamic range (LDR) images with altering exposures to aggregate more information. However, these approaches introduce ghosting artifacts when significant inter-frame motions are present. Moreover, although multi-exposure images are given, we have little information in severely over-exposed areas. Most existing methods focus on motion compensation, i.e., alignment of multiple LDR shots to reduce the ghosting artifacts, but they still produce unsatisfying results. These methods also rather overlook the need to restore the saturated areas. In this paper, we generate well-aligned multi-exposure features by reformulating a motion alignment problem into a simple brightness adjustment problem. In addition, we propose a coarse-to-fine merging strategy with explicit saturation compensation. The saturated areas are reconstructed with similar well-exposed content using adaptive contextual attention. We demonstrate that our method outperforms the state-of-the-art methods regarding qualitative and quantitative evaluations., Comment: WACV 2022
Published: 2023

3. LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction

Author: Chung, Haesoo and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: As demands for high-quality videos continue to rise, high-resolution and high-dynamic range (HDR) imaging techniques are drawing attention. To generate an HDR video from low dynamic range (LDR) images, one of the critical steps is the motion compensation between LDR frames, for which most existing works employed the optical flow algorithm. However, these methods suffer from flow estimation errors when saturation or complicated motions exist. In this paper, we propose an end-to-end HDR video composition framework, which aligns LDR frames in the feature space and then merges aligned features into an HDR frame, without relying on pixel-domain optical flow. Specifically, we propose a luminance-based alignment network for HDR (LAN-HDR) consisting of an alignment module and a hallucination module. The alignment module aligns a frame to the adjacent reference by evaluating luminance-based attention, excluding color information. The hallucination module generates sharp details, especially for washed-out areas due to saturation. The aligned and hallucinated features are then blended adaptively to complement each other. Finally, we merge the features to generate a final HDR frame. In training, we adopt a temporal loss, in addition to frame reconstruction losses, to enhance temporal consistency and thus reduce flickering. Extensive experiments demonstrate that our method performs better or comparable to state-of-the-art methods on several benchmarks., Comment: ICCV 2023
Published: 2023

4. Self-supervised Image Denoising with Downsampled Invariance Loss and Conditional Blind-Spot Network

Author: Jang, Yeong Il, Lee, Keuntek, Park, Gu Yong, Kim, Seyun, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: There have been many image denoisers using deep neural networks, which outperform conventional model-based methods by large margins. Recently, self-supervised methods have attracted attention because constructing a large real noise dataset for supervised training is an enormous burden. The most representative self-supervised denoisers are based on blind-spot networks, which exclude the receptive field's center pixel. However, excluding any input pixel is abandoning some information, especially when the input pixel at the corresponding output position is excluded. In addition, a standard blind-spot network fails to reduce real camera noise due to the pixel-wise correlation of noise, though it successfully removes independently distributed synthetic noise. Hence, to realize a more practical denoiser, we propose a novel self-supervised training framework that can remove real noise. For this, we derive the theoretic upper bound of a supervised loss where the network is guided by the downsampled blinded output. Also, we design a conditional blind-spot network (C-BSN), which selectively controls the blindness of the network to use the center pixel information. Furthermore, we exploit a random subsampler to decorrelate noise spatially, making the C-BSN free of visual artifacts that were often seen in downsample-based methods. Extensive experiments show that the proposed C-BSN achieves state-of-the-art performance on real-world datasets as a self-supervised denoiser and shows qualitatively pleasing results without any post-processing or refinement., Comment: Accepted to ICCV 2023
Published: 2023

5. Lightweight Hybrid Video Compression Framework Using Reference-Guided Restoration Network

Author: Rhee, Hochang, Kim, Seyun, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent deep-learning-based video compression methods brought coding gains over conventional codecs such as AVC and HEVC. However, learning-based codecs generally require considerable computation time and model complexity. In this paper, we propose a new lightweight hybrid video codec consisting of a conventional video codec(HEVC / VVC), a lossless image codec, and our new restoration network. Precisely, our encoder consists of the conventional video encoder and a lossless image encoder, transmitting a lossy-compressed video bitstream along with a losslessly-compressed reference frame. The decoder is constructed with corresponding video/image decoders and a new restoration network, which enhances the compressed video in two-step processes. In the first step, a network trained with a large video dataset restores the details lost by the conventional encoder. Then, we further boost the video quality with the guidance of a reference image, which is a losslessly compressed video frame. The reference image provides video-specific information, which can be utilized to better restore the details of a compressed video. Experimental results show that the proposed method achieves comparable performance to top-tier methods, even when applied to HEVC. Nevertheless, our method has lower complexity, a faster run time, and can be easily integrated into existing conventional codecs.
Published: 2023

6. Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation

Author: Park, Seung Ho, Moon, Young Su, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Single-image super-resolution (SISR) networks trained with perceptual and adversarial losses provide high-contrast outputs compared to those of networks trained with distortion-oriented losses, such as L1 or L2. However, it has been shown that using a single perceptual loss is insufficient for accurately restoring locally varying diverse shapes in images, often generating undesirable artifacts or unnatural details. For this reason, combinations of various losses, such as perceptual, adversarial, and distortion losses, have been attempted, yet it remains challenging to find optimal combinations. Hence, in this paper, we propose a new SISR framework that applies optimal objectives for each region to generate plausible results in overall areas of high-resolution outputs. Specifically, the framework comprises two models: a predictive model that infers an optimal objective map for a given low-resolution (LR) input and a generative model that applies a target objective map to produce the corresponding SR output. The generative model is trained over our proposed objective trajectory representing a set of essential objectives, which enables the single network to learn various SR results corresponding to combined losses on the trajectory. The predictive model is trained using pairs of LR images and corresponding optimal objective maps searched from the objective trajectory. Experimental results on five benchmarks show that the proposed method outperforms state-of-the-art perception-driven SR methods in LPIPS, DISTS, PSNR, and SSIM metrics. The visual results also demonstrate the superiority of our method in perception-oriented reconstruction. The code and models are available at https://github.com/seungho-snu/SROOE., Comment: CVPR 2023 accepted. Code and trained models will be available at https://github.com/seungho-snu/SROOE
Published: 2022

7. Training Patch Analysis and Mining Skills for Image Restoration Deep Neural Networks

Author: Soh, Jae Woong and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: There have been numerous image restoration methods based on deep convolutional neural networks (CNNs). However, most of the literature on this topic focused on the network architecture and loss functions, while less detailed on the training methods. Hence, some of the works are not easily reproducible because it is required to know the hidden training skills to obtain the same results. To be specific with the training dataset, few works discussed how to prepare and order the training image patches. Moreover, it requires a high cost to capture new datasets to train a restoration network for the real-world scene. Hence, we believe it is necessary to study the preparation and selection of training data. In this regard, we present an analysis of the training patches and explore the consequences of different patch extraction methods. Eventually, we propose a guideline for the patch extraction from given training images., Comment: 8 pages
Published: 2022

8. Variational Deep Image Restoration

Author: Soh, Jae Woong and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a new variational inference framework for image restoration and a convolutional neural network (CNN) structure that can solve the restoration problems described by the proposed framework. Earlier CNN-based image restoration methods primarily focused on network architecture design or training strategy with non-blind scenarios where the degradation models are known or assumed. For a step closer to real-world applications, CNNs are also blindly trained with the whole dataset, including diverse degradations. However, the conditional distribution of a high-quality image given a diversely degraded one is too complicated to be learned by a single CNN. Therefore, there have also been some methods that provide additional prior information to train a CNN. Unlike previous approaches, we focus more on the objective of restoration based on the Bayesian perspective and how to reformulate the objective. Specifically, our method relaxes the original posterior inference problem to better manageable sub-problems and thus behaves like a divide-and-conquer scheme. As a result, the proposed framework boosts the performance of several restoration problems compared to the previous ones. Specifically, our method delivers state-of-the-art performance on Gaussian denoising, real-world noise reduction, blind image super-resolution, and JPEG compression artifacts reduction., Comment: IEEE Transactions on Image Processing (TIP 2022)
Published: 2022
Full Text: View/download PDF

9. One-Shot Face Reenactment on Megapixels

Author: Kang, Wonjun, Lee, Geonsu, Koo, Hyung Il, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The goal of face reenactment is to transfer a target expression and head pose to a source face while preserving the source identity. With the popularity of face-related applications, there has been much research on this topic. However, the results of existing methods are still limited to low-resolution and lack photorealism. In this work, we present a one-shot and high-resolution face reenactment method called MegaFR. To be precise, we leverage StyleGAN by using 3DMM-based rendering images and overcome the lack of high-quality video datasets by designing a loss function that works without high-quality videos. Also, we apply iterative refinement to deal with extreme poses and/or expressions. Since the proposed method controls source images through 3DMM parameters, we can explicitly manipulate source images. We apply MegaFR to various applications such as face frontalization, eye in-painting, and talking head generation. Experimental results show that our method successfully disentangles identity from expression and head pose, and outperforms conventional methods., Comment: 29 pages, 19 figures
Published: 2022

10. Flexible Style Image Super-Resolution using Conditional Objective

Author: Park, Seung Ho, Moon, Young Su, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Recent studies have significantly enhanced the performance of single-image super-resolution (SR) using convolutional neural networks (CNNs). While there can be many high-resolution (HR) solutions for a given input, most existing CNN-based methods do not explore alternative solutions during the inference. A typical approach to obtaining alternative SR results is to train multiple SR models with different loss weightings and exploit the combination of these models. Instead of using multiple models, we present a more efficient method to train a single adjustable SR model on various combinations of losses by taking advantage of multi-task learning. Specifically, we optimize an SR model with a conditional objective during training, where the objective is a weighted sum of multiple perceptual losses at different feature levels. The weights vary according to given conditions, and the set of weights is defined as a style controller. Also, we present an architecture appropriate for this training scheme, which is the Residual-in-Residual Dense Block equipped with spatial feature transformation layers. At the inference phase, our trained model can generate locally different outputs conditioned on the style control map. Extensive experiments show that the proposed SR model produces various desirable reconstructions without artifacts and yields comparable quantitative performance to state-of-the-art SR methods., Comment: Will be presented in IEEE ACCESS. Code and trained models will be available at https://github.com/seungho-snu/FxSR
Published: 2022

11. Deep Hash Distillation for Image Retrieval

Author: Jang, Young Kyun, Gu, Geonmo, Ko, Byungsoo, Kang, Isaac, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval
Abstract: In hash-based image retrieval systems, degraded or transformed inputs usually generate different codes from the original, deteriorating the retrieval accuracy. To mitigate this issue, data augmentation can be applied during training. However, even if augmented samples of an image are similar in real feature space, the quantization can scatter them far away in Hamming space. This results in representation discrepancies that can impede training and degrade performance. In this work, we propose a novel self-distilled hashing scheme to minimize the discrepancy while exploiting the potential of augmented data. By transferring the hash knowledge of the weakly-transformed samples to the strong ones, we make the hash code insensitive to various transformations. We also introduce hash proxy-based similarity learning and binary cross entropy-based quantization loss to provide fine quality hash codes. Ultimately, we construct a deep hashing framework that not only improves the existing deep hashing approaches, but also achieves the state-of-the-art retrieval results. Extensive experiments are conducted and confirm the effectiveness of our work., Comment: ECCV2022
Published: 2021

12. DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation

Author: Park, Jaewoo and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Predicting the object's 6D pose from a single RGB image is a fundamental computer vision task. Generally, the distance between transformed object vertices is employed as an objective function for pose estimation methods. However, projective geometry in the camera space is not considered in those methods and causes performance degradation. In this regard, we propose a new pose estimation system based on a projective grid instead of object vertices. Our pose estimation method, dynamic projective spatial transformer network (DProST), localizes the region of interest grid on the rays in camera space and transforms the grid to object space by estimated pose. The transformed grid is used as both a sampling grid and a new criterion of the estimated pose. Additionally, because DProST does not require object vertices, our method can be used in a mesh-less setting by replacing the mesh with a reconstructed feature. Experimental results show that mesh-less DProST outperforms the state-of-the-art mesh-based methods on the LINEMOD and LINEMOD-OCCLUSION dataset, and shows competitive performance on the YCBV dataset with mesh data. The source code is available at https://github.com/parkjaewoo0611/DProST, Comment: Accepted to ECCV 2022
Published: 2021

13. LC-FDNet: Learned Lossless Image Compression with Frequency Decomposition Network

Author: Rhee, Hochang, Jang, Yeong Il, Kim, Seyun, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent learning-based lossless image compression methods encode an image in the unit of subimages and achieve comparable performances to conventional non-learning algorithms. However, these methods do not consider the performance drop in the high-frequency region, giving equal consideration to the low and high-frequency areas. In this paper, we propose a new lossless image compression method that proceeds the encoding in a coarse-to-fine manner to separate and process low and high-frequency regions differently. We initially compress the low-frequency components and then use them as additional input for encoding the remaining high-frequency region. The low-frequency components act as a strong prior in this case, which leads to improved estimation in the high-frequency area. In addition, we design the frequency decomposition process to be adaptive to color channel, spatial location, and image characteristics. As a result, our method derives an image-specific optimal ratio of low/high-frequency components. Experiments show that the proposed method achieves state-of-the-art performance for benchmark high-resolution datasets.
Published: 2021

14. A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution

Author: Park, Karam, Soh, Jae Woong, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep learning methods have shown outstanding performance in many applications, including single-image super-resolution (SISR). With residual connection architecture, deeply stacked convolutional neural networks provide a substantial performance boost for SISR, but their huge parameters and computational loads are impractical for real-world applications. Thus, designing lightweight models with acceptable performance is one of the major tasks in current SISR research. The objective of lightweight network design is to balance a computational load and reconstruction performance. Most of the previous methods have manually designed complex and predefined fixed structures, which generally required a large number of experiments and lacked flexibility in the diversity of input image statistics. In this paper, we propose a dynamic residual self-attention network (DRSAN) for lightweight SISR, while focusing on the automated design of residual connections between building blocks. The proposed DRSAN has dynamic residual connections based on dynamic residual attention (DRA), which adaptively changes its structure according to input statistics. Specifically, we propose a dynamic residual module that explicitly models the DRA by finding the interrelation between residual paths and input image statistics, as well as assigning proper weights to each residual path. We also propose a residual self-attention (RSA) module to further boost the performance, which produces 3-dimensional attention maps without additional parameters by cooperating with residual structures. The proposed dynamic scheme, exploiting the combination of DRA and RSA, shows an efficient trade-off between computational complexity and network performance. Experimental results show that the DRSAN performs better than or comparable to existing state-of-the-art lightweight models for SISR., Comment: Accepted for publication as a regular paper in the IEEE Transactions on Multimedia
Published: 2021

15. Self-supervised Product Quantization for Deep Unsupervised Image Retrieval

Author: Jang, Young Kyun and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Supervised deep learning-based hash and vector quantization are enabling fast and large-scale image retrieval systems. By fully exploiting label annotations, they are achieving outstanding retrieval performances compared to the conventional methods. However, it is painstaking to assign labels precisely for a vast amount of training data, and also, the annotation process is error-prone. To tackle these issues, we propose the first deep unsupervised image retrieval method dubbed Self-supervised Product Quantization (SPQ) network, which is label-free and trained in a self-supervised manner. We design a Cross Quantized Contrastive learning strategy that jointly learns codewords and deep visual descriptors by comparing individually transformed images (views). Our method analyzes the image contents to extract descriptive features, allowing us to understand image representations for accurate retrieval. By conducting extensive experiments on benchmarks, we demonstrate that the proposed method yields state-of-the-art results even without supervised pretraining., Comment: ICCV 2021
Published: 2021

16. Similarity Guided Deep Face Image Retrieval

Author: Jang, Young Kyun and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval
Abstract: Face image retrieval, which searches for images of the same identity from the query input face image, is drawing more attention as the size of the image database increases rapidly. In order to conduct fast and accurate retrieval, a compact hash code-based methods have been proposed, and recently, deep face image hashing methods with supervised classification training have shown outstanding performance. However, classification-based scheme has a disadvantage in that it cannot reveal complex similarities between face images into the hash code learning. In this paper, we attempt to improve the face image retrieval quality by proposing a Similarity Guided Hashing (SGH) method, which gently considers self and pairwise-similarity simultaneously. SGH employs various data augmentations designed to explore elaborate similarities between face images, solving both intra and inter identity-wise difficulties. Extensive experimental results on the protocols with existing benchmarks and an additionally proposed large scale higher resolution face image dataset demonstrate that our SGH delivers state-of-the-art retrieval performance., Comment: 10 pages, 9 figures
Published: 2021

17. Neural Architecture Search for Image Super-Resolution Using Densely Constructed Search Space: DeCoNAS

Author: Ahn, Joon Young and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The recent progress of deep convolutional neural networks has enabled great success in single image super-resolution (SISR) and many other vision tasks. Their performances are also being increased by deepening the networks and developing more sophisticated network structures. However, finding an optimal structure for the given problem is a difficult task, even for human experts. For this reason, neural architecture search (NAS) methods have been introduced, which automate the procedure of constructing the structures. In this paper, we expand the NAS to the super-resolution domain and find a lightweight densely connected network named DeCoNASNet. We use a hierarchical search strategy to find the best connection with local and global features. In this process, we define a complexity-based penalty for solving image super-resolution, which can be considered a multi-objective problem. Experiments show that our DeCoNASNet outperforms the state-of-the-art lightweight super-resolution networks designed by handcraft methods and existing NAS-based design.
Published: 2021

18. Variational Deep Image Denoising

Author: Soh, Jae Woong and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Convolutional neural networks (CNNs) have shown outstanding performance on image denoising with the help of large-scale datasets. Earlier methods naively trained a single CNN with many pairs of clean-noisy images. However, the conditional distribution of the clean image given a noisy one is too complicated and diverse, so that a single CNN cannot well learn such distributions. Therefore, there have also been some methods that exploit additional noise level parameters or train a separate CNN for a specific noise level parameter. These methods separate the original problem into easier sub-problems and thus have shown improved performance than the naively trained CNN. In this step, we raise two questions. The first one is whether it is an optimal approach to relate the conditional distribution only to noise level parameters. The second is what if we do not have noise level information, such as in a real-world scenario. To answer the questions and provide a better solution, we propose a novel Bayesian framework based on the variational approximation of objective functions. This enables us to separate the complicated target distribution into simpler sub-distributions. Eventually, the denoising CNN can conquer noise from each sub-distribution, which is generally an easier problem than the original. Experiments show that the proposed method provides remarkable performance on additive white Gaussian noise (AWGN) and real-noise denoising while requiring fewer parameters than recent state-of-the-art denoisers., Comment: 16 pages
Published: 2021

19. Deep Universal Blind Image Denoising

Author: Soh, Jae Woong and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image denoising is an essential part of many image processing and computer vision tasks due to inevitable noise corruption during image acquisition. Traditionally, many researchers have investigated image priors for the denoising, within the Bayesian perspective based on image properties and statistics. Recently, deep convolutional neural networks (CNNs) have shown great success in image denoising by incorporating large-scale synthetic datasets. However, they both have pros and cons. While the deep CNNs are powerful for removing the noise with known statistics, they tend to lack flexibility and practicality for the blind and real-world noise. Moreover, they cannot easily employ explicit priors. On the other hand, traditional non-learning methods can involve explicit image priors, but they require considerable computation time and cannot exploit large-scale external datasets. In this paper, we present a CNN-based method that leverages the advantages of both methods based on the Bayesian perspective. Concretely, we divide the blind image denoising problem into sub-problems and conquer each inference problem separately. As the CNN is a powerful tool for inference, our method is rooted in CNNs and propose a novel design of network for efficient inference. With our proposed method, we can successfully remove blind and real-world noise, with a moderate number of parameters of universal CNN., Comment: Presented in ICPR 2020 (Oral)
Published: 2021

20. Meta-Transfer Learning for Zero-Shot Super-Resolution

Author: Soh, Jae Woong, Cho, Sunwoo, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a "bicubic" downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process., Comment: Will be presented in CVPR 2020
Published: 2020

21. Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization

Author: Kim, Yoonsik, Soh, Jae Woong, Park, Gu Yong, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Real-noise denoising is a challenging task because the statistics of real-noise do not follow the normal distribution, and they are also spatially and temporally changing. In order to cope with various and complex real-noise, we propose a well-generalized denoising architecture and a transfer learning scheme. Specifically, we adopt an adaptive instance normalization to build a denoiser, which can regularize the feature map and prevent the network from overfitting to the training set. We also introduce a transfer learning scheme that transfers knowledge learned from synthetic-noise data to the real-noise denoiser. From the proposed transfer learning, the synthetic-noise denoiser can learn general features from various synthetic-noise data, and the real-noise denoiser can learn the real-noise characteristics from real data. From the experiments, we find that the proposed denoising method has great generalization ability, such that our network trained with synthetic-noise achieves the best performance for Darmstadt Noise Dataset (DND) among the methods from published papers. We can also see that the proposed transfer learning scheme robustly works for real-noise images through the learning with a very small number of labeled data., Comment: CVPR accepted paper. The paper will be updated according to reviewers' comments
Published: 2020

22. Generalized Product Quantization Network for Semi-supervised Image Retrieval

Author: Jang, Young Kyun and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image retrieval methods that employ hashing or vector quantization have achieved great success by taking advantage of deep learning. However, these approaches do not meet expectations unless expensive label information is sufficient. To resolve this issue, we propose the first quantization-based semi-supervised image retrieval scheme: Generalized Product Quantization (GPQ) network. We design a novel metric learning strategy that preserves semantic similarity between labeled data, and employ entropy regularization term to fully exploit inherent potentials of unlabeled data. Our solution increases the generalization capacity of the quantization network, which allows overcoming previous limitations in the retrieval community. Extensive experimental results demonstrate that GPQ yields state-of-the-art performance on large-scale real image benchmark datasets., Comment: 10 pages, 10 figures, Computer Vision and Pattern Recognition (CVPR) 2020 accpeted paper
Published: 2020

23. Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation

Author: Choo, Sungkwon, Seo, Wonkyo, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a method for automatic video object segmentation based on the fusion of motion stream, appearance stream, and instance-aware segmentation. The proposed scheme consists of a two-stream fusion network and an instance segmentation network. The two-stream fusion network again consists of motion and appearance stream networks, which extract long-term temporal and spatial information, respectively. Unlike the existing two-stream fusion methods, the proposed fusion network blends the two streams at the original resolution for obtaining accurate segmentation boundary. We develop a recurrent bidirectional multiscale structure with skip connection for the stream fusion network to extract long-term temporal information. Also, the multiscale structure enables to obtain the original resolution features at the end of the network. As a result of two-stream fusion, we have a pixel-level probabilistic segmentation map, which has higher values at the pixels belonging to the foreground object. By combining the probability of foreground map and objectness score of instance segmentation mask, we finally obtain foreground segmentation results for video sequences without any user intervention, i.e., we achieve successful automatic video segmentation. The proposed structure shows a state-of-the-art performance for automatic video object segmentation task, and also achieves near semi-supervised performance., Comment: 8+1 pages, 5 figures
Published: 2019

24. Natural and Realistic Single Image Super-Resolution with Explicit Natural Manifold Discrimination

Author: Soh, Jae Woong, Park, Gu Yong, Jo, Junho, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently, many convolutional neural networks for single image super-resolution (SISR) have been proposed, which focus on reconstructing the high-resolution images in terms of objective distortion measures. However, the networks trained with objective loss functions generally fail to reconstruct the realistic fine textures and details that are essential for better perceptual quality. Recovering the realistic details remains a challenging problem, and only a few works have been proposed which aim at increasing the perceptual quality by generating enhanced textures. However, the generated fake details often make undesirable artifacts and the overall image looks somewhat unnatural. Therefore, in this paper, we present a new approach to reconstructing realistic super-resolved images with high perceptual quality, while maintaining the naturalness of the result. In particular, we focus on the domain prior properties of SISR problem. Specifically, we define the naturalness prior in the low-level domain and constrain the output image in the natural manifold, which eventually generates more natural and realistic images. Our results show better naturalness compared to the recent super-resolution algorithms including perception-oriented ones., Comment: Presented in CVPR 2019
Published: 2019

25. Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning

Author: Hosseini, Sepidehsadat, Shabani, Mohammad Amin, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation. The proposed method exploits multiple datasets with different labels for different-but-related tasks such as simultaneous age, gender, race, facial expression estimation. Specifically, when there are only a few well-labeled data for a specific task among the multiple related ones, we exploit the labels of other related tasks in different domains. Our approach is composed of (1) a new MTL method which can deal with weakly labeled datasets and perform several tasks simultaneously, and (2) an MTL-based data distillation framework which enables network generalization for the training and test data from different domains. Experiments show that the proposed multi-task system performs each task better than the baseline single task. It is also demonstrated that using different domain datasets along with the main dataset can enhance network generalization and overcome the domain differences between datasets. Also, comparing data distillation both on the baseline and MTL framework, the latter shows more accurate predictions on unlabeled data from different domains. Furthermore, by proposing a new learning-rate optimization method, our proposed network is able to dynamically tune its learning rate.
Published: 2019

26. Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Network

Author: Jo, Junho, Koo, Hyung Il, Soh, Jae Woong, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a new handwritten text segmentation method by training a convolutional neural network (CNN) in an end-to-end manner. Many conventional methods addressed this problem by extracting connected components and then classifying them. However, this two-step approach has limitations when handwritten components and machine-printed parts are overlapping. Unlike conventional methods, we develop an end-to-end deep CNN for this problem, which does not need any preprocessing steps. Since there is no publicly available dataset for this goal and pixel-wise annotations are time-consuming and costly, we also propose a data synthesis algorithm that generates realistic training samples. For training our network, we develop a cross-entropy based loss function that addresses the imbalance problems. Experimental results on synthetic and real images show the effectiveness of the proposed method. Specifically, the proposed network has been trained solely on synthetic images, nevertheless the removal of handwritten text in real documents improves OCR performance from 71.13% to 92.50%, showing the generalization performance of our network and synthesized images.
Published: 2019

27. Joint High Dynamic Range Imaging and Super-Resolution from a Single Image

Author: Soh, Jae Woong, Park, Jae Sung, and Cho, Nam Ik
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition, 68T45
Abstract: This paper presents a new framework for jointly enhancing the resolution and the dynamic range of an image, i.e., simultaneous super-resolution (SR) and high dynamic range imaging (HDRI), based on a convolutional neural network (CNN). From the common trends of both tasks, we train a CNN for the joint HDRI and SR by focusing on the reconstruction of high-frequency details. Specifically, the high-frequency component in our work is the reflectance component according to the Retinex-based image decomposition, and only the reflectance component is manipulated by the CNN while another component (illumination) is processed in a conventional way. In training the CNN, we devise an appropriate loss function that contributes to the naturalness quality of resulting images. Experiments show that our algorithm outperforms the cascade implementation of CNN-based SR and HDRI., Comment: 11 pages
Published: 2019

28. PuVAE: A Variational Autoencoder to Purify Adversarial Examples

Author: Hwang, Uiwon, Park, Jaewoo, Jang, Hyemi, Yoon, Sungroh, and Cho, Nam Ik
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Deep neural networks are widely used and exhibit excellent performance in many areas. However, they are vulnerable to adversarial attacks that compromise the network at the inference time by applying elaborately designed perturbation to input data. Although several defense methods have been proposed to address specific attacks, other attack methods can circumvent these defense mechanisms. Therefore, we propose Purifying Variational Autoencoder (PuVAE), a method to purify adversarial examples. The proposed method eliminates an adversarial perturbation by projecting an adversarial example on the manifold of each class, and determines the closest projection as a purified sample. We experimentally illustrate the robustness of PuVAE against various attack methods without any prior knowledge. In our experiments, the proposed method exhibits performances competitive with state-of-the-art defense methods, and the inference time is approximately 130 times faster than that of Defense-GAN that is the state-of-the art purifier model.
Published: 2019

29. DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation

Author: Park, Jaewoo, Cho, Nam Ik, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
Published: 2022
Full Text: View/download PDF

30. Deep Hash Distillation for Image Retrieval

Author: Jang, Young Kyun, Gu, Geonmo, Ko, Byungsoo, Kang, Isaac, Cho, Nam Ik, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
Published: 2022
Full Text: View/download PDF

31. Understanding and explaining convolutional neural networks based on inverse approach

Author: Kwon, Hyuk Jin, Koo, Hyung Il, and Cho, Nam Ik
Published: 2023
Full Text: View/download PDF

32. Feeding Hand-Crafted Features for Enhancing the Performance of Convolutional Neural Networks

Author: Hosseini, Sepidehsadat, Lee, Seok Hee, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Since the convolutional neural network (CNN) is be- lieved to find right features for a given problem, the study of hand-crafted features is somewhat neglected these days. In this paper, we show that finding an appropriate feature for the given problem may be still important as they can en- hance the performance of CNN-based algorithms. Specif- ically, we show that feeding an appropriate feature to the CNN enhances its performance in some face related works such as age/gender estimation, face detection and emotion recognition. We use Gabor filter bank responses for these tasks, feeding them to the CNN along with the input image. The stack of image and Gabor responses can be fed to the CNN as a tensor input, or as a fused image which is a weighted sum of image and Gabor responses. The Gabor filter parameters can also be tuned depending on the given problem, for increasing the performance. From the extensive experiments, it is shown that the proposed methods provide better performance than the conventional CNN-based methods that use only the input images., Comment: 8 pages
Published: 2018

33. Strain Analysis of Multi-Phase Steel Using In-Situ EBSD Tensile Testing and Digital Image Correlation

Author: Kim, Kyung Il, Oh, Yeonju, Kim, Dong Uk, Kang, Joo-Hee, Cho, Nam Ik, Oh, Kyu Hwan, Kang, Jun-Yun, and Han, Heung Nam
Published: 2022
Full Text: View/download PDF

34. Generation of High Dynamic Range Illumination from a Single Image for the Enhancement of Undesirably Illuminated Images

Author: Park, Jae Sung and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics
Abstract: This paper presents an algorithm that enhances undesirably illuminated images by generating and fusing multi-level illuminations from a single image.The input image is first decomposed into illumination and reflectance components by using an edge-preserving smoothing filter. Then the reflectance component is scaled up to improve the image details in bright areas. The illumination component is scaled up and down to generate several illumination images that correspond to certain camera exposure values different from the original. The virtual multi-exposure illuminations are blended into an enhanced illumination, where we also propose a method to generate appropriate weight maps for the tone fusion. Finally, an enhanced image is obtained by multiplying the equalized illumination and enhanced reflectance. Experiments show that the proposed algorithm produces visually pleasing output and also yields comparable objective results to the conventional enhancement methods, while requiring modest computational loads.
Published: 2017

35. Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph

Author: Jeong, Dong-ju, Hwang, Insung, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a co-salient object detection method to find common salient regions in a set of images. We utilize deep saliency networks to transfer co-saliency prior knowledge and better capture high-level semantic information, and the resulting initial co-saliency maps are enhanced by seed propagation steps over an integrated graph. The deep saliency networks are trained in a supervised manner to avoid online weakly supervised learning and exploit them not only to extract high-level features but also to produce both intra- and inter-image saliency maps. Through a refinement step, the initial co-saliency maps can uniformly highlight co-salient regions and locate accurate object boundaries. To handle input image groups inconsistent in size, we propose to pool multi-regional descriptors including both within-segment and within-group information. In addition, the integrated multilayer graph is constructed to find the regions that the previous steps may not detect by seed propagation with low-level descriptors. In this work, we utilize the useful complementary components of high-, low-level information, and several learning-based steps. Our experiments have demonstrated that the proposed approach outperforms comparable co-saliency detection methods on widely used public databases and can also be directly applied to co-segmentation tasks., Comment: 13 pages, 10 figures, 3 tables
Published: 2017
Full Text: View/download PDF

36. Self-Committee Approach for Image Restoration Problems using Convolutional Neural Network

Author: Ahn, Byeongyong and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: There have been many discriminative learning methods using convolutional neural networks (CNN) for several image restoration problems, which learn the mapping function from a degraded input to the clean output. In this letter, we propose a self-committee method that can find enhanced restoration results from the multiple trial of a trained CNN with different but related inputs. Specifically, it is noted that the CNN sometimes finds different mapping functions when the input is transformed by a reversible transform and thus produces different but related outputs with the original. Hence averaging the outputs for several different transformed inputs can enhance the results as evidenced by the network committee methods. Unlike the conventional committee approaches that require several networks, the proposed method needs only a single network. Experimental results show that adding an additional transform as a committee always brings additional gain on image denoising and single image supre-resolution problems., Comment: 4 pages, 5 figures
Published: 2017

37. Block-Matching Convolutional Neural Network for Image Denoising

Author: Ahn, Byeongyong and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: There are two main streams in up-to-date image denoising algorithms: non-local self similarity (NSS) prior based methods and convolutional neural network (CNN) based methods. The NSS based methods are favorable on images with regular and repetitive patterns while the CNN based methods perform better on irregular structures. In this paper, we propose a block-matching convolutional neural network (BMCNN) method that combines NSS prior and CNN. Initially, similar local patches in the input image are integrated into a 3D block. In order to prevent the noise from messing up the block matching, we first apply an existing denoising algorithm on the noisy image. The denoised image is employed as a pilot signal for the block matching, and then denoising function for the block is learned by a CNN structure. Experimental results show that the proposed BMCNN algorithm achieves state-of-the-art performance. In detail, BMCNN can restore both repetitive and irregular structures., Comment: 11 pages, 9 figures
Published: 2017

38. A New Convolutional Network-in-Network Structure and Its Applications in Skin Detection, Semantic Segmentation, and Artifact Reduction

Author: Kim, Yoonsik, Hwang, Insung, and Cho, Nam Ik
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The inception network has been shown to provide good performance on image classification problems, but there are not much evidences that it is also effective for the image restoration or pixel-wise labeling problems. For image restoration problems, the pooling is generally not used because the decimated features are not helpful for the reconstruction of an image as the output. Moreover, most deep learning architectures for the restoration problems do not use dense prediction that need lots of training parameters. From these observations, for enjoying the performance of inception-like structure on the image based problems we propose a new convolutional network-in-network structure. The proposed network can be considered a modification of inception structure where pool projection and pooling layer are removed for maintaining the entire feature map size, and a larger kernel filter is added instead. Proposed network greatly reduces the number of parameters on account of removed dense prediction and pooling, which is an advantage, but may also reduce the receptive field in each layer. Hence, we add a larger kernel than the original inception structure for not increasing the depth of layers. The proposed structure is applied to typical image-to-image learning problems, i.e., the problems where the size of input and output are same such as skin detection, semantic segmentation, and compression artifacts reduction. Extensive experiments show that the proposed network brings comparable or better results than the state-of-the-art convolutional neural networks for these problems., Comment: 10 pages
Published: 2017

39. Deep-learning and graph-based approach to table structure recognition

Author: Lee, Eunji, Park, Jaewoo, Koo, Hyung Il, and Cho, Nam Ik
Published: 2022
Full Text: View/download PDF

40. Learning Background Subtraction by Video Synthesis and Multi-scale Recurrent Networks

Author: Choo, Sungkwon, Seo, Wonkyo, Jeong, Dong-ju, Cho, Nam Ik, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Jawahar, C.V., editor, Li, Hongdong, editor, Mori, Greg, editor, and Schindler, Konrad, editor
Published: 2019
Full Text: View/download PDF

41. Deep Clustering and Block Hashing Network for Face Image Retrieval

Author: Jang, Young Kyun, Jeong, Dong-ju, Lee, Seok Hee, Cho, Nam Ik, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Jawahar, C.V., editor, Li, Hongdong, editor, Mori, Greg, editor, and Schindler, Konrad, editor
Published: 2019
Full Text: View/download PDF

42. Recent Issues in Real-World Image Restoration Using Deep Convolutional Neural Networks

Author: Cho, Nam Ik, speaker
Published: 2021
Full Text: View/download PDF

43. Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Author: Jo, Junho, Koo, Hyung Il, Soh, Jae Woong, and Cho, Nam Ik
Published: 2020
Full Text: View/download PDF

44. OCVOS: Object-Centric Representation for Video Object Segmentation

Author: Jo, Junho, primary, Wee, Dongyoon, additional, and Cho, Nam Ik, additional
Published: 2023
Full Text: View/download PDF

45. Generation of high dynamic range illumination from a single image for the enhancement of undesirably illuminated images

Author: Park, Jae Sung, Soh, Jae Woong, and Cho, Nam Ik
Published: 2019
Full Text: View/download PDF

46. Perception-Oriented Single Image Super-Resolution using Optimal Objective Estimation

Author: Park, Seung Ho, primary, Moon, Young Su, additional, and Cho, Nam Ik, additional
Published: 2023
Full Text: View/download PDF

47. Learning Background Subtraction by Video Synthesis and Multi-scale Recurrent Networks

Author: Choo, Sungkwon, primary, Seo, Wonkyo, additional, Jeong, Dong-ju, additional, and Cho, Nam Ik, additional
Published: 2019
Full Text: View/download PDF

48. Deep Clustering and Block Hashing Network for Face Image Retrieval

Author: Jang, Young Kyun, primary, Jeong, Dong-ju, additional, Lee, Seok Hee, additional, and Cho, Nam Ik, additional
Published: 2019
Full Text: View/download PDF

49. Frequency-Domain Multi-Exposure HDR Imaging Network With Representative Image Features

Author: Lee, Keuntek, primary, Park, Jaehyun, additional, Jang, Yeong Il, additional, and Cho, Nam Ik, additional
Published: 2023
Full Text: View/download PDF

50. A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution

Author: Park, Karam, primary, Soh, Jae Woong, additional, and Cho, Nam Ik, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

431 results on '"Cho, Nam Ik"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources