99 results on '"Chee Sun Won"'
Search Results
2. Filter pruning by image channel reduction in pre-trained convolutional neural networks
- Author
-
Chee Sun Won and Gi Su Chung
- Subjects
Brightness ,Channel (digital image) ,Contextual image classification ,Computer Networks and Communications ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Filter (signal processing) ,Convolutional neural network ,Reduction (complexity) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,Artificial intelligence ,business ,Software ,Pruning (morphology) - Abstract
There are domain-specific image classification problems such as facial emotion and house-number classifications, where the color information in the images may not be crucial for recognition. This motivates us to convert RGB images to gray-scale ones with a single Y channel to be fed into the pre-trained convolutional neural networks (CNN). Now, since the existing CNN models are pre-trained by three-channel color images, one can expect that some trained filters are more sensitive to colors than brightness. Therefore, adopting the single-channel gray-scale images as inputs, we can prune out some of the convolutional filters in the first layer of the pre-trained CNN. This first-layer pruning greatly facilitates the filter compression of the subsequent convolutional layers. Now, the pre-trained CNN with the compressed filters is fine-tuned with the single-channel images for a domain-specific dataset. Experimental results on the facial emotion and Street View House Numbers (SVHN) datasets show that we can achieve a significant compression of the pre-trained CNN filters by the proposed method. For example, compared with the fine-tuned VGG-16 model by color images, we can save 10.538 GFLOPs computations, while keeping the classification accuracy around 84% for the facial emotion RAF-DB dataset.
- Published
- 2020
3. Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks
- Author
-
Jun-Hwa Kim and Chee Sun Won
- Subjects
General Computer Science ,Computer science ,Convolutional neural network (CNN) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical flow ,02 engineering and technology ,Grayscale ,Convolutional neural network ,Image (mathematics) ,video analysis ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,action recognition ,business.industry ,Frame (networking) ,General Engineering ,020206 networking & telecommunications ,Pattern recognition ,two-stream convolutional neural networks ,Action recognition ,RGB color model ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 - Abstract
A pre-trained 2D CNN (Convolutional Neural Network) can be used for the spatial stream in the two-stream CNN structure for videos, treating the representative frame selected from the video as an input. However, the CNN for the temporal stream in the two-stream CNN needs training from scratch using the optical flow frames, which demands expensive computations. In this paper, we propose to adopt a pre-trained 2D CNN for the temporal stream to avoid the optical flow computations. Specifically, three RGB frames selected at three different times in the video sequence are converted into grayscale images and are assigned to three R(red), G(green), and B(blue) channels, respectively, to form a Stacked Grayscale 3-channel Image (SG3I). Then, the pre-trained 2D CNN is fine-tuned by SG3Is for the temporal stream CNN. Therefore, only pre-trained 2D CNNs are used for both spatial and temporal streams. To learn long-range temporal motions in videos, we can use multiple SG3Is by partitioning the video shot into sub-shots and a single SG3I is generated for each sub-shot. Experimental results show that our two-stream CNN with the proposed SG3Is is about 14.6 times faster than the first version of the two-stream CNN with the optical flow, and yet achieves a similar recognition accuracy for UCF-101 and a 5.7% better result for HMDB-51.
- Published
- 2020
4. Multi-Scale CNN for Fine-Grained Image Recognition
- Author
-
Chee Sun Won
- Subjects
General Computer Science ,Contextual image classification ,fine-grained image classification ,Computer science ,business.industry ,Convolutional neural network (CNN) ,020208 electrical & electronic engineering ,General Engineering ,Process (computing) ,020206 networking & telecommunications ,Scale (descriptive set theory) ,02 engineering and technology ,Object (computer science) ,Separable space ,Image (mathematics) ,Line (geometry) ,0202 electrical engineering, electronic engineering, information engineering ,image resizing ,General Materials Science ,Computer vision ,food recognition ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 - Abstract
Most conventional fine-grained image recognitions are based on a two-stream model of object-level and part-level CNNs, where the part-level CNN is responsible for learning the object-parts and their spatial relationships. To train the part-level CNN, we first need to separate parts from an object. However, there exist sub-level objects with no distinctive and separable parts. In this paper, a multi-scale CNN with a baseline Object-level and multiple Part-level CNNs is proposed for the fine-grained image recognition with no separable object-parts. The basic idea to train different CNNs of the multi-scale CNNs is to adopt different scales in resizing the training images. That is, the training images are resized such that the entire object appears as much as possible for the Object-level CNN, while only a local part of the object is to be included for the Part-level CNN. This scale-specific image resizing approach requires a scale-controllable parameter in the image resizing process. In this paper, a scale-controllable parameter is introduced for the linear-scaling and random-cropping method. Also, a line-based image resizing method with a scale-controllable parameter is employed for the part-level CNNs. The proposed multi-scale CNN is applied to a food image classification, which belongs to a fine-grained classification problem with no separable object-parts. Experimental results on the public food image datasets show that the classification accuracy improves substantially when the predicted scores of the multi-scale CNN are fused together. This reveals that the object-level and part-level CNNs work harmoniously in differentiating subtle differences of the sub-level objects.
- Published
- 2020
5. Facial Action Units for Training Convolutional Neural Networks
- Author
-
Trinh Thi Doan Pham and Chee Sun Won
- Subjects
data oversampling ,General Computer Science ,Computer science ,Emotion classification ,Convolutional neural network ,02 engineering and technology ,01 natural sciences ,Image (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Selection (linguistics) ,General Materials Science ,facial emotion recognition ,data imbalance ,Training set ,business.industry ,010401 analytical chemistry ,General Engineering ,Training (meteorology) ,020207 software engineering ,Pattern recognition ,0104 chemical sciences ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,facial action units ,business ,Focus (optics) ,lcsh:TK1-9971 - Abstract
This paper deals with the problem of training convolutional neural networks (CNNs) with facial action units (AUs). In particular, we focus on the imbalance problem of the training datasets for facial emotion classification. Since training a CNN with an imbalanced dataset tends to yield a learning bias toward the major classes and eventually leads to deterioration in the classification accuracy, it is required to increase the number of training images for the minority classes to have evenly distributed training images over all classes. However, it is difficult to find the images with a similar facial emotion for the oversampling. In this paper, we propose to use the AU features to retrieve an image with a similar emotion. The query selection from the minority class and the AU-based retrieval processes repeat until the numbers of training data over all classes are balanced. Also, to improve the classification accuracy, the AU features are fused with the CNN features to train a support vector machine (SVM) for final classification. The experiments have been conducted on three imbalanced facial image datasets, RAF-DB, FER2013, and ExpW. The results demonstrate that the CNNs trained with the AU features improve the classification accuracy by 3%-4%.
- Published
- 2019
6. Emotion Enhancement for Facial Images Using GAN
- Author
-
Chee Sun Won and Jun-Hwa Kim
- Subjects
Noise measurement ,Computer science ,Emotion classification ,Speech recognition ,05 social sciences ,010501 environmental sciences ,01 natural sciences ,Facial recognition system ,Convolutional neural network ,ComputingMethodologies_PATTERNRECOGNITION ,ComputerApplications_MISCELLANEOUS ,0502 economics and business ,050207 economics ,Reliability (statistics) ,0105 earth and related environmental sciences - Abstract
Labeled images play an important role for training convolutional neural networks (CNN). In particular, training CNNs for facial emotion classification, the publicly available datasets suffer from noisy labels and inter-class imbalance problem. In this paper, we adopt a Generative Adversarial Network (GAN) to alleviate both noisy labeling and inter-class imbalance problems. Specifically, the noisy labelled images are identified by cross-checking the classified results with two fine-tuned CNNs and their facial emotions are strengthened by a GAN. Also, some of the neutral emotion images are transformed into minor emotion classes to solve the imbalance problem.
- Published
- 2020
7. Constrained Optimization for Image Reshaping With Soft Conditions
- Author
-
Chee Sun Won
- Subjects
Mathematical optimization ,General Computer Science ,Linear programming ,Computer science ,Machine vision ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0211 other engineering and technologies ,02 engineering and technology ,Convolutional neural network ,Image (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Constrained optimization ,Image resolution ,convolutional neural network (CNN) ,021110 strategic, defence & security studies ,General Engineering ,linear programming ,Function (mathematics) ,image processing ,Range (mathematics) ,Computer Science::Computer Vision and Pattern Recognition ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,lcsh:TK1-9971 - Abstract
Conventional image resizing problems demand hard conditions on size and aspect ratio, which must be met with no tolerance. In this paper, a generalized optimization framework is presented, which can handle soft conditions as well as the hard ones. The soft condition can be given by an allowable range of the image parameter, which is incorporated as an inequality condition in the constrained optimization framework. Given the soft constraints, the proposed framework seeks to find the set of image parameters that minimize the cost function. A constrained optimization via a linear programming framework is employed to manage a diverse combination of soft and hard conditions for the target image. The optimization is based on the image line, which optimally selects a set of image lines (columns and rows) to be deleted for size reduction in accordance with the cost function and the constraints. As a case study, the line-based optimal image resizing method based on the linear programming framework is applied for the pre-processing of VGG-19 convolutional neural network (CNN). Although the target input size is a hard condition of $224\times 224$ for the VGG-19 CNN, the proposed optimization framework with a soft condition on the image size firstly finds an optimal near-square image with a tradeoff against the saliency level of image features. Then, the optimal near-square image is linearly scaled to the final image size to meet the hard condition.
- Published
- 2018
8. Near-reversible efficient image resizing for devices supporting different spatial resolutions
- Author
-
Chee Sun Won and Seung-Won Jung
- Subjects
Pixel ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Computer Science::Human-Computer Interaction ,02 engineering and technology ,Theoretical Computer Science ,Image (mathematics) ,Computer Science::Graphics ,Hardware and Architecture ,Computer Science::Computer Vision and Pattern Recognition ,Line (geometry) ,0202 electrical engineering, electronic engineering, information engineering ,Maximum a posteriori estimation ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Resizing ,business ,Cropping ,Software ,Information Systems - Abstract
Many devices in cloud environments support different spatial resolutions, necessitating image resizing of the original image contents. The goal of this paper is to combine multiple operators for image resizing in a stochastic optimization framework, seeking an optimal balance among essential resizing operators such as cropping and linear scaling. Specifically, we formulate image resizing as a MAP (maximum a posteriori) optimization problem with a Gibbs energy function. To reduce computational complexity we seek a sub-optimal solution of the MAP criterion with a deterministic implementation of the Metropolis algorithm. Since the optimization is carried out on the basis of a straight horizontal or vertical line in an image instead of the curved seam pixels, the optimization should converge quickly to have a fast image resizing. In addition, our image resizing can be associated with various user-defined content filtering such as a color masking. Finally, our resizing method is reversible, meaning that the image with the original size can be reconstructed from the retargeted image. This allows us to apply the proposed image resizing method to a prioritized image transmission with a scalable and progressive structure.
- Published
- 2016
9. Order-Preserving Condensation of Moving Objects in Surveillance Videos
- Author
-
Chee Sun Won, Seung-Won Jung, and Hai Thanh Nguyen
- Subjects
050210 logistics & transportation ,Pixel ,Computational complexity theory ,Computer science ,business.industry ,Mechanical Engineering ,05 social sciences ,Condensation ,Video sequence ,02 engineering and technology ,Computer Science Applications ,Order (business) ,0502 economics and business ,Automotive Engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Condensation algorithm ,Artificial intelligence ,business ,Short duration ,Intelligent transportation system - Abstract
Vision-based detection of illegal or accidental activities in urban traffic has attracted great interest. Since state-of-the-art online automated detection algorithms are far from perfect, much research effort on offline video surveillance has been made to prevent police or security staff from observing all recorded video frames unnecessarily. To solve the problem, this study focuses on video condensation, which provides fast monitoring of moving objects in a long duration of surveillance videos. Considering the computational complexity and the condensation ratio as the two main criteria for efficient video condensation, we propose a video condensation algorithm, which consists of the following: 1) initial condensation by discarding frames of nonmoving objects; 2) intra-GoFM (group of frames with moving objects) condensation; and 3) inter-GoFM condensation. In the intra-GoFM and inter-GoFM condensation, spatiotemporal static pixels within each GoFM and temporal static pixels between two consecutive GoFMs are dropped to shorten the temporal distances between consecutive moving objects. Experimental results show that our video condensation saves a significant amount of computational loads compared with the previous methods without sacrificing the condensation ratio and visual quality.
- Published
- 2016
10. Depth completion for kinect v2 sensor
- Author
-
Anh Vu Le, Seokmin Yun, Chee Sun Won, Seung-Won Jung, and Wanbin Song
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Object (computer science) ,Image (mathematics) ,Hardware and Architecture ,Position (vector) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Kinect v2 adopts a time-of-flight (ToF) depth sensing mechanism, which causes different type of depth artifacts comparing to the original Kinect v1. The goal of this paper is to propose a depth completion method, which is designed especially for the Kinect v2 depth artifacts. Observing the specific types of depth errors in the Kinect v2 such as thin hole-lines along the object boundaries and the new type of holes in the image corners, in this paper, we exploit the position information of the color edges extracted from the Kinect v2 sensor to guide the accurate hole-filling around the object boundaries. Since our approach requires a precise registration between color and depth images, we also introduce the transformation matrix which yields point-to-point correspondence with a pixel-accuracy. Experimental results demonstrate the effectiveness of the proposed depth image completion algorithm for the Kinect v2 in terms of completion accuracy and execution time.
- Published
- 2016
11. A new depth image quality metric using a pair of color and depth images
- Author
-
Thanh Ha Le, Chee Sun Won, and Seung-Won Jung
- Subjects
Ground truth ,Pixel ,Computer Networks and Communications ,Color image ,Computer science ,Image quality ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Stereoscopy ,02 engineering and technology ,law.invention ,Image (mathematics) ,Hardware and Architecture ,law ,Computer Science::Computer Vision and Pattern Recognition ,Distortion ,Metric (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Software - Abstract
Typical depth quality metrics require the ground truth depth image or stereoscopic color image pair, which are not always available in many practical applications. In this paper, we propose a new depth image quality metric which demands only a single pair of color and depth images. Our observations reveal that the depth distortion is strongly related to the local image characteristics, which in turn leads us to formulate a new distortion assessment method for the edge and non-edge pixels in the depth image. The local depth distortion is adaptively weighted using the Gabor filtered color image and added up to the global depth image quality metric. The experimental results show that the proposed metric closely approximates the depth quality metrics that use the ground truth depth or stereo color image pair.
- Published
- 2016
12. Key-point based stereo matching and its application to interpolations
- Author
-
Anh Vu Le and Chee Sun Won
- Subjects
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stereoscopy ,02 engineering and technology ,law.invention ,Artificial Intelligence ,law ,0202 electrical engineering, electronic engineering, information engineering ,Image scaling ,Overhead (computing) ,Computer vision ,ComputingMethodologies_COMPUTERGRAPHICS ,Mathematics ,Pixel ,business.industry ,Applied Mathematics ,020206 networking & telecommunications ,Stairstep interpolation ,Computer Science Applications ,Transformation (function) ,Hardware and Architecture ,Signal Processing ,Key (cryptography) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Software ,Information Systems ,Interpolation - Abstract
In this paper, we propose a novel interpolation method to expand the decimated stereoscopic 3D (S3D) video to the original size. The basic approach for our interpolation is to exploit key-point correspondences between the stereoscopic left and right images. Since the rectified left and right frames of the S3D videos are aligned, a simple key-point detection method can be employed without considering the scale and transformation invariances. After detecting matched key-point pairs between the left and right images, we can interpolate the decimated pixels by exploiting the corresponding key-points in the opposite view as well as their neighboring pixels in the current view. The merit of our method is that no side information overhead is required for the interpolation. Nevertheless, the proposed method yields similar or even better PSNR performance than the previous side information method.
- Published
- 2015
13. A Survey of Human Action Recognition Approaches that use an RGB-D Sensor
- Author
-
Adnan Farooq and Chee Sun Won
- Subjects
Sketch recognition ,business.industry ,Computer science ,Signal Processing ,Pattern recognition (psychology) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,RGB color model ,Action recognition ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Focus (optics) ,business - Abstract
Human action recognition from a video scene has remained a challenging problem in the area of computer vision and pattern recognition. The development of the low-cost RGB depth camera (RGB-D) allows new opportunities to solve the problem of human action recognition. In this paper, we present a comprehensive review of recent approaches to human action recognition based on depth maps, skeleton joints, and other hybrid approaches. In particular, we focus on the advantages and limitations of the existing approaches and on future directions.
- Published
- 2015
14. Accurate vertical road profile estimation using v-disparity map and dynamic programming
- Author
-
Chee Sun Won, Ji-Yeol Park, Sesong Kim, and Seung-Won Jung
- Subjects
050210 logistics & transportation ,Computer science ,business.industry ,05 social sciences ,Advanced driver assistance systems ,02 engineering and technology ,Function (mathematics) ,Dynamic programming ,ComputerSystemsOrganization_MISCELLANEOUS ,Road surface ,Obstacle ,0502 economics and business ,Parametric model ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Energy (signal processing) - Abstract
Detecting obstacles on the road is crucial for the advanced driver assistance systems. Obstacle detection on the road can be greatly facilitated if we have a vertical road profile. Therefore, in this paper, we present a novel method that can estimate an accurate vertical road profile of the scene from the stereo images. Unlike conventional stereo-based road profile estimation methods that heavily rely on a parametric model of the road surface, our method can obtain a road profile for an arbitrary complicated road. To this end, an energy function that includes the stereo matching fidelity and spatio-temporal smoothness of the road profile is presented, and thus the road profile is extracted by maximizing the energy function via dynamic programming. The experimental results demonstrate the effectiveness of the proposed method.
- Published
- 2017
15. Directional Joint Bilateral Filter for Depth Images
- Author
-
Anh Vu Le, Seung-Won Jung, and Chee Sun Won
- Subjects
joint trilateral filter ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,joint bilateral filter ,lcsh:Chemical technology ,Biochemistry ,Article ,Analytical Chemistry ,Rendering (computer graphics) ,Depth map ,depth map ,image filtering ,Kinect ,lcsh:TP1-1185 ,Computer vision ,Electrical and Electronic Engineering ,Instrumentation ,Color image ,business.industry ,Cognitive neuroscience of visual object recognition ,Atomic and Molecular Physics, and Optics ,joint trilateralfilter ,Bilateral filter ,Artificial intelligence ,business - Abstract
Depth maps taken by the low cost Kinect sensor are often noisy and incomplete. Thus, post-processing for obtaining reliable depth maps is necessary for advanced image and video applications such as object recognition and multi-view rendering. In this paper, we propose adaptive directional filters that fill the holes and suppress the noise in depth maps. Specifically, novel filters whose window shapes are adaptively adjusted based on the edge direction of the color image are presented. Experimental results show that our method yields higher quality filtered depth maps than other existing methods, especially at the edge boundaries.
- Published
- 2014
16. Stereo Video Retargeting with Representative Seams in a Group of Stereoscopic Frames
- Author
-
Hai Thanh Nguyen and Chee Sun Won
- Subjects
General Computer Science ,business.industry ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,Stereoscopy ,Electronic, Optical and Magnetic Materials ,law.invention ,Seam carving ,law ,Distortion ,Video tracking ,Retargeting ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,ComputingMethodologies_COMPUTERGRAPHICS ,Coherence (physics) ,Mathematics - Abstract
The important requirements for stereo video retargeting are threefold: keeping temporal coherence, preventing depth distortion, and minimizing shape distortions of the retargeted video. To meet these requirements, the left and right video sequences are divided into groups of frames (GoFs), where the GoF is a basic unit for the seam carving and we assign a set of fixed seams for all frames within the GoF. To determine the fixed seams for each GoF, we need to find the GoF boundary in the video first. Then, the representative frame for each GoF is generated by considering the spatial saliency and temporal coherence. Also, the confidence of the stereoscopic correspondence between the left and right frames is considered to prevent depth distortion.
- Published
- 2013
17. Moving object detection with Kinect v2
- Author
-
Sungmin Lee, Hai Thanh Nguyen, Chee Sun Won, and Trinh Thi Doan Pham
- Subjects
Computer science ,business.industry ,Computer graphics (images) ,0202 electrical engineering, electronic engineering, information engineering ,020206 networking & telecommunications ,020201 artificial intelligence & image processing ,Computer vision ,02 engineering and technology ,Artificial intelligence ,business ,Object (computer science) ,Object detection - Abstract
Kinect v2 provides depth images with reduced noises and holes, which simplifies the pre-processing of eliminating noises and filling holes. This also allows us to detect the moving object with the depth information of Kinect v2 without resorting to RGB image. Based on this a fast moving object detection method for Kinect v2 is presented in this paper. Our method performs very quickly while extracting moving objects accurately.
- Published
- 2016
18. Text-aware image dehazing using stroke width transform
- Author
-
Chee Sun Won, Sungmin Lee, Jinwon Park, Seung-Won Jung, and Kyumok Kim
- Subjects
0209 industrial biotechnology ,020901 industrial engineering & automation ,Channel (digital image) ,Computer science ,business.industry ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,02 engineering and technology ,Artificial intelligence ,business ,Stroke width ,Image (mathematics) - Abstract
Haze removal, which is also referred to as image dehazing, has been extensively used to improve the visibility in images captured under inclement weather. In particular, the dark channel prior (DCP)-based single image dehazing has received the greatest amount of interest due to its superior performance. However, since the DCP is based on the characteristics of natural outdoor images, its reliability tends to decrease especially when an image contains man-made textures. In this paper, we present a DCP-based single image dehazing method that is robust when text or text-like patterns are present in the image. The proposed method first estimates the text likelihood from a hazy image using the stroke width transform (SWT) and uses the estimated likelihood to correct the DCP. The experimental results show that the proposed algorithm outperforms the conventional DCP-based dehazing methods.
- Published
- 2016
19. A review on dark channel prior based image dehazing algorithms
- Author
-
Ju-Hun Nam, Seok Min Yun, Seung-Won Jung, Chee Sun Won, and Sungmin Lee
- Subjects
Channel (digital image) ,business.industry ,Computer science ,Process (computing) ,020206 networking & telecommunications ,02 engineering and technology ,Iterative reconstruction ,Inverse problem ,Image (mathematics) ,Transmission (telecommunications) ,Signal Processing ,Pattern recognition (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Algorithm ,Image restoration ,Information Systems - Abstract
The presence of haze in the atmosphere degrades the quality of images captured by visible camera sensors. The removal of haze, called dehazing, is typically performed under the physical degradation model, which necessitates a solution of an ill-posed inverse problem. To relieve the difficulty of the inverse problem, a novel prior called dark channel prior (DCP) was recently proposed and has received a great deal of attention. The DCP is derived from the characteristic of natural outdoor images that the intensity value of at least one color channel within a local window is close to zero. Based on the DCP, the dehazing is accomplished through four major steps: atmospheric light estimation, transmission map estimation, transmission map refinement, and image reconstruction. This four-step dehazing process makes it possible to provide a step-by-step approach to the complex solution of the ill-posed inverse problem. This also enables us to shed light on the systematic contributions of recent researches related to the DCP for each step of the dehazing process. Our detailed survey and experimental analysis on DCP-based methods will help readers understand the effectiveness of the individual step of the dehazing process and will facilitate development of advanced dehazing algorithms.
- Published
- 2016
20. Mode selective interpolation for stereoscopic 3D video in frame-compatible top-bottom packing
- Author
-
Chee Sun Won
- Subjects
Offset (computer science) ,business.industry ,Applied Mathematics ,Stereoscopy ,Single frame ,Horizontal line test ,Computer Science Applications ,law.invention ,Transition stage ,Line segment ,Artificial Intelligence ,Hardware and Architecture ,law ,Signal Processing ,Codec ,Computer vision ,Artificial intelligence ,business ,Software ,Information Systems ,Mathematics - Abstract
As a transition stage from a conventional 2D TV to a full stereoscopic 3D TV system, a frame-compatible format of fitting stereoscopic left and right images to a single frame of the existing 2D TV is required to utilize existing codec and transmission infrastructure. To meet this requirement, a frame-compatible top-bottom packing with a horizontal line offset is proposed, where the vertical resolutions of the stereoscopic left and right images are reduced by half. Then, the optimal interpolation mode for each line segment of the sub-sampled horizontal line is determined by exploiting parallax-compensated data as well as undeleted neighboring upper and lower horizontal lines. At the receiver, the discarded horizontal lines for the left and right images are reconstructed by the interpolation modes provided by the sender. Experimental results show that the proposed algorithm improves the PSNR as much as 1.5---3dB comparing to conventional interpolation filters.
- Published
- 2012
21. A new iris segmentation method for non-ideal iris images
- Author
-
Byung Jun Kang, Jaihie Kim, Dong-Kwon Park, Kang Ryoung Park, Jae Won Hwang, Dae Sik Jeong, and Chee Sun Won
- Subjects
urogenital system ,business.industry ,Computer science ,fungi ,Iris recognition ,Word error rate ,urologic and male genital diseases ,female genital diseases and pregnancy complications ,Edge detection ,medicine.anatomical_structure ,Signal Processing ,medicine ,Computer vision ,Segmentation ,cardiovascular diseases ,Computer Vision and Pattern Recognition ,Artificial intelligence ,AdaBoost ,Iris (anatomy) ,Ghosting ,business ,Focus (optics) - Abstract
Many researchers have studied iris recognition techniques in unconstrained environments, where the probability of acquiring non-ideal iris images is very high due to off-angles, noise, blurring and occlusion by eyelashes, eyelids, glasses, and hair. Although there have been many iris segmentation methods, most focus primarily on the accurate detection with iris images which are captured in a closely controlled environment. This paper proposes a new iris segmentation method that can be used to accurately extract iris regions from non-ideal quality iris images. This research has following three novelties compared to previous works; firstly, the proposed method uses AdaBoost eye detection in order to compensate for the iris detection error caused by the two circular edge detection operations; secondly, it uses a color segmentation technique for detecting obstructions by the ghosting effects of visible light; and thirdly, if there is no extracted corneal specular reflection in the detected pupil and iris regions, the captured iris image is determined as a ''closed eye'' image. The proposed method has been tested using the UBIRIS.v2 database via NICE.I (Noisy Iris Challenge Evaluation - Part I) contest. The results show that FP (False Positive) error rate and FN (False Negative) error rate are 1.2% and 27.6%, respectively, from NICE.I report (the 5th highest rank).
- Published
- 2010
22. Minimizing eyestrain on LCD TV based on edge difference and scene change
- Author
-
Kang Ryoung Park, Eui Chul Lee, Chee Sun Won, and Si Mong Lee
- Subjects
Liquid-crystal display ,Pixel ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Accommodation reflex ,law.invention ,Uncompressed video ,law ,Computer graphics (images) ,Media Technology ,medicine ,Computer vision ,Eyestrain ,Enhanced Data Rates for GSM Evolution ,Artificial intelligence ,Electrical and Electronic Engineering ,medicine.symptom ,business ,Image resolution - Abstract
Evaluations on the quality of liquid crystal display (LCD) reveal that human factors yield more accurate results than conventional measurements such as image resolution and response speed. In this paper, as a measurement for the human factor, eyestrains on LCD TV are evaluated and analyzed in terms of video features such as edge difference and scene change. Our study is novel in the following three aspects. First, two measures such as the average blinking rate and the average pupil accommodation speed provide more reliable measurements on eyestrains. Second, video based features such as edge difference and scene change are used to examine the relation between eyestrain and video on LCD TV. Third, the guidelines for LCD TV manufacturer in terms of minimizing eyestrain are suggested. Experimental results show that more edge differences and scene changes induce the decrement of eyestrain. Thus, to boost edge difference and scene change, the amplification method of high frequency component on successive images in uncompressed or compressed domain is presented in order to manufacture the LCD TV with low eyestrains.
- Published
- 2009
23. A Thumbnail Extraction Algorithm for DMB
- Author
-
Chee Sun Won and Yongkwang Kwon
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Thumbnail ,Data compression ratio ,Data_CODINGANDINFORMATIONTHEORY ,Video quality ,Video compression picture types ,Image (mathematics) ,Compression (functional analysis) ,Computer graphics (images) ,Computer vision ,Artificial intelligence ,business ,Data compression ,Image compression - Abstract
H.264/AVC for DMB is the most advanced video compression standard which provides a high compression rate and a video quality by adopting new technologies. However, these new technologies prevent us from applying some conventional algorithms directly to the H.264/AVC compression domain. For example, we need to study new image resizing schemes in the compressed bit-stream. So, in this paper, we propose a new thumbnail extraction method which can be applied to H.264. The proposed method shows that it can extract a 1/16 size thumbnail video, saving the operation time up to 50% to 70%.
- Published
- 2007
24. Image Segmentation Using Hidden Markov Gauss Mixture Models
- Author
-
Kyungsuk Pyun, Robert M. Gray, Johan Lim, and Chee Sun Won
- Subjects
Models, Statistical ,Markov chain ,business.industry ,Segmentation-based object categorization ,Normal Distribution ,Scale-space segmentation ,Pattern recognition ,Image segmentation ,Image Enhancement ,Mixture model ,Markov model ,Computer Graphics and Computer-Aided Design ,Markov Chains ,Pattern Recognition, Automated ,Image texture ,Artificial Intelligence ,Computer Science::Computer Vision and Pattern Recognition ,Image Interpretation, Computer-Assisted ,Computer Simulation ,Artificial intelligence ,Hidden Markov model ,business ,Algorithms ,Software ,Mathematics - Abstract
Image segmentation is an important tool in image processing and can serve as an efficient front end to sophisticated algorithms and thereby simplify subsequent processing. We develop a multiclass image segmentation method using hidden Markov Gauss mixture models (HMGMMs) and provide examples of segmentation of aerial images and textures. HMGMMs incorporate supervised learning, fitting the observation probability distribution given each class by a Gauss mixture estimated using vector quantization with a minimum discrimination information (MDI) distortion. We formulate the image segmentation problem using a maximum a posteriori criteria and find the hidden states that maximize the posterior density given the observation. We estimate both the hidden Markov parameter and hidden states using a stochastic expectation-maximization algorithm. Our results demonstrate that HMGMM provides better classification in terms of Bayes risk and spatial homogeneity of the classified objects than do several popular methods, including classification and regression trees, learning vector quantization, causal hidden Markov models (HMMs), and multiresolution HMMs. The computational load of HMGMM is similar to that of the causal HMM.
- Published
- 2007
25. Color and Depth Image Correspondence for Kinect v2
- Author
-
Seokmin Yun, Changhee Kim, Seung-Won Jung, and Chee Sun Won
- Subjects
Transformation matrix ,Computer science ,business.industry ,Measured depth ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Image resolution ,Camera resectioning ,Image (mathematics) - Abstract
Kinect v2, a new version of Kinect sensor, provides RGB, IR (Infrared) and depth images like its predecessor Kinect v1. However, the depth measurement mechanism and the image resolutions of the Kinect v2 are different from those of Kinect v1, which requires a new transformation matrix for the camera calibration of Kinect v2. In this paper, we correct the radial distortion of the RGB camera and find the transformation matrix for the correspondence between the RGB and depth image of the Kinect v2. Experimental results show that our method yields accurate correspondence between the RGB and depth images.
- Published
- 2015
26. Effective Similarity Measurement for Key-Point Matching in Images
- Author
-
Chee Sun Won, Seung-Won Jung, and Sungmin Lee
- Subjects
Matching (statistics) ,Standard test image ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-invariant feature transform ,Pattern recognition ,Measure (mathematics) ,Image (mathematics) ,Similarity (network science) ,Computer Science::Computer Vision and Pattern Recognition ,Key (cryptography) ,Artificial intelligence ,business ,Image retrieval - Abstract
Different similarity measures between the descriptors of the key-points certainly yield different performances in image matching. In this paper we introduce an effective similarity measurement, which considers the distances of each key-point in a query image and its matched key-point with the smallest distance in the test image. Therefore, the distances of all key-points in the query image to the corresponding matched key-points in the test image contribute to the final similarity measurement. On the other hand, the previous method considers only the distances less than a threshold value of all possible key-point pairs, which may ignore a significant part of the key-points in the query image. Our experiments show that the proposed measure yields better performance for image similarity matching and retrieval.
- Published
- 2015
27. Reduced Reference Quality Metric for Depth Images
- Author
-
Thanh Ha Le, Seung-Won Jung, Seongjo Lee, and Chee Sun Won
- Subjects
Ground truth ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image (mathematics) ,Quality (physics) ,Single camera ,Gabor filter ,Computer Science::Computer Vision and Pattern Recognition ,Distortion ,Metric (mathematics) ,Computer vision ,Enhanced Data Rates for GSM Evolution ,Artificial intelligence ,business - Abstract
In this paper, a new quality metric for depth images is proposed. Unlike the conventional depth metrics which require the additional information such as the ground truth depth image or a stereo image pair, the proposed quality metric demands only a single camera image and its corresponding depth image. In this work, we first empirically observe that the depth distortion is closely related to the local image characteristics. Based on the observation, we introduce a method to assess the local depth distortion for the edge and non-edge regions. Then, the local distortion is adaptively weighted by the Gabor filter and added up to the quality metric for the depth image.
- Published
- 2015
28. Image retrieval using color histograms generated by Gauss mixture vector quantization
- Author
-
Robert M. Gray, Sangoh Jeong, and Chee Sun Won
- Subjects
Color histogram ,Color image ,business.industry ,Color normalization ,Quantization (signal processing) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Vector quantization ,Pattern recognition ,Data_CODINGANDINFORMATIONTHEORY ,Color space ,Color quantization ,Computer Science::Computer Vision and Pattern Recognition ,Signal Processing ,Color depth ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Mathematics - Abstract
Image retrieval based on color histograms requires quantization of a color space. Uniform scalar quantization of each color channel is a popular method for the reduction of histogram dimensionality. With this method, however, no spatial information among pixels is considered in constructing the histograms. Vector quantization (VQ) provides a simple and effective means for exploiting spatial information by clustering groups of pixels. We propose the use of Gauss mixture vector quantization (GMVQ) as a quantization method for color histogram generation. GMVQ is known to be robust for quantizer mismatch, which motivates its use in making color histograms for both the query image and the images in the database. Results show that the histograms made by GMVQ with a penalized log-likelihood (LL) distortion yield better retrieval performance for color images than the conventional methods of uniform quantization and VQ with squared error distortion.
- Published
- 2004
29. Efficient Use of MPEG-7 Edge Histogram Descriptor
- Author
-
Dong Kwon Park Park, Soo-Jun Park Park, and Chee Sun Won Won
- Subjects
Color histogram ,General Computer Science ,Balanced histogram thresholding ,Local binary patterns ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Histogram matching ,Pattern recognition ,Edge detection ,Electronic, Optical and Magnetic Materials ,Histogram ,Computer vision ,Adaptive histogram equalization ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Image histogram ,Mathematics - Abstract
MPEG-7 Visual Standard specifies a set of descriptors that can be used to measure similarity in images or video. Among them, the Edge Histogram Descriptor describes edge distribution with a histogram based on local edge distribution in an image. Since the Edge Histogram Descriptor recommended for the MPEG-7 standard represents only local edge distribution in the image, the matching performance for image retrieval may not be satisfactory. This paper proposes the use of global and semi-local edge histograms generated directly from the local histogram bins to increase the matching performance. Then, the global, semi-global, and local histograms of images are combined to measure the image similarity and are compared with the MPEG-7 descriptor of the local-only histogram. Since we exploit the absolute location of the edge in the image as well as its global composition, the proposed matching method can retrieve semantically similar images. Experiments on MPEG-7 test images show that the proposed method yields better retrieval performance by an amount of 0.04 in ANMRR, which shows a significant difference in visual inspection.
- Published
- 2002
30. Fast binary matching for edge histogram descriptor
- Author
-
Byoul Park and Chee Sun Won
- Subjects
Matching (graph theory) ,business.industry ,Computer science ,Local binary patterns ,Physics::Medical Physics ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Histogram matching ,Binary number ,Pattern recognition ,Similarity measure ,Image (mathematics) ,ComputingMethodologies_PATTERNRECOGNITION ,Computer Science::Computer Vision and Pattern Recognition ,Histogram ,Computer Science::Multimedia ,Artificial intelligence ,Enhanced Data Rates for GSM Evolution ,business - Abstract
Edge histogram descriptor (EHD) consists of histogram bins of local edges in an image. Instead of the conventional bin-to-bin matching for the similarity measure, this paper presents a binary string descriptor for the EHD to achieve a faster binary bit-to-bit matching.
- Published
- 2014
31. Bounding Box and Frame Resizing for Moving Object of Interest
- Author
-
Anh Vu Le and Chee Sun Won
- Subjects
Pixel ,Computer science ,business.industry ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,Object (computer science) ,Image (mathematics) ,Minimum bounding box ,Computer graphics (images) ,Computer vision ,Artificial intelligence ,business ,TRACE (psycholinguistics) ,Data compression - Abstract
representative frame in GoF (Group of frames) of a video is formed by taking spatial and temporal gradients sequentially for image frames and by selecting the pixel of the largest spatial-temporal gradient (STG) for all co-located pixels in the GoF. As a result, the boundary of the moving object in the video is highlighted by the STG operation. Therefore, an optimal bounding box for a moving object can be determined by choosing the maximum spatial density of the STG for various sizes of the bounding box. The bounding box includes the boundary trace of the MOOI (moving object of interest) in the GoF and is used to differentiate the MOOI from the non-MOOI. That is, the pixels outside the bounding box are the non-MOOI and they are the main target for the size reduction of the video frames for a pre-processing of video compression.
- Published
- 2014
32. MPEG-7 TEXTURE DESCRIPTORS
- Author
-
Chee Sun Won, Yang-lim Choi, Yong Man Ro, and Peng Wu
- Subjects
Similarity (geometry) ,business.industry ,Computer science ,Local binary patterns ,Nearest neighbor search ,Texture Descriptor ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Computer Graphics and Computer-Aided Design ,Texture (geology) ,Sketch ,Computer Science Applications ,Image texture ,Histogram ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business - Abstract
A texture description is useful in many applications including similarity based image search and browsing. We present here three texture descriptors that are being considered for the final committee draft of the ISO/MPEG-7 standard. A comprehensive overview of the syntax and semantics of these texture descriptors is provided. The Homogeneous Texture Descriptor (HTD) and the Edge Histogram Descriptor (EHD) are useful in similarity search. The HTD characterizes homogeneous texture regions and is also useful in texture classification and recognition. The EHD is applicable when the underlying texture is not homogeneous and can also be used in sketch based retrieval. In addition, a compact descriptor that facilitates browsing applications is also defined. These descriptors are selected after a highly competitive test and evaluation phase within the MPEG group and we briefly summarize the evaluation criteria, the datasets used and the experimental results.
- Published
- 2001
33. A watermarking sequence using parities of error control coding for image authentication and correction
- Author
-
Jaejin Lee and Chee Sun Won
- Subjects
Sequence ,Authentication ,Theoretical computer science ,Computer science ,Media Technology ,Key (cryptography) ,Message authentication code ,Data_CODINGANDINFORMATIONTHEORY ,Electrical and Electronic Engineering ,Algorithm ,Digital watermarking ,Decoding methods ,Scrambling - Abstract
A novel function for image watermarking is proposed. We show how the watermarking sequence can be used for correction of illegal modifications. To make this task possible, the parities generated by the conventional error control coding (ECC) technique is used for the watermarking sequence. Then, the receiver can correct any alterations by applying the ECC decoding. To increase the correction capability we also adopt a scrambling scheme before the encoding to change a burst type of content modifications (i.e., errors) into a random noise. The scrambling key can be also used as a key for the authentication of the sender.
- Published
- 2000
34. Image Fidelity Assessment Using the Edge Histogram Descriptor of MPEG-7
- Author
-
Chee Sun Won
- Subjects
General Computer Science ,business.industry ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Histogram matching ,Fidelity ,Electronic, Optical and Magnetic Materials ,Image (mathematics) ,Image texture ,Histogram ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Digital watermarking ,Image histogram ,media_common ,Feature detection (computer vision) ,Mathematics - Abstract
An image fidelity assessment using the edge histogram descriptor (EHD) of MPEG-7 is presented. Neither additional data nor fragile watermarking is needed, and there is no need to access the original image as a reference. Only the EHDs of the original image and the received image are required. The peak signal-to-noise ratio (PSNR) obtained by comparing the EHD extracted from the received image and that of the original image is used to assess the noise level of the received image. Experimental results show that the PSNRs calculated from the conventional pixel-to-pixel gray level and from the proposed bin-to-bin EHD maintain a proportional relationship. This implies that the EHD can be used instead of image data for the image fidelity assessments.
- Published
- 2007
35. Image Retrieval via Query-by-Layout Using MPEG-7 Visual Descriptors
- Author
-
Chee Sun Won, Sung Min Kim, and Soo Jun Park
- Subjects
General Computer Science ,business.industry ,Binary image ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Content-based image retrieval ,Electronic, Optical and Magnetic Materials ,Automatic image annotation ,Image texture ,Color layout descriptor ,Computer vision ,Visual Word ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Image retrieval ,Mathematics ,Feature detection (computer vision) - Abstract
Query-by-example (QBE) is a well-known method for image retrieval. In reality, however, an example image to be used for the query is rarely available. Therefore, it is often necessary to find a good example image to be used for the query before applying the QBE method. Query-by-layout (QBL) is our proposal for that purpose. In particular, we make use of the visual descriptors such as the edge histogram descriptor (EHD) and the color layout descriptor (CLD) in MPEG-7. Since image features of the CLD and the EHD can be localized in terms of a 4 4 sub-image, we can specify image features such as color and edge distribution on each sub-image separately for image retrieval without a query image. Experimental results show that the proposed query method can be used to retrieve a good image as a starting point for further QBE-based image retrieval.
- Published
- 2007
36. A block-based MAP segmentation for image compressions
- Author
-
Chee Sun Won
- Subjects
A priori probability ,Contextual image classification ,business.industry ,Conditional probability ,Image processing ,Pattern recognition ,Image texture ,Media Technology ,Maximum a posteriori estimation ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Data compression ,Image compression ,Mathematics - Abstract
A novel block-based image segmentation algorithm using the maximum a posteriori (MAP) criterion is proposed. The conditional probability in the MAP criterion, which is formulated by the Bayesian framework, is in charge of classifying image blocks into edge, monotone, and textured blocks. On the other hand, the a priori probability is responsible for edge connectivity and homogeneous region continuity. After a few iterations to achieve a deterministic MAP optimization, we can obtain a block-based segmented image in terms of edge, monotone, or textured blocks. Then, using a connected block-labeling algorithm, we can assign a number to all connected homogeneous blocks to define an interior of a region. Finally, uncertainty blocks, which are not given any region number yet, are assigned to one of the neighboring homogeneous regions by a block-based region-growing method. During this process, we can also check the balance between the accuracy and the cost of the contour coding by adjusting the size of the uncertainty blocks. Experimental results show that the proposed algorithm yields larger homogeneous regions which are suitable for the object-based image compression.
- Published
- 1998
37. Compressing stereo images in discrete Fourier transform domain
- Author
-
Chee Sun Won and Shahram Shirani
- Subjects
Computer science ,business.industry ,Property (programming) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Phase (waves) ,Short-time Fourier transform ,Discrete Fourier transform ,Domain (software engineering) ,Discrete Fourier transform (general) ,symbols.namesake ,Fourier transform ,Discrete sine transform ,Computer Science::Computer Vision and Pattern Recognition ,Physics::Space Physics ,Physics::Atomic and Molecular Clusters ,symbols ,Astrophysics::Solar and Stellar Astrophysics ,Computer vision ,Artificial intelligence ,Physics::Chemical Physics ,business ,Image compression - Abstract
Frequency features of stereo images are investigated in the DFT (Discrete Fourier Transform) domain by characterizing phase and magnitude properties originated from the horizontal disparities in the stereo images. Also, the well-known DFT properties including the conjugate symmetry property are utilized to identify essential frequency components of stereo images. Our investigation reveals that the DFT of the stereo images has useful properties that can prioritize the DFT coefficients for compact representations and compressions.
- Published
- 2013
38. Fast selective interpolation for 3D depth images
- Author
-
Chee Sun Won, Hai Nguyen, and Anh Vu Le
- Subjects
Demosaicing ,Color image ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Bilinear interpolation ,Depth map ,Filter (video) ,Image scaling ,Computer vision ,Artificial intelligence ,business ,Image resolution ,Interpolation - Abstract
Depth maps taken from depth cameras are often available at a lower resolution compared to the corresponding color images. In this paper, we propose a selective interpolation method to expand depth image to the size of the corresponding color image. To achieve our goal we selectively adopt either the bilinear smoothing filter or an edge-preserving filter to expand the low resolution depth map. The edge-preserving filters we considered in this paper are joint bilateral up-sampling (JBU) filter and the New Edge-Directed Interpolation (NEDI) filter. Our method not only maintains the visual quality of the interpolated depth image at edge regions, but it also reduces the computational complexity for real time execution.
- Published
- 2012
39. Design considerations for pico-projector based on LCoS and 3-LEDs
- Author
-
Soo Jun Park, Chee Sun Won, and Sung Min Kim
- Subjects
Materials science ,Liquid-crystal display ,Led illumination ,business.industry ,Color balance ,law.invention ,Liquid crystal on silicon ,Projector ,Duty cycle ,law ,Electronic engineering ,Optoelectronics ,Constant current ,business ,Light-emitting diode - Abstract
In this paper, two considerations were presented for designing pico-projector based-on LCoS panel and LED. To have a clear picture, LED duty cycle has to be less than 50% of whole possible LED illumination times. And, a constant current circuit is needed to maintain a white balance of picture.
- Published
- 2011
40. Adaptive interpolation for 3D stereoscopic video in frame-compatible top-bottom packing
- Author
-
Chee Sun Won
- Subjects
Demosaicing ,Video post-processing ,business.industry ,Image processing ,Stairstep interpolation ,Stereoscopy ,law.invention ,law ,Computer graphics (images) ,Image scaling ,Computer vision ,Artificial intelligence ,Motion interpolation ,business ,Mathematics ,Interpolation - Abstract
Stereoscopic 3D video is decimated by frame-compatible top-bottom packing structure. At the receiver the discarded horizontal lines are reconstructed by interpolation functions of parallax-compensated data as well as remained neighboring lines.
- Published
- 2011
41. An optimization for classification maximum likelihood criterion
- Author
-
Chee Sun Won
- Subjects
Degenerate energy levels ,k-means clustering ,Structure (category theory) ,Covariance ,Combinatorics ,Artificial Intelligence ,Bayesian information criterion ,Signal Processing ,Pattern recognition (psychology) ,Applied mathematics ,Partition (number theory) ,Computer Vision and Pattern Recognition ,Cluster analysis ,Software ,Mathematics - Abstract
A clustering criterion introduced by Symons (1981), which is called Classification Maximum Likelihood (CML) criterion in this paper, is designed to consider the cluster size and the covariance structure of samples. The CML criterion is optimized by the ‘Moving method’ suggested by Duda and Hart (1973, p. 226). When the Moving method is applied to the CML criterion with an arbitrary initial cluster, it often yields degenerate clusters. To avoid such degenerate cases, we propose two stages of clustering. In the first stage, we roughly partition samples with respect to ‘the covariance structure component’ in the CML criterion. The resulting partition is then further clustered with the full CML criterion.
- Published
- 1993
42. Size-controllable region-of-interest in scalable image representation
- Author
-
Shahram Shirani and Chee Sun Won
- Subjects
Pixel ,business.industry ,Computer science ,Phantoms, Imaging ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Iterative reconstruction ,Grid ,Computer Graphics and Computer-Aided Design ,Scalable Video Coding ,Display device ,ComputingMethodologies_PATTERNRECOGNITION ,Transformation (function) ,Region of interest ,Scalability ,Image Processing, Computer-Assisted ,Computer vision ,Pyramid (image processing) ,Artificial intelligence ,business ,Software ,Algorithms ,Interpolation - Abstract
Differentiating region-of-interest (ROI) from non-ROI in an image in terms of relative size as well as fidelity becomes an important functionality for future visual communication environment with a variety of display devices. In this paper, we propose a scalable image representation with the ROI functionality in the spatial domain, which allows us to generate a hierarchy of images with arbitrary sizes. The ROI functionality of our scalable representation is a result of a nonuniform grid transformation in the spatial domain, where only the center of ROI and an expansion parameter are to be known. Our grid transformation guarantees no loss of information within the area of ROI.
- Published
- 2010
43. Unsupervised segmentation of noisy and textured images using Markov random fields
- Author
-
Haluk Derin and Chee Sun Won
- Subjects
Random field ,Markov chain ,Segmentation-based object categorization ,business.industry ,Estimation theory ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,General Engineering ,Scale-space segmentation ,Pattern recognition ,Image segmentation ,symbols.namesake ,Gaussian noise ,Computer Science::Computer Vision and Pattern Recognition ,symbols ,General Earth and Planetary Sciences ,Segmentation ,Artificial intelligence ,business ,General Environmental Science ,Mathematics - Abstract
This paper proposes a general unsupervised segmentation algorithm which estimates all model parameters, including the number of regions, as part of the segmentation. The general image model is a hierarchical one, consisting of Markov random fields as components of the model. The MAP criterion is adopted, in principle, for the simultaneous image segmentation and parameter estimation procedures. Due to the difficulty of implementing the MAP segmentation with a large number of unknown parameters, a novel modification of the MAP criterion is proposed. For the model parameters having closed-form ML estimates, these estimates are substituted back into the objective function to reduce the difficulty of the maximization. The remaining maximization is implemented by a recursive segmentation-parameter estimation algorithm, which yields a partial optimal solution (POS) to the maximization problem. In the special case where all model parameters have closed-form ML estimates, the proposed algorithm is equivalent to implementing the MAP criterion. The number of regions in the image is determined through a model fitting criterion tagged on to the segmentation algorithm. Special forms of the general unsupervised segmentation algorithm are developed for the segmentation of noisy and textured images. For noisy images, the image is assumed to consist of uniform graylevel regions modeled by a class of Gibbs random fields and corrupted by additive, white, region-dependent, Gaussian noise. For textured images, the image is assumed to consist of regions, modeled by a class of Gibbs random fields, which are filled with textures, modeled by Gaussian Markov random fields. The algorithms for both classes of images are applied to a wide range of images—generated according to the model, hand-drawn, natural and Brodatz textures, their combinations, and outdoor images—with notable success. Despite the large number of unknown parameters (as many as 14 for some noisy images and 36 for some textured images), the algorithms yield good segmentations, accurate estimates for the parameters, and the correct number of regions.
- Published
- 1992
44. Decoding strategies for Reed-Solomon product codes: application to digital video recording systems
- Author
-
Chee Sun Won, Seung-Ho Kim, and Sang Wu Kim
- Subjects
Error floor ,Computer science ,List decoding ,Data_CODINGANDINFORMATIONTHEORY ,Sequential decoding ,Soft-decision decoder ,Reed–Solomon error correction ,Media Technology ,Redundancy (engineering) ,Electrical and Electronic Engineering ,Error detection and correction ,Algorithm ,Decoding methods ,Computer Science::Information Theory ,Communication channel - Abstract
The authors propose two decoding strategies for Reed-Solomon product codes, and analyze their performance on a random error channel. The performance measures of interest are the probability of undetected error at the inner decoder output and that of decoding failure at the outer decoder output. It has been shown that the proposed decoding strategies give a significant performance improvement over the conventional decoder. >
- Published
- 1992
45. Automatic Object Extraction in Images using Embedded Labels
- Author
-
Chee Sun Won
- Subjects
Pixel ,business.industry ,Computer science ,Data_MISCELLANEOUS ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Watermark ,Data_CODINGANDINFORMATIONTHEORY ,Image segmentation ,Computer vision ,Artificial intelligence ,business ,Digital watermarking ,Decoding methods ,Transform coding ,Data compression - Abstract
To automatically generate images with the same foreground but different backgrounds, a watermark bit (e.g., binary 1 for foreground and 0 for background) can be inserted for each pixel location. Then, the embedded watermark bit can be automatically extracted and the background can be separated from the object. Note that the object extraction can be done successfully only if the watermarked image is intact. However, if the watermarked image goes through some post-processing including JPEG compression and cropping, then the pixel-wise watermark decoding may fail. To overcome this problem, in this paper, a block-wise watermark insertion and a block-wise MAP (maximum a posteriori) watermark decoding are proposed. Experimental results show that the proposed method is more robust that the pixel-wise decoding for various post-processing attacks.
- Published
- 2008
46. An adaptive color image retrieval framework using Gauss mixtures
- Author
-
Chee Sun Won, Sangoh Jeong, and Robert M. Gray
- Subjects
Contextual image classification ,business.industry ,Histogram ,Gauss ,Relevance feedback ,Pattern recognition ,Visual Word ,Artificial intelligence ,business ,Image retrieval ,Classifier (UML) ,Mathematics ,Semantic gap - Abstract
To reduce the semantic gap, image retrieval systems based on users' relevance feedback have been adopted. However, since this structure needs human intervention during the retrieval process, it cannot be applied to fully automated systems. To avoid this problem, we propose a feed-forward framework instead of the feed-back retrieval system, which adds a classifier to the traditional system for giving feed-forward information to maximize the average precision. That is, given a database, our proposed system improves the overall precision by selecting the best mode based on known statistics (average precision vs. recall for each category). Lloyd-clustered Gauss mixtures are used in the classifier to provide the feed-forward category information and in the quantization of color images for histogram generation.
- Published
- 2008
47. Embedding Auxiliary Data in H.264/AVC Compression Domain
- Author
-
Chee Sun Won and Sang Beom Kim
- Subjects
Theoretical computer science ,Computer science ,Compression (functional analysis) ,Embedding ,Algorithm ,H 264 avc ,Domain (software engineering) - Published
- 2008
48. On generating automatic-object-extractable images
- Author
-
Chee Sun Won
- Subjects
business.industry ,Computer science ,Segmentation-based object categorization ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,Image segmentation ,Object (computer science) ,Semantics ,Signature (logic) ,Image (mathematics) ,Computer vision ,Segmentation ,Artificial intelligence ,business - Abstract
Object segmentation and extraction play an important role in computer vision and recognition problems. Unfortunately, with current computing technologies, fully automatic object segmentation is not possible, but human intervention is needed for outlining the rough boundary of the object to be segmented. The goal of this paper is to make the object extraction automatic after the first semi-automatic segmentation. That is, once the semantically meaningful object such as a house or a human body is extracted from the image under human's guidance, an image manipulation technique is applied. There is no noticeable difference between the original and the manipulated images. However, the embedded signature by the image manipulation can be detected automatically to be used to differentiate the object from the background. The manipulated images, which is called automatic-object-extractible images, can be used to provide training images with the same object but various background images.
- Published
- 2007
49. Data Hiding on H.264/AVC Compressed Video
- Author
-
Sang Beom Kim, Sung Min Kim, Youpyo Hong, and Chee Sun Won
- Subjects
Computer science ,Information hiding ,Real-time computing ,Sign bit ,Watermark ,Visual artifact ,Algorithm ,Digital watermarking ,Scalable Video Coding ,Context-adaptive variable-length coding ,Video compression picture types - Abstract
An important issue in embedding watermark bits in compressed video stream is to keep the bit-rate unchanged after the watermarking. This is a very difficult problem for high efficient compression methods such as H.264/AVC, because just one bit alteration in highly compressed bit-stream may widely affect the video content. In this paper we solve this problem by embedding watermark bit to the sign bit of the Trailing Ones in Context Adaptive Variable Length Coding (CAVLC) of H.264/AVC. The algorithm yields no bit-rate change after the data hiding. Also, we can easily balance between the capacity of the watermark bits and the fidelity of the video. The simplicity of the proposed algorithm is an added bonus for the real-time applications. Our experiments show that the PSNRs of the video sequences after the data hiding are higher than 43dB.
- Published
- 2007
50. Using edge histogram descriptor of MPEG-7 for the measurement of image quality and modifications
- Author
-
Chee Sun Won
- Subjects
Standard test image ,business.industry ,Image quality ,Binary image ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Geography ,Image texture ,Histogram ,Computer vision ,Artificial intelligence ,business ,Image histogram ,Image gradient - Abstract
A novel image quality assessment using the edge histogram descriptor (EHD) of MPEG-7 is presented. Neither additional data nor fragile watermarking is needed for the quality assessment and the image content authentication. Also, the original image is not needed for our method, no need to access the original image as a reference. Only the EHD metadata of the original image and the received (noisy or altered) one are required. The PSNR (Peak to Signal-to-Noise Ratio) or the mean-square error (MSE) is obtained by comparing the EHD extracted from the received image and that of the original image attached as the meta-data. Then, it is used to assess the level of the image degradation and any illicit modification of the image. Experimental results show that the PSNRs calculated from the two EHDs are similar to those calculated from the pixel-to-pixel comparisons of original and received images. This implies that one can use the EHD, instead of the image data, to calculate the PSNR for the image assessment. Also, since the EHD extracted from the received image is prone to be changed according to the alterations of the image content, one can also use the proposed method as the image authentication purpose.
- Published
- 2006
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.