145 results on '"Video post-processing"'
Search Results
2. Enhancement system of nighttime infrared video image and visible video image
- Author
-
Yan Piao and Yue Wang
- Subjects
Image fusion ,Motion compensation ,Video post-processing ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-invariant feature transform ,Image processing ,Uncompressed video ,Geography ,Video tracking ,Video denoising ,Computer vision ,Artificial intelligence ,business - Abstract
Visibility of Nighttime video image has a great significance for military and medicine areas, but nighttime video image has so poor quality that we can’t recognize the target and background. Thus we enhance the nighttime video image by fuse infrared video image and visible video image. According to the characteristics of infrared and visible images, we proposed improved sift algorithm andαβ weighted algorithm to fuse heterologous nighttime images. We would deduced a transfer matrix from improved sift algorithm. The transfer matrix would rapid register heterologous nighttime images. And theαβ weighted algorithm can be applied in any scene. In the video image fusion system, we used the transfer matrix to register every frame and then used αβ weighted method to fuse every frame, which reached the time requirement soft video. The fused video image not only retains the clear target information of infrared video image, but also retains the detail and color information of visible video image and the fused video image can fluency play.
- Published
- 2016
3. Real-time synchronized rendering of multi-view video for 8Kx2K three-dimensional display with spliced four liquid crystal panels
- Author
-
Binbin Yan, Shujun Xing, Jiwei Ning, Wenhua Dou, Liquan Xiao, Xinzhu Sang, and Huilong Cui
- Subjects
Liquid-crystal display ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Image segmentation ,Stereo display ,law.invention ,Rendering (computer graphics) ,CUDA ,Software ,law ,Computer graphics (images) ,Computer vision ,Artificial intelligence ,business - Abstract
A high speed synchronized rendering of multi-view video for 8K×4K multi-LCD-spliced three-dimensional (3D) display system based on CUDA is demonstrated. Because the conventional image processing calculation method is no longer applicable to this 3D display system, the CUDA technology is used for 3D image processing to address the problem of low efficiency.The 8K×4K screen is composed of four LCD screens, and accurate segmentation of the scene is carried out to ensure the correct display of 3D contents and a set of controlling and the host software are optimally implemented to make all of the connected processors render 3D videos simultaneously. The system which is based on the master-slave synchronization communication mode and DIBR-CUDA accelerated algorithm is used to realize the high resolution, high frame rate, large size, and wide view angle video rendering for the real-time 3D display. Experimental result shows a stable frame-rate at 30 frame-per-second and the friendly interactive interface can be achieved.
- Published
- 2016
4. Video stabilization using space-time video completion
- Author
-
Nikolay Gapon, Sos S. Agaian, V. A. Frantc, I. Shrayfel, Vladimir I. Marchuk, and V. V. Voronin
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Video compression picture types ,Video tracking ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Video denoising ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,Reference frame ,Block-matching algorithm - Abstract
This paper proposes a video stabilization method using space-time video completion for effective static and dynamic textures reconstruction instead of frames cropping. The proposed method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. We propose to use a set of descriptors that encapsulate the information of periodical motion of objects necessary to reconstruct missing/corrupted frames. The background is filled-in by extending spatial texture synthesis techniques using set of 3D patches. Experimental results demonstrate the effectiveness of the proposed method in the task of full-frame video stabilization.
- Published
- 2016
5. Design and implementation of H.264 based embedded video coding technology
- Author
-
Jiemin Zhang, Jinming Liu, and Jian Mao
- Subjects
Video post-processing ,Computer science ,Video capture ,business.industry ,Bink Video ,020208 electrical & electronic engineering ,020206 networking & telecommunications ,02 engineering and technology ,Video processing ,computer.file_format ,Smacker video ,computer.software_genre ,Uncompressed video ,Non-linear editing system ,Embedded system ,Video tracking ,0202 electrical engineering, electronic engineering, information engineering ,Multiview Video Coding ,business ,computer ,Computer hardware - Abstract
In this paper, an embedded system for remote online video monitoring was designed and developed to capture and record the real-time circumstances in elevator. For the purpose of improving the efficiency of video acquisition and processing, the system selected Samsung S5PV210 chip as the core processor which Integrated graphics processing unit. And the video was encoded with H.264 format for storage and transmission efficiently. Based on S5PV210 chip, the hardware video coding technology was researched, which was more efficient than software coding. After running test, it had been proved that the hardware video coding technology could obviously reduce the cost of system and obtain the more smooth video display. It can be widely applied for the security supervision [1].
- Published
- 2016
6. Video streaming with SHVC to HEVC transcoding
- Author
-
Yan Ye, Xiaoyu Xiu, Srinivas Gudumasu, and Yuwen He
- Subjects
Video post-processing ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,Transcoding ,InformationSystems_MISCELLANEOUS ,Video quality ,computer.software_genre ,Bitstream ,Motion vector ,computer ,Scalable Video Coding - Abstract
This paper proposes an efficient Scalable High efficiency Video Coding (SHVC) to High Efficiency Video Coding (HEVC) transcoder, which can reduce the transcoding complexity significantly, and provide a desired trade-off between the transcoding complexity and the transcoded video quality. To reduce the transcoding complexity, some of coding information, such as coding unit (CU) depth, prediction mode, merge mode, motion vector information, intra direction information and transform unit (TU) depth information, in the SHVC bitstream are mapped and transcoded to single layer HEVC bitstream. One major difficulty in transcoding arises when trying to reuse the motion information from SHVC bitstream since motion vectors referring to inter-layer reference (ILR) pictures cannot be reused directly in transcoding. Reusing motion information obtained from ILR pictures for those prediction units (PUs) will reduce the complexity of the SHVC transcoder greatly but a significant reduction in the quality of the picture is observed. Pictures corresponding to the intra refresh pictures in the base layer (BL) will be coded as P pictures in enhancement layer (EL) in the SHVC bitstream; and directly reusing the intra information from the BL for transcoding will not get a good coding efficiency. To solve these problems, various transcoding technologies are proposed. The proposed technologies offer different trade-offs between transcoding speed and transcoding quality. They are implemented on the basis of reference software SHM-6.0 and HM-14.0 for the two layer spatial scalability configuration. Simulations show that the proposed SHVC software transcoder reduces the transcoding complexity by up to 98-99% using low complexity transcoding mode when compared with cascaded re-encoding method. The transcoder performance at various bitrates with different transcoding modes are compared in terms of transcoding speed and transcoded video quality.
- Published
- 2015
7. Display device-adapted video quality-of-experience assessment
- Author
-
Zhou Wang, Kai Zeng, and Abdul Rehman
- Subjects
Video post-processing ,Video capture ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,Video quality ,Video compression picture types ,Uncompressed video ,Deflicking ,Video tracking ,Human visual system model ,Computer vision ,Artificial intelligence ,PEVQ ,business ,Subjective video quality - Abstract
Today's viewers consume video content from a variety of connected devices, including smart phones, tablets, notebooks, TVs, and PCs. This imposes significant challenges for managing video traffic efficiently to ensure an acceptable quality-of-experience (QoE) for the end users as the perceptual quality of video content strongly depends on the properties of the display device and the viewing conditions. State-of-the-art full-reference objective video quality assessment algorithms do not take into account the combined impact of display device properties, viewing conditions, and video resolution while performing video quality assessment. We performed a subjective study in order to understand the impact of aforementioned factors on perceptual video QoE. We also propose a full reference video QoE measure, named SSIMplus, that provides real-time prediction of the perceptual quality of a video based on human visual system behaviors, video content characteristics (such as spatial and temporal complexity, and video resolution), display device properties (such as screen size, resolution, and brightness), and viewing conditions (such as viewing distance and angle). Experimental results have shown that the proposed algorithm outperforms state-of-the-art video quality measures in terms of accuracy and speed.
- Published
- 2015
8. Enhanced features for supervised lecture video segmentation and indexing
- Author
-
Gady Agam and Di Ma
- Subjects
Motion compensation ,Video post-processing ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image segmentation ,Video compression picture types ,Gabor filter ,Video tracking ,Video denoising ,Segmentation ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,Block-matching algorithm - Abstract
Lecture videos are common and increase rapidly. Consequently, automatically and efficiently indexing such videos is an important task. Video segmentation is a crucial step of video indexing that directly affects the indexing quality. We are developing a system for automated video indexing and in this paper discuss our approach for video segmentation and classification of video segments. The novel contributions in this paper are twofold. First we develop a dynamic Gabor filter and use it to extract features for video frame classification. Second, we propose a recursive video segmentation algorithm that is capable of clustering video frames into video segments. We then use these to classify and index the video segments. The proposed approach results in a higher True Positive Rate(TPR) 89.5% and lower False Discovery Rate(FDR) 11.2% compared with the commercial system(TPR= 81.8%, FDR=39.4%) demonstrate that the performance is significantly improved by using enhanced features.
- Published
- 2015
9. Improved video copy detection algorithm based on multi-scale Harris feature points
- Author
-
Yan Yan Hou
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,Video copy detection ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Fingerprint recognition ,Longest path problem ,Video tracking ,Graph (abstract data type) ,Computer vision ,Artificial intelligence ,business ,Algorithm ,Block-matching algorithm - Abstract
In order to meet the real-time requirement of video copy detection, a robust video copy detection algorithm is proposed. Harris feature points is extracted based on local feature descriptor, video frames are divided into blocks and video fingerprints are generated by calculating video feature point amplitude and angle differences. The matching result graph is formed based on video matching frames, copy video is detection by searching the longest path of the graph. Compared with other video detection algorithm, proposed algorithm is with good robustness and discrimination accuracy, experiment proved that detection speed is improved further.
- Published
- 2015
10. Advanced texture filtering: a versatile framework for reconstructing multi-dimensional image data on heterogeneous architectures
- Author
-
Ulrich Lang, Yvonne Percan, and Stefan Zellmann
- Subjects
Video post-processing ,Parallel rendering ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Software rendering ,Volume rendering ,3D rendering ,Real-time rendering ,Visualization ,Rendering (computer graphics) ,Texture filtering ,Computer graphics (images) ,Tiled rendering ,General-purpose computing on graphics processing units ,Alternate frame rendering ,business ,Texture memory ,ComputingMethodologies_COMPUTERGRAPHICS ,Interpolation - Abstract
Reconstruction of 2 -d image primitives or of 3 -d volumetric primitives is one of the most common operationsperformed by the rendering components of modern visualization systems. Because this operation is often aided byGPUs, reconstruction is typically restricted to rst-order interpolation. With the advent of in situ visualization,the assumption that rendering algorithms are in general executed on GPUs is however no longer adequate. Wethus propose a framework that provides versatile texture ltering capabilities: up to third-order reconstructionusing various types of cubic ltering and interpolation primitives; cache-optimized algorithms that integrateseamlessly with GPGPU rendering or with software rendering that was optimized for cache-friendly \ Structureof Array " (SoA) access patterns; a memory management layer (MML) that gracefully hides the complexitiesof extra data copies necessary for memory access optimizations such as swizzling, for rendering on GPGPUs,or for reconstruction schemes that rely on pre- ltered data arrays. We prove the e ectiveness of our softwarearchitecture by integrating it into and validating it using the open source direct volume rendering (DVR) softwareDeskVOX.Keywords: Image Reconstruction, Splines, Texture Filtering Library, Direct Volume Rendering, HeterogeneousHPC Architectures
- Published
- 2015
11. Design of video interface conversion system based on FPGA
- Author
-
Heng Zhao and Xiangjun Wang
- Subjects
Video post-processing ,business.industry ,Computer science ,Video capture ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,S-Video ,computer.file_format ,Video processing ,Smacker video ,Uncompressed video ,Multiview Video Coding ,business ,computer ,Computer hardware ,Composite video - Abstract
This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
- Published
- 2014
12. Fast motion detection in coded video streams for a large-scale remote video sensor system
- Author
-
Yong-Sung Kim, Seung-Hwan Kim, Gyu-Hee Park, and Hyung-Joon Cho
- Subjects
Video post-processing ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,S-Video ,Smacker video ,Computer vision ,Block-matching algorithm ,Motion compensation ,Video capture ,business.industry ,Motion detection ,computer.file_format ,Video processing ,Motion vector ,Scalable Video Coding ,Quarter-pixel motion ,Video compression picture types ,Uncompressed video ,Rate–distortion optimization ,Video tracking ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Data compression - Abstract
A large number of remote video sensors are being deployed in the world to collect, store, and analyze the real-world data. Since a remote video sensor produces very large data, the total amount of video data are extremely large in size, complexity, and capacity. Important events from a remote video sensor are closely related to a motion in video. We present, in this paper, a fast motion detection method based on the number of bits used for encoding a video stream and the GOP-level motion detection. A low complexity measurement of the number of bits is performed in the coded video sequence and then, we store and process the coded video stream only if the total bits are larger than a pre-defined threshold. We also use a GOP level motion detection to reduce processing overhead compared to the conventional motion vector-based approach which processes every frame. Manipulating the number of bits is itself a much easier task than full reconstruction of each pixel of a video frame and it can save storage cost because it only stores a coded video sequence with a motion. The proposed method also contributes to reduction of computational complexity compared to the manipulation of motion vectors per 4x4 macro block. To evaluate our method, we deployed a centralized single server connected to H.264 capable remote video sensors. Results on the video sequences showed that the proposed approach can process more video sequences than the conventional compressed domain approach.
- Published
- 2014
13. Extended image differencing for change detection in UAV video mosaics
- Author
-
Günter Saur, Arne Schumann, and Wolfgang Krüger
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image registration ,Image differencing ,Video tracking ,Computer vision ,Artificial intelligence ,Thin plate spline ,business ,Change detection ,ComputingMethodologies_COMPUTERGRAPHICS ,Block-matching algorithm - Abstract
Change detection is one of the most important tasks when using unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. We address changes of short time scale, i.e. the observations are taken in time distances from several minutes up to a few hours. Each observation is a short video sequence acquired by the UAV in near-nadir view and the relevant changes are, e.g., recently parked or moved vehicles. In this paper we extend our previous approach of image differencing for single video frames to video mosaics. A precise image-to-image registration combined with a robust matching approach is needed to stitch the video frames to a mosaic. Additionally, this matching algorithm is applied to mosaic pairs in order to align them to a common geometry. The resulting registered video mosaic pairs are the input of the change detection procedure based on extended image differencing. A change mask is generated by an adaptive threshold applied to a linear combination of difference images of intensity and gradient magnitude. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed size of shadows, and compression or transmission artifacts. The special effects of video mosaicking such as geometric distortions and artifacts at moving objects have to be considered, too. In our experiments we analyze the influence of these effects on the change detection results by considering several scenes. The results show that for video mosaics this task is more difficult than for single video frames. Therefore, we extended the image registration by estimating an elastic transformation using a thin plate spline approach. The results for mosaics are comparable to that of single video frames and are useful for interactive image exploitation due to a larger scene coverage.
- Published
- 2014
14. An HEVC compressed domain content-based video signature for copy detection and video retrieval
- Author
-
Khalid Tahboub, Neeraj J. Gadgil, Edward J. Delp, and Mary L. Comer
- Subjects
Motion compensation ,Video post-processing ,Multimedia ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,computer.file_format ,Smacker video ,computer.software_genre ,Scalable Video Coding ,Video compression picture types ,Uncompressed video ,Video tracking ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer - Abstract
Video sharing platforms and social networks have been growing very rapidly for the past few years. The rapid increase in the amount of video content introduces many challenges in terms of copyright violation detection and video search and retrieval. Generating and matching content-based video signatures, or fingerprints, is an effective method to detect copies or “near-duplicate” videos. Video signatures should be robust to changes in the video features used to characterize the signature caused by common signal processing operations. Recent work has focused on generating video signatures based on the uncompressed domain. However, decompression is a computationally intensive operation. In large video databases, it becomes advantageous to create robust signatures directly from the compressed domain. The High Efficiency Video Coding (HEVC) standard has been recently ratified as the latest video coding standard and wide spread adoption is anticipated. We propose a method in which a content-based video signature is generated directly from the HEVC-coded bitstream. Motion vectors from the HEVC-coded bitstream are used as the features. A robust hashing function based on projection on random matrices is used to generate the hashing bits. A sequence of these bits serves as the signature for the video. Our experimental results show that our proposed method generates a signature robust to common signal processing techniques such as resolution scaling, brightness scaling and compression.
- Published
- 2014
15. Theory and practice of perceptual video processing in broadcast encoders for cable, IPTV, satellite, and internet distribution
- Author
-
Sean T. McCarthy
- Subjects
Motion compensation ,Video post-processing ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,computer.file_format ,Smacker video ,Video compression picture types ,Video tracking ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,Encoder ,computer ,Internet video ,Data compression - Abstract
This paper describes the theory and application of a perceptually-inspired video processing technology that was recently incorporated into professional video encoders now being used by major cable, IPTV, satellite, and internet video service providers. We will present data that show that this perceptual video processing (PVP) technology can improve video compression efficiency by up to 50% for MPEG-2, H.264, and High Efficiency Video Coding (HEVC). The PVP technology described in this paper works by forming predicted eye-tracking attractor maps that indicate how likely it might be that a free viewing person would look at particular area of an image or video. We will introduce in this paper the novel model and supporting theory used to calculate the eye-tracking attractor maps. We will show how the underlying perceptual model was inspired by electrophysiological studies of the vertebrate retina, and will explain how the model incorporates statistical expectations about natural scenes as well as a novel method for predicting error in signal estimation tasks. Finally, we will describe how the eye-tracking attractor maps are created in real time and used to modify video prior to encoding so that it is more compressible but not noticeably different than the original unmodified video.
- Published
- 2014
16. Low-delay cloud-based rendering of free viewpoint video for mobile devices
- Author
-
Chang Wen Chen, Dan Miao, and Wenwu Zhu
- Subjects
Video post-processing ,Multimedia ,Computer science ,business.industry ,Real-time computing ,Cloud computing ,Video quality ,computer.software_genre ,Rendering (computer graphics) ,Codec ,Quality of experience ,Image warping ,business ,computer ,Mobile device - Abstract
Free viewpoint video (FVV) provides immersive experiences in a truly seamless environment. Cloud computing facilitates the possibility to watch the FVV on mobile devices through the remote rendering in which the synthesis view is rendered in cloud server and transmitted to mobile devices. However, how to reduce the interaction delay during the viewpoint switching is a challenging problem in the remote rendering. In this paper, we propose a low-delay cloud-based FVV rendering framework to support FVV on mobile devices with satisfactory video quality and low interaction delay. In our framework, the rendering allocation scheme is proposed in which the local rendering is introduced on mobile devices during the viewpoint switching to conceal the interaction delay. To support the local rendering, the side information is generated based on the viewpoint prediction and 3D warping rule in cloud and then compressed by standard video codec. The experiment results show that the proposed remote rendering framework can improve the quality of experience substantially with improved free viewpoint video quality and low interaction delay on mobile devices.
- Published
- 2013
17. Augmented video calls on mobile devices
- Author
-
Pengwei Wang, Fengjun Lv, and Fengqing Zhu
- Subjects
Video post-processing ,Computer science ,Video capture ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,computer.file_format ,Smacker video ,Video compression picture types ,Uncompressed video ,Microsoft Video 1 ,Computer graphics (images) ,Video tracking ,Computer vision ,Segmentation ,Artificial intelligence ,Multiview Video Coding ,business ,computer - Abstract
An apparatus comprises a processor configured to: process, automatically and in real time, segmentation of a video object from a portion of a video, wherein the video object is a foreground of the video, and wherein a remaining portion of the video is a background of the video; and remove the background.
- Published
- 2013
18. Rate-adaptive compressive video acquisition with sliding-window total-variation-minimization reconstruction
- Author
-
Dimitris A. Pados and Ying Liu
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scalable Video Coding ,Video compression picture types ,Uncompressed video ,Distortion ,Video denoising ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,Block-matching algorithm - Abstract
We consider a compressive video acquisition system where frame blocks are sensed independently. Varying block sparsity is exploited in the form of individual per-block open-loop sampling rate allocation with minimal system overhead. At the decoder, video frames are reconstructed via sliding-window inter-frame total variation minimization. Experimental results demonstrate that such rate-adaptive compressive video acquisition improves noticeably the rate-distortion performance of the video stream over fixed-rate acquisition approaches.
- Published
- 2013
19. Edge adaptive intra field de-interlacing of video images
- Author
-
Gregory Smith, Vladimir Lachine, and Louie Lee
- Subjects
Video post-processing ,Pixel ,business.industry ,Computer science ,Orientation (computer vision) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Interlacing ,Image processing ,Edge enhancement ,Video processing ,Scale factor ,Filter (video) ,Video tracking ,Computer vision ,Video denoising ,Artificial intelligence ,Visual artifact ,business ,Interpolation - Abstract
Expanding image by an arbitrary scale factor and thereby creating an enlarged image is a crucial image processing operation. De-interlacing is an example of such operation where a video field is enlarged in vertical direction with 1 to 2 scale factor. The most advanced de-interlacing algorithms use a few consequent input fields to generate one output frame. In order to save hardware resources in video processors, missing lines in each field may be generated without reference to the other fields. Line doubling, known as “bobbing”, is the simplest intra field de-interlacing method. However, it may generate visual artifacts. For example, interpolation of an inserted line from a few neighboring lines by vertical filter may produce such visual artifacts as “jaggies.” In this work we present edge adaptive image up-scaling and/or enhancement algorithm, which can produce “jaggies” free video output frames. As a first step, an edge and its parameters in each interpolated pixel are detected from gradient squared tensor based on local signal variances. Then, according to the edge parameters including orientation, anisotropy and variance strength, the algorithm determines footprint and frequency response of two-dimensional interpolation filter for the output pixel. Filter’s coefficients are defined by edge parameters, so that quality of the output frame is controlled by local content. The proposed method may be used for image enlargement or enhancement (for example, anti-aliasing without resampling). It has been hardware implemented in video display processor for intra field de-interlacing of video images.
- Published
- 2013
20. Efficient streaming of stereoscopic depth-based 3D videos
- Author
-
Ghassan AlRegib, Mashhour Solh, Dogancan Temel, and Mohammed A. Aabed
- Subjects
Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,2D to 3D conversion ,Color space ,Luminance ,Video compression picture types ,Depth map ,Chrominance ,Codec ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,Depth perception ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
In this paper, we propose a method to extract depth from motion, texture and intensity. We first analyze the depth map to extract a set of depth cues. Then, based on these depth cues, we process the colored reference video, using texture, motion, luminance and chrominance content, to extract the depth map. The processing of each channel in the YC R C B -color space is conducted separately. We tested this approach on different video sequences with different monocular properties. The results of our simulations show that the extracted depth maps generate a 3D video with quality close to the video rendered using the ground truth depth map. We report objective results using 3VQM and subjective analysis via comparison of rendered images. Furthermore, we analyze the savings in bitrate as a consequence of eliminating the need for two video codecs, one for the reference color video and one for the depth map. In this case, only the depth cues are sent as a side information to the color video.
- Published
- 2013
21. Content adaptive enhancement of video images
- Author
-
Louie Lee, Gregory Smith, and Vladimir Lachine
- Subjects
Video post-processing ,Video capture ,Computer science ,business.industry ,Digital video ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,S-Video ,Video processing ,computer.file_format ,Video quality ,Smacker video ,Video compression picture types ,Uncompressed video ,Histogram ,Deflicking ,Video tracking ,Digital image processing ,Video denoising ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer - Abstract
Digital video products such as TVs, set-up boxes and DV players have circuits that enhance quality of incoming video content. User may control parameters of these circuits according to video source for optimum quality. However, there is a need for a procedure that can adjust these parameters automatically without user interaction. A three stages method for content adaptive enhancement of video images (CAEVI) in display processors is proposed. The first stage measures video signal statistics such as intensity and frequency histograms over image’s active area. The following stage generates control parameters for image processing blocks after measured statistics analysis. One of four quality classes (low, medium, high or special) is assigned to the incoming video, and a set of predefined control parameters for this class is selected. At the third stage, the set of control parameters is applied to the corresponding image processing blocks to reduce noise, improve signal transitions, enhance spatial details, contrast, brightness and saturation, and resample the video image. Video signal statistics are measured and accumulated for each frame, and control parameters are gradually adjusted on scene basis. Measuring and processing blocks are implemented in hardware to provide real time response. Image analysis and quality classification algorithm is implemented in embedded software for flexibility. The proposed method has been implemented in video processor as “Auto HQV” feature. The method was originally developed for TVs and. It is currently under adaptation for hand held devices.
- Published
- 2012
22. Directional frame interpolation for MPEG compressed video
- Author
-
Debin Zhao, Chang Zhao, Xinwei Gao, and Xiaopeng Fan
- Subjects
Demosaicing ,Video post-processing ,Pixel ,business.industry ,Computer science ,Quantization (signal processing) ,Wiener filter ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Bilinear interpolation ,Stairstep interpolation ,Video compression picture types ,Multivariate interpolation ,Uncompressed video ,symbols.namesake ,Nearest-neighbor interpolation ,symbols ,Image scaling ,Bicubic interpolation ,Computer vision ,Artificial intelligence ,Motion interpolation ,business ,Interpolation - Abstract
Image interpolation is one of the most elementary imaging research topics. A number of image interpolation methods have been developed for uncompressed images in the literature. However, a lot of videos have already been stored in MPEG-2 format or have to be transmitted in MPEG-2 format due to bandwidth limitation. The image interpolation methods developed for uncompressed images may not be effective when directly applied to compressed videos, because on the one hand, they do not utilize the information existed in the coded bitstreams; on the other hand, they do not consider quantization error, which may be dominant in some cases. Inspired by the success of the intra prediction in H.264/AVC and the edge-directed image interpolation methods (such as LAZA and NEDI), we propose a directional frame interpolation for MPEG compressed video. In the proposed method, 8×8 intra blocks in I frames are first classified to the nine block directions in transform domain. Then the interpolation on each block is performed along its block direction. For each block direction, an optimal Wiener filter is trained based on the representative video sequences and then used for its interpolation. In the similar way, for each pixel in an inter block in P or B frames, the interpolation is performed along the direction of its corresponding reference block. The experimental results demonstrate that the proposed method achieves better performance than the traditional linear methods such as Bicubic and Bilinear and the edge-directed methods such as LAZA and NEDI, while keeping low computational complexity which meets the requirement of practical applications.
- Published
- 2012
23. A study on the impact of compression and packet losses on rendered 3D views
- Author
-
Maria G. Martini, Harsha D. Appuhami, and Chaminda T. E. R. Hewage
- Subjects
Video post-processing ,Computer science ,business.industry ,Network packet ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video quality ,Real-time rendering ,Video compression picture types ,Rendering (computer graphics) ,Video tracking ,Computer vision ,Artificial intelligence ,business ,Alternate frame rendering - Abstract
In 3D video delivery, the rendered 3D video quality at the receiver-side can be affected by rendering artifacts as well as by concealment errors which occur in the process of recovering missing 3D video packets. Therefore it is vital to have an understanding of the artifacts prior to transmitting data. This work proposes a model to quantify rendering and concealment errors at the sender-side and to use the information generated through the model to effectively deliver 3D video content.
- Published
- 2012
24. An efficient and effective video similarity search method
- Author
-
Zheng Cao, Liuzhang Zhu, and Zimian Li
- Subjects
Motion compensation ,Video post-processing ,Information retrieval ,Computer science ,Nearest neighbor search ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.file_format ,Smacker video ,computer.software_genre ,Video compression picture types ,Similarity (network science) ,Video tracking ,Data mining ,computer ,Block-matching algorithm - Abstract
The increasing popularity of online video share and video-on-demand systems makes video similarity search become a hot research field in content-based video retrieval. There is still no satisfying fast scalable video similarity search method in large database. To solve two challenging problems: similarity measurement and search method, a novel efficient video similarity search approach is proposed in this paper. The video features are represented by image characteristic code based on the statistics of spatial-temporal distribution of video frame sequences. The video similarity is measured based on the calculation of the number of video components. For the scalable computing requirement, an efficient search method via clustering index table was presented by index clustering. The experimental results from the query tests in large database show this method is highly efficient and effective for similar video search.
- Published
- 2011
25. Mosaic-guided video retargeting for video adaptation
- Author
-
Tzu-Chieh Yen, Chia-Wen Lin, and Tsai Chia-Ming
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video compression picture types ,Uncompressed video ,Distortion ,Video tracking ,Retargeting ,Computer vision ,Video denoising ,Artificial intelligence ,Multiview Video Coding ,business ,Block-matching algorithm - Abstract
Video retargeting from a full-resolution video to a lower-resolution display will inevitably cause information loss. Content-aware video retargeting techniques have been studied to avoid critical visual information loss while resizing a video. In this paper, we propose a mosaic-guided video retargeting scheme to ensure good spatio-temporal coherence of the downscaled video. Besides, a rate-distortion optimization framework is proposed to maximize the information retained in the downscaled video.
- Published
- 2011
26. Extending JPEG-LS for low-complexity scalable video coding
- Author
-
Anton Sergeev, Soren Forchhammer, and Anna Ukhanova
- Subjects
Video post-processing ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,Lossy compression ,Smacker video ,JPEG2000 ,Computer vision ,Lossless compression ,Motion compensation ,business.industry ,computer.file_format ,Video processing ,H.264/SVC ,Scalable JPEG-LS ,JPEG ,Scalable Video Coding ,Wireless video transmission ,Video compression picture types ,Uncompressed video ,Video tracking ,JPEG 2000 ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Computer hardware ,Context-adaptive binary arithmetic coding ,Image compression ,Data compression - Abstract
JPEG-LS, the well-known international standard for lossless and near-lossless image compression, was originally designed for non-scalable applications. In this paper we propose a scalable modification of JPEG-LS and compare it with the leading image and video coding standards JPEG2000 and H.264/SVC intra for low-complexity constraints of some wireless video applications including graphics.
- Published
- 2011
27. Image quality of up-converted 2D video from frame-compatible 3D video
- Author
-
Ronald Renaud, Phil Blanchfield, Carlos Vázquez, Filippo Speranza, and Wa James Tam
- Subjects
Motion compensation ,Video post-processing ,Computer science ,Image quality ,Video capture ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,S-Video ,Video processing ,computer.file_format ,Video quality ,Smacker video ,Scalable Video Coding ,Video compression picture types ,Uncompressed video ,Video tracking ,Computer graphics (images) ,Codec ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Composite video ,Block-matching algorithm - Abstract
In the stereoscopic frame-compatible format, the separate high-definition left and high-definition right views are reduced in resolution and packed to fit within the same video frame as a conventional two-dimensional high-definition signal. This format has been suggested for 3DTV since it does not require additional transmission bandwidth and entails only small changes to the existing broadcasting infrastructure. In some instances, the frame-compatible format might be used to deliver both 2D and 3D services, e.g., for over-the-air television services. In those cases, the video quality of the 2D service is bound to decrease since the 2D signal will have to be generated by up-converting one of the two views. In this study, we investigated such loss by measuring the perceptual image quality of 1080i and 720p up-converted video as compared to that of full resolution original 2D video. The video was encoded with either a MPEG-2 or a H.264/AVC codec at different bit rates and presented for viewing with either no polarized glasses (2D viewing mode) or with polarized glasses (3D viewing mode). The results confirmed a loss of video quality of the 2D video up-converted material. The loss due to the sampling processes inherent to the frame-compatible format was rather small for both 1080i and 720p video formats; the loss became more substantial with encoding, particularly for MPEG-2 encoding. The 3D viewing mode provided higher quality ratings, possibly because the visibility of the degradations was reduced.
- Published
- 2011
28. A color video compression technique using key frames and a low complexity color transfer
- Author
-
Varaprasad Gude, Rakesh Agarwal, and Sumana Gupta
- Subjects
Video post-processing ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,S-Video ,Data_CODINGANDINFORMATIONTHEORY ,Residual frame ,Smacker video ,Computer Science::Multimedia ,Color depth ,Codec ,Computer vision ,Composite video ,Block-matching algorithm ,Motion compensation ,business.industry ,Video capture ,computer.file_format ,Video processing ,Video compression picture types ,Uncompressed video ,Video tracking ,RGB color model ,Key frame ,Video denoising ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Encoder ,Data compression ,Reference frame - Abstract
In this work, a novel method for color video compression using key-frame based color transfer has been proposed. In this scheme, compression is achieved by discarding the color information of all but few selected frames. These selected frames are either the key frames (frame selected by a key frame selection algorithm) or the Intra coded (I) frames. The partially colored video is compressed using a standard encoder thereby achieving higher compression. In the proposed decoder, a standard decoder first generates the partially colored video sequence from the compressed input. A color transfer algorithm is then used for generating the fully colored video sequence. The complexity of the proposed decoder is close to a standard decoder, allowing its use in wide variety of applications like video broadcasting, video streaming, hand-held devices etc.
- Published
- 2011
29. Video transcoding using GPU accelerated decoder
- Author
-
Wei-Lien Hsu
- Subjects
Video post-processing ,business.industry ,Computer science ,Video decoder ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Transcoding ,Video processing ,Graphics ,business ,computer.software_genre ,computer ,Computer hardware - Abstract
Due to the growing popularity of portable multimedia display devices and wide availability of high-definition video content, the transcoding of high-resolution videos into lower resolution ones with different formats has become a crucial challenge for PC platforms. This paper presents our study on the leveraging of the Unified Video Decoder (UVD) provided by the graphics processor unit (GPU) for achieving high-speed video transcoding with low CPU usage. Our experimental results show off-loading video decoding and video scaling to the GPU can double transcoding speed with only half the CPU usage compared to in-box software decoders for transcoding 1080p (1920x1080) video content on an AMD Vision processor with an integrated graphics unit.
- Published
- 2011
30. Video quality management for mobile video application
- Author
-
Khaled Helmi El-Maleh, Kai-Chieh Yang, and Vasudev Bhaskaren
- Subjects
Video post-processing ,Multimedia ,Video capture ,Computer science ,Video processing ,computer.file_format ,Video quality ,computer.software_genre ,Smacker video ,Video compression picture types ,Uncompressed video ,Video tracking ,Human visual system model ,PEVQ ,Encoder ,computer ,Subjective video quality ,Data compression - Abstract
This paper first briefly reviewed sources of visual quality degradation during video compression and also different video quality assessment techniques. It further extended discussion beyond video compression to other modules in different video application pipeline. Different video application is composed by different processing modules, such as sensor, video encoder, and display. Visual experience is not always determined by single module. Hence, the way of quantifying visual experience on different video applications should vary accordingly. Furthermore, users have very different expectation on visual experience in each application. Different quality assessment approach should be adopted based on various users' expectation.
- Published
- 2010
31. 3D video coding: an overview of present and upcoming standards
- Author
-
Philipp Merkle, Thomas Wiegand, and Karsten Muller
- Subjects
H.262/MPEG-2 Part 2 ,Video post-processing ,Multimedia ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Signal compression ,020206 networking & telecommunications ,02 engineering and technology ,computer.file_format ,Video processing ,Smacker video ,computer.software_genre ,Scalable Video Coding ,Video compression picture types ,Uncompressed video ,Microsoft Video 1 ,Video tracking ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Multiview Video Coding ,computer ,Data compression - Abstract
An overview of existing and upcoming 3D video coding standards is given. Various different 3D video formats are available, each with individual pros and cons. The 3D video formats can be separated into two classes: video-only formats (such as stereo and multiview video) and depth-enhanced formats (such as video plus depth and multiview video plus depth). Since all these formats exist of at least two video sequences and possibly additional depth data, efficient compression is essential for the success of 3D video applications and technologies. For the video-only formats the H.264 family of coding standards already provides efficient and widely established compression algorithms: H.264/AVC simulcast, H.264/AVC stereo SEI message, and H.264/MVC. For the depth-enhanced formats standardized coding algorithms are currently being developed. New and specially adapted coding approaches are necessary, as the depth or disparity information included in these formats has significantly different characteristics than video and is not displayed directly, but used for rendering. Motivated by evolving market needs, MPEG has started an activity to develop a generic 3D video standard within the 3DVC ad-hoc group. Key features of the standard are efficient and flexible compression of depth-enhanced 3D video representations and decoupling of content creation and display requirements.
- Published
- 2010
32. Immersive haptic interaction with media
- Author
-
A.M. Tekalp, N. Dindar, Cagatay Basdogan, Tekalp, Ahmet Murat (ORCID 0000-0003-1465-8121 & YÖK ID 26207), Başdoğan, Çağatay (ORCID 0000-0002-6382-7334 & YÖK ID 125489), Dindar, N., College of Engineering, Department of Electrical and Electronics Engineering, and Department of Mechanical Engineering
- Subjects
Haptic interaction ,Haptic motion ,Haptic structure ,Video post-processing ,Video-plus-depth representation ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Electrical and electronic engineering ,Optics ,Cursor (databases) ,Motion (physics) ,InformationSystems_MODELSANDPRINCIPLES ,Stereotaxy ,Computer graphics (images) ,Immersion (virtual reality) ,Representation (mathematics) ,ComputingMethodologies_COMPUTERGRAPHICS ,Haptic technology - Abstract
New 3D video representations enable new modalities of interaction, such as haptic interaction, with 2D and 3D video for truly immersive media applications. Haptic interaction with video includes haptic structure and haptic motion for new immersive experiences. It is possible to compute haptic structure signals from 3D scene geometry or depth information. This paper introduces the concept of haptic motion, as well as new methods to compute haptic structure and motion signals for 2D video-plus-depth representation. The resulting haptic signals can be rendered using a haptic cursor attached to a 2D or 3D video display., Turkish Academy of Sciences (TÜBA)
- Published
- 2010
33. Approaches to 3D video compression
- Author
-
Toshihiko Yamasaki, Kiyoharu Aizawa, and Seung-Ryong Han
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,Video processing ,computer.file_format ,Smacker video ,Video compression picture types ,Uncompressed video ,Video tracking ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Data compression ,Block-matching algorithm - Abstract
Three-dimensional (3-D) video provides an immersing experience for users. In recent years, many attempts have been made to capture the complex surface shape and highly detailed texture of real-world moving objects, which results in a huge amount of data. In this paper, we discuss compression issues for 3-D video. We introduce 3-D video, which is classified into two categories. Then we survey compression methods that have been investigated for each category. We present our compression methods for temporally varying mesh sequences. In addition, we show comparison results for our algorithm with respect to previous work.
- Published
- 2010
34. Streaming video for distributed simulation
- Author
-
Steven Webster and Douglas J. Paul
- Subjects
Video post-processing ,Video capture ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.file_format ,Video processing ,Smacker video ,Video compression picture types ,Uncompressed video ,Microsoft Video 1 ,Video tracking ,Codec ,Encoder ,computer - Abstract
Distributed simulation environments are increasingly using video to stimulate operational systems and their prototypical equivalents. Traditionally, this video has been synthesized and delivered by an analog means to consuming software applications. Scene generators typically render to commodity video cards, generate out of band metadata, and convert their outputs to formats compatible with the stimulated systems. However, the approach becomes hardware intensive as environment scale and distribution requirements grow. Streaming video technologies can be applied to uncouple video sources from their consumers, thereby enabling video channel quantities beyond rendering hardware outputs. Moreover, metadata describing the video content can be multiplexed, thereby ensuring temporal registration between video and its attribution. As an application of this approach, the Night Vision Image Generator (NVIG) has been extended and integrated with distribution architectures to deliver streaming video in virtual simulation environments. Video capture hardware emulation and application frame buffer reads are considered for capturing rendered scenes. Video source to encoder bindings and content multiplexing are realized by combining third party video codec, container, and transport implementations with original metadata encoders. Readily available commercial and open source solutions are utilized for content distribution and demultiplexing to a variety of formats and clients. Connected and connectionless distribution approaches are discussed with respect to latency and reliability. Client side scalability, latency, and initialization issues are addressed. Finally, the solution is applied to tactical systems stimulus and training, showing the evolvement from the analog to the streamed video approach.
- Published
- 2010
35. Temporal fusion for de-noising of RGB video received from small UAVs
- Author
-
Kyle J. Hildebrand and Amber Fischer
- Subjects
Image fusion ,Motion compensation ,Video post-processing ,business.industry ,Video capture ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,Object detection ,Video compression picture types ,Uncompressed video ,Distortion ,Video tracking ,RGB color model ,Computer vision ,Video denoising ,Artificial intelligence ,Multiview Video Coding ,business - Abstract
Monitoring video data sources received from UAVs is especially challenging because of the quality of the video received. Due to the individual characteristics of the unmanned platform and the changing environment, the important elements in the scene are not always observable or easily identified. In addition to typical sensor noise, significant image degradation can occur during transmission of the video from an airborne platform. Interference from other transmitters, analog noise in the embedded avionics, and multi-path effects can corrupt the video signal during transmission, introducing distortion in the video received at the ground. In some cases, the loss of signal is so severe; no information is received in portions of an image frame. To improve the corrupt video, we capitalize on the oversampling in the temporal domain (across video frames), applying a data fusion approach to de-noise the video. The resulting video retains the significant scene content and dynamics, without distracting artifacts from noise. This allows humans to easily ingest the information from the video, and make it possible to utilize further video exploitation algorithms such as object detection and tracking.
- Published
- 2010
36. Fast and efficient search for MPEG-4 video using adjacent pixel intensity difference quantization histogram feature
- Author
-
Koji Kotani, Feifei Lee, Tadahiro Ohmi, and Qiu Chen
- Subjects
Motion compensation ,Video post-processing ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.file_format ,Smacker video ,Video compression picture types ,Video tracking ,Video denoising ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Block-matching algorithm - Abstract
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
- Published
- 2010
37. A memory-efficient and time-consistent filtering of depth map sequences
- Author
-
Karen Egiazarian, Atanas Gotchev, Sergey Smirnov, Astola, Jaakko, Egiazarian, Katen, Tampere University, Department of Signal Processing, Research group: Computational Imaging-CI, Research group: 3D MEDIA, and Research group: Algebraic and Algorithmic Methods in Signal Processing AAMSP
- Subjects
Video post-processing ,Pixel ,Computer science ,business.industry ,Color image ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,113 Computer and information sciences ,Grayscale ,Rendering (computer graphics) ,Depth map ,Video tracking ,Computer vision ,Artificial intelligence ,Bilateral filter ,business - Abstract
'View plus depth' is a 3D video representation where a single color video channel is augmented with per-pixel depth information in the form of gray-scale video sequence. This representation is a good candidate for 3D video delivery applications, as it is display agnostic and allows for some parallax adjustments. However, the quality of the associated depth is an issue, as the depth channel is usually a result of estimation procedure based on stereo correspondences or comes from a noisy and low-resolution range sensor. Therefore, proper filtering of the depth channel is needed before it is used for compression and/or view rendering. The problem is even more pronounced in video, where temporal consistency of the depth sequence is required. In this paper, we propose a filtering approach to refine the quality of noisy, blocky, and temporally-inconsistent depth maps. We utilize color constraints from the video channel and modify a previous super-resolution approach to tackle the time consistency for video. Our implementation is fast and highly memory efficient. We present filtering results demonstrating the superiority of the developed technique.
- Published
- 2010
38. Efficient generation of holographic video of 3D objects by use of redundancy of image and look-up table methods
- Author
-
Jae-Eun Kang, Eun-Soo Kim, and Seung-Cheol Kim
- Subjects
Video post-processing ,Pixel ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Holography ,Stereo display ,Computer-generated holography ,law.invention ,law ,Video tracking ,Lookup table ,Redundancy (engineering) ,Computer vision ,Artificial intelligence ,business - Abstract
In this paper, a new method for efficient generation of video hologram for 3-D video is proposed by combined use of redundant data of 3-D video and look-up table techniques. That is, 3-D video is a collection of sequential 3-D images having depth data as well as intensity and neighboring moving pictures in the 3-D video differ slightly from each other. Therefore, a method for fast computation of CGH patterns for 3-D video images is proposed by combined use of temporal redundancy and look-up table techniques. Further more, adjacent pixels of a 3-D image have very similar values of intensity and depth and some of them even have the exactly same values of them each other. In other words, a 3-D image has a spatial redundancy in intensity and depth data. Therefore, a method for fast computation of CGH patterns for the 3-D image by taking into account of the spatial redundancy of the 3-D image is proposed. To confirm the feasibility of the proposed method, some experiments with a 3-D test object are carried out and the results are compared to those of the conventional methods in terms of a computational speed and a required memory size.
- Published
- 2009
39. Rate-Adaptive Video Compression (RAVC) Universal Video Stick (UVS)
- Author
-
David L. Hench
- Subjects
Video post-processing ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,S-Video ,Smacker video ,Video quality ,computer.software_genre ,Videoconferencing ,MPEG-4 ,Computer vision ,Composite video ,Block-matching algorithm ,Motion compensation ,business.industry ,Video capture ,Signal compression ,computer.file_format ,Video processing ,Scalable Video Coding ,Video compression picture types ,Video tracking ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Encoder ,Network Abstraction Layer ,Computer hardware ,Group of pictures ,Image compression ,Data compression - Abstract
The H.264 video compression standard, aka MPEG 4 Part 10 aka Advanced Video Coding (AVC) allows new flexibility in the use of video in the battlefield. This standard necessitates encoder chips to effectively utilize the increased capabilities. Such chips are designed to cover the full range of the standard with designers of individual products given the capability of selecting the parameters that differentiate a broadcast system from a video conferencing system. The SmartCapture commercial product and the Universal Video Stick (UVS) military versions are about the size of a thumb drive with analog video input and USB (Universal Serial Bus) output and allow the user to select the parameters of imaging to the. Thereby, allowing the user to select video bandwidth (and video quality) using four dimensions of quality, on the fly, without stopping video transmission. The four dimensions are: 1) spatial, change from 720 pixel x 480 pixel to 320 pixel x 360 pixel to 160 pixel x 180 pixel, 2) temporal, change from 30 frames/ sec to 5 frames/sec, 3) transform quality with a 5 to 1 range, 4) and Group of Pictures (GOP) that affects noise immunity. The host processor simply wraps the H.264 network abstraction layer packets into the appropriate network packets. We also discuss the recently adopted scalable amendment to H.264 that will allow limit RAVC at any point in the communication chain by throwing away preselected packets.
- Published
- 2009
40. Compressed stereoscopic video quality metric
- Author
-
Jungdong Seo, Dong-Hyun Kim, and Kwanghoon Sohn
- Subjects
Motion compensation ,Video post-processing ,Image quality ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video quality ,Video compression picture types ,Rate–distortion optimization ,Video tracking ,Computer vision ,Video denoising ,Artificial intelligence ,Multiview Video Coding ,PEVQ ,business ,Subjective video quality ,Image compression ,Data compression ,Block-matching algorithm - Abstract
Stereoscopic video delivers depth perception to users contrary to 2-dimenstional video. Therefore, we need to develop a new video quality assessment model for stereoscopic video. In this paper, we propose a new method for objective assessment of stereoscopic video. The proposed method detects blocking artifacts and degradation in edge regions such as in conventional video quality assessment model. In addition, it detects video quality difference between views using depth information. We performed subjective evaluation of stereoscopic video to verify the performance of the proposed method, and we confirmed that the proposed algorithm is superior to PSNR in respect to correlation with the subjective evaluation.
- Published
- 2009
41. Reduced resolution MPEG-2 to H.264 transcoder
- Author
-
Hari Kalva, Gerardo Fernández-Escribano, and Kelly Kunzelmann
- Subjects
Speedup ,Video post-processing ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Signal compression ,Image processing ,Data_CODINGANDINFORMATIONTHEORY ,Transcoding ,computer.file_format ,computer.software_genre ,MPEG-2 ,Motion estimation ,Discrete cosine transform ,Computer vision ,Artificial intelligence ,business ,computer ,Computer hardware - Abstract
This paper describes complexity reduction in MPEG-2 to H.264 transcoding with resolution reduction. The methods developed are applicable to transcoding any DCT based video such as MPEG-2, MPEG-4, and H.263 to the recently standardized H.264 video at a reduced resolution. H.264 is being adopted by mobile device industry and devices such as iPod use H.264. The mobile devices, however, need the video at a reduced resolution. The proposed transcoder accelerates the H.264 encoding stage by performing motion estimation for only one block size as determined by the trained decision trees. Our solution allows conversion of MPEG-2 video to the H.264 video format at a reduced resolution with substantially less computing complexity. We use machine learning based approaches to significantly reduce the complexity of this transcoding. Experimental results show a reduction in transcoding time of about 67% (a 3x speedup) with a less than -0.5 dB change in PSNR.
- Published
- 2009
42. Design and implementation of non-standard video acquisition system
- Author
-
Tingfa Xu, Qingwang Qin, and Guoqiang Ni
- Subjects
Video post-processing ,business.industry ,Computer science ,Video capture ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,S-Video ,Video processing ,Video compression picture types ,Video tracking ,Electronic engineering ,Multiview Video Coding ,business ,Composite video ,Computer hardware - Abstract
Non-standard video acquisition is a common requirement in science research, industrial detection and medical instrument. A general analog video signal acquisition scheme using TI DSP video port cooperates with video ADC AD9985 is proposed in this paper. FPGA is used to package AD9985 output data into 16bit width and decrease half of pixel clock frequency, in order to resolve video port bandwidth bottleneck problem. The experiments show that the system can capture a variety of analog video formats, including non-standard video and standard video signals. Resolution of captured video can be up to 2048×2048 in case of frame rate under 30fps, and the captured frame rate can reach up to 400fps in case of resolution under 640×480. The image quality obtained is quite well, and system parameters can be adjusted conveniently.
- Published
- 2008
43. Spatio-temporal sampling for video
- Author
-
Mohan Shankar, Nikos P. Pitsianis, and David J. Brady
- Subjects
Motion compensation ,Video post-processing ,business.industry ,Computer science ,Aperture ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Astrophysics::Instrumentation and Methods for Astrophysics ,Sampling (statistics) ,Iterative reconstruction ,Video processing ,Computational photography ,Compressed sensing ,Video tracking ,Computer Science::Multimedia ,Redundancy (engineering) ,Computer vision ,Artificial intelligence ,business - Abstract
With this work we propose spatio-temporal sampling strategies for video using a lenslet array computational imaging system and explore the opportunities and challenges in the design of compressive video sensors and corresponding processing algorithms. The redundancies in video streams are exploited by (a) sampling the sub-apertures of a multichannel (TOMBO) camera, and (b) by the computational reconstruction to achieve low power and low complexity video sensors. A spatial and a spatio-temporal sampling strategy are considered, taking into account the feasibility for implementation in the focal-plane readout hardware. The algorithms used to reconstruct the video frames from measurements are also presented.
- Published
- 2008
44. Using wavelets for edge directed image interpolation
- Author
-
Eric P. Lam
- Subjects
Video post-processing ,Demosaicing ,business.industry ,Computer science ,MathematicsofComputing_NUMERICALANALYSIS ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Trilinear interpolation ,Bilinear interpolation ,Image processing ,Stairstep interpolation ,Multivariate interpolation ,Wavelet ,Nearest-neighbor interpolation ,Computer Science::Computer Vision and Pattern Recognition ,Image scaling ,Bicubic interpolation ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS ,Interpolation - Abstract
Resizing an image is an important technique in image processing. When increasing the size of the image, some details of the image are smeared or blurred with common interpolation techniques, such as bilinear interpolation. Edges do not appear as sharp as the original image. In addition, when performing interpolation with high magnification, blocking effects start to appear. In this paper, we present an approach that performs interpolation in the direction of the edges, rather than just horizontal and vertical direction. A wavelet preprocessing is used to extract edge direction information before performing interpolation in multiple directions.
- Published
- 2008
45. Rate-controlled requantization transcoding for H.264/AVC video streams
- Author
-
Peter Lambert, Jan De Cock, Stijn Notebaert, and Rik Van de Walle
- Subjects
Video post-processing ,Computer science ,Quantization (signal processing) ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Constant bitrate ,Data_CODINGANDINFORMATIONTHEORY ,Transcoding ,computer.software_genre ,Scalable Video Coding ,Video compression picture types ,Distortion ,Bit rate ,computer - Abstract
Nowadays, most video material is coded using a non-scalable format. When transmitting these single-layer video bitstreams, there may be a problem for connection links with limited capacity. In order to solve this problem, requantization transcoding is often used. The requantization transcoder applies coarser quantization in order to reduce the amount of residual information in the compressed video bitstream. In this paper, we extend a requantization transcoder for H.264/AVC video bitstreams with a rate-control algorithm. A simple algorithm is proposed which limits the computational complexity. The bit allocation is based on the bit distribution in the original video bitstream. Using the bit budget and a linear model between rate and quantizer, the new quantizer is calculated. The target bit rate is attained with an average deviation lower than 6%, while the rate-distortion performance shows small improvements over transcoding without rate control.
- Published
- 2008
46. Rate controlling for color and depth based 3D video coding
- Author
-
W.A.C. Fernando, B. Kamolrat, Marta Mrak, and Tescher, Andrew G.
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,3D Video Coding ,Rate Control ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.file_format ,Smacker video ,Video quality ,Video compression picture types ,Depth map ,Video tracking ,Color depth ,Bit rate ,Computer vision ,Artificial intelligence ,Multiview Video Coding ,business ,computer - Abstract
Rate control is a very important tool for any kind of video coding and transmission. Even though there are many rate control techniques designed for 2D video, not much work has been carried out in this area for 3D video. In this paper, a novel rate control approach for 3D video based on color and depth maps is introduced. The aim of this rate control algorithm is to keep the 3D video quality near constant. Since the 3D video is synthesized from color and depth maps and the final quality of 3D video is more influenced by color sequence rather than depth maps, the qualities of both color and depth maps are firstly varied until the target 3D quality is achieved with minimal bit rate. Subsequently, the PSNR of both color and depth maps at the optimum point are maintained for the entire group of picture (GOP). The bit-allocation problem is solved by introducing a Lagrangian optimization cost function. According to experimental results, the proposed rate control technique is capable to adjust the bit rate allocated to the color and depth maps sequences adaptively in order to maximize 3D video quality which is measured by PSNR of the synthesized left and right views using reconstructed color and depth map sequences.
- Published
- 2008
47. Real-time video-based iris image processing
- Author
-
Yingzi Du and Zhi Zhou
- Subjects
Video post-processing ,Biometrics ,Image quality ,business.industry ,media_common.quotation_subject ,Iris recognition ,Frame (networking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Art ,ComputingMethodologies_PATTERNRECOGNITION ,Video tracking ,Pattern recognition (psychology) ,Computer vision ,Artificial intelligence ,business ,media_common - Abstract
Iris recognition is an important method for identifying a person. Currently, most iris recognition methods are based on individual images. For non-cooperative user identification, video image based methods can provide more information. However, the iris image quality may vary from frame to frame. In this paper, we propose a real-time video based iris image processing method to eliminate the bad quality video image frames. It takes advantage of the correlations among video frames.
- Published
- 2008
48. A mobile video surveillance system with intelligent object analysis
- Author
-
Yung-Hsiang Hu, Li-Ya Wang, and Yuan-Kai Wang
- Subjects
Background subtraction ,Video post-processing ,Computer science ,business.industry ,Video capture ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Video processing ,Transcoding ,computer.software_genre ,Object detection ,Video tracking ,Computer vision ,Artificial intelligence ,business ,computer - Abstract
A mobile video surveillance system is a video surveillance system adopts mobile clients to visualize surveillance videos over mobile networks. However, mobile networks and mobile clients have limited computational and network resources. The system combines moving object detection and video transcoding techniques to help users monitor remote site through video streaming over 3G communication networks. The moving object detection and tracking can skim off useful video clips. The communication networking services, comprising video transcoding, short text messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction of adaptive Gaussian mixture modeling, and particle filter tracking. A spatial-domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into 3GPP video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, and the transcoder has high PSNR.
- Published
- 2008
49. Video fingerprinting: features for duplicate and similar video detection and query-based video retrieval
- Author
-
Emily Moxley, Anindya Sarkar, Pradiptya Ghosh, and B.S. Manjunath
- Subjects
Motion compensation ,Video post-processing ,Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,computer.file_format ,Video processing ,Video quality ,Smacker video ,Automatic summarization ,Video compression picture types ,Uncompressed video ,Video tracking ,Computer vision ,Video denoising ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Image compression ,Block-matching algorithm - Abstract
A video "fingerprint" is a feature extracted from the video that should represent the video compactly, allowing faster search without compromising the retrieval accuracy. Here, we use a keyframe set to represent a video, motivated by the video summarization approach. We experiment with different features to represent each keyframe with the goal of identifying duplicate and similar videos. Various image processing operations like blurring, gamma correction, JPEG compression, and Gaussian noise addition are applied on the individual video frames to generate duplicate videos. Random and bursty frame drop errors of 20%, 40% and 60% (over the entire video) are also applied to create more noisy "duplicate" videos. The similar videos consist of videos with similar content but with varying camera angles, cuts, and idiosyncrasies that occur during successive retakes of a video. Among the feature sets used for comparison, for duplicate video detection, Compact Fourier-Mellin Transform (CFMT) performs the best while for similar video retrieval, Scale Invariant Feature Transform (SIFT) features are found to be better than comparable-dimension features. We also address the problem of retrieval of full-length videos with shorter-length clip queries. For identical feature size, CFMT performs the best for video retrieval.
- Published
- 2008
50. H.263 to VP6 video transcoder
- Author
-
Hari Kalva and Chris Holder
- Subjects
Video post-processing ,Multimedia ,Video capture ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,Video processing ,computer.file_format ,Transcoding ,Smacker video ,computer.software_genre ,Scalable Video Coding ,Video compression picture types ,Video tracking ,Codec ,Multiview Video Coding ,computer ,AMV video format ,Image compression ,Data compression - Abstract
VP6 is a video coding standard developed by On2 Technologies. It is the preferred codec in the Flash 8/9 format used by many popular online video services and user generated content sites. The wide adoption of Flash video for video delivery on the Internet has made VP6 one of the most widely used video compression standards on the Internet. With the wide adoption of VP6 comes the need for transcoding other video formats to the VP6 format. This paper presents algorithms to transcode H.263 to the VP6 format. This transcoder has applications in media adaptation including converting older Flash video formats to Flash 8 format. The transcoding algorithms reuse the information from the H.263 decoding stage and accelerate the VP6 encoding stage. Experimental results show that the proposed algorithms are able to reduce the encoding complexity by up to 52% while reducing the PSNR by at most 0.42 dB in the worst case.
- Published
- 2008
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.