37 results on '"Xu, Jizheng"'
Search Results
2. A cross-resolution leaky prediction scheme for in-band wavelet video coding with spatial scalability
- Author
-
Zhang, Dongdong, Zhang, Wenjun, Xu, Jizheng, Wu, Feng, and Xiong, Hongkai
- Subjects
Image coding -- Technology application ,Wavelet transforms -- Evaluation ,Scalability -- Evaluation ,Circuit design -- Evaluation ,Circuit designer ,Integrated circuit design ,Technology application ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In most existing in-band wavelet video coding schemes, over-complete wavelet transform is used for the motion-compensated temporal filtering (MCTF) of each spatial subband. It can overcome the shift-variance of critical sampling wavelet transform and improve the coding efficiency of the in-band scheme. However, a dilemma exists in the current implementations of in-band MCTF (IBMCTF), which is whether or not to exploit the spatial highpass subbands in motion compensation of the spatial lowpass subband. The absence of the spatial highpass subbands will result in significant quality loss in the reconstructed full-resolution video, whereas the presence of the spatial highpass subbands may bring serious mismatch error in the decoded low-resolution video since the corresponding highpass subbands may be unavailable at the decoder. In this paper, we first analyze the mismatch error propagation in decoding the low-resolution video. Based on our analysis, we then propose a frame-based cross-resolution leaky prediction scheme for IBMCTF. It can make a good tradeoff between alleviating the low-resolution mismatch and improving the full-resolution coding efficiency. Experimental results show that the proposed scheme can dramatically reduce the mismatch error by 0.3-2.5 dB for low resolution, while the performance loss is marginal for high resolution. Index Terms--Cross-resolution leaky prediction, in-band motion-compensated temporal filtering (MCTF), mismatch error analysis, spatial scalability, wavelet video coding.
- Published
- 2008
3. In-scale motion compensation for spatially scalable video coding
- Author
-
Xiong, Ruiqin, Xu, Jizheng, and Wu, Feng
- Subjects
Scalability -- Methods ,Image coding -- Methods ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In existing pyramid-based spatially scalable coding schemes, such as H.264/MPEG-4 SVC (scalable video coding), video frame at a certain high-resolution layer is mainly predicted either from the same frame at the next lower resolution layer, or from the temporal neighboring frames within the same resolution layer. But these schemes fail to exploit both kinds of correlation simultaneously and therefore cannot remove the redundancies among resolution layers efficiently. This paper extends the idea of spatiotemporal subband transform and proposes a general in-scale motion compensation technique for pyramid-based spatially scalable video coding. Video frame at each high-resolution layer is partitioned into two parts in frequency. Prediction for the lowpass part is derived from the next lower resolution layer, whereas prediction for the highpass part is obtained from neighboring frames within the same resolution layer, to further utilize temporal correlation. In this way, both kinds of correlation are exploited simultaneously and the cross-resolution-layer redundancy can be highly removed. Furthermore, this paper also proposes a macroblock-based adaptive in-scale technique for hybrid spatial and SNR scalability. Experimental results show that the proposed techniques can significantly improve the spatial scalability performance of H.264/MPEG-4 SVC, especially when the bit-rate ratio of lower resolution bit stream to higher resolution bit stream is considerable. Index Terms--H.2641MPEG-4 SVC, in-scale motion compensation, inter-layer prediction, scalable video coding, spatial scalability.
- Published
- 2008
4. Lifting-based directional DCT-like transform for image coding
- Author
-
Xu, Hao, Xu, Jizheng, and Wu, Feng
- Subjects
Image coding -- Methods ,Transformations (Mathematics) -- Properties ,Entropy (Information theory) -- Measurement ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
Traditional 2-D discrete cosine transform (DCT) implemented by separable 1-D transform in horizontal and vertical directions does not take image orientation features in a local window into account. To improve it, we propose to introduce directional primary operations to the lifting-based DCT and thereby derive a new directional DCT-like transform, whose transform matrix is dependent on directional angle and interpolation used there. Furthermore, the proposed transform is compared with the straightforward one of first rotated and then transformed. A JPEG-wise image coding scheme is also proposed to evaluate the performance of the proposed directional DCT-like transform. The first 1-D transform is performed according to image orientation features, and the second 1-D transform still in the horizontal or vertical direction. At the same time, an approach is proposed to optimally select transform direction of each block because selected directions of neighboring blocks will influence each other. The experimental results show that the performance of the proposed directional DCT-like transform can dramatically outperform the conventional DCT up to 2 dB even without modifying entropy coding. Index Terms--Directional transform, discrete cosine transform (DCT), image coding, lifting structure.
- Published
- 2007
5. Subband coupling aware rate allocation for spatial scalability in 3-D wavelet video coding
- Author
-
Xiong, Ruiqin, Xu, Jizheng, Wu, Feng, Li, Shipeng, and Zhang, Ya-Qin
- Subjects
Image coding -- Methods ,Signal processing -- Methods ,Electric filters -- Usage ,Digital signal processor ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The motion compensated temporal filtering (MCTF) technique, which is extensively used in 3-D wavelet video coding schemes nowadays, leads to signal coupling among various spatial subbands because motion alignment is introduced in the temporal filtering. Using all spatial subbands as a reference enables MCTF to fully take advantage of temporal correlation across frames but inevitably brings drifting problem in supporting spatial scalability. This paper first analyzes the signal coupling phenomenon and then proposes a quantitative model to describe signal propagation across spatial subbands during the MCTF process. The signal propagation is modeled for a single MC step based on the shifting effect of wavelet synthesis filters and then it is extended to multilevel MCTF. This model is called subband coupling aware signal propagation (SCASP) model in this paper. Based on the model, we further propose a subband coupling aware rate allocation scheme as one possible solution to the above dilemma in supporting spatial scalability. To find the optimal rate allocation among all subbands for a specified reconstruction resolution, the SCASP model is used to approximate the reconstruction process and derive the synthesis gain of each subband with regard to that reconstruction. Experimental results have fully demonstrated the advantages of our proposed rate allocation scheme in improving both objective and subjective qualities of reconstructed low-resolution video, especially at middle bit rates and high bit rates. Index Terms--Motion compensated temporal filtering (MCTF), rate allocation, signal propagation model, spatial scalability, subband coupling, 3-D wavelet video coding.
- Published
- 2007
6. Barbell-lifting based 3-D wavelet coding scheme
- Author
-
Xiong, Ruiqin, Xu, Jizheng, Wu, Feng, and Li, Shipeng
- Subjects
Image coding -- Methods ,Wavelet transforms -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
This paper provides an overview of the Barbell lifting coding scheme that has been adopted as common software by the MPEG ad hoc group on further exploration of wavelet video coding. The core techniques used in this scheme, such as Barbell lifting, layered motion coding, 3-D entropy coding and base layer embedding, are discussed. The paper also analyzes and compares the proposed scheme with the oncoming Scalable Video Coding (SVC) standard because the hierarchical temporal prediction technique used in SVC has a close relationship with motion compensated temporal lifting (MCTF) in wavelet coding. The commonalities and differences between these two schemes are exhibited for readers to better understand modern scalable video coding technologies. Several challenges that still exist in scalable video coding, e.g., performance of spatial scalable coding and accurate MC lifting, are also discussed. Two new techniques are presented in this paper although they are not yet integrated into the common software. Finally, experimental results demonstrate the performance of the Barbell-lifting coding scheme and compare it with SVC and another well-known 3-D wavelet coding scheme, MC embedded zero block coding (MC-EZBC). Index Terms--Barbell lifting, lifting-based wavelet transform, Scalable Video Coding (SVC), 3-D wavelet video coding.
- Published
- 2007
7. Memory-constrained 3-D wavelet transform for video coding without boundary effects
- Author
-
Xu, Jizheng, Xiong, Zixiang, Li, Shipeng, and Zhang, Ya-Qin
- Subjects
Electrical engineering -- Research ,Video equipment -- Research ,Wavelet transforms -- Usage ,Codecs -- Usage ,Boundary value problems -- Research ,Image coding -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
Three-dimensional (3-D) wavelet-based scalable video coding provides a viable alternative to standard MC-DCT coding. However, many current 3-D wavelet coders experience severe boundary effects across group of pictures (GOP) boundaries. This paper proposes a memory-efficient transform technique via lifting that effectively computes wavelet transforms of a video sequence continuously on the fly, thus eliminating the boundary effects due to limited length of individual GOPs. Coding results show that the proposed scheme completely eliminates the boundary effects and gives superb video playback quality. Index Terms--Boundary effects, lifting, 3-D wavelet video coding.
- Published
- 2002
8. Overview of the Screen Content Support in VVC: Applications, Coding Tools, and Performance.
- Author
-
Nguyen, Tung, Xu, Xiaozhong, Henry, Felix, Liao, Ru-Ling, Sarwer, Mohammed Golam, Karczewicz, Marta, Chao, Yung-Hsuan, Xu, Jizheng, Liu, Shan, Marpe, Detlev, and Sullivan, Gary J.
- Subjects
PULSE-code modulation ,VIDEO coding ,COMPUTER-generated imagery ,CUSTOMER experience ,PERSONAL computers - Abstract
In an increasingly connected world, consumer video experiences have diversified away from traditional broadcast video into new applications with increased use of non-camera-captured content such as computer screen desktop recordings or animations created by computer rendering, collectively referred to as screen content. There has also been increased use of graphics and character content that is rendered and mixed or overlaid together with camera-generated content. The emerging Versatile Video Coding (VVC) standard, in its first version, addresses this market change by the specification of low-level coding tools suitable for screen content. This is in contrast to its predecessor, the High Efficiency Video Coding (HEVC) standard, where highly efficient screen content support is only available in extension profiles of its version 4. This paper describes the screen content support and the five main low-level screen content coding tools in VVC: transform skip residual coding (TSRC), block-based differential pulse-code modulation (BDPCM), intra block copy (IBC), adaptive color transform (ACT), and the palette mode. The specification of these coding tools in the first version of VVC enables the VVC reference software implementation (VTM) to achieve average bit-rate savings of about 41% to 61% relative to the HEVC test model (HM) reference software implementation using the Main 10 profile for 4:2:0 screen content test sequences. Compared to the HM using the Screen-Extended Main 10 profile and the same 4:2:0 test sequences, the VTM provides about 19% to 25% bit-rate savings. The same comparison with 4:4:4 test sequences revealed bit-rate savings of about 13% to 27% for $Y'C_{B}C_{R}$ and of about 6% to 14% for $R'G'B'$ screen content. Relative to the HM without the HEVC version 4 screen content coding extensions, the bit-rate savings for 4:4:4 test sequences are about 33% to 64% for $Y'C_{B}C_{R}$ and 43% to 66% for $R'G'B'$ screen content. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. Three-Dimensional Embedded Subband Coding with Optimized Truncation (3-D ESCOT)
- Author
-
Xu, Jizheng, Xiong, Zixiang, Li, Shipeng, and Zhang, Ya-Qin
- Published
- 2001
- Full Text
- View/download PDF
10. Consistent Video Style Transfer via Relaxation and Regularization.
- Author
-
Wang, Wenjing, Yang, Shuai, Xu, Jizheng, and Liu, Jiaying
- Subjects
OPTICAL flow ,IMAGE color analysis ,VIDEO surveillance - Abstract
In recent years, neural style transfer has attracted more and more attention, especially for image style transfer. However, temporally consistent style transfer for videos is still a challenging problem. Existing methods, either relying on a significant amount of video data with optical flows or using single-frame regularizers, fail to handle strong motions or complex variations, therefore have limited performance on real videos. In this article, we address the problem by jointly considering the intrinsic properties of stylization and temporal consistency. We first identify the cause of the conflict between style transfer and temporal consistency, and propose to reconcile this contradiction by relaxing the objective function, so as to make the stylization loss term more robust to motions. Through relaxation, style transfer is more robust to inter-frame variation without degrading the subjective effect. Then, we provide a novel formulation and understanding of temporal consistency. Based on the formulation, we analyze the drawbacks of existing training strategies and derive a new regularization. We show by experiments that the proposed regularization can better balance the spatial and temporal performance. Based on relaxation and regularization, we design a zero-shot video style transfer framework. Moreover, for better feature migration, we introduce a new module to dynamically adjust inter-channel distributions. Quantitative and qualitative results demonstrate the superiority of our method over state-of-the-art style transfer methods. Our project is publicly available at: https://daooshee.github.io/ReReVST/. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Interweaved Prediction for Video Coding.
- Author
-
Zhang, Kai, Zhang, Li, Liu, Hongbin, Xu, Jizheng, Deng, Zhipin, and Wang, Yue
- Subjects
VIDEO compression ,VIDEO coding ,BLOCK codes ,FORECASTING ,STATISTICS - Abstract
In the emerging next generation video coding standard Versatile Video Coding (VVC) developed by the Joint Video Exploration Team (JVET), sub-block-based inter-prediction plays a key role in promising coding tools such as Affine Motion Compensation (AMC) and sub-block-based Temporal Motion Vector Prediction (sbTMVP). With sub-block-based inter-prediction, a coding block is divided into sub-blocks, and the motion information of each sub-block is derived individually. Although sub-block-based inter-prediction can provide a higher quality prediction benefiting from a finer motion granularity, it still suffers two problems: uneven prediction quality and boundary discontinuity. In this paper, we present a method of interweaved prediction to further improve sub-block-based inter-prediction. With interweaved prediction, a coding block with AMC or sbTMVP mode is divided into sub-blocks with two different dividing patterns, so that a corner position of a sub-block in one dividing pattern coincides with the central position of a sub-block in the other dividing pattern. Then two auxiliary predictions are generated by AMC or sbTMVP with the two dividing patterns, independently. The final prediction is calculated as a weighted-sum of the two auxiliary predictions. Theoretical analysis and statistical data prove that interweaved prediction can significantly mitigate the two problems in sub-block-based inter-prediction. Simulation results show that the proposed methods can achieve 0.64% BD-rate saving on average with the random access configurations. On sequences with rich affine motions, the average BD-rate saving can be up to 2.54%. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. Direct Speech-to-Image Translation.
- Author
-
Li, Jiguo, Zhang, Xinfeng, Jia, Chuanmin, Xu, Jizheng, Zhang, Li, Wang, Yue, Ma, Siwei, and Gao, Wen
- Abstract
Direct speech-to-image translation without text is an interesting and useful topic due to the potential applications in human-computer interaction, art creation, computer-aided design. etc. Not to mention that many languages have no writing form. However, as far as we know, it has not been well-studied how to translate the speech signals into images directly and how well they can be translated. In this paper, we attempt to translate the speech signals into the image signals without the transcription stage. Specifically, a speech encoder is designed to represent the input speech signals as an embedding feature, and it is trained with a pretrained image encoder using teacher-student learning to obtain better generalization ability on new classes. Subsequently, a stacked generative adversarial network is used to synthesize high-quality images conditioned on the embedding feature. Experimental results on both synthesized and real data show that our proposed method is effective to translate the raw speech signals into images without the middle text representation. Ablation study gives more insights about our method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
13. Spherical Domain Rate-Distortion Optimization for Omnidirectional Video Coding.
- Author
-
Li, Yiming, Xu, Jizheng, and Chen, Zhenzhong
- Subjects
- *
VIDEO coding , *VIRTUAL reality , *RATE distortion theory , *STREAMING technology , *STREAMING video & television - Abstract
Efficient compression of omnidirectional video is important for emerging virtual reality applications. To compress this kind of video, each frame is projected to a 2D plane [e.g., equirectangular projection (ERP) map] first, adapting to the input format of existing video coding systems. At the display side, an inverse projection is applied to the reconstructed video to restore signals in spherical domain. Such a projection, however, makes presentation and encoding in different domains. Thus, an encoder agnostic to the projection performs inefficiently. In this paper, we analyze how a projection influences the distortion measurements in different domains. Based on the analysis, we propose a scheme to optimize the encoding process based on signals’ distortion in spherical domain. With the proposed optimization, an average 4.31% (up to 9.67%) luma BD-rate reduction is achieved for ERP in random access configuration. The corresponding bit saving is averagely 10.84% (up to 34.44%) when considering the viewing field being $\pi /2$. The proposed method also benefits other projections and viewport settings, with a marginal complexity increase. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
14. Variable Block-Sized Signal-Dependent Transform for Video Coding.
- Author
-
Lan, Cuiling, Xu, Jizheng, Zeng, Wenjun, Shi, Guangming, and Wu, Feng
- Subjects
- *
VIDEO coding , *SIGNAL processing , *COMPUTER algorithms , *ENCODING , *STATISTICAL correlation - Abstract
Transform, as one of the most important modules of mainstream video coding systems, seems very stable over the past several decades. However, recent developments indicate that bringing more options for transform can lead to coding efficiency benefits. In this paper, we go further to investigate how the coding efficiency can be improved over the state-of-the-art method by adapting a transform for each block. We present a variable block-sized signal-dependent transforms (SDTs) design based on the High Efficiency Video Coding (HEVC) framework. For a coding block ranged from $4\times4$ to $32\times32$ , we collect a quantity of similar blocks from the reconstructed area and use them to derive the Karhunen–Loève transform. We avoid sending overhead bits to denote the transform by performing the same procedure at the decoder. In this way, the transform for every block is tailored according to its statistics, to be signal-dependent. To make the large block-sized SDTs feasible, we present a fast algorithm for transform derivation. Experimental results show the effectiveness of the SDTs for different block sizes, which leads to up to 23.3% bit-saving. On average, we achieve BD-rate saving of 2.2%, 2.4%, 3.3%, and 7.1% under AI-Main10, RA-Main10, RA-Main10, and LP-Main10 configurations, respectively, compared with the test model HM-12 of HEVC. The proposed scheme has also been adopted into the joint exploration test model for the exploration of potential future video coding standard. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
15. Fully Connected Network-Based Intra Prediction for Image Coding.
- Author
-
Li, Jiahao, Li, Bin, Xu, Jizheng, Xiong, Ruiqin, and Gao, Wen
- Subjects
IMAGE compression ,DEEP learning ,IMAGE reconstruction ,VIDEO coding ,TRANSFORM coding - Abstract
This paper proposes a deep learning method for intra prediction. Different from traditional methods utilizing some fixed rules, we propose using a fully connected network to learn an end-to-end mapping from neighboring reconstructed pixels to the current block. In the proposed method, the network is fed by multiple reference lines. Compared with traditional single line-based methods, more contextual information of the current block is utilized. For this reason, the proposed network has the potential to generate better prediction. In addition, the proposed network has good generalization ability on different bitrate settings. The model trained from a specified bitrate setting also works well on other bitrate settings. Experimental results demonstrate the effectiveness of the proposed method. When compared with high efficiency video coding reference software HM-16.9, our network can achieve an average of 3.4% bitrate saving. In particular, the average result of 4K sequences is 4.5% bitrate saving, where the maximum one is 7.4%. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
16. Diversity-Based Reference Picture Management for Low Delay Screen Content Coding.
- Author
-
Li, Jiahao, Li, Bin, Xu, Jizheng, and Xiong, Ruiqin
- Subjects
IMAGE processing ,VIDEO coding ,COMPUTER programming ,ALGORITHMS ,IMAGING systems - Abstract
Screen content coding plays an important role in many applications. Conventional reference picture management (RPM) strategies developed for natural content may not work well for screen content. This is because many regions in screen content remain static for a long time, causing a lot of repetitive contents to stay in the decoded picture buffer. The repetitive contents are not conducive to inter prediction, but still occupy valuable memory. This paper proposes a diversity-based RPM scheme for screen content coding. The concept of diversity is introduced for the reference picture set (RPS) to help formulate the RPM problem. By maximizing the diversity of RPS, more potentially better predictions are provided. Better compression performance can then be achieved. Meanwhile, the proposed scheme is nonnormative and compatible with existing video coding standards, such as High Efficiency Video Coding. The experimental results show that, for low delay screen content coding, the bit saving of the proposed scheme is 4.9% on average and up to 13.7%, without increasing encoding time. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
17. Fast Hash-Based Inter-Block Matching for Screen Content Coding.
- Author
-
Xiao, Wei, Shi, Guangming, Li, Bin, Xu, Jizheng, and Wu, Feng
- Subjects
VIDEO coding ,VIDEO compression software ,GRAPHICS processing units ,COMPUTER graphics ,DIGITAL image processing - Abstract
In the latest High Efficiency Video Coding (HEVC) development, i.e., HEVC screen content coding extensions (HEVC-SCC), a hash-based inter-motion search/block matching scheme is adopted in the reference test model, which brings significant coding gains to code screen content. However, the hash table generation itself may take up to half the encoding time and is thus too complex for practical usage. In this paper, we propose a hierarchical hash design and the corresponding block matching scheme to significantly reduce the complexity of hash-based block matching. The hierarchical structure in the proposed scheme allows large block calculation to use the results of small blocks. Thus, we avoid redundant computation among blocks with different sizes, which greatly reduces complexity without compromising coding efficiency. The experimental results show that compared with the hash-based block matching scheme in the HEVC-SCC test model (SCM)-6.0, the proposed scheme reduces about 77% of hash processing time, which leads to 12% and 16% encoding time savings in random access (RA) and low-delay B coding structures. The proposed scheme has been adopted into the latest SCM. A parallel implementation of the proposed hash table generation on graphics processing unit (GPU) is also presented to show the high parallelism of the proposed scheme, which achieves more than 30 frames/s for 1080p sequences and 60 frames/s for 720p sequences. With the fast hash-based block matching integrated into x265 and the hash table generated on GPU, the encoder can achieve 11.8% and 14.0% coding gains on average for RA and low-delay P coding structures, respectively, for real-time encoding. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
18. Unequal Error Protection for Scalable Video Storage in the Cloud.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Abstract
Redundancy is necessary for a storage system to achieve reliability. Frequent errors in large-scale storage systems, for example, cloud, make it desirable to reduce the cost of recovery. Among all types of data in cloud storage, videos generally occupy significant amounts of space due to high volumes and the rapid development of video sharing and video-on-demand services. Unlike general data, videos can tolerate a certain level of quality degradation. This paper investigates multilayer video representations, such as scalable videos and simulcast streaming, and proposes an unequal error protection scheme based on local reconstruction codes (LRC) for video storage. By providing less protection for less important layers or video copies, a better tradeoff between storage and repair cost is achieved. Both theoretical and simulation results show that such a tradeoff can be achieved over the LRC with equal error protection, though the recovered video quality might be slightly lower in rare cases. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
19. Weighted Rate-Distortion Optimization for Screen Content Coding.
- Author
-
Xiao, Wei, Li, Bin, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
RATE distortion theory ,VIDEO coding ,MOTION estimation (Signal processing) ,HIGH definition video recording ,VIDEO compression - Abstract
Unlike camera-captured video, screen content (SC) often contains a lot of repeating patterns, which makes some blocks used as references much more important than others. However, conventional rate-distortion optimization (RDO) schemes in video coding do not consider the dependence among image blocks, which often leads to a locally optimal parameter selection, especially for SC. In this paper, we present a weighted RDO scheme for SC coding (SCC), in which the repeating characteristics are taken into account when deciding RD tradeoff for each block. For one block, the number being referenced by the current picture and following pictures is estimated and based on the number, we set a proper weight in the RDO process to reflect its importance from a global point of view. To estimate the number being referenced, we propose a hash-based method to approximate the results to avoid the complexity of direct search. Experimental results show that compared with the High Efficiency Video Coding SCC reference software, 10.1%, 14.5%, and 2.2% on average and up to 25.7%, 39.8%, and 4.6% bit saving can be achieved by considering weights provided by our scheme for hierarchical-B, IBBB, and all intra coding structures, respectively. Thanks to our hash-based design, the complexity increase brought by the proposed scheme is marginal. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
20. Content-adaptive deblocking for high efficiency video coding
- Author
-
Xiong, Zhiwei, Sun, Xiaoyan, Xu, Jizheng, and Wu, Feng
- Published
- 2012
- Full Text
- View/download PDF
21. Distributed Compressive Sensing for Cloud-Based Wireless Image Transmission.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Abstract
We consider efficient image transmission via time-varying channels. To improve the performance, we propose a new distributed compressive sensing (CS) scheme that can leverage similar images in the cloud. It is featured by channel SNR and bandwidth scalability, high efficiency, and low encoding complexity. For each image, a compressed thumbnail is first transmitted after forward error correction (FEC) and modulation to retrieve similar images and generate a side information (SI) in the cloud. The residual image after subtracting the decompressed thumbnail is then coded and transmitted by CS through a very dense constellation without FEC. The linearly and ratelessly generated CS measurements make it capable of achieving both graceful quality degradation (GD) with the channel SNR and bandwidth scalability in a universal scheme. A mode decision and transform-domain power allocation are introduced for better bandwidth usage and protection against channel errors. At the decoder, a two-step CS decoding is performed to recover the residual signal, where both the local and nonlocal correlations within the image and that with the SI are exploited. Simulations on landmark images and an AWGN channel show that the received image quality gracefully increases with the channel SNR and bandwidth. Furthermore, it outperforms existing schemes both subjectively and objectively by up to 11 dB gains compared with the state-of-the-art transmission scheme with GD, i.e. SoftCast. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
22. Hash-Based Line-by-Line Template Matching for Lossless Screen Image Coding.
- Author
-
Peng, Xiulian and Xu, Jizheng
- Subjects
- *
TEMPLATE matching (Digital image processing) , *MATCHING theory , *PATTERN recognition systems , *IMAGE transmission , *DATA transmission systems - Abstract
Template matching (TM) was proposed in the literature a decade ago to efficiently remove non-local redundancies within an image without transmitting any overhead of displacement vectors. However, the large computational complexity introduced at both the encoder and the decoder, especially for a large search range, limits its widespread use. This paper proposes a hash-based line-by-line template matching (hLTM) for lossless screen image coding, where the non-local redundancy commonly exists in text and graphics parts. By hash-based search, it can largely reduce the search complexity of template matching without an accuracy degradation. Besides, the line-by-line template matching increases prediction accuracy by using a fine granularity. Experimental results show that the hLTM can significantly reduce both the encoding and decoding complexities by 68 and 23 times, respectively, compared with the traditional TM with a search radius of 128. Moreover, when compared with High Efficiency Video Coding screen content coding test model SCM-1.0, it can largely improve coding efficiency by up to 12.68% bits saving on screen contents with rich texts/graphics. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
23. Overview of the Emerging HEVC Screen Content Coding Extension.
- Author
-
Xu, Jizheng, Joshi, Rajan, and Cohen, Robert A.
- Subjects
- *
VIDEO coding , *BIT rate , *VIDEO codecs , *STANDARDIZATION - Abstract
A screen content coding (SCC) extension to High Efficiency Video Coding (HEVC) is currently under development by the Joint Collaborative Team on Video Coding, which is a joint effort from the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC-SCC standardization effort is to enable significantly improved compression performance for videos containing a substantial amount of still or moving rendered graphics, text, and animation rather than, or in addition to, camera-captured content. This paper provides an overview of the technical features and characteristics of the current HEVC-SCC test model and related coding tools, including intra-block copy, palette mode, adaptive color transform, and adaptive motion vector resolution. The performance of the SCC extension is compared against existing standards in terms of bitrate savings at equal distortion. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
24. Overview of the Range Extensions for the HEVC Standard: Tools, Profiles, and Performance.
- Author
-
Flynn, David, Marpe, Detlev, Naccari, Matteo, Nguyen, Tung, Rosewarne, Chris, Sharman, Karl, Sole, Joel, and Xu, Jizheng
- Subjects
MPEG (Video coding standard) ,VIDEO compression ,BIT rate ,RANDOM access memory ,VIDEO coding - Abstract
The Range Extensions (RExt) of the High Efficiency Video Coding (HEVC) standard have recently been approved by both ITU-T and ISO/IEC. This set of extensions targets video coding applications in areas including content acquisition, postproduction, contribution, distribution, archiving, medical imaging, still imaging, and screen content. In addition to the functionality of HEVC Version 1, RExt provide support for monochrome, 4:2:2, and 4:4:4 chroma sampling formats as well as increased sample bit depths beyond 10 bits per sample. This extended functionality includes new coding tools with a view to provide additional coding efficiency, greater flexibility, and throughput at high bit depths/rates. Improved lossless, near-lossless, and very high bit-rate coding is also a part of the RExt scope. This paper presents the technical aspects of HEVC RExt, including a discussion of RExt profiles, tools, applications, and provides experimental results for a performance comparison with previous relevant coding technology. When compared with the High 4:4:4 Predictive Profile of H.264/Advanced Video Coding (AVC), the corresponding HEVC 4:4:4 RExt profile provides up to $\sim 25$ %, $\sim 32$ %, and $\sim 36$ % average bit-rate reduction at the same PSNR quality level for intra, random access, and low delay configurations, respectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
25. Cloud-Based Distributed Image Coding.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
IMAGE compression ,CLOUD computing ,DISTRIBUTED computing ,DIGITAL image correlation ,IMAGE reconstruction ,MULTIMEDIA systems - Abstract
With multimedia flourishing on the Web, it is easy to find similar images for a query, especially landmark images. Traditional image coding, such as JPEG, cannot exploit correlations with external images. Existing vision-based approaches are able to exploit such correlations by reconstructing from local descriptors but cannot ensure the pixel-level fidelity of the reconstruction. In this paper, a cloud-based distributed image coding (Cloud-DIC) scheme is proposed to exploit external correlations for mobile photo uploading. For each input image, a thumbnail is transmitted to retrieve correlated images and reconstruct it in the cloud by geometrical and illumination registrations. Such a reconstruction serves as the side information (SI) in the Cloud-DIC. The image is then compressed by a transform-domain syndrome coding to correct the disparity between the original image and the SI. Once a bitplane is received in the cloud, an iterative refinement process is performed between the final reconstruction and the SI. Moreover, a joint encoder/decoder mode decision at block, frequency, and bitplane levels is proposed to adapt to different correlations. Experimental results on a landmark image database show that the Cloud-DIC can largely enhance the coding efficiency both subjectively and objectively, with up to 5-dB gains and 70% bits saving over JPEG with arithmetic coding, and perform comparably at low bitrates with the intra coding of the High Efficiency Video Coding standard with a much lower encoder complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
26. HEVC Encoding Optimization Using Multicore CPUs and GPUs.
- Author
-
Xiao, Wei, Li, Bin, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
DECODERS & decoding ,ENCODING ,VIDEO coding ,VIDEO codecs ,GRAPHICS processing units - Abstract
Although the High Efficiency Video Coding (HEVC) standard significantly improves the coding efficiency of video compression, it is unacceptable even in offline applications to spend several hours compressing 10 s of high-definition video. In this paper, we propose using a multicore central processing unit (CPU) and an off-the-shelf graphics processing unit (GPU) with 3072 streaming processors (SPs) for HEVC fast encoding, so that the speed optimization does not result in loss of coding efficiency. There are two key technical contributions in this paper. First, we propose an algorithm that is both parallel and fast for the GPU, which can utilize 3072 SPs in parallel to estimate the motion vector (MV) of every prediction unit (PU) in every combination of the coding unit (CU) and PU partitions. Furthermore, the proposed GPU algorithm can avoid coding efficiency loss caused by the lack of a MV predictor (MVP). Second, we propose a fast algorithm for the CPU, which can fully utilize the results from the GPU to significantly reduce the number of possible CU and PU partitions without any coding efficiency loss. Our experimental results show that compared with the reference software, we can encode high-resolution video that consumes 1.9% of the CPU time and 1.0% of the GPU time, with only a 1.4% rate increase. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
27. Hash-Based Block Matching for Screen Content Coding.
- Author
-
Zhu, Weijia, Ding, Wenpeng, Xu, Jizheng, Shi, Yunhui, and Yin, Baocai
- Abstract
By considering the increasing importance of screen contents, the high efficiency video coding (HEVC) standard includes screen content coding as one of its requirements. In this paper, we demonstrate that enabling frame level block searching in HEVC can significantly improve coding efficiency on screen contents. We propose a hash-based block matching scheme for the intra block copy mode and the motion estimation process, which enables frame level block searching in HEVC without changing the HEVC syntaxes. In the proposed scheme, the blocks sharing the same hash values with the current block are selected as prediction candidates. Then the hash-based block selection is employed to select the best candidates. To achieve the best coding efficiency, the rate distortion optimization is further employed to improve the proposed scheme by balancing the coding cost of motion vectors and prediction difference. Compared with HEVC, the proposed scheme achieves 21% and 37% bitrate saving with all intra and low delay configurations with encoding time reduction . Up to 59% bitrate saving can be achieved on sequences with large motions. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
28. Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors.
- Author
-
Yan, Chenggang, Zhang, Yongdong, Xu, Jizheng, Dai, Feng, Zhang, Jun, Dai, Qionghai, and Wu, Feng
- Subjects
MOTION detectors ,FORCE & energy ,ALGORITHMS ,NUMERICAL analysis ,KINEMATICS - Abstract
High Efficiency Video Coding (HEVC) provides superior coding efficiency than previous video coding standards at the cost of increasing encoding complexity. The complexity increase of motion estimation (ME) procedure is rather significant, especially when considering the complicated partitioning structure of HEVC. To fully exploit the coding efficiency brought by HEVC requires a huge amount of computations. In this paper, we analyze the ME structure in HEVC and propose a parallel framework to decouple ME for different partitions on many-core processors. Based on local parallel method (LPM), we first use the directed acyclic graph (DAG)-based order to parallelize coding tree units (CTUs) and adopt improved LPM (ILPM) within each CTU (DAGILPM), which exploits the CTU-level and prediction unit (PU)-level parallelism. Then, we find that there exist completely independent PUs (CIPUs) and partially independent PUs (PIPUs). When the degree of parallelism (DP) is smaller than the maximum DP of DAGILPM, we process the CIPUs and PIPUs, which further increases the DP. The data dependencies and coding efficiency stay the same as LPM. Experiments show that on a 64-core system, compared with serial execution, our proposed scheme achieves more than 30 and 40 times speedup for $1920\times 1080$ and $2560\times 1600$ video sequences, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
29. Screen Content Coding Based on HEVC Framework.
- Author
-
Zhu, Weijia, Ding, Wenpeng, Xu, Jizheng, Shi, Yunhui, and Yin, Baocai
- Abstract
Screen content like cartoons, captures of typical computer screens or video with text overlay or news ticker is an important category of video, which needs new techniques beyond the existing video coding techniques. In this paper, we analyze the characteristics of screen content and coding efficiency of HEVC on screen content. We propose a new coding scheme, which adopts a non-transform representation, separating screen content into color component and structure component. Based on the proposed representation, two coding modes are designed for screen content to exploit the directional correlation and non-translational changes in screen video sequences. The proposed scheme is then seamlessly incorporated into the HEVC structure and implemented into HEVC range extension reference software HM9.0. Experimental results show that the proposed scheme achieves up to 52.6% bitrate saving compared with HM9.0. On average, 35.1%, 29.2% and 23.6% bitrate saving are achieved with intra, random-access and low-delay configurations, respectively. The visual quality of the decoded video sequences is also significantly improved by reducing ringing artifacts around sharp edges and reserving the shape of text without blur. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
30. Rate-Distortion Optimized Reference Picture Management for High Efficiency Video Coding.
- Author
-
Li, Houqiang, Li, Bin, and Xu, Jizheng
- Subjects
VIDEO coding ,RATE distortion theory ,IMAGE processing ,IMAGE stabilization ,MATHEMATICAL optimization ,SEARCH algorithms - Abstract
Motion compensation with multiple reference pictures has been widely used during the development of the emerging High Efficiency Video Coding (HEVC) standard, which greatly helps to improve the coding efficiency. Usually, a heuristic strategy is exploited to use the nearest reconstructed pictures as references. However, such a strategy may not be efficient on all occasions, especially when different content characteristics and coding settings are considered. In this paper, we investigate how to manage reference pictures so as to achieve better rate-distortion performance under the memory constraint of the decoded picture buffer at the decoder. We formulate the reference picture management as an optimization problem and approximate its optimal solution. Moreover, we explore how to adjust quality for each picture according to the reference structure to further improve coding efficiency. For some coding cases, where a complicated encoder optimization is unaffordable, we also develop fast algorithms to get the most benefit from reference picture selection. Among them, one strategy has been adopted by the HEVC software and common test conditions to generate the anchor. Experimental results show that the proposed full search algorithm and fast search algorithms achieve significant bitrate reduction. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
31. Visually Summarizing Web Pages Through Internal and External Images.
- Author
-
Jiao, Binxing, Yang, Linjun, Xu, Jizheng, Tian, Qi, and Wu, Feng
- Abstract
Visually summarizing web pages is an attractive approach that provides users an effective and friendly interface to identify desired contents at a first glance for search and re-finding tasks. Using dominant images in web pages is generally reliable for this purpose. However, dominant images are often unavailable in many web pages. To solve this problem, we first propose a new approach to summarize those web pages without any dominant images by retrieving relevant external images from the Internet. However, relevant external images are sometimes unreliable. To take the advantages of these two kinds of images, we further propose a clustering based algorithm to select the best summarization among all of internal and external images. This algorithm leverages relevance and dominance of images as the prior information. Experimental results show that our approach achieves 0.098 and 0.082 NDCG1 gain on a human labeled data set, compared with relevant external image and dominant image, respectively. Our user study also indicates that the images selected by our algorithm are useful as the summarization of web pages. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
32. Highly Parallel Line-Based Image Coding for Many Cores.
- Author
-
Peng, Xiulian, Xu, Jizheng, Zhou, You, and Wu, Feng
- Subjects
- *
IMAGE compression , *ALGORITHMS , *IMAGE segmentation , *PARALLEL processing , *IMAGE processing , *BIT rate , *IMAGE quality analysis - Abstract
Computers are developing along with a new trend from the dual-core and quad-core processors to ones with tens or even hundreds of cores. Multimedia, as one of the most important applications in computers, has an urgent need to design parallel coding algorithms for compression. Taking intraframe/image coding as a start point, this paper proposes a pure line-by-line coding scheme (LBLC) to meet the need. In LBLC, an input image is processed line by line sequentially, and each line is divided into small fixed-length segments. The compression of all segments from prediction to entropy coding is completely independent and concurrent at many cores. Results on a general-purpose computer show that our scheme can get a 13.9 times speedup with 15 cores at the encoder and a 10.3 times speedup at the decoder. Ideally, such near-linear speeding relation with the number of cores can be kept for more than 100 cores. In addition to the high parallelism, the proposed scheme can perform comparatively or even better than the H.264 high profile above middle bit rates. At near-lossless coding, it outperforms H.264 more than 10 dB. At lossless coding, up to 14% bit-rate reduction is observed compared with H.264 lossless coding at the high 4:4:4 profile. [ABSTRACT FROM PUBLISHER]
- Published
- 2012
- Full Text
- View/download PDF
33. Exploiting Non-Local Correlation via Signal-Dependent Transform (SDT).
- Author
-
Lan, Cuiling, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Abstract
Over the past few decades, many studies on image and video compression have found various approaches to the exploitation of spatial and temporal local correlations. However, we believe it is imperative to find more efficient methods to progress the development of image and video compression. In this paper, we first study spatial non-local correlation, deducing that there exist strong correlations in non-local regions. However, it is rather difficult to make use of these non-local correlations while simultaneously minimizing overhead. To solve this problem, we propose the signal-dependent transform (SDT), which is derived from decoded non-local blocks that are selected by matching neighboring pixels. Since the encoder and decoder can use the same methods to derive the proposed transform, we can successfully eliminate overhead. Finally, we have implemented the proposed transform into the Key Technology Area (KTA) software to exploit both spatial and temporal non-local correlations. The experimental results show that the coding gain over KTA can be as high as 1.4 dB in intra-frame coding, and up to 1.0 dB in inter-frame coding. We believe we have effectively created an alternate method to improve image and video compression. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
34. Directional Filtering Transform for Image/Intra-Frame Compression.
- Author
-
Peng, Xiulian, Xu, Jizheng, and Wu, Feng
- Abstract
While directional adaption is introduced into traditional transforms, different orders of two 1-D transforms will result in different results of one 2-D transform. Based upon an anisotropic image model, this paper analyzes the effect of transform orders in terms of theoretical coding gain. Our results reveal that the transform orders have little effect on the coding gain with full decomposition, good directional modes and good interpolation. However, in practical compression schemes, since high-pass bands are not decomposed fully because of the consideration on complexity, different transform orders have different coding performances, which can be solved by an adaptive transform order. Motivated by our analyzed results, a directional filtering transform (dFT, in order to distinguish from the common usage on DFT) is proposed in this paper to better exploit correlations among samples in H.264 intraframe coding. It provides an evenly distributed set of prediction modes with an adaptive transform order. Both interblock and intrablock correlations are exploited in this scheme. Experimental results in H.264 intraframe coding demonstrate its superiority both objectively and subjectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2010
- Full Text
- View/download PDF
35. Power Distortion Optimization for Uncoded Linear Transformed Transmission of Images and Videos.
- Author
-
Xiong R, Zhang J, Wu F, Xu J, and Gao W
- Abstract
Recently, there is a resurgence of interest in uncoded transmission for wireless visual communication. While conventional coded systems suffer from cliff effect as the channel condition varies dynamically, uncoded linear-transformed transmission (ULT) provides elegant quality degradation for wide channel SNR range. ULT skips non-linear operations, such as quantization and entropy coding. Instead, it utilizes linear decorrelation transform and linear scaling power allocation to achieve optimized transmission. This paper presents a theoretical analysis for power-distortion optimization of ULT. In addition to the observation in our previous work that a decorrelation transform can bring significant performance gain, this paper reveals that exploiting the energy diversity in transformed signal is the key to achieve the full potential of decorrelation transform. In particular, we investigated the efficiency of ULT with exact or inexact signal statistics, highlighting the impact of signal energy modeling accuracy. Based on that, we further proposed two practical energy modeling schemes for ULT of visual signals. Experimental results show that the proposed schemes improve the quality of reconstructed images by 3~5 dB, while reducing the signal modeling overhead from hundreds or thousands of meta data to only a few meta data. The perceptual quality of reconstruction is significantly improved.
- Published
- 2017
- Full Text
- View/download PDF
36. LineCast: line-based distributed coding and transmission for broadcasting satellite images.
- Author
-
Wu F, Peng X, and Xu J
- Subjects
- Numerical Analysis, Computer-Assisted, Reproducibility of Results, Sensitivity and Specificity, Algorithms, Data Compression methods, Image Enhancement methods, Image Interpretation, Computer-Assisted methods, Satellite Imagery methods, Signal Processing, Computer-Assisted
- Abstract
In this paper, we propose a novel coding and transmission scheme, called LineCast, for broadcasting satellite images to a large number of receivers. The proposed LineCast matches perfectly with the line scanning cameras that are widely adopted in orbit satellites to capture high-resolution images. On the sender side, each captured line is immediately compressed by a transform-domain scalar modulo quantization. Without syndrome coding, the transmission power is directly allocated to quantized coefficients by scaling the coefficients according to their distributions. Finally, the scaled coefficients are transmitted over a dense constellation. This line-based distributed scheme features low delay, low memory cost, and low complexity. On the receiver side, our proposed line-based prediction is used to generate side information from previously decoded lines, which fully utilizes the correlation among lines. The quantized coefficients are decoded by the linear least square estimator from the received data. The image line is then reconstructed by the scalar modulo dequantization using the generated side information. Since there is neither syndrome coding nor channel coding, the proposed LineCast can make a large number of receivers reach the qualities matching their channel conditions. Our theoretical analysis shows that the proposed LineCast can achieve Shannon's optimum performance by using a high-dimensional modulo-lattice quantization. Experiments on satellite images demonstrate that it achieves up to 1.9-dB gain over the state-of-the-art 2D broadcasting scheme and a gain of more than 5 dB over JPEG 2000 with forward error correction.
- Published
- 2014
- Full Text
- View/download PDF
37. Directional lapped transforms for image coding.
- Author
-
Xu J, Wu F, Liang J, and Zhang W
- Abstract
In this paper, we present the design of directional lapped transforms for image coding. A lapped transform, which can be implemented by a prefilter followed by a discrete cosine transform (DCT), can be factorized into elementary operators. The corresponding directional lapped transform is generated by applying each elementary operator along a given direction. The proposed directional lapped transforms are not only nonredundant and perfectly reconstructed, but they can also provide a basis along an arbitrary direction. These properties, along with the advantages of lapped transforms, make the proposed transforms appealing for image coding. A block-based directional transform scheme is also presented and integrated into HD Phtoto, one of the state-of-the-art image coding systems, to verify the effectiveness of the proposed transforms.
- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.