583 results on '"Xu, Jizheng"'
Search Results
2. A cross-resolution leaky prediction scheme for in-band wavelet video coding with spatial scalability
- Author
-
Zhang, Dongdong, Zhang, Wenjun, Xu, Jizheng, Wu, Feng, and Xiong, Hongkai
- Subjects
Image coding -- Technology application ,Wavelet transforms -- Evaluation ,Scalability -- Evaluation ,Circuit design -- Evaluation ,Circuit designer ,Integrated circuit design ,Technology application ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In most existing in-band wavelet video coding schemes, over-complete wavelet transform is used for the motion-compensated temporal filtering (MCTF) of each spatial subband. It can overcome the shift-variance of critical sampling wavelet transform and improve the coding efficiency of the in-band scheme. However, a dilemma exists in the current implementations of in-band MCTF (IBMCTF), which is whether or not to exploit the spatial highpass subbands in motion compensation of the spatial lowpass subband. The absence of the spatial highpass subbands will result in significant quality loss in the reconstructed full-resolution video, whereas the presence of the spatial highpass subbands may bring serious mismatch error in the decoded low-resolution video since the corresponding highpass subbands may be unavailable at the decoder. In this paper, we first analyze the mismatch error propagation in decoding the low-resolution video. Based on our analysis, we then propose a frame-based cross-resolution leaky prediction scheme for IBMCTF. It can make a good tradeoff between alleviating the low-resolution mismatch and improving the full-resolution coding efficiency. Experimental results show that the proposed scheme can dramatically reduce the mismatch error by 0.3-2.5 dB for low resolution, while the performance loss is marginal for high resolution. Index Terms--Cross-resolution leaky prediction, in-band motion-compensated temporal filtering (MCTF), mismatch error analysis, spatial scalability, wavelet video coding.
- Published
- 2008
3. In-scale motion compensation for spatially scalable video coding
- Author
-
Xiong, Ruiqin, Xu, Jizheng, and Wu, Feng
- Subjects
Scalability -- Methods ,Image coding -- Methods ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In existing pyramid-based spatially scalable coding schemes, such as H.264/MPEG-4 SVC (scalable video coding), video frame at a certain high-resolution layer is mainly predicted either from the same frame at the next lower resolution layer, or from the temporal neighboring frames within the same resolution layer. But these schemes fail to exploit both kinds of correlation simultaneously and therefore cannot remove the redundancies among resolution layers efficiently. This paper extends the idea of spatiotemporal subband transform and proposes a general in-scale motion compensation technique for pyramid-based spatially scalable video coding. Video frame at each high-resolution layer is partitioned into two parts in frequency. Prediction for the lowpass part is derived from the next lower resolution layer, whereas prediction for the highpass part is obtained from neighboring frames within the same resolution layer, to further utilize temporal correlation. In this way, both kinds of correlation are exploited simultaneously and the cross-resolution-layer redundancy can be highly removed. Furthermore, this paper also proposes a macroblock-based adaptive in-scale technique for hybrid spatial and SNR scalability. Experimental results show that the proposed techniques can significantly improve the spatial scalability performance of H.264/MPEG-4 SVC, especially when the bit-rate ratio of lower resolution bit stream to higher resolution bit stream is considerable. Index Terms--H.2641MPEG-4 SVC, in-scale motion compensation, inter-layer prediction, scalable video coding, spatial scalability.
- Published
- 2008
4. Lifting-based directional DCT-like transform for image coding
- Author
-
Xu, Hao, Xu, Jizheng, and Wu, Feng
- Subjects
Image coding -- Methods ,Transformations (Mathematics) -- Properties ,Entropy (Information theory) -- Measurement ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
Traditional 2-D discrete cosine transform (DCT) implemented by separable 1-D transform in horizontal and vertical directions does not take image orientation features in a local window into account. To improve it, we propose to introduce directional primary operations to the lifting-based DCT and thereby derive a new directional DCT-like transform, whose transform matrix is dependent on directional angle and interpolation used there. Furthermore, the proposed transform is compared with the straightforward one of first rotated and then transformed. A JPEG-wise image coding scheme is also proposed to evaluate the performance of the proposed directional DCT-like transform. The first 1-D transform is performed according to image orientation features, and the second 1-D transform still in the horizontal or vertical direction. At the same time, an approach is proposed to optimally select transform direction of each block because selected directions of neighboring blocks will influence each other. The experimental results show that the performance of the proposed directional DCT-like transform can dramatically outperform the conventional DCT up to 2 dB even without modifying entropy coding. Index Terms--Directional transform, discrete cosine transform (DCT), image coding, lifting structure.
- Published
- 2007
5. Subband coupling aware rate allocation for spatial scalability in 3-D wavelet video coding
- Author
-
Xiong, Ruiqin, Xu, Jizheng, Wu, Feng, Li, Shipeng, and Zhang, Ya-Qin
- Subjects
Image coding -- Methods ,Signal processing -- Methods ,Electric filters -- Usage ,Digital signal processor ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The motion compensated temporal filtering (MCTF) technique, which is extensively used in 3-D wavelet video coding schemes nowadays, leads to signal coupling among various spatial subbands because motion alignment is introduced in the temporal filtering. Using all spatial subbands as a reference enables MCTF to fully take advantage of temporal correlation across frames but inevitably brings drifting problem in supporting spatial scalability. This paper first analyzes the signal coupling phenomenon and then proposes a quantitative model to describe signal propagation across spatial subbands during the MCTF process. The signal propagation is modeled for a single MC step based on the shifting effect of wavelet synthesis filters and then it is extended to multilevel MCTF. This model is called subband coupling aware signal propagation (SCASP) model in this paper. Based on the model, we further propose a subband coupling aware rate allocation scheme as one possible solution to the above dilemma in supporting spatial scalability. To find the optimal rate allocation among all subbands for a specified reconstruction resolution, the SCASP model is used to approximate the reconstruction process and derive the synthesis gain of each subband with regard to that reconstruction. Experimental results have fully demonstrated the advantages of our proposed rate allocation scheme in improving both objective and subjective qualities of reconstructed low-resolution video, especially at middle bit rates and high bit rates. Index Terms--Motion compensated temporal filtering (MCTF), rate allocation, signal propagation model, spatial scalability, subband coupling, 3-D wavelet video coding.
- Published
- 2007
6. Barbell-lifting based 3-D wavelet coding scheme
- Author
-
Xiong, Ruiqin, Xu, Jizheng, Wu, Feng, and Li, Shipeng
- Subjects
Image coding -- Methods ,Wavelet transforms -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
This paper provides an overview of the Barbell lifting coding scheme that has been adopted as common software by the MPEG ad hoc group on further exploration of wavelet video coding. The core techniques used in this scheme, such as Barbell lifting, layered motion coding, 3-D entropy coding and base layer embedding, are discussed. The paper also analyzes and compares the proposed scheme with the oncoming Scalable Video Coding (SVC) standard because the hierarchical temporal prediction technique used in SVC has a close relationship with motion compensated temporal lifting (MCTF) in wavelet coding. The commonalities and differences between these two schemes are exhibited for readers to better understand modern scalable video coding technologies. Several challenges that still exist in scalable video coding, e.g., performance of spatial scalable coding and accurate MC lifting, are also discussed. Two new techniques are presented in this paper although they are not yet integrated into the common software. Finally, experimental results demonstrate the performance of the Barbell-lifting coding scheme and compare it with SVC and another well-known 3-D wavelet coding scheme, MC embedded zero block coding (MC-EZBC). Index Terms--Barbell lifting, lifting-based wavelet transform, Scalable Video Coding (SVC), 3-D wavelet video coding.
- Published
- 2007
7. 'Partial Pruning Method For Inter Prediction' in Patent Application Approval Process (USPTO 20210203922)
- Subjects
Desktop video -- Computer programs ,Desktop video software ,Government ,Political science - Abstract
2021 JUL 22 (VerticalNews) -- By a News Reporter-Staff News Editor at Politics & Government Week -- A patent application by the inventors LIU, Hongbin (Beijing, CN); WANG, Yue (Beijing, [...]
- Published
- 2021
8. 'Motion Vector Precision In Merge With Motion Vector Difference Mode' in Patent Application Approval Process (USPTO 20210203945)
- Subjects
Multimedia asset management software ,Government ,Political science - Abstract
2021 JUL 22 (VerticalNews) -- By a News Reporter-Staff News Editor at Politics & Government Week -- A patent application by the inventors LIU, Hongbin (Beijing, CN); WANG, Yue (Beijing, [...]
- Published
- 2021
9. Overview of the Screen Content Support in VVC: Applications, Coding Tools, and Performance.
- Author
-
Nguyen, Tung, Xu, Xiaozhong, Henry, Felix, Liao, Ru-Ling, Sarwer, Mohammed Golam, Karczewicz, Marta, Chao, Yung-Hsuan, Xu, Jizheng, Liu, Shan, Marpe, Detlev, and Sullivan, Gary J.
- Subjects
PULSE-code modulation ,VIDEO coding ,COMPUTER-generated imagery ,CUSTOMER experience ,PERSONAL computers - Abstract
In an increasingly connected world, consumer video experiences have diversified away from traditional broadcast video into new applications with increased use of non-camera-captured content such as computer screen desktop recordings or animations created by computer rendering, collectively referred to as screen content. There has also been increased use of graphics and character content that is rendered and mixed or overlaid together with camera-generated content. The emerging Versatile Video Coding (VVC) standard, in its first version, addresses this market change by the specification of low-level coding tools suitable for screen content. This is in contrast to its predecessor, the High Efficiency Video Coding (HEVC) standard, where highly efficient screen content support is only available in extension profiles of its version 4. This paper describes the screen content support and the five main low-level screen content coding tools in VVC: transform skip residual coding (TSRC), block-based differential pulse-code modulation (BDPCM), intra block copy (IBC), adaptive color transform (ACT), and the palette mode. The specification of these coding tools in the first version of VVC enables the VVC reference software implementation (VTM) to achieve average bit-rate savings of about 41% to 61% relative to the HEVC test model (HM) reference software implementation using the Main 10 profile for 4:2:0 screen content test sequences. Compared to the HM using the Screen-Extended Main 10 profile and the same 4:2:0 test sequences, the VTM provides about 19% to 25% bit-rate savings. The same comparison with 4:4:4 test sequences revealed bit-rate savings of about 13% to 27% for $Y'C_{B}C_{R}$ and of about 6% to 14% for $R'G'B'$ screen content. Relative to the HM without the HEVC version 4 screen content coding extensions, the bit-rate savings for 4:4:4 test sequences are about 33% to 64% for $Y'C_{B}C_{R}$ and 43% to 66% for $R'G'B'$ screen content. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
10. Affinity Derivation and Graph Merge for Instance Segmentation
- Author
-
Liu, Yiding, Yang, Siyu, Li, Bin, Zhou, Wengang, Xu, Jizheng, Li, Houqiang, and Lu, Yan
- Subjects
FOS: Computer and information sciences ,Physics::Instrumentation and Detectors ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science::Computer Vision and Pattern Recognition ,Computer Science::Multimedia ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition - Abstract
We present an instance segmentation scheme based on pixel affinity information, which is the relationship of two pixels belonging to a same instance. In our scheme, we use two neural networks with similar structure. One is to predict pixel level semantic score and the other is designed to derive pixel affinities. Regarding pixels as the vertexes and affinities as edges, we then propose a simple yet effective graph merge algorithm to cluster pixels into instances. Experimental results show that our scheme can generate fine-grained instance mask. With Cityscapes training data, the proposed scheme achieves 27.3 AP on test set., Published in ECCV 2018
- Published
- 2018
11. Weakly Supervised Bilinear Attention Network for Fine-Grained Visual Classification
- Author
-
Hu, Tao, Xu, Jizheng, Huang, Cong, Qi, Honggang, Huang, Qingming, and Lu, Yan
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
For fine-grained visual classification, objects usually share similar geometric structure but present variant local appearance and different pose. Therefore, localizing and extracting discriminative local features play a crucial role in accurate category prediction. Existing works either pay attention to limited object parts or train isolated networks for locating and classification. In this paper, we propose Weakly Supervised Bilinear Attention Network (WS-BAN) to solve these issues. It jointly generates a set of attention maps (region-of-interest maps) to indicate the locations of object's parts and extracts sequential part features by Bilinear Attention Pooling (BAP). Besides, we propose attention regularization and attention dropout to weakly supervise the generating process of attention maps. WS-BAN can be trained end-to-end and achieves the state-of-the-art performance on multiple fine-grained classification datasets, including CUB-200-2011, Stanford Car and FGVC-Aircraft, which demonstrated its effectiveness.
- Published
- 2018
12. Three-Dimensional Embedded Subband Coding with Optimized Truncation (3-D ESCOT)
- Author
-
Xu, Jizheng, Xiong, Zixiang, Li, Shipeng, and Zhang, Ya-Qin
- Published
- 2001
- Full Text
- View/download PDF
13. Consistent Video Style Transfer via Relaxation and Regularization.
- Author
-
Wang, Wenjing, Yang, Shuai, Xu, Jizheng, and Liu, Jiaying
- Subjects
OPTICAL flow ,IMAGE color analysis ,VIDEO surveillance - Abstract
In recent years, neural style transfer has attracted more and more attention, especially for image style transfer. However, temporally consistent style transfer for videos is still a challenging problem. Existing methods, either relying on a significant amount of video data with optical flows or using single-frame regularizers, fail to handle strong motions or complex variations, therefore have limited performance on real videos. In this article, we address the problem by jointly considering the intrinsic properties of stylization and temporal consistency. We first identify the cause of the conflict between style transfer and temporal consistency, and propose to reconcile this contradiction by relaxing the objective function, so as to make the stylization loss term more robust to motions. Through relaxation, style transfer is more robust to inter-frame variation without degrading the subjective effect. Then, we provide a novel formulation and understanding of temporal consistency. Based on the formulation, we analyze the drawbacks of existing training strategies and derive a new regularization. We show by experiments that the proposed regularization can better balance the spatial and temporal performance. Based on relaxation and regularization, we design a zero-shot video style transfer framework. Moreover, for better feature migration, we introduce a new module to dynamically adjust inter-channel distributions. Quantitative and qualitative results demonstrate the superiority of our method over state-of-the-art style transfer methods. Our project is publicly available at: https://daooshee.github.io/ReReVST/. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
14. Interweaved Prediction for Video Coding.
- Author
-
Zhang, Kai, Zhang, Li, Liu, Hongbin, Xu, Jizheng, Deng, Zhipin, and Wang, Yue
- Subjects
VIDEO compression ,VIDEO coding ,BLOCK codes ,FORECASTING ,STATISTICS - Abstract
In the emerging next generation video coding standard Versatile Video Coding (VVC) developed by the Joint Video Exploration Team (JVET), sub-block-based inter-prediction plays a key role in promising coding tools such as Affine Motion Compensation (AMC) and sub-block-based Temporal Motion Vector Prediction (sbTMVP). With sub-block-based inter-prediction, a coding block is divided into sub-blocks, and the motion information of each sub-block is derived individually. Although sub-block-based inter-prediction can provide a higher quality prediction benefiting from a finer motion granularity, it still suffers two problems: uneven prediction quality and boundary discontinuity. In this paper, we present a method of interweaved prediction to further improve sub-block-based inter-prediction. With interweaved prediction, a coding block with AMC or sbTMVP mode is divided into sub-blocks with two different dividing patterns, so that a corner position of a sub-block in one dividing pattern coincides with the central position of a sub-block in the other dividing pattern. Then two auxiliary predictions are generated by AMC or sbTMVP with the two dividing patterns, independently. The final prediction is calculated as a weighted-sum of the two auxiliary predictions. Theoretical analysis and statistical data prove that interweaved prediction can significantly mitigate the two problems in sub-block-based inter-prediction. Simulation results show that the proposed methods can achieve 0.64% BD-rate saving on average with the random access configurations. On sequences with rich affine motions, the average BD-rate saving can be up to 2.54%. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
15. Direct Speech-to-Image Translation.
- Author
-
Li, Jiguo, Zhang, Xinfeng, Jia, Chuanmin, Xu, Jizheng, Zhang, Li, Wang, Yue, Ma, Siwei, and Gao, Wen
- Abstract
Direct speech-to-image translation without text is an interesting and useful topic due to the potential applications in human-computer interaction, art creation, computer-aided design. etc. Not to mention that many languages have no writing form. However, as far as we know, it has not been well-studied how to translate the speech signals into images directly and how well they can be translated. In this paper, we attempt to translate the speech signals into the image signals without the transcription stage. Specifically, a speech encoder is designed to represent the input speech signals as an embedding feature, and it is trained with a pretrained image encoder using teacher-student learning to obtain better generalization ability on new classes. Subsequently, a stacked generative adversarial network is used to synthesize high-quality images conditioned on the embedding feature. Experimental results on both synthesized and real data show that our proposed method is effective to translate the raw speech signals into images without the middle text representation. Ablation study gives more insights about our method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
16. Efficient Multiple Line-Based Intra Prediction for HEVC
- Author
-
Ruiqin Xiong, Bin Li, Xu Jizheng, and Jiahao Li
- Subjects
FOS: Computer and information sciences ,Speedup ,Computer science ,business.industry ,030229 sport sciences ,02 engineering and technology ,Multimedia (cs.MM) ,03 medical and health sciences ,0302 clinical medicine ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Computer vision ,Algorithm design ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Algorithm ,Computer Science - Multimedia - Abstract
Traditional intra prediction usually utilizes the nearest reference line to generate the predicted block when considering strong spatial correlation. However, this kind of single line-based method does not always work well due to at least two issues. One is the incoherence caused by the signal noise or the texture of other object, where this texture deviates from the inherent texture of the current block. The other reason is that the nearest reference line usually has worse reconstruction quality in block-based video coding. Due to these two issues, this paper proposes an efficient multiple line-based intra prediction scheme to improve coding efficiency. Besides the nearest reference line, further reference lines are also utilized. The further reference lines with relatively higher quality can provide potential better prediction. At the same time, the residue compensation is introduced to calibrate the prediction of boundary regions in a block when we utilize further reference lines. To speed up the encoding process, this paper designs several fast algorithms. Experimental results show that, compared with HM-16.9, the proposed fast search method achieves 2.0% bit saving on average and up to 3.7%, with increasing the encoding time by 112%., Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology
- Published
- 2016
17. Spherical Domain Rate-Distortion Optimization for Omnidirectional Video Coding.
- Author
-
Li, Yiming, Xu, Jizheng, and Chen, Zhenzhong
- Subjects
- *
VIDEO coding , *VIRTUAL reality , *RATE distortion theory , *STREAMING technology , *STREAMING video & television - Abstract
Efficient compression of omnidirectional video is important for emerging virtual reality applications. To compress this kind of video, each frame is projected to a 2D plane [e.g., equirectangular projection (ERP) map] first, adapting to the input format of existing video coding systems. At the display side, an inverse projection is applied to the reconstructed video to restore signals in spherical domain. Such a projection, however, makes presentation and encoding in different domains. Thus, an encoder agnostic to the projection performs inefficiently. In this paper, we analyze how a projection influences the distortion measurements in different domains. Based on the analysis, we propose a scheme to optimize the encoding process based on signals’ distortion in spherical domain. With the proposed optimization, an average 4.31% (up to 9.67%) luma BD-rate reduction is achieved for ERP in random access configuration. The corresponding bit saving is averagely 10.84% (up to 34.44%) when considering the viewing field being $\pi /2$. The proposed method also benefits other projections and viewport settings, with a marginal complexity increase. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
18. Variable Block-Sized Signal-Dependent Transform for Video Coding.
- Author
-
Lan, Cuiling, Xu, Jizheng, Zeng, Wenjun, Shi, Guangming, and Wu, Feng
- Subjects
- *
VIDEO coding , *SIGNAL processing , *COMPUTER algorithms , *ENCODING , *STATISTICAL correlation - Abstract
Transform, as one of the most important modules of mainstream video coding systems, seems very stable over the past several decades. However, recent developments indicate that bringing more options for transform can lead to coding efficiency benefits. In this paper, we go further to investigate how the coding efficiency can be improved over the state-of-the-art method by adapting a transform for each block. We present a variable block-sized signal-dependent transforms (SDTs) design based on the High Efficiency Video Coding (HEVC) framework. For a coding block ranged from $4\times4$ to $32\times32$ , we collect a quantity of similar blocks from the reconstructed area and use them to derive the Karhunen–Loève transform. We avoid sending overhead bits to denote the transform by performing the same procedure at the decoder. In this way, the transform for every block is tailored according to its statistics, to be signal-dependent. To make the large block-sized SDTs feasible, we present a fast algorithm for transform derivation. Experimental results show the effectiveness of the SDTs for different block sizes, which leads to up to 23.3% bit-saving. On average, we achieve BD-rate saving of 2.2%, 2.4%, 3.3%, and 7.1% under AI-Main10, RA-Main10, RA-Main10, and LP-Main10 configurations, respectively, compared with the test model HM-12 of HEVC. The proposed scheme has also been adopted into the joint exploration test model for the exploration of potential future video coding standard. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
19. Fully Connected Network-Based Intra Prediction for Image Coding.
- Author
-
Li, Jiahao, Li, Bin, Xu, Jizheng, Xiong, Ruiqin, and Gao, Wen
- Subjects
IMAGE compression ,DEEP learning ,IMAGE reconstruction ,VIDEO coding ,TRANSFORM coding - Abstract
This paper proposes a deep learning method for intra prediction. Different from traditional methods utilizing some fixed rules, we propose using a fully connected network to learn an end-to-end mapping from neighboring reconstructed pixels to the current block. In the proposed method, the network is fed by multiple reference lines. Compared with traditional single line-based methods, more contextual information of the current block is utilized. For this reason, the proposed network has the potential to generate better prediction. In addition, the proposed network has good generalization ability on different bitrate settings. The model trained from a specified bitrate setting also works well on other bitrate settings. Experimental results demonstrate the effectiveness of the proposed method. When compared with high efficiency video coding reference software HM-16.9, our network can achieve an average of 3.4% bitrate saving. In particular, the average result of 4K sequences is 4.5% bitrate saving, where the maximum one is 7.4%. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
20. Diversity-Based Reference Picture Management for Low Delay Screen Content Coding.
- Author
-
Li, Jiahao, Li, Bin, Xu, Jizheng, and Xiong, Ruiqin
- Subjects
IMAGE processing ,VIDEO coding ,COMPUTER programming ,ALGORITHMS ,IMAGING systems - Abstract
Screen content coding plays an important role in many applications. Conventional reference picture management (RPM) strategies developed for natural content may not work well for screen content. This is because many regions in screen content remain static for a long time, causing a lot of repetitive contents to stay in the decoded picture buffer. The repetitive contents are not conducive to inter prediction, but still occupy valuable memory. This paper proposes a diversity-based RPM scheme for screen content coding. The concept of diversity is introduced for the reference picture set (RPS) to help formulate the RPM problem. By maximizing the diversity of RPS, more potentially better predictions are provided. Better compression performance can then be achieved. Meanwhile, the proposed scheme is nonnormative and compatible with existing video coding standards, such as High Efficiency Video Coding. The experimental results show that, for low delay screen content coding, the bit saving of the proposed scheme is 4.9% on average and up to 13.7%, without increasing encoding time. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
21. Fast Hash-Based Inter-Block Matching for Screen Content Coding.
- Author
-
Xiao, Wei, Shi, Guangming, Li, Bin, Xu, Jizheng, and Wu, Feng
- Subjects
VIDEO coding ,VIDEO compression software ,GRAPHICS processing units ,COMPUTER graphics ,DIGITAL image processing - Abstract
In the latest High Efficiency Video Coding (HEVC) development, i.e., HEVC screen content coding extensions (HEVC-SCC), a hash-based inter-motion search/block matching scheme is adopted in the reference test model, which brings significant coding gains to code screen content. However, the hash table generation itself may take up to half the encoding time and is thus too complex for practical usage. In this paper, we propose a hierarchical hash design and the corresponding block matching scheme to significantly reduce the complexity of hash-based block matching. The hierarchical structure in the proposed scheme allows large block calculation to use the results of small blocks. Thus, we avoid redundant computation among blocks with different sizes, which greatly reduces complexity without compromising coding efficiency. The experimental results show that compared with the hash-based block matching scheme in the HEVC-SCC test model (SCM)-6.0, the proposed scheme reduces about 77% of hash processing time, which leads to 12% and 16% encoding time savings in random access (RA) and low-delay B coding structures. The proposed scheme has been adopted into the latest SCM. A parallel implementation of the proposed hash table generation on graphics processing unit (GPU) is also presented to show the high parallelism of the proposed scheme, which achieves more than 30 frames/s for 1080p sequences and 60 frames/s for 720p sequences. With the fast hash-based block matching integrated into x265 and the hash table generated on GPU, the encoder can achieve 11.8% and 14.0% coding gains on average for RA and low-delay P coding structures, respectively, for real-time encoding. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
22. Unequal Error Protection for Scalable Video Storage in the Cloud.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Abstract
Redundancy is necessary for a storage system to achieve reliability. Frequent errors in large-scale storage systems, for example, cloud, make it desirable to reduce the cost of recovery. Among all types of data in cloud storage, videos generally occupy significant amounts of space due to high volumes and the rapid development of video sharing and video-on-demand services. Unlike general data, videos can tolerate a certain level of quality degradation. This paper investigates multilayer video representations, such as scalable videos and simulcast streaming, and proposes an unequal error protection scheme based on local reconstruction codes (LRC) for video storage. By providing less protection for less important layers or video copies, a better tradeoff between storage and repair cost is achieved. Both theoretical and simulation results show that such a tradeoff can be achieved over the LRC with equal error protection, though the recovered video quality might be slightly lower in rare cases. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
23. Weighted Rate-Distortion Optimization for Screen Content Coding.
- Author
-
Xiao, Wei, Li, Bin, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
RATE distortion theory ,VIDEO coding ,MOTION estimation (Signal processing) ,HIGH definition video recording ,VIDEO compression - Abstract
Unlike camera-captured video, screen content (SC) often contains a lot of repeating patterns, which makes some blocks used as references much more important than others. However, conventional rate-distortion optimization (RDO) schemes in video coding do not consider the dependence among image blocks, which often leads to a locally optimal parameter selection, especially for SC. In this paper, we present a weighted RDO scheme for SC coding (SCC), in which the repeating characteristics are taken into account when deciding RD tradeoff for each block. For one block, the number being referenced by the current picture and following pictures is estimated and based on the number, we set a proper weight in the RDO process to reflect its importance from a global point of view. To estimate the number being referenced, we propose a hash-based method to approximate the results to avoid the complexity of direct search. Experimental results show that compared with the High Efficiency Video Coding SCC reference software, 10.1%, 14.5%, and 2.2% on average and up to 25.7%, 39.8%, and 4.6% bit saving can be achieved by considering weights provided by our scheme for hierarchical-B, IBBB, and all intra coding structures, respectively. Thanks to our hash-based design, the complexity increase brought by the proposed scheme is marginal. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
24. Content-adaptive deblocking for high efficiency video coding
- Author
-
Xiong, Zhiwei, Sun, Xiaoyan, Xu, Jizheng, and Wu, Feng
- Published
- 2012
- Full Text
- View/download PDF
25. PCA-based adaptive color decorrelation algorithm for HEVC.
- Author
-
Zhang, Mengmeng, Guo, Yuhui, Li, Bin, and Xu, Jizheng
- Published
- 2016
- Full Text
- View/download PDF
26. Distributed Compressive Sensing for Cloud-Based Wireless Image Transmission.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Abstract
We consider efficient image transmission via time-varying channels. To improve the performance, we propose a new distributed compressive sensing (CS) scheme that can leverage similar images in the cloud. It is featured by channel SNR and bandwidth scalability, high efficiency, and low encoding complexity. For each image, a compressed thumbnail is first transmitted after forward error correction (FEC) and modulation to retrieve similar images and generate a side information (SI) in the cloud. The residual image after subtracting the decompressed thumbnail is then coded and transmitted by CS through a very dense constellation without FEC. The linearly and ratelessly generated CS measurements make it capable of achieving both graceful quality degradation (GD) with the channel SNR and bandwidth scalability in a universal scheme. A mode decision and transform-domain power allocation are introduced for better bandwidth usage and protection against channel errors. At the decoder, a two-step CS decoding is performed to recover the residual signal, where both the local and nonlocal correlations within the image and that with the SI are exploited. Simulations on landmark images and an AWGN channel show that the received image quality gracefully increases with the channel SNR and bandwidth. Furthermore, it outperforms existing schemes both subjectively and objectively by up to 11 dB gains compared with the state-of-the-art transmission scheme with GD, i.e. SoftCast. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
27. Draft Status Report on Wavelet Video Coding Exploration
- Author
-
Brangoulo, Sébastien, Leonardi, Riccardo, Mrak, Marta, Pesquet Popescu, Béatrice, and Xu, Jizheng
- Subjects
scalable video coding ,video coding standardization ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY - Abstract
Report on the status of scalable wavelet video coding exploration activities.
- Published
- 2005
28. Hash-Based Line-by-Line Template Matching for Lossless Screen Image Coding.
- Author
-
Peng, Xiulian and Xu, Jizheng
- Subjects
- *
TEMPLATE matching (Digital image processing) , *MATCHING theory , *PATTERN recognition systems , *IMAGE transmission , *DATA transmission systems - Abstract
Template matching (TM) was proposed in the literature a decade ago to efficiently remove non-local redundancies within an image without transmitting any overhead of displacement vectors. However, the large computational complexity introduced at both the encoder and the decoder, especially for a large search range, limits its widespread use. This paper proposes a hash-based line-by-line template matching (hLTM) for lossless screen image coding, where the non-local redundancy commonly exists in text and graphics parts. By hash-based search, it can largely reduce the search complexity of template matching without an accuracy degradation. Besides, the line-by-line template matching increases prediction accuracy by using a fine granularity. Experimental results show that the hLTM can significantly reduce both the encoding and decoding complexities by 68 and 23 times, respectively, compared with the traditional TM with a search radius of 128. Moreover, when compared with High Efficiency Video Coding screen content coding test model SCM-1.0, it can largely improve coding efficiency by up to 12.68% bits saving on screen contents with rich texts/graphics. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
29. A Fast Algorithm for Adaptive Motion Compensation Precision in Screen Content Coding.
- Author
-
Li, Bin and Xu, Jizheng
- Published
- 2015
- Full Text
- View/download PDF
30. Rate control for screen content coding in HEVC.
- Author
-
Guo, Yaoyao, Li, Bin, Sun, Songlin, and Xu, Jizheng
- Published
- 2015
- Full Text
- View/download PDF
31. Compound image compression using lossless and lossy LZMA in HEVC.
- Author
-
Lan, Cuiling, Xu, Jizheng, Wenjun Zeng, and Wu, Feng
- Published
- 2015
- Full Text
- View/download PDF
32. Compressive sensing based image transmission with side information at the decoder.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Published
- 2015
- Full Text
- View/download PDF
33. Weighted rate-distortion optimization for screen content intra coding.
- Author
-
Xiao, Wei, Li, Bin, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Published
- 2015
- Full Text
- View/download PDF
34. An adaptive hierarchical QP setting for screen content coding.
- Author
-
Li, Jiahao, Li, Bin, Xu, Jizheng, and Xiong, Ruiqin
- Published
- 2015
- Full Text
- View/download PDF
35. Rate control for screen content coding based on picture classification.
- Author
-
Guo, Yaoyao, Li, Bin, Sun, Songlin, and Xu, Jizheng
- Published
- 2015
- Full Text
- View/download PDF
36. Overview of the Emerging HEVC Screen Content Coding Extension.
- Author
-
Xu, Jizheng, Joshi, Rajan, and Cohen, Robert A.
- Subjects
- *
VIDEO coding , *BIT rate , *VIDEO codecs , *STANDARDIZATION - Abstract
A screen content coding (SCC) extension to High Efficiency Video Coding (HEVC) is currently under development by the Joint Collaborative Team on Video Coding, which is a joint effort from the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC-SCC standardization effort is to enable significantly improved compression performance for videos containing a substantial amount of still or moving rendered graphics, text, and animation rather than, or in addition to, camera-captured content. This paper provides an overview of the technical features and characteristics of the current HEVC-SCC test model and related coding tools, including intra-block copy, palette mode, adaptive color transform, and adaptive motion vector resolution. The performance of the SCC extension is compared against existing standards in terms of bitrate savings at equal distortion. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
37. Overview of the Range Extensions for the HEVC Standard: Tools, Profiles, and Performance.
- Author
-
Flynn, David, Marpe, Detlev, Naccari, Matteo, Nguyen, Tung, Rosewarne, Chris, Sharman, Karl, Sole, Joel, and Xu, Jizheng
- Subjects
MPEG (Video coding standard) ,VIDEO compression ,BIT rate ,RANDOM access memory ,VIDEO coding - Abstract
The Range Extensions (RExt) of the High Efficiency Video Coding (HEVC) standard have recently been approved by both ITU-T and ISO/IEC. This set of extensions targets video coding applications in areas including content acquisition, postproduction, contribution, distribution, archiving, medical imaging, still imaging, and screen content. In addition to the functionality of HEVC Version 1, RExt provide support for monochrome, 4:2:2, and 4:4:4 chroma sampling formats as well as increased sample bit depths beyond 10 bits per sample. This extended functionality includes new coding tools with a view to provide additional coding efficiency, greater flexibility, and throughput at high bit depths/rates. Improved lossless, near-lossless, and very high bit-rate coding is also a part of the RExt scope. This paper presents the technical aspects of HEVC RExt, including a discussion of RExt profiles, tools, applications, and provides experimental results for a performance comparison with previous relevant coding technology. When compared with the High 4:4:4 Predictive Profile of H.264/Advanced Video Coding (AVC), the corresponding HEVC 4:4:4 RExt profile provides up to $\sim 25$ %, $\sim 32$ %, and $\sim 36$ % average bit-rate reduction at the same PSNR quality level for intra, random access, and low delay configurations, respectively. [ABSTRACT FROM PUBLISHER]
- Published
- 2016
- Full Text
- View/download PDF
38. Cloud-Based Distributed Image Coding.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
IMAGE compression ,CLOUD computing ,DISTRIBUTED computing ,DIGITAL image correlation ,IMAGE reconstruction ,MULTIMEDIA systems - Abstract
With multimedia flourishing on the Web, it is easy to find similar images for a query, especially landmark images. Traditional image coding, such as JPEG, cannot exploit correlations with external images. Existing vision-based approaches are able to exploit such correlations by reconstructing from local descriptors but cannot ensure the pixel-level fidelity of the reconstruction. In this paper, a cloud-based distributed image coding (Cloud-DIC) scheme is proposed to exploit external correlations for mobile photo uploading. For each input image, a thumbnail is transmitted to retrieve correlated images and reconstruct it in the cloud by geometrical and illumination registrations. Such a reconstruction serves as the side information (SI) in the Cloud-DIC. The image is then compressed by a transform-domain syndrome coding to correct the disparity between the original image and the SI. Once a bitplane is received in the cloud, an iterative refinement process is performed between the final reconstruction and the SI. Moreover, a joint encoder/decoder mode decision at block, frequency, and bitplane levels is proposed to adapt to different correlations. Experimental results on a landmark image database show that the Cloud-DIC can largely enhance the coding efficiency both subjectively and objectively, with up to 5-dB gains and 70% bits saving over JPEG with arithmetic coding, and perform comparably at low bitrates with the intra coding of the High Efficiency Video Coding standard with a much lower encoder complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
39. Cloud-based distributed image coding.
- Author
-
Song, Xiaodan, Peng, Xiulian, Xu, Jizheng, and Wu, Feng
- Published
- 2014
- Full Text
- View/download PDF
40. Screen content coding for HEVC by improved line-based intra block copy.
- Author
-
Zhang, Mengmeng, Zhang, Yang, Peng, Xiulian, and Xu, Jizheng
- Published
- 2014
- Full Text
- View/download PDF
41. Adaptive weighted distortion optimization for video coding in RGB color space.
- Author
-
Huang, Yue, Qi, Honggang, Li, Bin, and Xu, Jizheng
- Published
- 2014
- Full Text
- View/download PDF
42. 2-D Dictionary Based Video Coding for Screen Contents.
- Author
-
Zhu, Weijia, Ding, Wenpeng, Xu, Jizheng, Shi, Yunhui, and Yin, Baocai
- Published
- 2014
- Full Text
- View/download PDF
43. A unified framework of hash-based matching for screen content coding.
- Author
-
Li, Bin, Xu, Jizheng, and Wu, Feng
- Published
- 2014
- Full Text
- View/download PDF
44. 1-D dictionary mode for screen content coding.
- Author
-
Li, Bin, Xu, Jizheng, and Wu, Feng
- Published
- 2014
- Full Text
- View/download PDF
45. HEVC Encoding Optimization Using Multicore CPUs and GPUs.
- Author
-
Xiao, Wei, Li, Bin, Xu, Jizheng, Shi, Guangming, and Wu, Feng
- Subjects
DECODERS & decoding ,ENCODING ,VIDEO coding ,VIDEO codecs ,GRAPHICS processing units - Abstract
Although the High Efficiency Video Coding (HEVC) standard significantly improves the coding efficiency of video compression, it is unacceptable even in offline applications to spend several hours compressing 10 s of high-definition video. In this paper, we propose using a multicore central processing unit (CPU) and an off-the-shelf graphics processing unit (GPU) with 3072 streaming processors (SPs) for HEVC fast encoding, so that the speed optimization does not result in loss of coding efficiency. There are two key technical contributions in this paper. First, we propose an algorithm that is both parallel and fast for the GPU, which can utilize 3072 SPs in parallel to estimate the motion vector (MV) of every prediction unit (PU) in every combination of the coding unit (CU) and PU partitions. Furthermore, the proposed GPU algorithm can avoid coding efficiency loss caused by the lack of a MV predictor (MVP). Second, we propose a fast algorithm for the CPU, which can fully utilize the results from the GPU to significantly reduce the number of possible CU and PU partitions without any coding efficiency loss. Our experimental results show that compared with the reference software, we can encode high-resolution video that consumes 1.9% of the CPU time and 1.0% of the GPU time, with only a 1.4% rate increase. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
46. Hash-Based Block Matching for Screen Content Coding.
- Author
-
Zhu, Weijia, Ding, Wenpeng, Xu, Jizheng, Shi, Yunhui, and Yin, Baocai
- Abstract
By considering the increasing importance of screen contents, the high efficiency video coding (HEVC) standard includes screen content coding as one of its requirements. In this paper, we demonstrate that enabling frame level block searching in HEVC can significantly improve coding efficiency on screen contents. We propose a hash-based block matching scheme for the intra block copy mode and the motion estimation process, which enables frame level block searching in HEVC without changing the HEVC syntaxes. In the proposed scheme, the blocks sharing the same hash values with the current block are selected as prediction candidates. Then the hash-based block selection is employed to select the best candidates. To achieve the best coding efficiency, the rate distortion optimization is further employed to improve the proposed scheme by balancing the coding cost of motion vectors and prediction difference. Compared with HEVC, the proposed scheme achieves 21% and 37% bitrate saving with all intra and low delay configurations with encoding time reduction . Up to 59% bitrate saving can be achieved on sequences with large motions. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
47. Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors.
- Author
-
Yan, Chenggang, Zhang, Yongdong, Xu, Jizheng, Dai, Feng, Zhang, Jun, Dai, Qionghai, and Wu, Feng
- Subjects
MOTION detectors ,FORCE & energy ,ALGORITHMS ,NUMERICAL analysis ,KINEMATICS - Abstract
High Efficiency Video Coding (HEVC) provides superior coding efficiency than previous video coding standards at the cost of increasing encoding complexity. The complexity increase of motion estimation (ME) procedure is rather significant, especially when considering the complicated partitioning structure of HEVC. To fully exploit the coding efficiency brought by HEVC requires a huge amount of computations. In this paper, we analyze the ME structure in HEVC and propose a parallel framework to decouple ME for different partitions on many-core processors. Based on local parallel method (LPM), we first use the directed acyclic graph (DAG)-based order to parallelize coding tree units (CTUs) and adopt improved LPM (ILPM) within each CTU (DAGILPM), which exploits the CTU-level and prediction unit (PU)-level parallelism. Then, we find that there exist completely independent PUs (CIPUs) and partially independent PUs (PIPUs). When the degree of parallelism (DP) is smaller than the maximum DP of DAGILPM, we process the CIPUs and PIPUs, which further increases the DP. The data dependencies and coding efficiency stay the same as LPM. Experiments show that on a 64-core system, compared with serial execution, our proposed scheme achieves more than 30 and 40 times speedup for $1920\times 1080$ and $2560\times 1600$ video sequences, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
48. A fast depth-map wedgelet partitioning scheme for intra prediction in 3D video coding.
- Author
-
Zhang, Mengmeng, Zhao, Chuan, Xu, Jizheng, and Bai, Huihui
- Published
- 2013
- Full Text
- View/download PDF
49. Performance analysis of transform in uncoded wireless visual communication.
- Author
-
Xiong, Ruiqin, Wu, Feng, Xu, Jizheng, and Gao, Wen
- Published
- 2013
- Full Text
- View/download PDF
50. Rate-distortion optimization with adaptive weighted distortion in high Efficiency Video Coding.
- Author
-
Li, Bin, Xu, Jizheng, and Li, Houqiang
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.