Author: "Zhang, Shanghang" / Search Limiters: Peer Reviewed - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang, Shanghang"' showing total 12 results

Start Over Author "Zhang, Shanghang" Search Limiters Peer Reviewed

12 results on '"Zhang, Shanghang"'

1. A multimodal physiological dataset for driving behaviour analysis

Author: Tao, Xiaoming, Gao, Dingcheng, Zhang, Wenqi, Liu, Tianqi, Du, Bing, Zhang, Shanghang, and Qin, Yanjun
Published: 2024
Full Text: View/download PDF

2. A lightweight multi-layer perceptron for efficient multivariate time series forecasting

Author: Wang, Zhenghong, Ruan, Sijie, Huang, Tianqiang, Zhou, Haoyi, Zhang, Shanghang, Wang, Yi, Wang, Leye, Huang, Zhou, and Liu, Yu
Published: 2024
Full Text: View/download PDF

3. Expanding the prediction capacity in long sequence time-series forecasting

Author: Zhou, Haoyi, Li, Jianxin, Zhang, Shanghang, Zhang, Shuai, Yan, Mengyi, and Xiong, Hui
Published: 2023
Full Text: View/download PDF

4. Learning graph attention-aware knowledge graph embedding

Author: Li, Chen, Peng, Xutan, Niu, Yuhang, Zhang, Shanghang, Peng, Hao, Zhou, Chuan, and Li, Jianxin
Published: 2021
Full Text: View/download PDF

5. Modeling relation paths for knowledge base completion via joint adversarial training

Author: Li, Chen, Peng, Xutan, Zhang, Shanghang, Peng, Hao, Yu, Philip S., He, Min, Du, Linfeng, and Wang, Lihong
Published: 2020
Full Text: View/download PDF

6. P 2 FEViT: Plug-and-Play CNN Feature Embedded Hybrid Vision Transformer for Remote Sensing Image Classification.

Author: Wang, Guanqun, Chen, He, Chen, Liang, Zhuang, Yin, Zhang, Shanghang, Zhang, Tong, Dong, Hao, and Gao, Peng
Subjects: IMAGE recognition (Computer vision), TRANSFORMER models, REMOTE sensing, CONVOLUTIONAL neural networks, DATA mining, SPATIAL ability
Abstract: Remote sensing image classification (RSIC) is a classical and fundamental task in the intelligent interpretation of remote sensing imagery, which can provide unique labeling information for each acquired remote sensing image. Thanks to the potent global context information extraction ability of the multi-head self-attention (MSA) mechanism, visual transformer (ViT)-based architectures have shown excellent capability in natural scene image classification. However, in order to achieve powerful RSIC performance, it is insufficient to capture global spatial information alone. Specifically, for fine-grained target recognition tasks with high inter-class similarity, discriminative and effective local feature representations are key to correct classification. In addition, due to the lack of inductive biases, the powerful global spatial context representation capability of ViT requires lengthy training procedures and large-scale pre-training data volume. To solve the above problems, a hybrid architecture of convolution neural network (CNN) and ViT is proposed to improve the RSIC ability, called P 2 FEViT, which integrates plug-and-play CNN features with ViT. In this paper, the feature representation capabilities of CNN and ViT applying for RSIC are first analyzed. Second, aiming to integrate the advantages of CNN and ViT, a novel approach embedding CNN features into the ViT architecture is proposed, which can make the model synchronously capture and fuse global context and local multimodal information to further improve the classification capability of ViT. Third, based on the hybrid structure, only a simple cross-entropy loss is employed for model training. The model can also have rapid and comfortable convergence with relatively less training data than the original ViT. Finally, extensive experiments are conducted on the public and challenging remote sensing scene classification dataset of NWPU-RESISC45 (NWPU-R45) and the self-built fine-grained target classification dataset called BIT-AFGR50. The experimental results demonstrate that the proposed P 2 FEViT can effectively improve the feature description capability and obtain outstanding image classification performance, while significantly reducing the high dependence of ViT on large-scale pre-training data volume and accelerating the convergence speed. The code and self-built dataset will be released at our webpages. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. Traffic flow from a low frame rate city camera.

Author: Toropov, Evgeny, Gui, Liangyan, Zhang, Shanghang, Kottur, Satwik, and Moura, Jose M. F.
Published: 2015
Full Text: View/download PDF

8. Bayesian model fusion: Enabling test cost reduction of analog/RF circuits via wafer-level spatial variation modeling.

Author: Zhang, Shanghang, Li, Xin, Blanton, R. D., da Silva, Jose Machado, Carulli, John M., and Butler, Kenneth M.
Published: 2014
Full Text: View/download PDF

9. A high-throughput low-latency arithmetic encoder design for HDTV.

Author: Li, Yuan, Zhang, Shanghang, Jia, Huizhu, Xie, Xiaodong, and Gao, Wen
Published: 2013
Full Text: View/download PDF

10. A flexible and high-performance hardware video encoder architecture.

Author: Wei, Kaijin, Zhang, Shanghang, Jia, Huizhu, Xie, Don, and Gao, Wen
Abstract: This paper presents a new video encoder architecture for H.264 and AVS, which adopts a novel macroblock (MB) encoding order. As a replacement of Level C+ zigzag coding order, the so-called Level C+ slash scan coding order with NOP insertion is used as MB scheduling to remove MB-level data dependency of the pipeline so that the left MB's coded results such as motion vector (MV) and reconstructed pixels can be obtained early in motion estimation (ME) stages. As a result, by sharing the reconstruction (REC) loop, sequential intra prediction (INTRA) can be split into multiple pipeline stages to explore more block-level parallelization and rate distortion optimization (RDO) based mode decision is apt to implement. The exact MV predictors (MVP) obtained in motion estimation can not only improve coding performance but also make pre-skip ME algorithm able to be applied into this architecture for low power applications. Since the proposed scheme is attributed to Level C+ data reuse, the bandwidth is decreased greatly. A real-time high-definition (HD) 1080P AVS encoder implementation on FPGA verification board with search range [−128, 128]×[−96, 96] and two reference frames at an operating frequency of 160 MHz validates the efficiency of proposed architecture. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

11. An Optimized Hardware Video Encoder for AVS with Level C+ Data Reuse Scheme for Motion Estimation.

Author: Wei, Kaijin, Zhou, Rongwei, Zhang, Shanghang, Jia, Huizhu, Xie, Don, and Gao, Wen
Abstract: In a hardware video encoder, Level C+ data reuse for motion estimation can reuse two-dimensional overlapped search window (SW) and thus is a good choice to trade off the memory bandwidth with the on-chip buffer size. However, the irregular zigzag coding order brings some other troubles to the encoder implementation. This paper mainly focuses on the special considerations for a Level C+ zigzag encoder. First we present a guideline about how to select the Level C+ zigzag HFmVn scan for the adopted encoder pipeline. Second, according to the guideline, zigzag HF5V3 coding order is applied into our Level C+ encoder in which a new function is added to alter zigzag bit-stream into standard raster order and exact motion vector predictor (MVP) can be used for most macro blocks (MBs) except some corner MBs to increase the coding performance. Third, zigzag-aware scheduling for prefetching the SW is proposed so that the pipeline will never be disturbed by this irregular coding order and can smoothly run MB by MB. In addition, balancing the bandwidth into each MB processing period can improve the bandwidth utilization. With these techniques, a real-time high-definition (HD) 1080P AVS encoder is successfully implemented on FPGA verification board with search range [-128, 128]Ã -- [-96, 96] and two reference frames at an operating frequency of 160 MHz. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

12. On a Highly Efficient RDO-Based Mode Decision Pipeline Design for AVS.

Author: Zhu, Chuang, Jia, Huizhu, Zhang, Shanghang, Huang, Xiaofeng, Xie, Xiaodong, and Gao, Wen
Abstract: Rate distortion optimization (RDO) is the best known mode decision method, while the high implementation complexity limits its applications and almost no real-time hardware encoder is truly full-featured RDO based. In this paper, first, a full-featured RDO-based mode decision (MD) algorithm is presented, which makes more modes enter RDO process. Second, the throughput of RDO-based MD pipeline is thoroughly analyzed and modeled. Third, a highly efficient adaptive block-level pipelining architecture of RDO-based MD for AVS video encoder is proposed which can achieve the highest throughput to alleviate the RDO burden. Our design is described in high-level Verilog/VHDL hardware description language and implemented under SMIC 0.18-\mum CMOS technology with 232 K logic gates and 85 Kb SRAMs. The implementation results validate our architectural design and the proposed architecture can support real time processing of 1080P@30 fps. The coding efficiency of our adopted method far outperforms (0.57 dB PSNR gain in average) the traditional low-complexity MD (LCMD) methods and the throughput of our designed pipeline is increased by 11.3%, 19% and 17% for I, P and B frames, respectively, compared with the existed RDO-based architecture. [ABSTRACT FROM PUBLISHER]
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"Zhang, Shanghang"'

1. A multimodal physiological dataset for driving behaviour analysis

2. A lightweight multi-layer perceptron for efficient multivariate time series forecasting

3. Expanding the prediction capacity in long sequence time-series forecasting

4. Learning graph attention-aware knowledge graph embedding

5. Modeling relation paths for knowledge base completion via joint adversarial training

6. P 2 FEViT: Plug-and-Play CNN Feature Embedded Hybrid Vision Transformer for Remote Sensing Image Classification.

7. Traffic flow from a low frame rate city camera.

8. Bayesian model fusion: Enabling test cost reduction of analog/RF circuits via wafer-level spatial variation modeling.

9. A high-throughput low-latency arithmetic encoder design for HDTV.

10. A flexible and high-performance hardware video encoder architecture.

11. An Optimized Hardware Video Encoder for AVS with Level C+ Data Reuse Scheme for Motion Estimation.

12. On a Highly Efficient RDO-Based Mode Decision Pipeline Design for AVS.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

12 results on '"Zhang, Shanghang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources