111 results on '"Feature compression"'
Search Results
2. LiFSO-Net: A lightweight feature screening optimization network for complex-scale flat metal defect detection
- Author
-
Zhong, Hao, Xiao, Ling, Wang, Haifeng, Zhang, Xin, Wan, Chenhui, Hu, Youmin, and Wu, Bo
- Published
- 2024
- Full Text
- View/download PDF
3. Distributed Semantic Segmentation with Efficient Joint Source and Task Decoding
- Author
-
Nazir, Danish, Bartels, Timo, Piewek, Jan, Bagdonat, Thorsten, Fingscheidt, Tim, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Robust underwater object tracking with image enhancement and two-step feature compression.
- Author
-
Li, Jiaqing, Xue, Chaocan, Luo, Xuan, Fu, Yubin, and Lin, Bin
- Abstract
Developing a robust algorithm for underwater object tracking (UOT) is crucial to support the sustainable development and utilization of marine resources. In addition to open-air tracking challenges, the visual object tracking (VOT) task presents further difficulties in underwater environments due to visual distortions, color cast issues, and low-visibility conditions. To address these challenges, this study introduces a novel underwater target tracking framework based on correlation filter (CF) with image enhancement and a two-step feature compression mechanism. Underwater image enhancement mitigates the impact of visual distortions and color cast issues on target appearance modeling, while the two-step feature compression strategy addresses low-visibility conditions by compressing redundant features and combining multiple compressed features based on the peak-to-sidelobe ratio (PSR) indicator for accurate target localization. The excellent performance of the proposed method is demonstrated through evaluation on two public UOT datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. Robust underwater object tracking with image enhancement and two-step feature compression
- Author
-
Jiaqing Li, Chaocan Xue, Xuan Luo, Yubin Fu, and Bin Lin
- Subjects
Underwater object tracking ,Correlation filter ,Underwater image enhancement ,Feature compression ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract Developing a robust algorithm for underwater object tracking (UOT) is crucial to support the sustainable development and utilization of marine resources. In addition to open-air tracking challenges, the visual object tracking (VOT) task presents further difficulties in underwater environments due to visual distortions, color cast issues, and low-visibility conditions. To address these challenges, this study introduces a novel underwater target tracking framework based on correlation filter (CF) with image enhancement and a two-step feature compression mechanism. Underwater image enhancement mitigates the impact of visual distortions and color cast issues on target appearance modeling, while the two-step feature compression strategy addresses low-visibility conditions by compressing redundant features and combining multiple compressed features based on the peak-to-sidelobe ratio (PSR) indicator for accurate target localization. The excellent performance of the proposed method is demonstrated through evaluation on two public UOT datasets.
- Published
- 2025
- Full Text
- View/download PDF
6. Enhancing the data processing speed of a deep-learning-based three-dimensional single molecule localization algorithm (FD-DeepLoc) with a combination of feature compression and pipeline programming.
- Author
-
Guo, Shuhao, Lin, Jiaxun, Zhang, Yingjun, and Huang, Zhen-Li
- Subjects
- *
REAL-time computing , *IMAGE processing , *ONLINE algorithms , *ELECTRONIC data processing , *SINGLE molecules , *DEEP learning , *SADDLEPOINT approximations - Abstract
Three-dimensional (3D) single molecule localization microscopy (SMLM) plays an important role in biomedical applications, but its data processing is very complicated. Deep learning is a potential tool to solve this problem. As the state of art 3D super-resolution localization algorithm based on deep learning, FD-DeepLoc algorithm reported recently still has a gap with the expected goal of online image processing, even though it has greatly improved the data processing throughput. In this paper, a new algorithm Lite-FD-DeepLoc is developed on the basis of FD-DeepLoc algorithm to meet the online image processing requirements of 3D SMLM. This new algorithm uses the feature compression method to reduce the parameters of the model, and combines it with pipeline programming to accelerate the inference process of the deep learning model. The simulated data processing results show that the image processing speed of Lite-FD-DeepLoc is about twice as fast as that of FD-DeepLoc with a slight decrease in localization accuracy, which can realize real-time processing of 256×256 pixels size images. The results of biological experimental data processing imply that Lite-FD-DeepLoc can successfully analyze the data based on astigmatism and saddle point engineering, and the global resolution of the reconstructed image is equivalent to or even better than FD-DeepLoc algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Improved YOLOv8 Algorithm for Industrial Surface Defect Detection.
- Author
-
SU Jia, JIA Ze, QIN Yichang, and ZHANG Jianyan
- Subjects
SURFACE defects ,FEATURE extraction ,LEAK detection ,ELK ,ALGORITHMS - Abstract
Aiming at the problems of low contrast of industrial defects and high false detection rate and leakage rate caused by the surrounding interference information, it proposes an industrial surface defect detection algorithm EML-YOLO based on the improvement of YOLOv8. By designing a high-efficiency large convolution module ELK, the model ' s feature extraction capability can be improved by providing a multi-scale feature representation while retaining the spatial information; by proposing a parallel multi-branch feature fusion module MCM, which enables the model to acquire rich feature information and global context information; and reducing the number of parameters and computation of the model by feature compression and streamlining in the Neck module, which makes the model more applicable to industrial scenarios with limited resources. Two industrial surface defect datasets, GC10-DET and DeepPCB, are used to validate the effectiveness of the improved EML-YOLO algorithm. The experimental results show that on the GC10-DET dataset and Deep- PCB dataset, the detection accuracy is improved by 4.3 percentage points and 2.9 percentage points, respectively, and the number of parametric quantities is only 2.7x10
6 . The proposed algorithm can be better applied to industrial defect detection scenarios. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
8. 轻度认知障碍分类中全脑功能连接的特征压缩分析.
- Author
-
马 佳, 吴海锋, and 李顺良
- Abstract
Copyright of Journal of Data Acquisition & Processing / Shu Ju Cai Ji Yu Chu Li is the property of Editorial Department of Journal of Nanjing University of Aeronautics & Astronautics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
9. Masked Feature Compression for Object Detection.
- Author
-
Dai, Chengjie, Song, Tiantian, Jin, Yuxuan, Ren, Yixiang, Yang, Bowei, and Song, Guanghua
- Subjects
- *
OBJECT recognition (Computer vision) , *IMAGE compression , *CLOUD storage - Abstract
Deploying high-accuracy detection models on lightweight edge devices (e.g., drones) is challenging due to hardware constraints. To achieve satisfactory detection results, a common solution is to compress and transmit the images to a cloud server where powerful models can be used. However, the image compression process for transmission may lead to a reduction in detection accuracy. In this paper, we propose a feature compression method tailored for object detection tasks, and it can be easily integrated with existing learned image compression models. In the method, the encoding process consists of two steps. Firstly, we use a feature extractor to obtain the low-level feature, and then use a mask generator to obtain an object mask to select regions containing objects. Secondly, we use a neural network encoder to compress the masked feature. As for decoding, a neural network decoder is used to restore the compressed representation into the feature that can be directly inputted into the object detection model. The experimental results demonstrate that our method surpasses existing compression techniques. Specifically, when compared to one of the leading methods—TCM2023—our approach achieves a 25.3% reduction in compressed file size and a 6.9% increase in mAP0.5. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Future Work
- Author
-
Li, Ge, Gao, Wei, Gao, Wen, Li, Ge, Gao, Wei, and Gao, Wen
- Published
- 2024
- Full Text
- View/download PDF
11. A Hybrid Model for Video Compression Based on the Fusion of Feature Compression Framework and Multi-object Tracking Network
- Author
-
Chen, Yunyu, Wang, Lichuan, Zhang, Yuan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Attention-based variable-size feature compression module for edge inference.
- Author
-
Li, Shibao, Ma, Chenxu, Zhang, Yunwu, Li, Longfei, Wang, Chengzhi, Cui, Xuerong, and Liu, Jianhang
- Subjects
- *
ARTIFICIAL intelligence , *DATA reduction - Abstract
Artificial intelligence has made significant breakthroughs in many fields, especially with the broad deployment of edge devices, which provides opportunities to develop and apply various intelligent models in edge networks. Edge device-server co-inference system has gradually become the mainstream of edge intelligent computing. However, the existing feature procession works in the edge inference framework neglect the focus on whether features are important, and the processed features are still redundant, affecting the inference efficiency. In this paper, we propose a novel attention-based variable-size feature compression module to enhance edge systems' inference efficiency by leveraging input data's varying importance levels. First, a multi-scale attention mechanism is introduced, which operates jointly in the channel spatial to effectively compute importance weights from the intermediate output features of the edge devices. These weights are then utilized to assign different transmission probabilities, filtering out irrelevant feature data and prioritizing task-relevant information. Second, the new loss algorithm and progressive model training strategy are designed to optimize the proposed module, enabling the model to adapt to the reduced feature data gradually and effectively. Finally, experimental results on CIFAR-10 and ImageNet datasets demonstrate the effectiveness of our proposed solution, showcasing a significant reduction in the data output volume of edge devices and minimizing communication overhead while ensuring minimal loss in model accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. MFGAN: towards a generic multi-kernel filter based adversarial generator for image restoration.
- Author
-
Chahi, Abderrazak, Kas, Mohamed, Kajo, Ibrahim, and Ruichek, Yassine
- Abstract
In recent years, there has been a growing interest in the use of Generative Adversarial Networks (GANs). Thanks to their outstanding performance in image translation and generation, they play an increasingly important role in computer vision applications. Most approaches based on GAN focus on proposing task-specific auxiliary modules or loss functions that are tailored to address various challenges of a single application, but often do not perform better when evaluated to improve other image generation tasks. Moreover, the basic ResNet and U-Net based GAN generators reach their limits in many image restoration and enhancement use cases. Therefore, in this paper, we propose a generic GAN referred to as Multi-Kernel Filter-based Conditional Generative Adversarial Network (MFGAN). We develop a new GAN generator with multiple CNN streams to extract more relevant and discriminative features related to the studied task. The proposed MFNet generator consists of two CNN modules, feature extraction and feature compression, which are combined to connect both the GAN encoder and decoder. It considers the strengths of conventional layers at different scale levels with multi-kernel filtering to capture high to low feature frequencies that reflect the complex image degradations and structural image details. Extensive experiments on five challenging applications for image enhancement, image restoration, and infrared image translation demonstrate the superiority and effectiveness of the proposed MFGAN in removing image degradation and generating visually appealing fake images. Our MFGAN quantitatively outperforms both state-of-the-art GANs and other CNN-based architectures in all tested benchmarks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Local tri directional pattern (LTDP): a novel descriptor for face recognition in unconstrained conditions.
- Author
-
Karanwal, Shekhar
- Abstract
Plentiful of local descriptors has been reported based on Local Binary Pattern (LBP). LBP and most of them establishes a uniform coordination among neighbors and center pixel to develop its code. To be precise the meaning full information located in different directions are missed in the earlier research. In addition the magnitude features are minimal used in earlier research. The invented work develop a novel local descriptor for Face Recognition (FR) called Local Tri Directional Pattern (LTDP) in various unconstrained conditions, by eliminating these problems. LTDP captures direction features from 3 × 3 patch based on first order derivatives generated in clockwise, center and anticlockwise directions, for each neighborhood position of the 3 × 3 patch. The generated first order derivatives are then conceived by novel thresholding function to form the tri directional pattern. The tri directional pattern is further split into three binary patterns, which is further transformed into three LTDP codes by weights assignment and summing values. To increase more discriminativity two magnitude features are also proposed and integrated with the previously extracted features. Eventually all five LTDP codes are merged to develop the size of LTDP for single position. Further all the generated histograms are merged to develop LTDP feature size. Principal Component Analysis (PCA) and Fishers Linear Discriminant Analysis (FLDA) are used for feature compaction and matching is done by Support Vector Machines (SVMs). Results on ORL, GT, EYB and YB illustrates the efficacy of LTDP against the compared methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Loop Closure Detection Based on Compressed ConvNet Features in Dynamic Environments.
- Author
-
Jiang, Shuhai, Zhou, Zhongkai, and Sun, Shangjie
- Subjects
CONVOLUTIONAL neural networks ,FEATURE extraction - Abstract
In dynamic environments, convolutional neural networks (CNNs) often produce image feature maps with significant redundancy due to external factors such as moving objects and occlusions. These feature maps are inadequate as precise image descriptors for similarity measurement, hindering loop closure detection. Addressing this issue, this paper proposes feature compression of convolutional neural network output. The approach is detailed as follows: (1) employing ResNet152 as the backbone feature-extraction network, a Siamese neural network is constructed to enhance the efficiency of feature extraction; (2) utilizing KL transformation to extract principal components from the backbone network's output, thereby eliminating redundant information; (3) employing the compressed features as input for NetVLAD to construct a spatially informed feature descriptor for similarity measurement. Experimental results demonstrate that, on the New College dataset, the proposed improved method exhibits an approximately 9.98% enhancement in average accuracy compared to the original network. On the City Center dataset, there is an improvement of approximately 2.64%, with an overall increase of about 23.51% in time performance. These findings indicate that the enhanced ResNet152 performs better than the original network in environments with more moving objects and occlusions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Scaling Dimension
- Author
-
Ganter, Bernhard, Hanika, Tom, Hirth, Johannes, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Dürrschnabel, Dominik, editor, and López Rodríguez, Domingo, editor
- Published
- 2023
- Full Text
- View/download PDF
17. Image Recognition of Plants and Plant Diseases with Transfer Learning and Feature Compression
- Author
-
Ziȩba, Marcin, Przewłoka, Konrad, Grela, Michał, Szkoła, Kamil, Kuta, Marcin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mikyška, Jiří, editor, de Mulatier, Clélia, editor, Paszynski, Maciej, editor, Krzhizhanovskaya, Valeria V., editor, Dongarra, Jack J., editor, and Sloot, Peter M.A., editor
- Published
- 2023
- Full Text
- View/download PDF
18. Fused Local Color Pattern (FLCP): A Novel Color Descriptor for Face Recognition
- Author
-
Karanwal, Shekhar, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Hanne, Thomas, editor, Gandhi, Niketa, editor, Manghirmalani Mishra, Pooja, editor, Bajaj, Anu, editor, and Siarry, Patrick, editor
- Published
- 2023
- Full Text
- View/download PDF
19. Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy
- Author
-
Huang, Shiyuan, Piramuthu, Robinson, Chang, Shih-Fu, Sigurdsson, Gunnar A., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlinsky, Leonid, editor, Michaeli, Tomer, editor, and Nishino, Ko, editor
- Published
- 2023
- Full Text
- View/download PDF
20. Slimmable Multi-Task Image Compression for Human and Machine Vision
- Author
-
Jiangzhong Cao, Ximei Yao, Huan Zhang, Jian Jin, Yun Zhang, and Bingo Wing-Kuen Ling
- Subjects
Image compression ,feature compression ,collaborative compression ,intelligent analytics ,machine vision ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
In the Internet of Things (IoT) communications, visual data are frequently processed among intelligent devices using artificial intelligence algorithms, replacing humans for analysis and decision-making while only occasionally requiring human scrutiny. However, due to high redundancy of compressive encoders, existing image coding solutions for machine vision are inefficient at runtime. To balance the rate-accuracy performance and efficiency of image compression for machine vision while attaining high-quality reconstructed images for human vision, this paper introduces a novel slimmable multi-task compression framework for human and machine vision in visual IoT applications. Firstly, image compression for human and machine vision under the constraint of bandwidth, latency, and computational resources is modeled as a multi-task optimization problem. Secondly, slimmable encoders are employed for multiple human and machine vision tasks in which the parameters of the sub-encoder for machine vision tasks are shared among all tasks and jointly learned. Thirdly, to solve the feature match between latent representation and intermediate features of deep vision networks, feature transformation networks are introduced as decoders of machine vision feature compression. Finally, the proposed framework is successfully applied to human and machine vision tasks’ scenarios, e.g., object detection and image reconstruction. Experimental results show that the proposed method outperforms baselines and other image compression approaches on machine vision tasks with higher efficiency (shorter latency) in two vision tasks’ scenarios while retaining comparable quality on image reconstruction.
- Published
- 2023
- Full Text
- View/download PDF
21. A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding
- Author
-
Jung-Heum Kang, Muhammad Salman Ali, Hye-Won Jeong, Chang-Kyun Choi, Younhee Kim, Se Yoon Jeong, Sung-Ho Bae, and Hui Yong Kim
- Subjects
Versatile video codec ,video coding for machine ,feature compression ,deep neural network ,super resolution ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Recently, video and image compression methods using neural networks have received much attention. In MPEG standardization, Video Coding for Machine (VCM) is a newly arising topic which attempts to compress features/images for the purpose of machine vision tasks. Especially, compressing features has advantages in terms of privacy protection and computation off-loading. In this paper, we propose an effective feature compression method equipped with a super-resolution (SR) module for features. Our main motivation comes from the observation that features are somewhat robust to spatial distortions (e.g., AWGN, blur, quantization distortions, coding artifacts), which leads us to integrating an SR module into the compression framework. We also further explore the best training strategy of the proposed method, i.e., finding the best combination of various losses and proper input feature shapes. Our comprehensive experiments show that the proposed method outperforms the baseline in the original VCM anchor scenario on various QP values with Versatile Video Coding (VVC). Specifically, the proposed framework achieved up to 50% BD-rate reduction compared to the conventional P-layer feature map compression method for the object detection task on the OpenImage dataset.
- Published
- 2023
- Full Text
- View/download PDF
22. Compression of Multiscale Features of FPN with Channel-Wise Reduction for VCM.
- Author
-
Kim, Dong-Ha, Yoon, Yong-Uk, Han, Gyu-Woong, Oh, Byung Tae, and Kim, Jae-Gon
- Subjects
DEEP learning ,VIDEO compression ,COMPUTER vision ,VIDEO coding ,IMAGE compression ,VIDEO surveillance ,SMART cities - Abstract
With the development of deep learning technology and the abundance of sensors, machine vision applications that utilize vast amounts of image/video data are rapidly increasing in the autonomous vehicle, video surveillance and smart city fields. However, achieving a more compact image/video representation and lower latency solutions is challenging for such machine-based applications. Therefore, it is essential to develop a more efficient video coding standard for machine vision applications. Currently, the Moving Picture Experts Group (MPEG) is developing a new standard called video coding for machines (VCM) with two tracks, each mainly dealing with compression of the input image/video (Track 2) and compression of the features extracted from it (Track 1). In this paper, an enhanced multiscale feature compression (E-MSFC) method is proposed to efficiently compress multiscale features generated by a feature pyramid network (FPN), which is the backbone network of machine vision networks specified in the VCM evaluation framework. The proposed E-MSFC reduces the feature channels to be included in a single feature map and compresses the feature map using versatile video coding (VVC), the latest video standard, rather than the single stream feature compression (SSFC) module in the existing MSFC. In addition, the performance of the E-MSFC is further enhanced by adding a bottom-up structure to the multiscale feature fusion (MSFF) module, which performs the channel-wise reduction in the E-MSFC. Experimental results reveal that the proposed E-MSFC significantly outperforms the VCM image anchor with a BD-rate gain of up to 85.94%, which includes an additional gain of 0.96% achieved by the MSFF with the bottom-up structure. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Human Pose Estimation via an Ultra-Lightweight Pose Distillation Network.
- Author
-
Zhang, Shihao, Qiang, Baohua, Yang, Xianyi, Wei, Xuekai, Chen, Ruidong, and Chen, Lirui
- Subjects
DISTILLATION ,HUMAN beings - Abstract
Most current pose estimation methods have a high resource cost that makes them unusable in some resource-limited devices. To address this problem, we propose an ultra-lightweight end-to-end pose distillation network, which applies some helpful techniques to suitably balance the number of parameters and predictive accuracy. First, we designed a lightweight one-stage pose estimation network, which learns from an increasingly refined sequential expert network in an online knowledge distillation manner. Then, we constructed an ultra-lightweight re-parameterized pose estimation subnetwork that uses a multi-module design with weight sharing to improve the multi-scale image feature acquisition capability of the single-module design. When training was complete, we used the first re-parameterized module as the deployment network to retain the simple architecture. Finally, extensive experimental results demonstrated the detection precision and low parameters of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Color Multiscale Block-ZigZag LBP (CMB-ZZLBP): An Efficient and Discriminant Face Descriptor
- Author
-
Karanwal, Shekhar, Rushi Kumar, B., editor, Ponnusamy, S., editor, Giri, Debasis, editor, Thuraisingham, Bhavani, editor, Clifton, Christopher W., editor, and Carminati, Barbara, editor
- Published
- 2022
- Full Text
- View/download PDF
25. 深度学习模型中间层特征压缩技术综述.
- Author
-
汪 维, 徐 龙, and 陈 卓
- Subjects
- *
BANDWIDTH allocation , *DEEP learning , *VIDEO coding , *IMAGE compression - Abstract
As a new research hotspot in deep learning, intermediate deep feature compression has gotten a great deal of attention and has been applied to edge-cloud intelligent collaboration. This paper summarized the current state of research on the intermediate deep feature compression and analyzed the problems in the current methods. Firstly, it introduced three kinds of intermediate deep feature compression from aspects of image/video coding framework, channel bit allocation and network units. Then it did the comparisons of the data set performance between the three intermediate deep feature compression. Finally, this paper discussed the existing challenges and solutions in intermediate deep feature compression and looked forward to the feature trends of intermediate deep feature compression. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. An Efficient Compression Method for Lightning Electromagnetic Pulse Signal Based on Convolutional Neural Network and Autoencoder.
- Author
-
Guo, Jinhua, Wang, Jiaquan, Xiao, Fang, Zhou, Xiao, Liu, Yongsheng, and Ma, Qiming
- Subjects
- *
CONVOLUTIONAL neural networks , *ELECTROMAGNETIC pulses , *LIGHTNING , *OPTICAL disks , *PHOTOPLETHYSMOGRAPHY , *DATA transmission systems , *VIDEO coding - Abstract
Advances in technology have facilitated the development of lightning research and data processing. The electromagnetic pulse signals emitted by lightning (LEMP) can be collected by very low frequency (VLF)/low frequency (LF) instruments in real time. The storage and transmission of the obtained data is a crucial link, and a good compression method can improve the efficiency of this process. In this paper, a lightning convolutional stack autoencoder (LCSAE) model for compressing LEMP data was designed, which converts the data into low-dimensional feature vectors through the encoder part and reconstructs the waveform through the decoder part. Finally, we investigated the compression performance of the LCSAE model for LEMP waveform data under different compression ratios. The results show that the compression performance is positively correlated with the minimum feature of the neural network extraction model. When the compressed minimum feature is 64, the average coefficient of determination R 2 of the reconstructed waveform and the original waveform can reach 96.7%. It can effectively solve the problem regarding the compression of LEMP signals collected by the lightning sensor and improve the efficiency of remote data transmission. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Loop Closure Detection Based on Compressed ConvNet Features in Dynamic Environments
- Author
-
Shuhai Jiang, Zhongkai Zhou, and Shangjie Sun
- Subjects
loop closure detection ,visual slam ,feature compression ,convolutional neural network ,KL transformation ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
In dynamic environments, convolutional neural networks (CNNs) often produce image feature maps with significant redundancy due to external factors such as moving objects and occlusions. These feature maps are inadequate as precise image descriptors for similarity measurement, hindering loop closure detection. Addressing this issue, this paper proposes feature compression of convolutional neural network output. The approach is detailed as follows: (1) employing ResNet152 as the backbone feature-extraction network, a Siamese neural network is constructed to enhance the efficiency of feature extraction; (2) utilizing KL transformation to extract principal components from the backbone network’s output, thereby eliminating redundant information; (3) employing the compressed features as input for NetVLAD to construct a spatially informed feature descriptor for similarity measurement. Experimental results demonstrate that, on the New College dataset, the proposed improved method exhibits an approximately 9.98% enhancement in average accuracy compared to the original network. On the City Center dataset, there is an improvement of approximately 2.64%, with an overall increase of about 23.51% in time performance. These findings indicate that the enhanced ResNet152 performs better than the original network in environments with more moving objects and occlusions.
- Published
- 2023
- Full Text
- View/download PDF
28. Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
- Author
-
Robert A. Cohen, Hyomin Choi, and Ivan V. Bajic
- Subjects
Collaborative intelligence ,deep learning ,neural network compression ,feature compression ,quantization ,Electric apparatus and materials. Electric circuits. Electric networks ,TK452-454.4 - Abstract
In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a lightweight device such as a mobile phone or edge device, and the remaining portion of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel lightweight compression technique designed specifically to quantize and compress the features output by the intermediate layer of a split DNN, without requiring any retraining of the network weights. Mathematical models for estimating the clipping and quantization error of leaky-ReLU and ReLU activations at this intermediate layer are used to compute optimal clipping ranges for coarse quantization. A mathematical model for estimating the clipping and quantization error of leaky-ReLU activations at this intermediate layer is developed and used to compute optimal clipping ranges for coarse quantization. We also present a modified entropy-constrained design algorithm for quantizing clipped activations. When applied to popular object-detection and classification DNNs, we were able to compress the 32-bit floating point intermediate activations down to 0.6 to 0.8 bits, while keeping the loss in accuracy to less than 1%. When compared to HEVC, we found that the lightweight codec consistently provided better inference accuracy, by up to 1.3%. The performance and simplicity of this lightweight compression technique makes it an attractive option for coding an intermediate layer of a split neural network for edge/cloud applications.
- Published
- 2021
- Full Text
- View/download PDF
29. Feature Compression Applications of Genetic Algorithm.
- Author
-
Zou, Meiling, Jiang, Sirong, Wang, Fang, Zhao, Long, Zhang, Chenji, Bao, Yuting, Chen, Yonghao, and Xia, Zhiqiang
- Subjects
GENETIC algorithms ,SINGLE nucleotide polymorphisms ,CASSAVA ,POTATOES ,NANOTECHNOLOGY ,IMAGE compression - Abstract
With the rapid development of molecular breeding technology and many new varieties breeding, a method is urgently needed to identify different varieties accurately and quickly. Using this method can not only help farmers feel convenient and efficient in the normal cultivation and breeding process but also protect the interests of breeders, producers and users. In this study, single nucleotide polymorphism (SNP) data of 533 Oryza sativa , 284 Solanum tuberosum and 247 Sus scrofa and 544 Manihot esculenta Crantz were used. The original SNPs were filtered and screened to remove the SNPs with deletion number more than 1% or the homozygous genotype 0/0 and 1/1 number less than 2. The correlation between SNPs were calculated, and the two adjacent SNPs with correlation R
2 > 0.95 were retained. The genetic algorithm program was developed to convert the genotype format and randomly combine SNPs to calculate a set of a small number of SNPs which could distinguish all varieties in different species as fingerprint data, using Matlab platform. The successful construction of three sets of fingerprints showed that the method developed in this study was effective in animals and plants. The population structure analysis showed that the genetic algorithm could effectively obtain the core SNPs for constructing fingerprints, and the fingerprint was practical and effective. At present, the two-dimensional code of Manihot esculenta Crantz fingerprint obtained by this method has been applied to field planting. This study provides a novel idea for the Oryza sativa , Solanum tuberosum , Sus scrofa and Manihot esculenta Crantz identification of various species, lays foundation for the cultivation and identification of new varieties, and provides theoretical significance for many other species fingerprints construction. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
30. Feature Compression Applications of Genetic Algorithm
- Author
-
Meiling Zou, Sirong Jiang, Fang Wang, Long Zhao, Chenji Zhang, Yuting Bao, Yonghao Chen, and Zhiqiang Xia
- Subjects
fingerprint ,DNA molecular markers ,SNP ,genetic algorithm ,feature compression ,Genetics ,QH426-470 - Abstract
With the rapid development of molecular breeding technology and many new varieties breeding, a method is urgently needed to identify different varieties accurately and quickly. Using this method can not only help farmers feel convenient and efficient in the normal cultivation and breeding process but also protect the interests of breeders, producers and users. In this study, single nucleotide polymorphism (SNP) data of 533 Oryza sativa, 284 Solanum tuberosum and 247 Sus scrofa and 544 Manihot esculenta Crantz were used. The original SNPs were filtered and screened to remove the SNPs with deletion number more than 1% or the homozygous genotype 0/0 and 1/1 number less than 2. The correlation between SNPs were calculated, and the two adjacent SNPs with correlation R2 > 0.95 were retained. The genetic algorithm program was developed to convert the genotype format and randomly combine SNPs to calculate a set of a small number of SNPs which could distinguish all varieties in different species as fingerprint data, using Matlab platform. The successful construction of three sets of fingerprints showed that the method developed in this study was effective in animals and plants. The population structure analysis showed that the genetic algorithm could effectively obtain the core SNPs for constructing fingerprints, and the fingerprint was practical and effective. At present, the two-dimensional code of Manihot esculenta Crantz fingerprint obtained by this method has been applied to field planting. This study provides a novel idea for the Oryza sativa, Solanum tuberosum, Sus scrofa and Manihot esculenta Crantz identification of various species, lays foundation for the cultivation and identification of new varieties, and provides theoretical significance for many other species fingerprints construction.
- Published
- 2022
- Full Text
- View/download PDF
31. A feature compression method based on similarity matching.
- Author
-
Jiang, Wei, Shen, Haoyu, Xu, Zitao, Yang, Cheng, and Yang, Junjie
- Subjects
- *
IMAGE recognition (Computer vision) , *BIT rate , *ALGORITHMS - Abstract
In collaborative intelligent applications, it is attractive to split the neural network into two parts. The front part is deployed on an edge device and the remaining part on the cloud side. It is a promising approach to compress the intermediate features for high efficiency. In this paper, a novel feature compression method based on feature similarity matching is presented. According to the statistical characteristics of features, the noninformative features that are not sensitive to the compression distortion are determined. The similarity degree between the noninformative features and other features is calculated and the feature with the high similarity degree will be removed in the encoder and replaced by the noninformative features in the decoder. In addition, transformation is introduced to transform the features in a compact form to reduce the statistical redundancy. Without loss of generality, the image classification is taken as intelligent task to validate the effectiveness of the proposed method. Experimental results show that the proposed method can achieve higher compression efficiency in various bit rate. In comparison with different feature compression algorithms, at least 24% bitrate saving is obtained. • A novel feature compression framework based on feature similarity matching. • The least important features in vision task's point of view are determined, and introduce the similarity matrix to measure the similarity degree. • 2D-DCT is introduced to further squeeze the redundance of the features. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A Cytokine Protein Identification Model Based on the Compressed PseKRAAC Features
- Author
-
Xing Gao and Guilin Li
- Subjects
MRMD ,feature compression ,cytokine identification ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Cytokine proteins, which form a complex cytokine regulatory network, participate in a variety of important physiological functions of the human body. Identification of cytokine proteins is very important and has attracted the attention of many researchers. In this paper, we propose a MRMD-cosine model based on the PseKRAAC features to identify the cytokine proteins. First, the PseKRAAC feature extraction method is used to extract four kinds of feature sets from the cytokine proteins, named type1 g-gap, type1 lambda, type2 g-gap and type2 lambda feature sets. Then the MRMD algorithm is used to remove the redundant features from the feature sets. Three kinds of metrics are used by the MRMD algorithm to measure the redundancy of a feature set, which are the Euclidean distance, Cosine similarity and Tanimoto coefficient. Bagging and random forest algorithms are used to construct the classification models based on the compressed feature set. The experimental results show that the MRMD-cosine model based on the type1 lambda feature set constructed by the random forest algorithm can achieve the best performance among all models. Finally, we compare the performance of the MRMD-cosine model with another state-of-art model, named greedy based feature compression model based on the CNT features. It shows that the MRMD-cosine model uses only 15% features of the greedy based model to achieve a better accuracy.
- Published
- 2020
- Full Text
- View/download PDF
33. The Feature Compression Algorithms for Identifying Cytokines Based on CNT Features
- Author
-
Guilin Li and Xing Gao
- Subjects
Cytokine identification ,feature compression ,feature selection ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
As the signaling proteins, cytokines regulate a wide range of biological functions. It is important to distinguish the cytokines from other kinds of proteins. The 188-Dimensional CNT features are presented to identify the cytokines, which contain many redundant features. In this paper, we propose three kinds of feature compression algorithms to exclude the redundant features from the 188D features and keep the accuracy of the algorithm at the same time. The three algorithms are called the genetic based algorithm, the greedy based algorithm and the brute-force based algorithm. Experimental results demonstrate that the brute-force based algorithm gets the highest classification accuracy among the three algorithms. The genetic based algorithm achieves the least number of compressed features among the three algorithms. But they consume much more time than that consumed by the greedy based algorithm. The greedy based algorithm makes a good trade-off among the three factors, which are the classification accuracy, the number of compressed features and the time consumption.
- Published
- 2020
- Full Text
- View/download PDF
34. A Multimodal End-to-End Deep Learning Architecture for Music Popularity Prediction
- Author
-
David Martin-Gutierrez, Gustavo Hernandez Penaloza, Alberto Belmonte-Hernandez, and Federico Alvarez Garcia
- Subjects
Multimedia information retrieval systems ,autoencoders ,deep learning ,feature compression ,music information retrieval ,popularity prediction ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The continuous evolution of multimedia applications is fostering applied research in order to dynamically enhance the services provided by platforms such as Spotify, Lastfm, or Billboard. Thus, innovative methods for retrieving specific information from large volumes of data related with music arises as a potential challenge within the Music Information Retrieval (MIR) framework. Moreover, despite the existence of several musical-based datasets, there is still a lack of information to properly assess an accurate estimation of the impact or the popularity of a song within a platform. Furthermore, the aforementioned platforms measure the popularity in various manners, thus increasing the difficulties in performing generalized and comparable models. In this paper, the creation of SpotGenTrack Popularity Dataset (SPD) is presented as an alternative solution to existing datasets that will facilitate researchers when comparing and promoting their models. In addition, an innovative multimodal end-to-end Deep Learning architecture named as HitMusicNet is presented for predicting popularity in music recordings. Experiments conducted show that the proposed architecture outperforms previous studies in the State-of-the-Art by incorporating three main modalities to the analysis, such as audio, lyrics and meta-data as well as a preliminary compression stage via autoencoder to better the capability of the model when predicting the popularity.
- Published
- 2020
- Full Text
- View/download PDF
35. Feature-Compression-Based Detection of Sea-Surface Small Targets
- Author
-
Penglang Shui, Zixun Guo, and Sainan Shi
- Subjects
High-resolution maritime ubiquitous radars ,sea-surface small targets ,feature compression ,convexhull learning ,one-class classifier ,feature-compression-based detector ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This paper aims to develop a feature-based detector using seven existing salient features of radar returns to improve the detection ability of high-resolution maritime ubiquitous radars to sea-surface small targets. Maritime ubiquitous radars form simultaneously dwelling beams at multiple azimuths by digital array receiver and allow long observation time for detection. Due to absence or incompletion of training samples of radar returns with various types of sea-surface small targets, the detection boils down to designing a one-class classifier in the seven-dimensional (7D) feature space mainly by using training samples of sea clutter. A feature compression method, though maximizing interclass Bhattacharyya distance, is proposed to compress the 7D feature vector into one 3D feature vector with the help of simulated radar returns of typical targets. In the compressed 3D feature space, a modified convexhull learning algorithm is given to determine one convex polyhedron decision region of sea clutter at a given false alarm rate. In this way, a feature-compression-based detector is constructed, which can exploit more features of radar returns to improve detection performance. It is verified by the recognized and open IPIX and CSIR radar databases for sea-surface small target detection. The results show that it attains obvious performance improvement.
- Published
- 2020
- Full Text
- View/download PDF
36. Large Scale Image Retrieval and Its Challenges
- Author
-
Azim, Tayyaba, Ahmed, Sarah, Zdonik, Stan, Series Editor, Shekhar, Shashi, Series Editor, Wu, Xindong, Series Editor, Jain, Lakhmi C., Series Editor, Padua, David, Series Editor, Shen, Xuemin Sherman, Series Editor, Furht, Borko, Series Editor, Subrahmanian, V.S., Series Editor, Hebert, Martial, Series Editor, Ikeuchi, Katsushi, Series Editor, Siciliano, Bruno, Series Editor, Jajodia, Sushil, Series Editor, Lee, Newton, Series Editor, Azim, Tayyaba, and Ahmed, Sarah
- Published
- 2018
- Full Text
- View/download PDF
37. Condensing Deep Fisher Vectors: To Choose or to Compress?
- Author
-
Ahmed, Sarah, Azim, Tayyaba, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, De Marsico, Maria, editor, di Baja, Gabriella Sanniti, editor, and Fred, Ana, editor
- Published
- 2018
- Full Text
- View/download PDF
38. Residual Compression Network for Faster Correlation Tracking
- Author
-
Xie, Chao, Wang, Ning, Zhou, Wengang, Li, Weiping, Li, Houqiang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Hong, Richang, editor, Cheng, Wen-Huang, editor, Yamasaki, Toshihiko, editor, Wang, Meng, editor, and Ngo, Chong-Wah, editor
- Published
- 2018
- Full Text
- View/download PDF
39. Compact Deep Color Features for Remote Sensing Scene Classification.
- Author
-
Anwer, Rao Muhammad, Khan, Fahad Shahbaz, and Laaksonen, Jorma
- Subjects
REMOTE sensing ,DEEP learning ,CONVOLUTIONAL neural networks ,IMAGE representation ,CLASSIFICATION ,TEXT recognition - Abstract
Aerial scene classification is a challenging problem in understanding high-resolution remote sensing images. Most recent aerial scene classification approaches are based on Convolutional Neural Networks (CNNs). These CNN models are trained on a large amount of labeled data and the de facto practice is to use RGB patches as input to the networks. However, the importance of color within the deep learning framework is yet to be investigated for aerial scene classification. In this work, we investigate the fusion of several deep color models, trained using color representations, for aerial scene classification. We show that combining several deep color models significantly improves the recognition performance compared to using the RGB network alone. This improvement in classification performance is, however, achieved at the cost of a high-dimensional final image representation. We propose to use an information theoretic compression approach to counter this issue, leading to a compact deep color feature set without any significant loss in accuracy. Comprehensive experiments are performed on five remote sensing scene classification benchmarks: UC-Merced with 21 scene classes, WHU-RS19 with 19 scene types, RSSCN7 with 7 categories, AID with 30 aerial scene classes, and NWPU-RESISC45 with 45 categories. Our results clearly demonstrate that the fusion of deep color features always improves the overall classification performance compared to the standard RGB deep features. On the large-scale NWPU-RESISC45 dataset, our deep color features provide a significant absolute gain of 4.3% over the standard RGB deep features. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics.
- Author
-
Duan, Lingyu, Liu, Jiaying, Yang, Wenhan, Huang, Tiejun, and Gao, Wen
- Subjects
- *
COMPUTER vision , *VIDEO compression , *VIDEO coding , *VISUAL perception , *STREAMING video & television , *IMAGE compression , *DEEP learning - Abstract
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this article, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well. 1 https://lists.aau.at/mailman/listinfo/mpeg-vcm [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
41. Toward Intelligent Sensing: Intermediate Deep Feature Compression.
- Author
-
Chen, Zhuo, Fan, Kui, Wang, Shiqi, Duan, Lingyu, Lin, Weisi, and Kot, Alex Chichung
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *VIDEO coding - Abstract
The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized top-layer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features with high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layer features. We also present the results for evaluations of both lossless and lossy deep feature compression, which provide meaningful investigations and baselines for future research and standardization activities. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
42. IF-CNN: Image-Aware Inference Framework for CNN With the Collaboration of Mobile Devices and Cloud
- Author
-
Guansheng Shu, Weiqing Liu, Xiaojie Zheng, and Jing Li
- Subjects
CNN-based mobile applications ,IF-CNN ,model selection ,half-floating optimization ,feature compression ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Improving the performance of CNN-based mobile applications by offloading its computation from mobile devices to the cloud has attracted the attention of the community. Generally, there are three stages in the workflow, including local inference on the mobile device, data transmission of the intermediate result, and remote inference in the cloud. However, the time cost of local inference and data transmission are still the bottleneck in reaching the desirable inference performance. In this paper, we propose an image-aware inference framework called IF-CNN to enable fast inference based on computation offloading. In the framework, we first build a model pool consisting of CNN models with different complexities. The most efficient one from such candidate models is selected to process the corresponding image. During the selection process, we have designed an effective model to predict the confidence based on multi-task learning. After model selection, half-floating optimization and feature compression are applied to accelerate the process of distributed inference between mobile devices and cloud. Experimental results show that IF-CNN is credible to identify the most effective model for different images and the total inference performance could be significantly improved. Meanwhile, IF-CNN is complementary to other inference acceleration methods of CNN models.
- Published
- 2018
- Full Text
- View/download PDF
43. Human Pose Estimation via an Ultra-Lightweight Pose Distillation Network
- Author
-
Chen, Shihao Zhang, Baohua Qiang, Xianyi Yang, Xuekai Wei, Ruidong Chen, and Lirui
- Subjects
ultra-lightweight pose estimation ,knowledge distillation ,re-parameterized module ,end-to-end ,feature compression - Abstract
Most current pose estimation methods have a high resource cost that makes them unusable in some resource-limited devices. To address this problem, we propose an ultra-lightweight end-to-end pose distillation network, which applies some helpful techniques to suitably balance the number of parameters and predictive accuracy. First, we designed a lightweight one-stage pose estimation network, which learns from an increasingly refined sequential expert network in an online knowledge distillation manner. Then, we constructed an ultra-lightweight re-parameterized pose estimation subnetwork that uses a multi-module design with weight sharing to improve the multi-scale image feature acquisition capability of the single-module design. When training was complete, we used the first re-parameterized module as the deployment network to retain the simple architecture. Finally, extensive experimental results demonstrated the detection precision and low parameters of our method.
- Published
- 2023
- Full Text
- View/download PDF
44. Fast and Accurate Lane Detection via Graph Structure and Disentangled Representation Learning
- Author
-
Yulin He, Wei Chen, Chen Li, Xin Luo, and Libo Huang
- Subjects
lane detection ,graph structure ,feature compression ,disentangled representation learning ,Chemical technology ,TP1-1185 - Abstract
It is desirable to maintain high accuracy and runtime efficiency at the same time in lane detection. However, due to the long and thin properties of lanes, extracting features with both strong discrimination and perception abilities needs a huge amount of calculation, which seriously slows down the running speed. Therefore, we design a more efficient way to extract the features of lanes, including two phases: (1) Local feature extraction, which sets a series of predefined anchor lines, and extracts the local features through their locations. (2) Global feature aggregation, which treats local features as the nodes of the graph, and builds a fully connected graph by adaptively learning the distance between nodes, the global feature can be aggregated through weighted summing finally. Another problem that limits the performance is the information loss in feature compression, mainly due to the huge dimensional gap, e.g., from 512 to 8. To handle this issue, we propose a feature compression module based on decoupling representation learning. This module can effectively learn the statistical information and spatial relationships between features. After that, redundancy is greatly reduced and more critical information is retained. Extensional experimental results show that our proposed method is both fast and accurate. On the Tusimple and CULane benchmarks, with a running speed of 248 FPS, F1 values of 96.81% and 75.49% were achieved, respectively.
- Published
- 2021
- Full Text
- View/download PDF
45. BBNet: A Novel Convolutional Neural Network Structure in Edge-Cloud Collaborative Inference
- Author
-
Hongbo Zhou, Weiwei Zhang, Chengwei Wang, Xin Ma, and Haoran Yu
- Subjects
collaborative intelligence ,deep learning ,model compression ,feature compression ,cloud computing ,Chemical technology ,TP1-1185 - Abstract
Edge-cloud collaborative inference can significantly reduce the delay of a deep neural network (DNN) by dividing the network between mobile edge and cloud. However, the in-layer data size of DNN is usually larger than the original data, so the communication time to send intermediate data to the cloud will also increase end-to-end latency. To cope with these challenges, this paper proposes a novel convolutional neural network structure—BBNet—that accelerates collaborative inference from two levels: (1) through channel-pruning: reducing the number of calculations and parameters of the original network; (2) through compressing the feature map at the split point to further reduce the size of the data transmitted. In addition, This paper implemented the BBNet structure based on NVIDIA Nano and the server. Compared with the original network, BBNet’s FLOPs and parameter achieve up to 5.67× and 11.57× on the compression rate, respectively. In the best case, the feature compression layer can reach a bit-compression rate of 512×. Compared with the better bandwidth conditions, BBNet has a more obvious inference delay when the network conditions are poor. For example, when the upload bandwidth is only 20 kb/s, the end-to-end latency of BBNet is increased by 38.89× compared with the cloud-only approach.
- Published
- 2021
- Full Text
- View/download PDF
46. A pattern-based validation method for the credibility evaluation of simulation models.
- Author
-
Laili, Yuanjun, Zhang, Lin, and Luo, Yongliang
- Subjects
- *
SIMULATION methods & models , *EVALUATION methodology , *INFORMATION modeling , *DYNAMIC models , *PATTERN matching - Abstract
Measuring the credibility of a simulation model has always been challenging due to its growing uncertainty and complexity. During the past decades, plenty of metrics and evaluation procedures have been developed for evaluating different sorts of simulation models. Most of the existing research focuses on the direct comparison of numerical results with a group of reference data. However, it is sometimes unsuitable for evolving dynamic models such as the multi-agent models. With the same condition, both the practical system and the simulation model perform highly dynamic actions. The credibility of the model with insufficient information, non-stationary states and changing environment is unable to acquire through a direct pair comparison. This paper presents a pattern-based validation method to complementarily extract hidden patterns that exist in both a simulation model and its reference data, and assess the model performance in a different aspect. Firstly, multi-dimensional perceptually important points strategy is modified to find the patterns exist in time-serial data. Afterward, a pattern organizing topology is applied to automatically depict required pattern from reference data and assess the performance of the corresponding simulation model. The extensive case study on three simulation models shows the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. Joint Feature and Texture Coding: Toward Smart Video Representation via Front-End Intelligence.
- Author
-
Ma, Siwei, Zhang, Xiang, Wang, Shiqi, Zhang, Xinfeng, Jia, Chuanmin, and Wang, Shanshe
- Subjects
- *
VIDEO coding , *ARTIFICIAL intelligence , *VIDEO surveillance , *TEXTURES , *DEEP learning , *VIDEO compression , *VIDEOS , *BRIDGES (Dentistry) - Abstract
In this paper, we provide a systematical overview and analysis on the joint feature and texture representation framework, which aims to smartly and coherently represent the visual information with the front-end intelligence in the scenario of video big data applications. In particular, we first demonstrate the advantages of the joint compression framework in terms of both reconstruction quality and analysis accuracy. Subsequently, the interactions between visual feature and texture in the compression process are further illustrated. Finally, the future joint coding scheme by incorporating the deep learning features is envisioned, and future challenges toward seamless and unified joint compression are discussed. The joint compression framework, which bridges the gap between visual analysis and signal-level representation, is expected to contribute to a series of applications, such as video surveillance and autonomous driving. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
48. TOP-SIFT: the selected SIFT descriptor based on dictionary learning.
- Author
-
Liu, Yujie, Yu, Deng, Chen, Xiaoming, Li, Zongmin, and Fan, Jianping
- Subjects
- *
SIMULATED annealing , *DESCRIPTOR systems , *IMAGE databases , *FEATURE selection , *LEARNING - Abstract
The large amount of SIFT descriptors in an image and the high dimensionality of SIFT descriptor have made problems for the large-scale image database in terms of speed and scalability. In this paper, we present a descriptor selection algorithm based on dictionary learning to remove the redundant features and reserve only a small set of features, which we refer to as TOP-SIFTs. During the experiment, we discovered the inner relativity between the problem of descriptor selection and dictionary learning in sparse representation, and then turned our problem into dictionary learning. We designed a new dictionary learning method to adapt our problem and employed the simulated annealing algorithm to obtain the optimal solution. During the process of learning, we added the sparsity constraint and spatial distribution characteristic of SIFT points. And lastly selected the small representative feature set with good spatial distribution. Compared with the earlier methods, our method is neither relying on the database nor losing important information, and the experiments have shown that our algorithm can save memory space a lot and increase time efficiency while maintaining the accuracy as well. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
49. LCO: Lightweight Convolution Operators for fast tracking.
- Author
-
Li, Dongdong, Wen, Gongjian, Kuai, Yangliu, and Hui, Bingwei
- Subjects
- *
OPERATOR theory , *IMAGE compression , *GRAPHICS processing units , *MATHEMATICAL optimization , *STATISTICAL correlation , *PARAMETERS (Statistics) - Abstract
Abstract In recent years, Discriminative Correlation Filters (DCFs) based trackers have achieved continuous performance improvement due to sophisticated learning models (e.g. HCF [1]) or multiple feature integration (e.g. CCOT [2]). However, the increasingly complex model introduces a massive number of trainable parameters in the correlation filter. This significantly slows down the tracking speed and increases the risk of over-fitting. In this work, we tackle the problems of model complexity and over-fitting by introducing Lightweight Convolution Operators (LCO). Our LCO tracker performs dimensionality reduction and spatial constraints on the correlation filters to reduce the model complexity and accelerate the tracking speed. Compared with the baseline method, LCO reduces over 90% of the redundant trainable parameters in the tracking model. We perform experiments on three benchmarks: OTB2013, OTB100 and VOT2016. On OTB100, LCO runs at 24 fps with hand-crafted features on CPU and at 30 fps with shallow convolutional features on GPU. With shallow convolutional features, LCO obtains 65.8% in AUC of the success plots on OTB100. On VOT2016, our tracker ranks second in Expected Average Overlap (EAO) and first in Equivalent Filter Operations (EFO) compared with the top 5 trackers. Highlights • This is the first work that conceptually combines the ideas in BACF and ECO. • We propose an optimization strategy for efficient online filter learning. • LCO scarcely suffers from residual boundary effects in circular correlation. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Exploiting representations from pre-trained convolutional neural networks for high-resolution remote sensing image retrieval.
- Author
-
Ge, Yun, Jiang, Shunliang, Xu, Qingyong, Jiang, Changlong, and Ye, Famao
- Subjects
ARTIFICIAL neural networks ,REMOTE sensing ,IMAGE retrieval ,IMAGE archives ,FEATURE extraction ,IMAGE representation - Abstract
With the increasing amount of high-resolution remote sensing images, it becomes more and more urgent to retrieve remote sensing images from large archives efficiently. The existing methods are mainly based on shallow features to retrieve images, while shallow features are easily affected by artificial intervention. Recently, convolutional neural networks (CNNs) are capable of learning feature representations automatically, and CNNs pre-trained on large-scale datasets are generic. This paper exploits representations from pre-trained CNNs for high-resolution remote sensing image retrieval. CNN representations from AlexNet, VGGM, VGG16, and GoogLeNet are first transferred for high-resolution remote sensing images, and then CNN features are extracted via two approaches. One is extracting the outputs of high-level layers directly and the other is aggregating the outputs of mid-level layers by means of average pooling with different pooling regions. Given the generalization and high dimensionality of the CNN features, feature combination and feature compression are also adopted to improve the feature representation. Experimental results demonstrate that aggregated features with pooling region smaller than the feature map size perform excellently, especially for VGG16 and GoogLeNet. Shallow feature makes a great contribution to enhance the retrieval precision when combined with CNN features, and compressed features reduce redundancy effectively. Compared with the state-of-the-art methods, the proposed feature extraction methods are very simple, and the features are able to improve retrieval performance significantly. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.