Author: "Jiang, Shuqiang" / Topic: image recognition (computer vision) - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jiang, Shuqiang"' showing total 8 results

Start Over Author "Jiang, Shuqiang" Topic image recognition (computer vision)

8 results on '"Jiang, Shuqiang"'

1. Lightweight Food Recognition via Aggregation Block and Feature Encoding.

Author: Yang, Yancun, Min, Weiqing, Song, Jingru, Sheng, Guorui, Wang, Lili, and Jiang, Shuqiang
Subjects: IMAGE recognition (Computer vision), DATA mining, SOURCE code, ENCODING
Abstract: Food image recognition has recently been given considerable attention in the multimedia field in light of its possible implications on health. The characteristics of the dispersed distribution of ingredients in food images put forward higher requirements on the long-range information extraction ability of neural networks, leading to more complex and deeper models. Nevertheless, the lightweight version of food image recognition is essential for improved implementation on end devices and sustained server-side expansion. To address this issue, we present Aggregation Feature Net (AFNet), a lightweight network that is capable of effectively capturing both global and local features from food images. In AFNet, we develop a novel convolution based on a residual model by encoding global features through row-wise and column-wise information integration. Merging aggregation block with classic local convolution yields a framework that works as the backbone of the network. Based on the efficient use of parameters by the aggregation block, we constructed a lightweight food image recognition network with fewer layers and a smaller scale, assisted by a new type of activation function. Experimental results on four popular food recognition datasets demonstrate that our approach achieves state-of-the-art performance with higher accuracy and fewer FLOPs and parameters. For example, in comparison to the current state-of-the-art model of MobileViTv2, AFNet achieved 88.4% accuracy of the top-1 level on the ETHZ Food-101 dataset, with similar parameters and FLOPs but 1.4% more accuracy. The source code will be provided in supplementary materials. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Ingredient-Guided Region Discovery and Relationship Modeling for Food Category-Ingredient Prediction.

Author: Wang, Zhiling, Min, Weiqing, Li, Zhuo, Kang, Liping, Wei, Xiaoming, Wei, Xiaolin, and Jiang, Shuqiang
Subjects: FOOD composition, IMAGE recognition (Computer vision), DATA visualization, DIET therapy
Abstract: Recognizing the category and its ingredient composition from food images facilitates automatic nutrition estimation, which is crucial to various health relevant applications, such as nutrition intake management and healthy diet recommendation. Since food is composed of ingredients, discovering ingredient-relevant visual regions can help identify its corresponding category and ingredients. Furthermore, various ingredient relationships like co-occurrence and exclusion are also critical for this task. For that, we propose an ingredient-oriented multi-task food category-ingredient joint learning framework for simultaneous food recognition and ingredient prediction. This framework mainly involves learning an ingredient dictionary for ingredient-relevant visual region discovery and building an ingredient-based semantic-visual graph for ingredient relationship modeling. To obtain ingredient-relevant visual regions, we build an ingredient dictionary to capture multiple ingredient regions and obtain the corresponding assignment map, and then pool the region features belonging to the same ingredient to identify the ingredients more accurately and meanwhile improve the classification performance. For ingredient-relationship modeling, we utilize the visual ingredient representations as nodes and the semantic similarity between ingredient embeddings as edges to construct an ingredient graph, and then learn their relationships via the graph convolutional network to make label embeddings and visual features interact with each other to improve the performance. Finally, fused features from both ingredient-oriented region features and ingredient-relationship features are used in the following multi-task category-ingredient joint learning. Extensive evaluation on three popular benchmark datasets (ETH Food-101, Vireo Food-172 and ISIA Food-200) demonstrates the effectiveness of our method. Further visualization of ingredient assignment maps and attention maps also shows the superiority of our method. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

3. Plant Disease Recognition: A Large-Scale Benchmark Dataset and a Visual Region and Loss Reweighting Approach.

Author: Liu, Xinda, Min, Weiqing, Mei, Shuhuan, Wang, Lili, and Jiang, Shuqiang
Subjects: IMAGE processing, PLANT diseases, IMAGE recognition (Computer vision), AGRICULTURAL productivity, RECOGNITION (Psychology), HEBBIAN memory, FEATURE extraction
Abstract: Plant disease diagnosis is very critical for agriculture due to its importance for increasing crop production. Recent advances in image processing offer us a new way to solve this issue via visual plant disease analysis. However, there are few works in this area, not to mention systematic researches. In this paper, we systematically investigate the problem of visual plant disease recognition for plant disease diagnosis. Compared with other types of images, plant disease images generally exhibit randomly distributed lesions, diverse symptoms and complex backgrounds, and thus are hard to capture discriminative information. To facilitate the plant disease recognition research, we construct a new large-scale plant disease dataset with 271 plant disease categories and 220,592 images. Based on this dataset, we tackle plant disease recognition via reweighting both visual regions and loss to emphasize diseased parts. We first compute the weights of all the divided patches from each image based on the cluster distribution of these patches to indicate the discriminative level of each patch. Then we allocate the weight to each loss for each patch-label pair during weakly-supervised training to enable discriminative disease part learning. We finally extract patch features from the network trained with loss reweighting, and utilize the LSTM network to encode the weighed patch feature sequence into a comprehensive feature representation. Extensive evaluations on this dataset and another public dataset demonstrate the advantage of the proposed method. We expect this research will further the agenda of plant disease recognition in the community of image processing. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

4. Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition.

Author: Jiang, Shuqiang, Min, Weiqing, Liu, Linhu, and Luo, Zhengdong
Subjects: *COMPUTER vision, *OBJECT recognition (Computer vision), *SPATIAL arrangement, *IMAGE processing, *IMAGE recognition (Computer vision)
Abstract: Recently, food recognition has received more and more attention in image processing and computer vision for its great potential applications in human health. Most of the existing methods directly extracted deep visual features via convolutional neural networks (CNNs) for food recognition. Such methods ignore the characteristics of food images and are, thus, hard to achieve optimal recognition performance. In contrast to general object recognition, food images typically do not exhibit distinctive spatial arrangement and common semantic patterns. In this paper, we propose a multi-scale multi-view feature aggregation (MSMVFA) scheme for food recognition. MSMVFA can aggregate high-level semantic features, mid-level attribute features, and deep visual features into a unified representation. These three types of features describe the food image from different granularity. Therefore, the aggregated features can capture the semantics of food images with the greatest probability. For that solution, we utilize additional ingredient knowledge to obtain mid-level attribute representation via ingredient-supervised CNNs. High-level semantic features and deep visual features are extracted from class-supervised CNNs. Considering food images do not exhibit distinctive spatial layout in many cases, MSMVFA fuses multi-scale CNN activations for each type of features to make aggregated features more discriminative and invariable to geometrical deformation. Finally, the aggregated features are more robust, comprehensive, and discriminative via two-level fusion, namely multi-scale fusion for each type of features and multi-view aggregation for different types of features. In addition, MSMVFA is general and different deep networks can be easily applied into this scheme. Extensive experiments and evaluations demonstrate that our method achieves state-of-the-art recognition performance on three popular large-scale food benchmark datasets in Top-1 recognition accuracy. Furthermore, we expect this paper will further the agenda of food recognition in the community of image processing and computer vision. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

5. Class Agnostic Image Common Object Detection.

Author: Jiang, Shuqiang, Liang, Sisi, Chen, Chengpeng, Zhu, Yaohui, and Li, Xiangyang
Subjects: *COMPUTER vision, *IMAGE registration, *IMAGE, *IMAGE recognition (Computer vision)
Abstract: Learning similarity of two images is an important problem in computer vision and has many potential applications. Most of the previous works focus on generating image similarities in three aspects: global feature distance computing, local feature matching, and image concepts comparison. However, the task of directly detecting the class agnostic common objects from two images has not been studied before, which goes one step further to capture image similarities at the region level. In this paper, we propose an end-to-end image Common Object Detection Network (CODN) to detect class agnostic common objects from two images. The proposed method consists of two main modules: locating module and matching module. The locating module generates candidate proposals of each two images. The matching module learns the similarities of the candidate proposal pairs from two images, and refines the bounding boxes of the candidate proposals. The learning procedure of CODN is implemented in an integrated way and a multi-task loss is designed to guarantee both region localization and common object matching. Experiments are conducted on PASCAL VOC 2007 and COCO 2014 datasets. The experimental results validate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

6. A survey on context-aware mobile visual recognition.

Author: Min, Weiqing, Jiang, Shuqiang, Wang, Shuhui, Xu, Ruihan, Cao, Yushan, Herranz, Luis, and He, Zhiqiang
Subjects: *MOBILE communication systems, *CELL phones, *SMARTPHONES, *AUGMENTED reality, *IMAGE recognition (Computer vision)
Abstract: The phenomenal growth of the usage of mobile devices (e.g., mobile phones and tablet PCs) opens up a new service, namely mobile visual recognition, which has been widely used in many areas, such as mobile shopping and augmented reality. The rich contextual information (e.g., location, time and direction information), easily acquired by the mobile devices, provides useful clues to facilitate mobile visual recognition, including speeding up the recognition time and improving the recognition performance. This survey focuses on recent advances in Context-Aware Mobile Visual Recognition (CAMVR) and reviews related work regarding to different contextual information, recognition methods, recognition types, and various application scenarios. Finally, we discuss future research directions in this field. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

7. Category co-occurrence modeling for large scale scene recognition.

Author: Song, Xinhang, Jiang, Shuqiang, Herranz, Luis, Kong, Yan, and Zheng, Kai
Subjects: *LARGE scale systems, *PATTERN recognition systems, *FEATURE extraction, *SEMANTICS, *IMAGE recognition (Computer vision)
Abstract: Scene recognition involves complex reasoning from low-level local features to high-level scene categories. The large semantic gap motivates that most methods model scenes resorting to mid-level representations (e.g. objects, topics). However, this implies an additional mid-level vocabulary and has implications in training and inference. In contrast, the semantic multinomial (SMN) represents patches directly in the scene-level semantic space, which leads to ambiguity when aggregated to a global image representation. Fortunately, this ambiguity appears in the form of scene category co-occurrences which can be modeled a posteriori with a classifier. In this paper we observe that these patterns are essentially local rather than global, sparse, and consistent across SMNs obtained from multiple visual features. We propose a co-occurrence modeling framework where we exploit all these patterns jointly in a common semantic space, combining both supervised and unsupervised learning. Based on this framework we can integrate multiple features and design embeddings for large scale recognition directly in the scene-level space. Finally, we use the co-occurrence modeling framework to develop new scene representations, which experiments show that outperform previous SMN-based representations. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

8. Guest editorial: mobile visual tagging with mobile context.

Author: Jiang, Shuqiang, Cao, Liangliang, Sang, Jitao, Luo, Jiebo, and Jain, Ramesh
Subjects: *TAGS (Metadata), *MOBILE communication systems, *IMAGE recognition (Computer vision)
Abstract: An introduction to this special issue of the journal is presented in which the editors reflect on the changes in communication capabilities brought by mobile devices, mobile visual tagging of images and videos, and image recognition techniques.
Published: 2017
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"Jiang, Shuqiang"'

1. Lightweight Food Recognition via Aggregation Block and Feature Encoding.

2. Ingredient-Guided Region Discovery and Relationship Modeling for Food Category-Ingredient Prediction.

3. Plant Disease Recognition: A Large-Scale Benchmark Dataset and a Visual Region and Loss Reweighting Approach.

4. Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition.

5. Class Agnostic Image Common Object Detection.

6. A survey on context-aware mobile visual recognition.

7. Category co-occurrence modeling for large scale scene recognition.

8. Guest editorial: mobile visual tagging with mobile context.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

8 results on '"Jiang, Shuqiang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources