21,415 results on '"rgb color model"'
Search Results
102. USING ALTERNATIVE REPRESENTATIONS OF THE RGB COLOR MODEL TO SEPARATE ERGOT SCLEROTIA (CLAVICEPS PURPUREA) FROM THE IMPORTED WHEAT.
- Author
-
FOUDA, Tarek and ALBEBANY, Amina
- Subjects
- *
RGB color model , *SCLEROTIUM (Mycelium) , *WHEAT , *SPASMS , *ARITHMETIC mean - Abstract
Sclerotium of ergot is hard fungal body which contain toxic alkaloids to human and livestock. These alkaloids include ergotamine, ergovaline, ergocornine, ergocryptine, and ergocristine. Consumption of ergot can lead to human harmful health effects. These include constriction of blood flow to extremities, gangrene, hallucinations, muscle spasms and vomiting. Livestock fed ergot sclerotia can develop gangrene symptoms of ears, hooves and tails. The alkaloids can also cause abortions and reduce mammary gland development. The study was carried out through 2019 at the agriculture engineering department, faculty of agriculture Tanta University, Egypt. To prevent the harmful health effects in humans and livestock’s using separating methods which depend on physical properties and image analysis software for different varieties of wheat imported to Egypt which contaminated with ergot fungi sclerotia. The average dimensions of ergot fungi sclerotia were ranged according imported place the length ranged from 2.32 to 22.51 mm. width from 0.12 to 2.91 mm. thickness from 0.12 to 2.21 mm. volume from 0.0511 to 65.133 mm³, geometric mean diameter from 0.46 to 4.98 mm, the arithmetic mean diameter from 1.7 to 9.10 mm, and sphericity, from 7.9 to 55.6%. Also surface area ranged from 0.66 to 78.02 mm² . On the other hand, the average dimensions of the different varieties of Russian Ukrainian and French wheat were: length 5.24, 5.24, 5.17 mm, width 1.92, 1.92, 2.35 mm, thickness 162, 161, 186 mm, volume 8.81, 8.75, 12.1 mm³, arithmetic mean diameter 2.52, 2.52, 2.82 mm², geometric mean diameter 2.92, 2.92, 3.13 mm², sphericity 48.2, 48.1, 57.8%, aspect ratio 36.7, 36.7, 45.9, and surface area 20.3, 20.2, 25.2 mm² respectively. These results revealed that the differences between the physical properties of wheat varieties and ergot sclerotia is not strong spicily for length, width and thickness this case led to obstructing separation processes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
103. Development of Colorimetric Sensor Array for Instant Determination of Sodium Metabisulfite in Dried Longan.
- Author
-
Krongchai, Chanida, Jakmunee, Jaroon, and Kittiwachana, Sila
- Abstract
Sodium metabisulfite has been used as a pretreatment chemical to prevent the gradual discoloration and to slow down the spoilage process of dried longan. The excessive use of this sulfiting agent can result in allergic reactions, particularly for sulfite-sensitive people. In this research study, a colorimetric strip was fabricated to determine the concentration of sodium metabisulfite in dried longan. The detection was based on reactions of a series of colorimetric reagents including four different Fuchsine compounds and an Ellman's regent. The maximum change of color was observed 3 min after the reactions. To perform a quantitative analysis, the image of the strip was captured using a commercial scanner, and it was converted into an RGB color model to be digitally analyzed by chemometrics. Using partial least square (PLS) regression, the calibration range (5.00–250.00 mg/kg) provided R
2 and Q2 values of 0.994 and 0.908, respectively. The limits of detection and quantitation were 5.48 and 16.6 mg/kg, which were sensible for the detection of the sulfiting agent in dried longan samples. The developed colorimetric strip was applied to determine the concentration of metabisulfite in dried longan fruit purchased from local markets in Chiang Mai, Thailand. The results agreed well with the findings obtained from the standard reference method. The developed strip could be an alternative solution for rapid, simple, and low-cost detection of the sulfiting agent in dried agricultural products, where test results could be instantly obtained without the requirement of sophisticated analytical instruments. [ABSTRACT FROM AUTHOR]- Published
- 2020
- Full Text
- View/download PDF
104. Correlation between Color of Subsurface Soil Horizons and Ground-Penetrating Radar Data.
- Author
-
Voronin, A. Ya. and Savin, I. Yu.
- Subjects
- *
SOIL horizons , *SOIL color , *CLUSTER analysis (Statistics) , *ELECTROMAGNETIC pulses , *SOIL profiles , *GROUND penetrating radar , *CHERNOZEM soils , *COLORIMETRY - Abstract
The aim of the research was to analyze the relationship of color indicators of soil layers with the parameters obtained by ground-penetrating radar (GPR) profiling. Such parameters were the data of the spectral Fourier transform of the amplitude of the reflected pulse in the subsurface horizons of the soils of the Kamennaya Steppe area. GPR profiling was carried out by a Loza-V geophysical device. The diffraction points of reflected pulse amplitude corresponding to local soil horizons and groundwater surface were determined. The variability in the thickness of horizons and layers was calculated through the modules of amplitude maxima taking into account the conductivity of the horizon (dielectric permittivity and the rate of signal passage in a given medium). The procedure of verification of differentiation of sounding points was based on their comparison with real soil profiles and patterns. Statistical processing included the calculation of mean values, standard deviations, and frequencies of signal amplitude distribution; the use of the principal component method; and cluster analysis. Two polynomial models of connection of changes in parameters of the spectral Fourier transform of electromagnetic pulses in subsurface medium of chernozems with attributive parameters of formation of spectral pure colors of dominant wavelength (λi) were constructed. The indicative parameters included the conditional coefficient of pure color (Kλi), which corresponds to the addition functions in the tri-color colorimetric system MKO XYZ 1931, and its saturation, which is completely determined by the properties of the reflection coefficient (R) in the spectral range of λ 580–720 nm (red band). The calculated parameters of soil chromaticity obtained on the basis of these models were the prerogative for constructing the RGB model of soil layers and its visualization in graphic editors. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
105. Single image dehazing by dark channel prior and luminance adjustment.
- Author
-
Rafid Hashim, Ahmed, Daway, Hazim G., and Kareem, Hana H.
- Subjects
- *
IMAGE enhancement (Imaging systems) , *RGB color model , *IMAGE intensifiers - Abstract
This paper presents a new and simple algorithm to eliminate haze from a single image using YIQ colour space and Dark Channel Prior (DCP). The suggested method consists of two parts. The first one is enhancement by adjusting contrast and lightness using contrast-limited adaptive histogram equalization and gamma transform. The second part involves applying DCP on a hazy image using the RGB colour model to achieve a haze-free image. The resulting image is then converted into YIQ, and the chrominance components (I and Q) are extracted and combined with the enhanced luminance component (Y) to achieve an enhancement haze-free image. The suggested method is applied to many outdoor images, and the quality of the resulting images is compared with that of images from other methods, by using several measures of quality. Analysing the results explain that the suggested method has a higher quality value than other methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
106. GOOD TASTE: 20 alluring F+B concepts from around the world.
- Subjects
INTERIOR decoration ,ARCHITECTURAL details ,RGB color model - Abstract
The article discusses the architectural design of several restaurants including Akaari Premium in Hanoi, Vietnam designed by NH Village Architects, the street bar Akachochin in New Delhi, India designed by Studio Dangg, and Sushi Senju in Tokyo, Japan designed by Kubo Tsushima Architects.
- Published
- 2023
107. CBIR algorithm development using RGB histogram-based block contour method to improve the retrieval performance
- Author
-
Savita Sonoli and Manasa K Chigateri
- Subjects
010302 applied physics ,Computer science ,Carry (arithmetic) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,General Medicine ,HSL and HSV ,021001 nanoscience & nanotechnology ,01 natural sciences ,Image (mathematics) ,Histogram ,0103 physical sciences ,RGB color model ,0210 nano-technology ,MATLAB ,Precision and recall ,Algorithm ,computer ,Block (data storage) ,computer.programming_language - Abstract
In this research paper, CBIR algorithm development using RGB histogram-based block contour method to improve the retrieval performance is presented along with the simulation results. CBIR to extract the visual content of an image automatically like color, texture, size, direction, distance, nearness or its shape using HSV colour RGB histogram & block contour methods is proposed here in this paper. The work presented in this paper is a new methodology of the picture retrievals by the combination of different parameters of the query image with the help of RGB histograms & the block contours. Simulations are carried out in the Matlab environment and the simulation results are observed. The CBIR based system is going to be tested using different types of test images for varieties & finally the nearest test outcomes will be got & be tabulated neatly. Here, in this proposed work using different sizes of the blocks & the no. of characteristics & the time obtained for extracting those realistic parameters could be found out & also other parameters such as the recalling, precision, characteristics were calculated and compared with existing systems. The precision and recall are used to calculate the performance of the designed system. The results presented gives precision values & the values of the recall parameter of the developed & existing CBIR systems, which shows the efficacy of the proposed methodology that is superior. Here, the observation is – the developed methodology will be highly unique, its performance will be more superior to the existing systems. Finally, to conclude, it could be iterated that this contributory work was developed to show about the RGB histogram & block contour concept involvement to extract the different features of the images and carry out the performance of system (retrieval). Finally, the results of the simulation gives the improved performance of the system (retrieval) over the others, thus showing the profoundness or the effectiveness of the research work done.
- Published
- 2023
- Full Text
- View/download PDF
108. S $^3$ Net: Self-Supervised Self-Ensembling Network for Semi-Supervised RGB-D Salient Object Detection
- Author
-
Xin Yang, C.L. Philip Chen, Carola-Bibiane Schönlieb, Ping Li, Xiaoqiang Wang, Lei Zhu, Zhang Qing, and Weiming Wang
- Subjects
Computer science ,business.industry ,Detector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Convolutional neural network ,Computer Science Applications ,Image (mathematics) ,Task (project management) ,Signal Processing ,Media Technology ,Benchmark (computing) ,RGB color model ,Leverage (statistics) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Rotation (mathematics) - Abstract
RGB-D salient object detection aims to detect visually distinctive objects or regions from a pair of the RGB image and the depth image. State-of-the-art RGB-D saliency detectors are mainly based on convolutional neural networks but almost suffer from an intrinsic limitation relying on the labeled data, thus degrading detection accuracy in complex cases. In this work, we present a self-supervised self-ensembling network (S $^3$ Net) for semi-supervised RGB-D salient object detection by leveraging the unlabeled data and exploring a self-supervised learning mechanism. To be specific, we first build a self-guided convolutional neural network (SG-CNN) as a baseline model by developing a series of three-layer cross-model feature fusion (TCF) modules to leverage complementary information among depth and RGB modalities and formulating an auxiliary task that predicts a self-supervised image rotation angle. After that, to further explore the knowledge from unlabeled data, we assign SG-CNN to a student network and a teacher network, and encourage the saliency predictions and self-supervised rotation predictions from these two networks to be consistent on the unlabeled data. Experimental results on seven widely-used benchmark datasets demonstrate that our network quantitatively and qualitatively outperforms the state-of-the-art methods.
- Published
- 2023
- Full Text
- View/download PDF
109. Dome-shaped mode lasing from liquid crystals for full-color lasers and high-sensitivity detection
- Author
-
Rui Duan, Zitong Zhang, Lian Xiao, Tianhua Ren, Xuehong Zhou, Yi Tian Thung, Van Duong Ta, Jun Yang, Handong Sun, School of Physical and Mathematical Sciences, Guangdong University of Technology, and Le Quy Don Technical University
- Subjects
Color Lasers ,Materials Chemistry ,Metals and Alloys ,Ceramics and Composites ,General Chemistry ,Physics::Optics and light [Science] ,RGB Color Model ,Catalysis ,Materials::Photonics and optoelectronics materials [Engineering] ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials - Abstract
In this communication, we report a new class of oscillation mode, dome-shaped mode (DSM), in liquid crystal (LC) microlasers. A record high Q-factor over 24 000 is achieved in LC soft-matter microlasers. We successfully presented a proof-of-concept demonstration of red, green, blue (RGB) LC-DSM microlaser pixels with a 74% broader achievable color gamut than the standard RGB color space. Besides, the detection limit for acetone vapor molecules is as low as 0.5 ppm, confirming the excellent potential of the proposed LC-DSM microlaser in ultra-high sensitivity detection. Agency for Science, Technology and Research (A*STAR) Published version This work is supported by A*Star-AME-IRG-A20E5c0083, NRF-CRP23-2019-0007. V. D. T. acknowledges the support from the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 103.03- 2021.62. J. Y. also acknowledges the support from the National Science Fund for Distinguished Young Scholars of China (61925501), the Guangdong Introducing Innovative and Entrepreneurial Teams (2019ZT08X340) and Introducing Leading Talents (2019CX01X010) of ‘‘The Pearl River Talent Recruitment Program’’ of Guangdong Province.
- Published
- 2023
- Full Text
- View/download PDF
110. Eco-friendly screening method for sulfonamides using a 3D handheld smartphone-based fluorescence detection device and graphene nanoplatelet-packed pipette tip microextraction.
- Author
-
Barzallo, Diego, Ferrer, Laura, and Palacio, Edwin
- Subjects
RGB color model ,SMARTPHONES ,SULFONAMIDES ,WATER supply management ,DIGITAL image processing ,OCHRATOXINS - Abstract
A sensitive, miniaturized, and low-cost method combining pipette tip solid-phase microextraction and smartphone-based fluorescent detection has been developed for the determination of total sulfonamides in water samples. Sulfonamides antibiotics (SAs) are contaminants commonly found in water matrices, leading to antibiotic-resistant bacteria and risks to human health and the environment. Thus, its real-time monitoring is essential to the risk assessment, and the subsequent management of any water supply. Sample preparation consisted of extraction/preconcentration of SAs using graphene nanoplatelets packed inside a pipette tip, followed by fluorescent derivatization using fluorescamine inside the microplate reader, both 3D printed. Subsequently, a 3D printed detection platform that houses monochromatic LED strips as radiation source and a smartphone as detector have been used for determination total SAs. Digital image processing was based on the RGB colour model using image J software with its Readplate plugin and the green intensity channel was used as analytical signal due to its higher sensitivity. Several factors that affect the extraction efficiency and detection have been optimized. Under the optimized conditions, good linearity for SAs studied were obtained in a range of 10–60 µg L
−1 with r ≥ 0.990 and limits of detection between 2.5–3.1 µg L−1 for a sample volume of 10 mL. The recoveries of sulfamethoxazole (as a model compound to express total SAs) spiked in diverse water matrices were tested at two different levels showed good recoveries from 94% to 102% with RSD ≤ 7.6%. The results obtained with the proposed method were satisfactorily compared with those obtained with a conventional spectrofluorometer (P ≥ 0.13). This easy-to-operate system features a simple extraction procedure (up to 20-fold enrichment), excellent sensitivity and precision, which is very useful and practical for on-site analysis. Furthermore, three greenness evaluation methodologies (GAPI, AGREE, and AGREEprep) were used to assess the environmental friendliness of the proposed method, demonstrating its superior performance compared to previously published HPLC and spectrophotometric methods. • A handheld screening platform is proposed for the in-situ extraction and detection of total sulfonamides. • Fast and semi-automated processing of digital images. • Environmentally sustainable screening method. • Good recoveries are obtained in water samples. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
111. First-Person Hand Action Recognition Using Multimodal Data
- Author
-
Hongyu Wang, Hongye Xie, Na Cheng, Zhenyu Liu, and Rui Li
- Subjects
Modality (human–computer interaction) ,Computer science ,business.industry ,Pattern recognition ,Recurrent neural network ,Action (philosophy) ,Discriminative model ,Artificial Intelligence ,Simple (abstract algebra) ,RGB color model ,Graph (abstract data type) ,Artificial intelligence ,Representation (mathematics) ,business ,Software - Abstract
Extensive studies have been conducted on human action recognition, whereas relatively few methods have been proposed for hand action recognition. Although it is very natural and straightforward to apply a human action recognition method to hand action recognition, this approach cannot always lead to state-of-the-art performance. One of the important reasons is that both the between-class difference and the within-class difference in hand actions are much smaller than those in human actions. In this paper, we study first-person hand action recognition from RGB-D sequences. To explore whether pretrained networks substantially influence accuracy, 8 classic pretrained networks and one pretrained network designed by us are introduced for extracting RGB-D features. A Lie group is introduced for hand pose representation. Ablation studies are conducted to compare the discriminative power of the RGB modality, depth modality, pose modality, and their combinations. In our method, a fixed number of frames are randomly sampled to represent an action. This temporal modeling strategy is simple but is proven more effective than both the graph convolutional network (GCN) and the recurrent neural network (RNN), which are widely adopted by conventional methods. Evaluation experiments on two public datasets demonstrate that our method markedly outperforms recent baselines.
- Published
- 2022
- Full Text
- View/download PDF
112. A Novel Color Recognition Model for Improvement on Color Differences in Products via Grey Relational Grade
- Author
-
Jeih-Jang Liou
- Subjects
LED light ,color cast ,mechanic visual perception ,grey relational grade ,RGB color model ,Mathematics ,QA1-939 - Abstract
ED light, a green energy-saving light source, can cause color cast. For this reason, LED light is seldom favored by designers. The purpose of the paper is to provide shoppers who are observing product colors in an LED-lighted setting with an innovative color identification model. Based on designers’ product color comparison, the paper employs high-reliability mechanic visual perception in combination with grey relational grade. Grey relational grade is applied to eliminate electrical fault pertaining to mechanic visual perception, whereby appropriate LED parameters and color cast inclination can be obtained. The paper first mimics retail store display windows. The color temperature and illuminance of LED light sources are adjustable. Two degrees of illuminance, including high illuminance (1500 lux) and low illuminance (500 lux), and two light source color temperatures, including yellow light (2700 K) and white light (4000 K), were assigned for study. Four colors, including red, yellow, blue and green of the natural color system, were selected as product colors. The mechanic visual perception sensor was used to identify the object (product) color, which is then converted into an RGB color model to serve as research data of color cast measurement, and the grey relational grade was applied to obtain the most appropriate LED light parameters and the color cast of the four colors. The data analysis reveals that green shows the least color cast when it is lighted by a yellow LED light source with low illuminance, yellow and blue have the least color cast when it is lighted by a white LED light source with high illuminance and red displays the least color cast when it is lighted by a white LED light source with low-illuminance. The analysis also indicates each color’s cast inclination in blackness, chromaticness and hue. As a result, LED light that is more acceptable to designers is suggested for display windows, thus reducing problems with product color cast.
- Published
- 2021
- Full Text
- View/download PDF
113. SP-SLAM: Surfel-Point Simultaneous Localization and Mapping
- Author
-
Hae Min Cho, Euntai Kim, and HyungGi Jo
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Bundle adjustment ,Simultaneous localization and mapping ,Computer Science Applications ,Control and Systems Engineering ,Feature (computer vision) ,Surfel ,Benchmark (computing) ,RGB color model ,Computer vision ,Point (geometry) ,Central processing unit ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
In this article, a novel method for simultaneous localization and mapping (SLAM) named surface elements (surfel) point SLAM (SP-SLAM) is proposed using an RGB-D camera. The key idea of SP-SLAM is to use not only keypoints but also surfels as features to cope with both high texture and low texture environments. By decomposing a surface into a small number of surfels, the method can represent spacious environments using a relatively small amount of memory. To optimize the poses of points, surfels, and cameras altogether, new objective functions are proposed, and a new bundle adjustment using these objective functions is developed. The proposed SP-SLAM runs in real time on a central processing unit as in other feature-based visual SLAM methods but works better than them not only in high texture but also in low texture environments, overcoming well-known drawbacks of feature-based visual SLAM with degradation in low texture environments. The proposed method is applied to benchmark datasets and its effectiveness is demonstrated by comparing against those of previous methods in terms of localization accuracy.
- Published
- 2022
- Full Text
- View/download PDF
114. A leaf image localization based algorithm for different crops disease classification
- Author
-
Suchi Gangwar and Yashwant Kurmi
- Subjects
Contextual image classification ,020209 energy ,010401 analytical chemistry ,Forestry ,02 engineering and technology ,Aquatic Science ,RANSAC ,Color space ,01 natural sciences ,0104 chemical sciences ,Computer Science Applications ,Support vector machine ,Region growing ,Feature (computer vision) ,Bag-of-words model in computer vision ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,Animal Science and Zoology ,Agronomy and Crop Science ,Algorithm ,Mathematics - Abstract
Agricultural crop production is a major contributing element to any country’s economy. To maintain the economic growth of any country plants disease detection is a leading factor in agriculture. The contribution of the proposed algorithm is to optimize the extracted information from the available resources for the betterment of the result without any additional complexity. The proposed technique basically localizes the leaf region prior to the image classification into healthy and diseased. The novelty of this work is to fuse the information extracted from the available resources and optimize it to enhance the expected outcome. The leaf colors are analyzed using color transformation for the seed region identification. The mapping of a low-dimensional RGB color image into L*a*b color space provides an expansion of the spectral range. The neighboring pixels-based leaf region growing is applied on the initial seeds. In order to refine the leaf boundary and the disease-affected areas, we employed a random sample consensus (RANSAC) for suitable curve fitting. The feature sets using bag of visual words, Fisher vectors, and handcrafted features are extracted followed by classification using logistic regression, multilayer perceptron model, and support vector machine. The performance of the proposal is analyzed through PlantVillage datasets of apple, bell pepper, cherry, corn, grape, potato, and tomato. The simulation-based analysis of the proposed contextualization-based image categorization process outperforms as compared with the state of arts. The proposed approach provides average accuracy and area under the curve of 0.932 and 0.903, respectively.
- Published
- 2022
- Full Text
- View/download PDF
115. An Efficient Deep Learning Accelerator Architecture for Compressed Video Analysis
- Author
-
Yongchen Wang, Xiaowei Li, Huawei Li, and Ying Wang
- Subjects
Speedup ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,Video processing ,Computer Graphics and Computer-Aided Design ,Motion vector ,RGB color model ,Overhead (computing) ,Codec ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software ,Decoding methods ,Reference frame - Abstract
Previous neural network accelerators tailored to video analysis only accept data of RGB/YUV domain, requiring decompressing the video that are often compressed before transmitted from the edge sensors. A compressed video processing accelerator can alleviate the decoding overhead, and gain performance speedup by operating on more compact input data. This work proposes a novel deep learning accelerator architecture, Alchemist, which is able to predict results directly from the compressed video bitstream instead of reconstructing the full RGB images. By utilizing the metadata of motion vector and critical blocks extracted from bitstreams, Alchemist contributes to remarkable performance speedup of 5x with negligible accuracy loss. Nevertheless, we still find that the original compressed video coded by standard algorithms such as H.264 is not suitable to be directly manipulated, due to diverse compressed structures. Although obviating the requirement to recover all RGB frames, the accelerator must parse the entire compressed video bitstream to locate reference frames and extract useful metadata. If we combine the video codec with the proposed compressed video analysis, additional optimizations can be obtained. Therefore, to cope with the mismatch between current video coding algorithms such as H.264 and neural network-based video analysis, we propose a specialized coding strategy to generate compressed video bitstreams more suitable for transmission and analysis, which further simplifies the decoding stage of video analysis and is capable of achieving significant storage reduction.
- Published
- 2022
- Full Text
- View/download PDF
116. A General Method of Realistic Avatar Modeling and Driving for Head-Mounted Display Users
- Author
-
Ting Lu, Xiangmin Xu, Zhengfu Peng, Xiaofen Xing, and Jianxin Pang
- Subjects
UV mapping ,Facial expression ,Landmark ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical head-mounted display ,Virtual reality ,Artificial Intelligence ,Face (geometry) ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS ,Avatar - Abstract
The head-mounted displays (HMDs) provide immersive experiences in virtual reality (VR). However, the face interactions are limited due to the serious occlusion of the user’s face. Existed approaches try to recover the user’s facial expression by adding additional sensors to HMD. In this paper, we develop a novel framework to reconstruct the userb’s 3D face in VR only using an RGB camera. Given a reference face, a realistic full-textured avatar is created by fitting a 3D Morphable Model (3DMM). A self-supervised UV map Generative Adversarial Network (GAN) is proposed to make the facial texture look more realistic. Next, we propose a novel landmark detection method to locate the landmark positions under HMD occlusion since facial landmarks are commonly used for driving the avatar. To this end, we synthesize a face dataset with HMD. Our method is easy to build and popularize with low cost. The experiments on the synthetic and real HMD data demonstrate that the proposed method can detect landmark accurately and restore facial expressions faithfully despite the large occlusion of HMD.
- Published
- 2022
- Full Text
- View/download PDF
117. Color intensity variations and art prices: An examination of Latin American art
- Author
-
Urbi Garay, Eduardo Pérez, and Fredy Pulga
- Subjects
Art market ,Marketing ,History ,Painting ,Latin Americans ,Polymers and Plastics ,Fetch ,Color intensity ,Industrial and Manufacturing Engineering ,Visual arts ,Latin American art ,RGB color model ,Business and International Management ,Consumer behaviour - Abstract
Most existing literature has ignored the potential effects that color intensity may have on art prices (bearing a few recent exceptions). We examine 1,627 paintings executed by the “Big five” Latin American artists (Rivera, Tamayo, Lam, Matta, and Botero) and sold at Sotheby’s and Christie’s between 2003 and 2017 to analyze this impact. We find strong evidence indicating that paintings that are more intense in color fetch higher prices, but only up to a certain degree (paintings whose color is “too intense”, “too vivid” or “too dark” actually fetch lower prices). To the best of our knowledge, these results are the first to confirm, for the case of the art market, early experimental evidence in the psychology literature pointing to the existence of an inverse “U” pattern on the preferences for color intensity. Our findings have implications for other areas such as psychology and consumer behavior.
- Published
- 2022
- Full Text
- View/download PDF
118. Bidirectional Posture-Appearance Interaction Network for Driver Behavior Recognition
- Author
-
Mingkui Tan, Xu Liu, Gengqin Ni, Xiangmiao Wu, Runhao Zeng, Shiliang Zhang, and Yaowei Wang
- Subjects
Source code ,Exploit ,business.industry ,Computer science ,Mechanical Engineering ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Convolutional neural network ,Computer Science Applications ,Discriminative model ,Margin (machine learning) ,Interaction network ,Automotive Engineering ,RGB color model ,Graph (abstract data type) ,Artificial intelligence ,business ,media_common - Abstract
Driver behavior recognition has become one of the most important tasks for intelligent vehicles. This task, however, is very challenging since the background contents in real-world driving scenarios are often very complex. More critically, the difference between driving behaviors is often very minor, making it extremely difficult to distinguish them. Existing methods often rely only on RGB frames (or skeleton data), which may fail to capture the minor differences between behaviors and appearance information of objects simultaneously and thus fail to achieve promising performance. To address the above issues, in this paper, we propose a bidirectional posture-appearance interaction network (BPAI-Net), which simultaneously considers RGB frames and skeleton (\ie, posture) data for driver behavior recognition. Specifically, we propose a posture-guided convolutional neural network (PG-CNN) and an appearance-guided graph convolutional network (AG-GCN) to extract appearance and posture features, respectively. To exploit the complementary information between appearance and posture, we use the appearance features from PG-CNN for guiding AG-GCN to exploit the contextual information (e.g., nearby objects) to enhance posture features. Then, we use the enhanced posture features from AG-GCN to help PG-CNN focus on critical local areas of video frames that are related to driver behaviors. In this sense, we are able to use the interaction between two modalities to extract more discriminative features and thus improve the recognition accuracy. Experimental results on Drive&Act dataset show that our method outperforms state-of-the-art methods by a large margin (67.83% vs. 63.64%). Furthermore, we collect a bus driver behavior recognition dataset and yield consistent performance gain against baseline methods, demonstrating the effectiveness of our method in real-world applications. The source code and trained models are available at github.com/SCUT-AILab/BPAI-Net/.
- Published
- 2022
- Full Text
- View/download PDF
119. Capitalizing on RGB-FIR Hybrid Imaging for Road Detection
- Author
-
Jian Yang, Yigong Zhang, Hui Kong, Jose M. Alvarez, Cheng-Zhong Xu, and Jin Xie
- Subjects
Computer science ,business.industry ,Mechanical Engineering ,Automotive Engineering ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Computer Science Applications - Published
- 2022
- Full Text
- View/download PDF
120. P‐12.6: Research on Color Conversion Model of Multi‐Primary‐Color Display.
- Author
-
Sun, Yan, Xi, Yanhui, Zhang, Xiaomang, Shi, Tiankuo, Hou, Yifan, Ji, Zhihua, Chu, Minglei, Sun, Wei, Chen, Ming, and Dong, Xue
- Subjects
RGB color model ,PROBLEM solving ,COLOR display systems - Abstract
The tristimulus values (X, Y, Z) of the color need to be faithfully matched when a specified color is to be reproduced on the display. In RGB color system, there is only one solution, but in multi‐primary‐color system, there will be numerous combinations due to the color redundancy. In this paper, a 5.5‐inch 6‐primary‐color module was made with resolution of FHD (1920 x 1080) and a color conversion model was established to solve the problem of color reproduction redundancy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
121. Comparison and Analysis of Several Quantitative Identification Models of Pesticide Residues Based on Quick Detection Paperboard
- Author
-
Tang, Yao Zhang, Qifu Zheng, Xiaobin Chen, Yingyi Guan, Jingbo Dai, Min Zhang, Yunyuan Dong, and Haodong
- Subjects
pesticide residue ,image processing ,RGB color model ,prediction model ,data averaging - Abstract
Pesticide residues have long been a significant aspect of food safety, which has always been a major social concern. This study presents research and analysis on the identification of pesticide residue fast detection cards based on the enzyme inhibition approach. In this study, image recognition technology is used to extract the color information RGB eigenvalues from the detection results of the quick detection card, and four regression models are established to quantitatively predict the pesticide residue concentration indicated by the quick detection card using RGB eigenvalues. The four regression models are linear regression model, quadratic polynomial regression model, exponential regression model and RBF neural network model. Through study and comparison, it has been shown that the exponential regression model is superior at predicting the pesticide residue concentration indicated by the rapid detection card. The correlation value is 0.900, and the root mean square error is 0.106. There will be no negative prediction value when the expected concentration is near to 0. This gives a novel concept and data support for the development of image recognition equipment for pesticide residue fast detection cards based on the enzyme inhibition approach.
- Published
- 2023
- Full Text
- View/download PDF
122. LayART: Generating indoor layout using ARCore Transformations
- Author
-
Chiranjoy Chattopadhyay, Naimul Mefraz Khan, Shreya Goyal, and Gaurav Bhatnagar
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,RGB color model ,Computer vision ,Mobile phone camera ,Artificial intelligence ,Floor plan ,Simultaneous localization and mapping ,business ,Rgb image - Abstract
Reconstructing an indoor scene and generating a layout/floor plan in 3D or 2D is a widely known problem. Quite a few algorithms have been proposed in the literature recently. However, most of the existing methods either use RGB-D images, thus requiring a depth camera, or depend on panoramic photos with the assumption that there is little to no occlusion in the rooms. In this work, we proposed generation of layout using an RGB image captured using a simple mobile phone camera. We take advantage of Simultaneous Localization and Mapping (SLAM) to assess the 3D transformations required for layout generation. SLAM technology is built-in in recent mobile libraries such as ARCore by Google. Hence, the proposed method is fast and efficient, while giving the user freedom to generate layout by simply taking a few conventional photos, rather than relying on specialized depth hardware or occlusion-free panoramic photos.
- Published
- 2023
- Full Text
- View/download PDF
123. Blood vessel segmentation of retinal image using Clifford matched filter and Clifford convolution.
- Author
-
Roy, Somasis, Mitra, Anirban, Roy, Sudipta, and Setua, Sanjit Kumar
- Subjects
RETINAL blood vessels ,RETINAL imaging ,IMAGE segmentation ,CLIFFORD algebras ,MATCHED filters ,BLOOD vessels ,MATHEMATICAL convolutions - Abstract
The appearance and structure of blood vessels in retinal fundus image is a fundamental part of diagnosing different issues related with such as diabetes and hypertension. The proposed blood vessel segmentation in fundus image using Clifford Algebra approach is divided into three steps. Image vectorization as a first step helps to convert the image space into Clifford space. Next step introduces Clifford matched filter as a proposed mask which works for retinal blood vessel extraction. The third and final step of this method is Clifford convolution operation with the help of Clifford convolution. This mask generates edge points along the boundaries of the blood vessels. The edge points are represented as a Grade-0 vector or scalar unit. Discrete edge points along the boundary of blood vessels are the edge pixels instead of continuous edges. The output of this method differs in the representation of vessel tree compare to other existing methods. The output image can be defined as the edge point set. This method achieves blood vessel segmentation accuracy of 94.88% and 92.95% on two publicly available datasets STARE and DRIVE respectively in less than 0.5 s per image. The proposed matched filter and the segmentation technique opens many windows of reliable and faster processing for further image processing steps on retinal fundus images. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
124. Temporal Head Pose Estimation From Point Cloud in Naturalistic Driving Conditions
- Author
-
Carlos Busso, Sumit Jha, and Tiancheng Hu
- Subjects
Computer science ,business.industry ,Mechanical Engineering ,Deep learning ,Point cloud ,Context (language use) ,Computer Science Applications ,Discriminative model ,Feature (computer vision) ,Face (geometry) ,Automotive Engineering ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Pose - Abstract
Head pose estimation is an important problem as it facilitates tasks such as gaze estimation and attention modeling. In the automotive context, head pose provides crucial information about the driver's mental state, including drowsiness, distraction and attention. It can also be used for interaction with in-vehicle infotainment systems. While computer vision algorithms using RGB cameras are reliable in controlled environments, head pose estimation is a challenging problem in the car due to sudden illumination changes, occlusions and large head rotations that are common in a vehicle. These issues can be partially alleviated by using depth cameras. Head rotation trajectories are continuous with important temporal dependencies. Our study leverages this observation, proposing a novel temporal deep learning model for head pose estimation from point cloud. The approach extracts discriminative feature representation directly from point cloud data, leveraging the 3D spatial structure of the face. The frame-based representations are then combined with bidirectional long short term memory (BLSTM) layers. We train this model on the newly collected multimodal driver monitoring (MDM) dataset, achieving better results compared to non-temporal algorithms using point cloud data, and state-of-the-art models using RGB images. We further show quantitatively and qualitatively that incorporating temporal information provides large improvements not only in accuracy, but also in the smoothness of the predictions.
- Published
- 2022
- Full Text
- View/download PDF
125. Adaptive Fusion CNN Features for RGBT Object Tracking
- Author
-
Huanlong Zhang, Xian Wei, Yong Wang, Xuan Tang, and Hao Shen
- Subjects
Source data ,Modality (human–computer interaction) ,business.industry ,Computer science ,Mechanical Engineering ,Tracking (particle physics) ,Convolutional neural network ,Computer Science Applications ,Minimum bounding box ,Video tracking ,Automotive Engineering ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Intelligent transportation system - Abstract
Thermal sensors play an important role in intelligent transportation system. This paper studies the problem of RGB and thermal (RGBT) tracking in challenging situations by leveraging multimodal data. A RGBT object tracking method is proposed in correlation filter tracking framework based on short term historical information. Given the initial object bounding box, hierarchical convolutional neural network (CNN) is employed to extract features. The target is tracked for RGB and thermal modalities separately. Then the backward tracking is implemented in the two modalities. The difference between each pair is computed, which is an indicator of the tracking quality in each modality. Considering the temporal continuity of sequence frames, we also incorporate the history data into the weights computation to achieve a robust fusion of different source data. Experiments on three RGBT datasets show the proposed method achieves comparable results to state-of-the-art methods.
- Published
- 2022
- Full Text
- View/download PDF
126. LDNet: End-to-End Lane Marking Detection Approach Using a Dynamic Vision Sensor
- Author
-
Byung-Geun Lee, Farzeen Munir, Witold Pedrycz, Shoaib Azam, and Moongu Jeon
- Subjects
FOS: Computer and information sciences ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Mechanical Engineering ,Deep learning ,Frame (networking) ,Motion blur ,Computer Science - Computer Vision and Pattern Recognition ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science Applications ,Automotive Engineering ,RGB color model ,Computer vision ,Pyramid (image processing) ,Artificial intelligence ,business ,Encoder ,Image resolution ,Block (data storage) - Abstract
Modern vehicles are equipped with various driver-assistance systems, including automatic lane keeping, which prevents unintended lane departures. Traditional lane detection methods incorporate handcrafted or deep learning-based features followed by postprocessing techniques for lane extraction using frame-based RGB cameras. The utilization of frame-based RGB cameras for lane detection tasks is prone to illumination variations, sun glare, and motion blur, which limits the performance of lane detection methods. Incorporating an event camera for lane detection tasks in the perception stack of autonomous driving is one of the most promising solutions for mitigating challenges encountered by frame-based RGB cameras. The main contribution of this work is the design of the lane marking detection model, which employs the dynamic vision sensor. This paper explores the novel application of lane marking detection using an event camera by designing a convolutional encoder followed by the attention-guided decoder. The spatial resolution of the encoded features is retained by a dense atrous spatial pyramid pooling (ASPP) block. The additive attention mechanism in the decoder improves performance for high dimensional input encoded features that promote lane localization and relieve postprocessing computation. The efficacy of the proposed work is evaluated using the DVS dataset for lane extraction (DET). The experimental results show a significant improvement of $5.54\%$ and $5.03\%$ in $F1$ scores in multiclass and binary-class lane marking detection tasks. Additionally, the intersection over union ($IoU$) scores of the proposed method surpass those of the best-performing state-of-the-art method by $6.50\%$ and $9.37\%$ in multiclass and binary-class tasks, respectively.
- Published
- 2022
- Full Text
- View/download PDF
127. TMFNet: Three-Input Multilevel Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Pan Sijia, Lu Yu, Wujie Zhou, and Jingsheng Lei
- Subjects
Fusion ,Control and Optimization ,business.industry ,Computer science ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Computer Science Applications ,Image (mathematics) ,Computational Mathematics ,Artificial Intelligence ,Salient ,Depth map ,Benchmark (computing) ,RGB color model ,Artificial intelligence ,business ,Representation (mathematics) - Abstract
The use of depth information, acquired by depth sensors, for salient object detection (SOD) is being explored. Despite the remarkable results from recent deep learning approaches for RGB-D SOD, they fail to fully incorporate original and accurate information to express details of RGB-D images in salient objects. Here, we propose an RGB-D SOD model using a three-input multilevel fusion network (TMFNet), which differs from existing methods based on double-stream networks. In addition to RGB input (first input) and depth input (second input), the RGB image and depth map are combined into a four-channel representation (RGBD input) that constitutes the third input to the TMFNet. The RGBD input generates multilevel features that reflect details of the RGB-D image. In addition, the proposed TMFNet aggregates diverse region-based contextual information without discarding RGB and depth features. Thus, we introduce a cross-fusion module, and benefiting from rich low- and high-level information from salient features, feature fusion enables the improvement of localization of salient objects. The proposed TMFNet achieves state-of-the-art performance on six benchmark datasets for SOD.
- Published
- 2022
- Full Text
- View/download PDF
128. Depth Map Super-Resolution Based on Dual Normal-Depth Regularization and Graph Laplacian Prior
- Author
-
Baocai Yin, Yunhui Shi, Ruiqin Xiong, Longhua Sun, Jin Wang, and Qing Zhu
- Subjects
Smoothness (probability theory) ,Color image ,business.industry ,Computer science ,Pattern recognition ,Regularization (mathematics) ,Depth map ,Normal mapping ,Media Technology ,RGB color model ,Graph (abstract data type) ,Artificial intelligence ,Electrical and Electronic Engineering ,Laplacian matrix ,business - Abstract
The edge information plays a key role in the restoration of a depth map. Most conventional methods assume that the color image and depth map are consistent in edge areas. However, complex texture regions in the color image do not match exactly with edges in the depth map. In this paper, firstly, we point out that in most cases the consistency between normal map and depth map is much higher than that between RGB-D pairs. Then we propose a dual normal-depth regularization term to guide the restoration of depth map, which constrains the edge consistency between normal map and depth map back and forth. Moreover, considering the bimodal characteristic of weight distribution that exists in depth discontinuous areas, a reweighted graph Laplacian regularizer is proposed to promote this bimodal characteristic. And this regularization is incorporated into a unified optimization framework to effectively protect the piece-wise smoothness(PWS) characteristics of depth map. By treating depth image as graph signal, the weight between two nodes is adapted according to its content. The proposed method is tested for both noise-free and noisy cases, and is compared against the state-of-the-art methods on both synthesis and real captured datasets. Extensive experimental results demonstrate the superior performance of our method compared with most state-of-the-art works in terms of both objective and subjective quality evaluations. Specifically, our method is more effective on edge areas and more robust to noises.
- Published
- 2022
- Full Text
- View/download PDF
129. Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories
- Author
-
Ling Shao, Xiao-Yuan Jing, and Ziyun Cai
- Subjects
Matching (statistics) ,Computer science ,business.industry ,Pattern recognition ,Bottleneck ,Computer Science Applications ,Domain (software engineering) ,Task (project management) ,Human-Computer Interaction ,Control and Systems Engineering ,Component (UML) ,Outlier ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software ,Information Systems ,Test data - Abstract
Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.
- Published
- 2022
- Full Text
- View/download PDF
130. A New Semi-Supervised Fault Diagnosis Method via Deep CORAL and Transfer Component Analysis
- Author
-
Xinyu Li, Liang Gao, Zhao Zhang, and Long Wen
- Subjects
Control and Optimization ,Computer science ,business.industry ,Deep learning ,SIGNAL (programming language) ,Pattern recognition ,Fault (power engineering) ,Bearing (navigation) ,Convolutional neural network ,Computer Science Applications ,Data-driven ,Computational Mathematics ,Component analysis ,Artificial Intelligence ,RGB color model ,Artificial intelligence ,business - Abstract
Data driven method has been investigated in fault diagnosis. This kind of methods usually need of a large number of labeled data in order to obtain a good model. However, the unlabeled data dominates the data availability in the real industries, which impedes the successfully application of these methods. To remedy this situation, this research investigates a new semi-supervised fault diagnosis method, and it aims at training an adaptive model by using the small labeled data and large unlabeled data. Firstly, the fault signal is transformed to the RGB format. Then, Convolutional Neural Network (CNN) is trained to generate the fault features combining with the Correlation Alignment (CORAL) loss. Finally, the Transfer Component Analysis (TCA) is adopted to classify the fault features to improve the prediction accuracy. The proposed semi-supervised method is conducted on three bearing datasets. The experiments with different proportions of labeled data are done to validate the proposed CNN with CORAL and TCA. The comparisons between the proposed CNN model with other popular deep learning methods are also presented. The results show that CNN with CORAL and TCA is potential for fault diagnosis, and the prediction accuracies of proposed method outperform other comparisons methods.
- Published
- 2022
- Full Text
- View/download PDF
131. Multi-Stream Interaction Networks for Human Action Recognition
- Author
-
Baosheng Yu, Linlin Zhang, Jiaqi Li, Dongyue Chen, and Haoran Wang
- Subjects
genetic structures ,Computer science ,business.industry ,Human body ,Multi stream ,Skeleton (category theory) ,Object (computer science) ,Human skeleton ,medicine.anatomical_structure ,Robustness (computer science) ,Media Technology ,medicine ,RGB color model ,Action recognition ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Skeleton-based human action recognition has received extensive attention due to its efficiency and robustness to complex backgrounds. Though the human skeleton can accurately capture the dynamics of human poses, it fails to recognize human actions induced by the interaction between human and objects, making it is of great importance to further explore the interaction between the human and objects for human action recognition. In this paper, we devise the multi-stream interaction networks (MSIN), to simultaneously explore the dynamics of human skeleton, objects, and the interaction between human and objects. Specifically, apart from the traditional human skeleton stream, 1) the second stream explores the dynamics of object appearance from the objects surrounding the human body joints; and 2) the third stream captures the dynamics of object position in regard to the distance between the object and different human body joints. Experimental results on three popular skeleton-based human action recognition datasets, NTU RGB+D, NTU RGB+D 120, and SYSU, demonstrate the effectiveness of the proposed method, especially for recognizing the human actions with human-object interactions.
- Published
- 2022
- Full Text
- View/download PDF
132. CGFNet: Cross-Guided Fusion Network for RGB-T Salient Object Detection
- Author
-
Yunhui Yan, Kechen Song, Yanqi Bao, Jie Wang, and Liming Huang
- Subjects
Source code ,Computer science ,business.industry ,media_common.quotation_subject ,Context (language use) ,Image (mathematics) ,Modal ,Salient ,Media Technology ,RGB color model ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Decoding methods ,media_common ,Block (data storage) - Abstract
RGB salient object detection (SOD) has made great progress. However, the performance of this single-modal salient object detection will be significantly decreased when encountering some challenging scenes, such as low light or darkness. To deal with the above challenges, thermal infrared (T) image is introduced into the salient object detection. This fused modal is called RGB-T salient object detection. To achieve deep mining of the unique characteristics of single modal and the full integration of cross-modality information, a novel Cross-Guided Fusion Network (CGFNet) for RGB-T salient object detection is proposed. Specifically, a Cross-Scale Alternate Guiding Fusion (CSAGF) module is proposed to mine the high-level semantic information and provide global context support. Subsequently, we design a Guidance Fusion Module (GFM) to achieve sufficient cross-modality fusion by using single modal as the main guidance and the other modal as auxiliary. Finally, the Cross-Guided Fusion Module (CGFM) is presented and serves as the main decoding block. And each decoding block is consists of two parts with two modalities information of each being the main guidance, i.e., cross-shared Cross-Level Enhancement (CLE) and Global Auxiliary Enhancement (GAE). The main difference between the two parts is that the GFM using different modalities as the main guide. The comprehensive experimental results prove that our method achieves better performance than the state-of-the-art salient detection methods. The source code has released at: https://github.com/wangjie 0825/CGFNet.git.
- Published
- 2022
- Full Text
- View/download PDF
133. Intelligent Mechanical Fault Diagnosis Using Multisensor Fusion and Convolution Neural Network
- Author
-
Tingli Xie, Xufeng Huang, and Seung-Kyum Choi
- Subjects
Computer science ,System of measurement ,Residual ,Fault (power engineering) ,computer.software_genre ,Convolutional neural network ,Computer Science Applications ,Control and Systems Engineering ,Principal component analysis ,Fuse (electrical) ,RGB color model ,Data mining ,Electrical and Electronic Engineering ,computer ,Information Systems ,Data transmission - Abstract
Diagnosis of mechanical faults in manufacturing systems is critical for ensuring safety and saving costs. With the development of data transmission and sensor technologies, measuring systems can acquire massive amounts of multi-sensor data. Although Deep-Learning (DL) provides an end-to-end way to address the drawbacks of traditional methods, it is necessary to do deep research on an intelligent fault diagnosis method based on Multi-Sensor Data. In this project, a novel intelligent diagnosis method based on Multi-Sensor Fusion (MSF) and Convolutional Neural Network (CNN) is explored. Firstly, a Multi-Signals-to-RGB-Image conversion method based on Principal Component Analysis (PCA) is applied to fuse multi-signal data into three-channel RGB images. Then, an improved CNN with residual networks is proposed, which can balance the relationship between computational cost and accuracy. Two datasets are used to verify the effectiveness of the proposed method. The results show the proposed method outperforms other DL-based methods in terms of accuracy.
- Published
- 2022
- Full Text
- View/download PDF
134. Capsule Boundary Network With 3D Convolutional Dynamic Routing for Temporal Action Detection
- Author
-
Xinhua Suo, Yan Shen, Wei Wang, Yaosen Chen, Bing Guo, and Weichen Lu
- Subjects
Computer science ,Intersection (set theory) ,business.industry ,Optical flow ,Process (computing) ,Boundary (topology) ,Pattern recognition ,Convolutional neural network ,Action (philosophy) ,Feature (computer vision) ,Media Technology ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Temporal action detection is a challenging task in video understanding, due to the complexity of the background and rich action content impacting high-quality temporal proposals generation in untrimmed videos. Capsule networks can avoid some limitations of the invariance caused by pooling and inability from convolutional neural networks, which can better understand the temporal relations for temporal action detection. However, because of the extremely computationally expensive procedure, capsule network is difficult to be applied to the task of temporal action detection. To address this issue, this paper proposes a novel U-shaped capsule network framework with a k-Nearest Neighbor (k-NN) mechanism of 3D convolutional dynamic routing, which we named U-BlockConvCaps. Furthermore, we build a Capsules Boundary Network (CapsBoundNet) based on U-BlockConvCaps for dense temporal action proposal generation. Specifically, the first module is one 1D convolutional layer for fusing the two-stream with RGB and optical flow video features. The sampling module further processes the fused features to generate the 2D start-end action proposal feature maps. Then, the multi-scale U-Block convolutional capsule module with 3D convolutional dynamic routing is used to process the proposal feature map. Finally, the feature maps generated from the CapsBoundNet are used to predict starting, ending, action classification, and action regression score maps, which help to capture the boundary and intersection over union features. Our work innovatively improves the dynamic routing algorithm of capsule networks and extends the use of capsule networks to the temporal action detection task for the first time in the literature. The experimental results on benchmarks THUMOS14 show that the performance of CapsBoundNet is obviously beyond the state-of-the-art methods, e.g., the mAP@tIoU=0.3, 0.4, 0.5 on THUMOS14 are improved from 63.6% to 70.0%, 57.8% to 63.1%, 51.3% to 52.9%, respectively. We also got competitive results on the action detection dataset of ActivityNet1.3.
- Published
- 2022
- Full Text
- View/download PDF
135. Efficient Context-Guided Stacked Refinement Network for RGB-T Salient Object Detection
- Author
-
Fushuo Huo, Xuegui Zhu, Yu Shu, Qifeng Liu, and Lei Zhang
- Subjects
business.industry ,Computer science ,Context (language use) ,Filter (signal processing) ,FLOPS ,Task (computing) ,Media Technology ,Fuse (electrical) ,RGB color model ,Computer vision ,Noise (video) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Spatial analysis - Abstract
RGB-T salient object detection (SOD) aims at utilizing the complementary cues of RGB and Thermal (T) modalities to detect and segment the common objects. However, on one hand, existing methods simply fuse the features of two modalities without fully considering the characters of RGB and T. On the other hand, the high computational cost of existing methods prevents them from real-world applications (e.g., automatic driving, abnormal detection, person re-ID). To this end, we proposed an efficient encoder-decoder network named Context-guided Stacked Refinement Network (CSRNet). Specifically, we utilize a lightweight backbone and design efficient decoder parts, which greatly reduce the computational cost. To fuse RGB and T modalities, we proposed an efficient Context-guided Cross Modality Fusion (CCMF) module to filter the noise and explore the complementation of two modalities. Besides, Stacked Refinement Network (SRN) progressively refines the features from top to down via the interaction of semantic and spatial information. Extensive experiments show that our method performs favorably against state-of-the-art algorithms on RGB-T SOD task while with small model size (4.6M), few FLOPs (4.2G), and real-time speed (38fps). Our codes is available at: https://github.com/huofushuo/CSRNet.
- Published
- 2022
- Full Text
- View/download PDF
136. Report Summarizes Breast Cancer Study Findings from College of Engineering (Rapid Tri-net: Breast Cancer Classification From Histology Images Using Rapid Tri-attention Network).
- Subjects
TUMOR classification ,BREAST cancer ,ENGINEERING schools ,HISTOLOGY ,RGB color model - Abstract
A report from the College of Engineering in Maharashtra, India discusses the importance of early detection in breast cancer and proposes an automated breast cancer detection system using deep learning and histopathological images. The proposed approach involves several steps, including image filtering, dual-stage segmentation, feature extraction, feature selection, and classification. The research concludes that the proposed system, called Rapid Tri-Net, performs better than existing approaches. The study has been peer-reviewed and provides valuable insights into breast cancer detection using advanced technology. [Extracted from the article]
- Published
- 2024
137. Critical importance of RGB color space specificity for colorimetric bio/chemical sensing: A comprehensive study.
- Author
-
Fay, Cormac D. and Wu, Liang
- Subjects
- *
COLOR space , *RGB color model , *COLORIMETRIC analysis , *CHEMICAL models - Abstract
The use of the RGB color model in colorimetric chemical sensing via imaging techniques is widely prevalent in the literature. However, the lack of specificity in the selection of RGB color space during capture and analysis presents a significant challenge in creating standardised methods for this field and possible discrepancies. In this study, we conducted a comprehensive comparison and contrast of a total of 68 RGB color spaces to evaluate their respective impacts on colorimetric bio/chemical sensing. We explore the impact of dynamic range, sensitivity, and limit of detection, and show that the lack of specificity in RGB color space selection can significantly impact colorimetric chemical sensing by 42–77%. We also explore the impact of underlying RGB comparisons and demonstrate a further 18.3% discrepancy between RGB color spaces. By emphasising the importance of proper RGB color space selection and handling, our findings contribute to a better understanding of this critical area and present valuable opportunities for future research. We further provide valuable insights for creating standardised methods in this field, which can be utilised to avoid discrepancies and ensure accurate and reliable analysis in colorimetric bio/chemical sensing. [Display omitted] • Unspecified RGB color spaces can compromise chemical sensing via imaging techniques. • Sensory characteristics significantly affected by RGB color space selection (42–77%). • Conversion between RGB spaces can lead to further discrepancies (18%). • 68 RGB color spaces compared - proper selection is critical for accurate analysis. • Proposed method herein can contribute to standardised methods/analysis in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
138. Exploiting digital images and videos for urea determination in milk based on enzymatic hydrolysis monitoring.
- Author
-
Gonçalves, Isabela C., Fernandes, Gabriel M., and Rocha, Fábio R.P.
- Subjects
- *
DIGITAL images , *DIGITAL video , *RGB color model , *VIDEO monitors , *UREA , *SKIM milk , *IMAGE analysis - Abstract
As an endogenous species and a milk adulterant, urea determination is important for quality control in dairy industry. This work proposes a novel, simple, and cost-effective method to determine urea by the analysis of digital images and videos. Urea hydrolysis under urease catalysis releases ammonia and the increase in pH makes feasible the photometric measurements based on the dissociation of the phenol red acid-base indicator. Digital images were acquired by a smartphone camera under controlled conditions and intensities of the reflected radiation were converted to the RGB color system using a free app (PhotoMetrix®). Digital videos were exploited for procedure optimization and evaluation of matrix effects from RGB values obtained with ImageJ® software. A linear response was achieved within 2.0–25.0 mg L-1 urea (R2 =0.996), with a coefficient of variation (n = 10) of 1.6%, and limits of detection (95% confidence level) and quantification of 0.5 and 1.5 mg L-1, respectively. Recoveries from 95% to 112% were estimated from samples spiked with urea and results for commercial milk samples achieved by the standard additions method agreed with the reference values at the 95% confidence level. The procedure is applicable to whole, semi-skim, and skim milk either processed using pasteurization or under ultra-heat treatment. [Display omitted] • Exploitation of enzymatic hydrolysis for urea determination in milk. • Digital videos for kinetic monitoring and detection of matrix effects. • Application to different kinds of milk without pretreatment. • An environmentally friendly approach with microliter amounts of sample and reagents. • Accurate results achieved by the standard additions method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
139. Zenopontonia soror caridean shrimp exhibits dynamical mimicry with sea star host switching.
- Author
-
Fernández-Lereé, Carla G., Ávila-García, Ariadna, Sánchez, Carlos, Borda, Elizabeth, López-Vivas, Juan M., Huato-Soberanis, Leonardo, and Gómez-Gutiérrez, Jaime
- Subjects
- *
STARFISHES , *RGB color model , *SHRIMPS , *BODY size , *CHROMATOPHORES - Abstract
Zenopontonia soror (Nobili, 1904) (Caridea: Palaemonidae) is a shrimp symbiont of several sea star species that exhibits homochromy with distinct mimetic coloration according to host species. To date, populations of Z. soror have been studied extensively in the Indo-Pacific and Gulf of Panama; however, those present in the Gulf of California (Mexico) have not been evaluated. Here we conduct the first evaluation of the diversity of chromotypes present in Z. soror populations in Bahía de La Paz (Baja California Sur, Mexico), and experimentally tested how long shrimp take to archieve mimicry post sea star switch. We hypothesized that Z. soror individuals rapidly change coloration (minutes to few hours) throughout their ontogeny to mimic the coloration of host species to decrease risk of predation. This hypothesis was tested by translocating Z. soror individuals to a different sea star species with different coloration and background in the field and under laboratory conditions for three consecutive days. At least 18 chromotypes of Z. soror were identified among 386 observed specimens based on the RGB color system; and colors were independent of body size. Each chromotype holds different relative abundance in the population. Coloration was not distinctive of each individual, while color intensity increased along ontogeny development from semi-transparent (juveniles) to solid coloration (adult phase). Dynamic contraction or dispersion of the chromatophores (pigment migration) caused changes in coloration to resemble coloration of the basibiont sea star. Full mimicry of new host was achieved within 36 h in field and laboratory conditions. Zenopontonia soror individuals were observed to mimic the white spines of the sea star host Acanthaster solaris by exhibiting a white dorsal band, while others selected color-concealing hosts to stay cryptic for a short time. We conclude that Z. soror dynamically shifts coloration as an adaptation to match the symbiont shrimp–sea star host coloration to minimize or prevent predation, but individuals are vulnerable to be detected by visual predators during the color transition (<36 h period). The combined use of the Hogben and Slome index and the RGB color system from digital images provide the most precise method to assign intensity and color in epibiont-host organisms to understand the complex process of homochromy. • Zenopontonia soror has least 18 chromotypes changing independently of body size. • Each chromotype holds distinct relative abundance in the population. • Color intensity increased to ontogenetic development from transparent (juveniles) to solid coloration (adults). • Dynamic contraction or dispersion of the chromatophores cause change in coloration to resemble coloration of the basibiont. • Full homocromy of new host was achieved within 36 h in field and laboratory conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
140. Smartphone-assisted real-time estimation of chlorophyll and carotenoid concentrations and ratio using the inverse of red and green digital color features
- Author
-
Agarwal, Avinash, Dongre, Piyush Kumar, and Dutta Gupta, Snehasish
- Published
- 2021
- Full Text
- View/download PDF
141. Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection
- Author
-
Siwei Ma, Guibiao Liao, Weisi Lin, Yongsheng Liang, Wei Gao, Ge Li, and School of Computer Science and Engineering
- Subjects
Dynamic Cross-Modal Guided Mechanism ,Modality (human–computer interaction) ,Generalization ,business.industry ,Computer science ,RGBD/RGB-T Multi-Modal Data ,Pattern recognition ,Modal ,Robustness (computer science) ,Feature (computer vision) ,Human visual system model ,Media Technology ,Benchmark (computing) ,Computer science and engineering [Engineering] ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
The use of complementary information, namely depth or thermal information, has shown its benefits to salient object detection (SOD) during recent years. However, the RGB-D or RGB-T SOD problems are currently only solved independently, and most of them directly extract and fuse raw features from backbones. Such methods can be easily restricted by low-quality modality data and redundant cross-modal features. In this work, a unified end-to-end framework is designed to simultaneously analyze RGB-D and RGB-T SOD tasks. Specifically, to effectively tackle multi-modal features, we propose a novel multi-stage and multi-scale fusion network (MMNet), which consists of a crossmodal multi-stage fusion module (CMFM) and a bi-directional multi-scale decoder (BMD). Similar to the visual color stage doctrine in the human visual system (HVS), the proposed CMFM aims to explore important feature representations in feature response stage, and integrate them into cross-modal features in adversarial combination stage. Moreover, the proposed BMD learns the combination of multi-level cross-modal fused features to capture both local and global information of salient objects, and can further boost the multi-modal SOD performance. The proposed unified cross-modality feature analysis framework based on two-stage and multi-scale information fusion can be used for diverse multi-modal SOD tasks. Comprehensive experiments (∼92K image-pairs) demonstrate that the proposed method consistently outperforms the other 21 state-of-the-art methods on nine benchmark datasets. This validates that our proposed method can work well on diverse multi-modal SOD tasks with good generalization and robustness, and provides a good multimodal SOD benchmark. Accepted version This work was supported by Ministry of Science and Technology of China - Science and Technology Innovations 2030 (2019AAA0103501), Natural Science Foundation of China (61801303 and 62031013), Guangdong Basic and Applied Basic Research Foundation (2019A1515012031), and Shenzhen Science and Technology Plan Basic Research Project (JCYJ20190808161805519).
- Published
- 2022
- Full Text
- View/download PDF
142. Enhancing RGB-D SLAM Performances Considering Sensor Specifications for Indoor Localization
- Author
-
Abdelhafid El Ouardi, Sergio Rodriguez Florez, Imad El Bouazzaoui, Systèmes et Applications des Technologies de l'Information et de l'Energie (SATIE), École normale supérieure - Cachan (ENS Cachan)-Université Paris-Sud - Paris 11 (UP11)-Institut Français des Sciences et Technologies des Transports, de l'Aménagement et des Réseaux (IFSTTAR)-École normale supérieure - Rennes (ENS Rennes)-Université de Cergy Pontoise (UCP), and Université Paris-Seine-Université Paris-Seine-Conservatoire National des Arts et Métiers [CNAM] (CNAM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
robotics ,business.industry ,Computer science ,RGB-D cameras ,010401 analytical chemistry ,Context (language use) ,Robotics ,Simultaneous localization and mapping ,01 natural sciences ,0104 chemical sciences ,Stereopsis ,Depth map ,Sensors specifications ,SLAM ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,Robot ,RGB color model ,Computer vision ,Artificial intelligence ,Depth Map ,Electrical and Electronic Engineering ,business ,Indoor localization ,Instrumentation ,Protocol (object-oriented programming) - Abstract
International audience; Several works have focused on Simultaneous Localization and Mapping (SLAM), which is a topic that has been studied for more than a decade to meet the needs of robots to navigate in an unknown environment. SLAM is an essential perception functionality in several applications, especially in robotics and autonomous vehicles. RGB-D cameras are among the sensors commonly used with recent SLAM algorithms. They provide an RGB image and the associated depth map, making it possible to solve scale drift with less complexity and create a dense 3D environment representation. Many RGB-D SLAM algorithms have been studied and evaluated on publicly available datasets without considering sensor specifications or image acquisition modes that could improve or decrease localization accuracy. In this work, we deal with indoor localization, taking into account the sensor specifications. In this context, our contribution is a deep experimental study to highlight the impact of the sensor acquisition modes on the localization accuracy, and a parametric optimization protocol for a precise localization in a given environment. Furthermore, we apply the proposed protocol to optimize a depth-related parameter of the SLAM algorithm. The study is based on a publicly available dataset in an indoor environment with a depth sensor. The reconstruction results’ analysis is founded on the study of different metrics involving translational and rotational errors. These metrics errors are compared with those obtained with a state-of-the-art stereo vision-based SLAM algorithm.
- Published
- 2022
- Full Text
- View/download PDF
143. Spatiotemporal Multimodal Learning With 3D CNNs for Video Action Recognition
- Author
-
Yibin Li, Xin Ma, and Hanbo Wu
- Subjects
Modality (human–computer interaction) ,Computer science ,business.industry ,Data stream mining ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,ENCODE ,Convolutional neural network ,Multimodal learning ,Discriminative model ,Media Technology ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Pose - Abstract
Extracting effective spatial-temporal information is significantly important for video-based action recognition. Recently 3D convolutional neural networks (3D CNNs) that could simultaneously encode spatial and temporal dynamics in videos have made considerable progress in action recognition. However, almost all existing 3D CNN-based methods recognize human actions only using RGB videos. The single modality may limit the performance capacity of 3D networks. In this paper, we extend 3D CNN to depth and pose data besides RGB data to evaluate its capacity for spatiotemporal multimodal learning for video action recognition. We propose a novel multimodal two-stream 3D network framework, which can exploit complementary multimodal information to improve the recognition performance. Specifically, we first construct two discriminative video representations under depth and pose data modalities respectively, referred as depth residual dynamic image sequence (DRDIS) and pose estimation map sequence (PEMS). DRDIS captures spatial-temporal evolution of actions in depth videos by progressively aggregating the local motion information. PEMS eliminates the interference of cluttered backgrounds and describes the spatial configuration of body parts intuitively. The multimodal two-stream 3D CNN deals with two separate data streams to learn spatiotemporal features from DRDIS and PEMS representations. Finally, the classification scores from two streams are fused for action recognition. We conduct extensive experiments on four challenging action recognition datasets. The experimental results verify the effectiveness and superiority of our proposed method.
- Published
- 2022
- Full Text
- View/download PDF
144. Multi-Graph Fusion and Learning for RGBT Image Saliency Detection
- Author
-
Yunhui Yan, Menghui Niu, Liming Huang, Kechen Song, and Jie Wang
- Subjects
Fusion ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,Pattern recognition ,Field (computer science) ,Image (mathematics) ,Ranking (information retrieval) ,Media Technology ,RGB color model ,Graph (abstract data type) ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
RGB and thermal infrared (RGBT) image saliency detection is a relatively new direction in the field of computer vision. Combining the advantages of RGB images and T images can significantly improve detection performance. Currently, there are only a few methods to work on RGBT saliency detection, and the number of image samples cannot meet the training requirements for deep learning, so it remains valuable to propose an effective unsupervised method. In this paper, we present an unsupervised RGBT saliency detection method based on multi-graph fusion and learning. Firstly, RGB images and T images are adaptively fused based on boundary information to produce more accurate superpixels. Next, a multi-graph fusion model is proposed to selectively learn useful information from multi-modal images. Finally, we implement the theory of finding good neighbors in the graph affinity and propose different algorithms for two stages of saliency ranking. Experimental results on three RGBT datasets show that the proposed method is effective compared with the state-of-the-art algorithms.
- Published
- 2022
- Full Text
- View/download PDF
145. SiamCDA: Complementarity- and Distractor-Aware RGB-T Tracking Based on Siamese Network
- Author
-
Jungong Han, Xueru Liu, Qiang Zhang, and Tianlu Zhang
- Subjects
BitTorrent tracker ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Frame rate ,Feature (computer vision) ,Minimum bounding box ,Robustness (computer science) ,Video tracking ,Media Technology ,RGB color model ,Computer vision ,Pyramid (image processing) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Recent years have witnessed the prevalence of using the Siamese network for RGB-T tracking because of its remarkable success in RGB object tracking. Despite their faster than real-time speeds, existing RGB-T Siamese trackers suffer from low accuracy and poor robustness, compared to other state-of-the-art RGB-T trackers. To address such issues, a new complementarity- and distractor-aware RGB-T tracker based on Siamese network (referred to as SiamCDA) is developed in this paper. To this end, several modules are presented, where the feature pyramid network (FPN) is incorporated into the Siamese network to capture the cross-level information within unimodal features extracted from the RGB or the thermal images. Next, a complementarity-aware multi-modal feature fusion module (CA-MF) is specially designed to capture the cross-modal information between RGB features and thermal features. In the final bounding box selection phase, a distractor-aware region proposal selection module (DAS) further enhances the robustness of our tracker. On top of the technical modules, we also build a large-scale, diverse synthetic RGB-T tracking dataset, containing more than 4831 pairs of synthetic RGB-T videos and 12K synthetic RGB-T images. Extensive experiments on three RGB-T tracking benchmark datasets demonstrate the outstanding performance of our proposed tracker with a tracking speed over 37 frames per second (FPS).
- Published
- 2022
- Full Text
- View/download PDF
146. An efficient method for acquisition of spectral BRDFs in real-world scenarios
- Author
-
J. Roberto Jiménez-Pérez, Joaquim J. Sousa, Luís Pádua, Juan Manuel Jurado, and Francisco R. Feito
- Subjects
Human-Computer Interaction ,Computer graphics ,Computer science ,Spectral rendering ,General Engineering ,Point cloud ,Process (computing) ,RGB color model ,Hyperspectral imaging ,Bidirectional reflectance distribution function ,Computer Graphics and Computer-Aided Design ,Sample (graphics) ,Remote sensing - Abstract
Modelling of material appearance from reflectance measurements has become increasingly prevalent due to the development of novel methodologies in Computer Graphics. In the last few years, some advances have been made in measuring the light-material interactions, by employing goniometers/reflectometers under specific laboratory’s constraints. A wide range of applications benefit from data-driven appearance modelling techniques and material databases to create photorealistic scenarios and physically based simulations. However, important limitations arise from the current material scanning process, mostly related to the high diversity of existing materials in the real-world, the tedious process for material scanning and the spectral characterisation behaviour. Consequently, new approaches are required both for the automatic material acquisition process and for the generation of measured material databases. In this study, a novel approach for material appearance acquisition using hyperspectral data is proposed. A dense 3D point cloud filled with spectral data was generated from the images obtained by an unmanned aerial vehicle (UAV) equipped with an RGB camera and a hyperspectral sensor. The observed hyperspectral signatures were used to recognise natural and artificial materials in the 3D point cloud according to spectral similarity. Then, a parametrisation of Bidirectional Reflectance Distribution Function (BRDF) was carried out by sampling the BRDF space for each material. Consequently, each material is characterised by multiple samples with different incoming and outgoing angles. Finally, an analysis of BRDF sample completeness is performed considering four sunlight positions and 16x16 resolution for each material. The results demonstrated the capability of the used technology and the effectiveness of our method to be used in applications such as spectral rendering and real-word material acquisition and classification.
- Published
- 2022
- Full Text
- View/download PDF
147. Nonuniformity Measurement of Image Resolution under Effect of Color Speckle for Raster-Scan RGB Laser Mobile Projector
- Author
-
Kazuo Kuroda, Koji Suzuki, Akira Takamori, Keisuke Hieda, Kazuhisa Yamamoto, and Junichi Kinoshita
- Subjects
Computer science ,business.industry ,Laser ,Electronic, Optical and Magnetic Materials ,law.invention ,Speckle pattern ,Optics ,Projector ,law ,RGB color model ,Electrical and Electronic Engineering ,business ,Raster scan ,Image resolution - Published
- 2022
- Full Text
- View/download PDF
148. NeuroIV: Neuromorphic Vision Meets Intelligent Vehicle Towards Safe Driving With a New Database and Baseline Evaluations
- Author
-
Jörg Conradt, Alois Knoll, Zhenyan Zhang, Weijun Li, Fa Wang, Yiwen Lu, Lin Hong, Jieneng Chen, and Guang Chen
- Subjects
Focus (computing) ,Database ,Computer science ,Event (computing) ,Mechanical Engineering ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.software_genre ,Field (computer science) ,Computer Science Applications ,Neuromorphic engineering ,Asynchronous communication ,Automotive Engineering ,RGB color model ,Distracted driving ,Motion perception ,computer - Abstract
Neuromorphic vision sensors such as the Dynamic and Active-pixel Vision Sensor (DAVIS) using silicon retina are inspired by biological vision, they generate streams of asynchronous events to indicate local log-intensity brightness changes. Their properties of high temporal resolution, low-bandwidth, lightweight computation, and low-latency make them a good fit for many applications of motion perception in the intelligent vehicle. However, as a younger and smaller research field compared to classical computer vision, neuromorphic vision is rarely connected with the intelligent vehicle. For this purpose, we present three novel datasets recorded with DAVIS sensors and depth sensor for the distracted driving research and focus on driver drowsiness detection, driver gaze-zone recognition, and driver hand-gesture recognition. To facilitate the comparison with classical computer vision, we record the RGB, depth and infrared data with a depth sensor simultaneously. The total volume of this dataset has 27360 samples. To unlock the potential of neuromorphic vision on the intelligent vehicle, we utilize three popular event-encoding methods to convert asynchronous event slices to event-frames and adapt state-of-the-art convolutional architectures to extensively evaluate their performances on this dataset. Together with qualitative and quantitative results, this work provides a new database and baseline evaluations named NeuroIV in cross-cutting areas of neuromorphic vision and intelligent vehicle.
- Published
- 2022
- Full Text
- View/download PDF
149. Detecting Specular Reflections and Cast Shadows to Estimate Reflectance and Illumination of Dynamic Indoor Scenes
- Author
-
Eric Marchand, Salma Jiddi, Philippe Robert, Technicolor [Cesson Sévigné], Technicolor, Geomagical Labs, InterDigital R&D France, Sensor-based and interactive robotics (RAINBOW), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SIGNAUX ET IMAGES NUMÉRIQUES, ROBOTIQUE (IRISA-D5), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-CentraleSupélec-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Bretagne Sud (UBS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-RÉALITÉ VIRTUELLE, HUMAINS VIRTUELS, INTERACTIONS ET ROBOTIQUE (IRISA-D5), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique)
- Subjects
diffuse ,reflectance ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Virtual reality ,Metaverse ,shadow ,Shadow ,0202 electrical engineering, electronic engineering, information engineering ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,specular ,Computer vision ,Specular reflection ,mixed reality ,ComputingMethodologies_COMPUTERGRAPHICS ,business.industry ,illumination ,020207 software engineering ,Photometric registration ,Computer Graphics and Computer-Aided Design ,Reflectivity ,Mixed reality ,Range (mathematics) ,Specularity ,Signal Processing ,retexturing ,RGB color model ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,texture ,Software - Abstract
International audience; The goal of Mixed Reality (MR) is to achieve a seamless and realistic blending between real and virtual worlds. This requires the estimation of reflectance properties and lighting characteristics of the real scene. One of the main challenges within this task consists in recovering such properties using a single RGB-D camera. In this paper, we introduce a novel framework to recover both the position and color of multiple light sources as well as the specular reflectance of real scene surfaces. This is achieved by detecting and incorporating information from both specular reflections and cast shadows. Our approach is capable of handling any textured surface and considers both static and dynamic light sources. Its effectiveness is demonstrated through a range of applications including visually-consistent mixed reality scenarios (e.g. correct real specularity removal, coherent shadows in terms of shape and intensity) and retexturing where the texture of the scene is altered whereas the incident lighting is preserved.
- Published
- 2022
- Full Text
- View/download PDF
150. Enhancement Layer Coding for Chroma Sub-Sampled Screen Content Video
- Author
-
Andre Kaup, Alexander Gehlert, Benjamin Prestele, and Andreas Heindel
- Subjects
Lossless compression ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,YCbCr ,Media Technology ,Chrominance ,RGB color model ,Codec ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Layer (object-oriented design) ,Joint (audio engineering) ,business ,Coding (social sciences) - Abstract
Prevalent video codec implementations deployed in the field often solely support chrominance sub-sampled video data represented by the YCbCr 4:2:0 format. For certain applications like screen sharing, however, chroma sub-sampling leads to disturbing artifacts, especially for text or graphics with thin lines. It is desirable to reduce these artifacts while maintaining compatibility with all user devices. For this reason, an enhancement layer coding framework for chroma-format scalable coding with specific focus on screen content is proposed in this paper. Based on an analysis of screen content data characteristics, the enhancement layer codec is optimized specifically for this content class, is of low algorithmic complexity, and applicable with any image or video codec for base layer compression in a joint coding system. The system is intentionally not designed as a general purpose lossless YCbCr 4:4:4 coding scheme, but instead chooses to close a quality gap that prevalent video codecs do not address. Experimental analysis reveals an average BD-PSNR gain in the RGB domain of 1.0 dB comparing the proposed two-layer scalable coding approach to single layer compression using base layer coding only. In relation to simulcast coding, an average BDR RGB difference of 8.1% is observed.
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.