315 results
Search Results
2. Foreword to the special section on best papers of the Eurographics 2022 Education Papers Program
- Author
-
Eric Paquette and Jean-Jacques Bourdin
- Subjects
Human-Computer Interaction ,General Engineering ,Computer Graphics and Computer-Aided Design - Published
- 2023
3. GRSI Best Paper Award 2021
- Published
- 2022
- Full Text
- View/download PDF
4. GRSI Best Paper Award
- Published
- 2021
- Full Text
- View/download PDF
5. GRSI Best Paper Award
- Published
- 2021
- Full Text
- View/download PDF
6. GRSI Best Paper Award 2021
- Published
- 2022
- Full Text
- View/download PDF
7. Visualising geospatial time series datasets in realtime with the Digital Earth Viewer.
- Author
-
Buck, Valentin, Stäbler, Flemming, Mohrmann, Jochen, González, Everardo, and Greinert, Jens
- Subjects
- *
TIME series analysis , *UNDERWATER exploration , *INTERNET servers , *CONFERENCE papers , *VISUALIZATION , *GEOSPATIAL data - Abstract
A comprehensive study of the Earth System and its different environments requires understanding of multi-dimensional data acquired with a multitude of different sensors or produced by various models. Here we present a component-wise scalable web-based framework for simultaneous visualisation of multiple data sources. It helps contextualise mixed observation and simulation data in time and space. This work is an extended version of the conference paper (Buck et al., 2021). [Display omitted] • Open-source (EUPL) hybrid application for realtime visualisation of 4D geoscientific data. • Split into native server and web client to utilize the strengths of both platforms. • Desktop builds are released for Windows, Linux, and MacOS. • Viewer used on expedition cruises to plan underwater exploration missions • Presentation and data validation capabilities used by GLODAP [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Computing and analyzing decision boundaries from shortest path maps.
- Author
-
Sharma, Ritesh and Kallmann, Marcelo
- Subjects
- *
CIVILIAN evacuation , *SCALAR field theory , *EMERGENCY management , *TOPOLOGICAL fields , *DATA visualization - Abstract
This paper proposes a methodology for computing, visualizing, and analyzing critical decision boundaries for the selection of shortest paths in a given environment. Decision boundaries are defined as the points in a map from which two or more different shortest paths exist towards a destination. This paper introduces the problem of visualizing their evolution, taking into account moving obstacles, moving goals, and as well multiple goals. The proposed visualizations enable analyzing which paths should be taken and at which departure times, such that a destination can be reached by the shortest possible path when taking into account a moving target or time-varying areas to be avoided. The proposed techniques are also applied to the analysis and improvement of exit placement in a given environment, in order to improve the evacuation flow in emergency situations. [Display omitted] • This research presents a unique method for detecting decision boundaries in a given environment, based on the analysis of the generator points of the Shortest Path Map (SPM) rather than employing traditional scalar field topological methods relying on cell neighborhood information which can be affected by the representation resolution. • The proposed approach introduces tools and techniques to visualize the evolution of decision boundaries when considering dynamically-changing obstacles and targets, and to design exit placement to equalize the escape flow distribution. • This novel approach supports decision-making applications related to navigation and environment modeling in emergency evacuation planning. • By analyzing and visualizing SPM decision boundaries, the lengths of globally-optimal Euclidean shortest paths are taken into account, instead of grid-based accumulated distances used in other approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Image deraining based on dual-channel component decomposition.
- Author
-
Lin, Xiao, Xu, Duojiu, Tan, Peiwen, Ma, Lizhuang, and Wang, Zhi-Jie
- Subjects
- *
IMAGE reconstruction , *IMAGE processing , *VISIBILITY - Abstract
Image deraining aims to remove rain streaks from images and reduce information loss in outdoor images caused by rain. As a fundamental task in image processing, image deraining not only enhances the visibility of images but also provides necessary image restoration for advanced vision tasks. Existing image deraining models mostly train end-to-end models by minimizing the similarity between the output image of the model and the rain-free ground truth. Although these methods have achieved significant results, they often perform poorly in the face of dense and changing rain streak scenes. In this paper, we propose a novel method, called D ual-Channel C omponent D ecomposition Net work (DCD-Net). The basic idea of DCD-Net is to leverage the separability prior of rainy images, treats the rain-free background layer and the rain streak mask layer as two parallel component extraction tasks. To this end, it builds a dual-branch parallel networks that extract the rain-free background image and decouple the reconstruction information of the rain streak mask, respectively. It finally applies a composite multi-level contrastive supervision to the output of the above dual-branch parallel network, thereby achieving rain streak removal. Extensive experiments on various datasets demonstrate that the proposed model outperforms existing methods in deraining dense rain streak images. [Display omitted] • This paper proposes an image deraining method, called Dual-Channel Component Decomposition Network (DCD-Net). • DCD-Net treats the rain-free background layer and the rain streak mask layer as two parallel component extraction tasks. • DCD-Net obtains competitive performance in deraining complex and dense rain streak images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. [formula omitted]GAN: Importance Weight and Wavelet feature guided Image-to-Image translation under limited data.
- Author
-
Yang, Qiuxia, Pu, Yuanyuan, Zhao, Zhengpeng, Xu, Dan, and Li, Siqi
- Subjects
- *
GENERATIVE adversarial networks , *MACHINE translating - Abstract
Image-to-Image (I2I) translation methods based on generative adversarial networks (GANs) require large amounts of training data, without which they will suffer from over-fitting and train divergence, and trained models are sub-optimal. In addition, it would be very difficult for the model to synthesize high-frequency signals, deteriorating the synthesis quality. To address these, this paper proposes W 2 GAN, which mainly introduces the ideas of the Importance Weight and Wavelet transformation to achieve the I2I translation trained on limited-data. Concretely, this paper first alleviates the over-fitting and train divergence by the adversarial loss with importance weight, which aims to improve the influence of the high-quality generated images during the training generator, thus enhancing the generator to deceive the discriminator. Then, the high-frequency features of the wavelet transformation are applied to the decoder, and wavelet-AdaIN normalization is proposed to prevent deficiency of high-frequency information, which adaptively integrates high-frequency statistical characteristics from generated features and real image high-frequency information. Qualitative and quantitative results on the AFHQ and CelebA-HQ datasets demonstrate the merits of the W 2 GAN. Noticeably, this paper achieves state-of-the-art FID and KID on AFHQ and CelebA-HQ datasets. [Display omitted] • In this paper, the GANs are trained under limited data. This paper overcomes the problem of the discriminator over-fitting during the training, which leads to the model divergence and the results degradation, stabilizing the training process and achieving higher quality results. From the qualitative and quantitative perspectives, this paper has achieved competitive results in current research. • This paper introduces the learnable importance weight to the adversarial loss, which aims to hope the high-quality images produce higher influence during the training generator. It relieves the problems of the training diverge and over-fitting. • This paper proposes a Wavelet-AdaIN Normalization to learn the high-frequency features, which adaptively integrates high-frequency statistical characteristics from generated features and real image high-frequency information. It encourages the generator to produce precise high-frequency signals with fine details. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. LBARNet: Lightweight bilateral asymmetric residual network for real-time semantic segmentation.
- Author
-
Hu, Xuegang and Zhou, Baoman
- Subjects
- *
DATA mining , *COMPUTER vision , *IMAGE segmentation , *VISUAL fields , *PIXELS , *LEARNING modules - Abstract
Real-time semantic segmentation, as a key technique for scene understanding, has been an important research topic in the field of computer vision in recent years. However, existing models are unable to achieve good segmentation accuracy on mobile devices due to their huge computational overhead, which makes it difficult to meet actual industrial requirements. To address the problems faced by current semantic segmentation tasks, this paper proposes a lightweight bilateral asymmetric residual network (LBARNet) for real-time semantic segmentation. First, we propose the bilateral asymmetric residual (BAR) module. This module learns multi-scale feature representations with strong semantic information at different stages of the semantic information extraction branch, thus improving pixel classification performance. Secondly, the spatial information extraction (SIE) module is constructed in the spatial detail extraction branch to capture multi-level local features of the shallow network to compensate for the lost geometric information in the downsampling stage. At the same time, we design the attention mechanism perception (AMP) module in the jump connection part to enhance the contextual representation. Finally, we design the dual branch feature fusion (DBF) module to exploit the correspondence between higher-order features and lower-order features to fuse spatial and semantic information appropriately. The experimental results show that LBARNet, without any pre-training and pre-processing and using only 0.6M parameters, achieves 70.8% mloU and 67.2% mloU on the Cityscapes dataset and Camvid dataset, respectively. LBARNet maintain a high segmentation accuracy while using a smaller number of parameters compared to most existing state-of-the-art models. [Display omitted] • This paper proposes a Bilateral Asymmetric Residual (BAR) module, a Spatial Information Extraction (SIE) module, an Attention Mechanism Perception (AMP) module and a Dual Branch Feature Fusion (DBF) module. • A Lightweight Bilateral Asymmetric Residual Network (LBARNet) for real-time image semantic segmentation is proposed in this article. • The experimental results show that LBARNet achieves 70.8% and 67.2% segmentation accuracy on two challenging datasets (Cityscapes and CamVid) using only 0.6M parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. LFPeers: Temporal similarity search and result exploration.
- Author
-
Sachdeva, Madhav, Burmeister, Jan, Kohlhammer, Jörn, and Bernard, Jürgen
- Subjects
- *
CONCEPT learning , *VISUAL analytics , *BEHAVIORAL assessment , *COVID-19 pandemic , *PEERS , *NATURAL gas prospecting - Abstract
In this paper, we introduce a general concept for the analysis of temporal and multivariate data and the system LFPeers that applies this concept to temporal similarity search and results exploration. The conceptual workflow divides the analysis in two phases: a search phase to find the most similar objects to a query object before a time point t 0 in the temporal data, and an exploration phase to analyze and contextualize this subset of objects after t 0. LFPeers enables users to search for peers through interactive similarity search and filtering, explore interesting behavior of this peer group, and learn from peers through the assessment of diverging behaviors. We present the conceptual workflow to learn from peers and the LFPeers system with novel interfaces for search and exploration in temporal and multivariate data. An earlier workshop publication for LFPeers included a usage scenario targeting epidemiologists and the public who want to learn from the Covid-19 pandemic and distinguish successful and ineffective measures. In this extended paper, we now show how our concept is generalized and applied by domain experts in two case studies, including a novel case on stocks data. Finally, we reflect on the new state of development and on the insights gained by the experts in the case studies on the search and exploration of temporal data to learn from peers. [Display omitted] • Conceptual framework for temporal and multimodal similarity search and exploration. • Visual Analytics system for cause–effect analysis on temporal data. • Interactive user-defined search and exploration of data objects by similarity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Deep learning of curvature features for shape completion.
- Author
-
Hernández-Bautista, Marina and Melero, Francisco Javier
- Subjects
- *
DEEP learning , *CURVATURE , *GEOMETRIC surfaces , *SURFACE reconstruction , *INPAINTING , *PARAMETERIZATION , *SURFACE geometry - Abstract
The paper presents a novel solution to the issue of incomplete regions in 3D meshes obtained through digitization. Traditional methods for estimating the surface of missing geometry and topology often yield unrealistic outcomes for intricate surfaces. To overcome this limitation, the paper proposes a neural network-based approach that generates points in areas where geometric information is lacking. The method employs 2D inpainting techniques on color images obtained from the original mesh parameterization and curvature values. The network used in this approach can reconstruct the curvature image, which then serves as a reference for generating a polygonal surface that closely resembles the predicted one. The paper's experiments show that the proposed method effectively fills complex holes in 3D surfaces with a high degree of naturalness and detail. This paper improves the previous work in terms of a more in-depth explanation of the different stages of the approach as well as an extended results section with exhaustive experiments. [Display omitted] • We perform 3D surface reconstructions using generative inpainting techniques. • 2D representation of a 3D surface geometry based on its curvature. • Application of a general purpose neural network for inpainting. • Our approach does not require dataset nor training time. • Results outperform state-of-the-art quality and naturalness of the reconstructions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. A systematic review on open-set segmentation.
- Author
-
Nunes, Ian, Laranjeira, Camila, Oliveira, Hugo, and dos Santos, Jefersson A.
- Subjects
- *
COMPUTER vision , *REMOTE sensing , *RESEARCH personnel , *AUTONOMOUS vehicles , *VISUAL learning - Abstract
Open-set semantic segmentation remains yet a challenging task, not only due to the inherent challenges of pixel-wise classification but also the precise segmentation of categories not seen during training. The pursuit of that task is rapidly growing in the Computer Vision community, urging the need to organize the literature. In this paper, we extend our previous work by conducting a more comprehensive systematic mapping of the open-set segmentation literature between January 2001 and January 2023 and proposing a novel taxonomy. Our goal is to provide a broad understanding of current trends for the open-set semantic segmentation (OSS) task defined by existing approaches that may influence future methods. By characterizing methodologies in terms of open-set identification strategies, data inputs, and other relevant aspects, we present a structured view of how researchers are advancing in the field of open-set semantic segmentation. To the best of the authors' knowledge, this is the first systematic review of OSS methods. Moreover, we apply the proposed taxonomy to selected methods for open-set recognition, outlining important similarities and differences of such a closely related field. [Display omitted] • Systematic review of papers related to open-set semantic segmentation for the past 20 years. • The proposed taxonomy aims to organize the literature on open-set segmentation. • Seminal papers on open-set recognition are classified under the proposed taxonomy. • Applications like autonomous driving and remote sensing were found to commonly resort to the open set strategy. • Methods tackling open-world are becoming more commom; [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Teaching the basics of computer graphics in virtual reality.
- Author
-
Heinemann, Birte, Görzen, Sergej, and Schroeder, Ulrik
- Subjects
- *
VIRTUAL reality , *COMPUTERS in education , *TECHNOLOGICAL innovations , *COMPUTER graphics , *SCHOOL environment , *TELEPORTATION - Abstract
New technology such as virtual reality can help computer graphics education, for example, by providing the opportunity to illustrate challenging 3D procedures. RePiX VR is a virtual reality tool for computer graphics education that focuses on teaching the core ideas of the rendering pipeline. This paper describes the development and two initial evaluations, which aimed to strengthen the usability, review requirements for different stakeholders, and build infrastructure for learning analytics and research. The integration of learning analytics raises the question of appropriate indicators to be approached through exploratory data analysis. In addition to learning analytics, the evaluation includes quantitative techniques to get insights about usability, and didactical feedback. This paper discusses advanced aspects of learning in VR and looks specifically at movement behavior. According to the evaluations, even learners without prior experience can utilize the VR tool to pick up the fundamentals of computer graphics. [Display omitted] • Evaluated educational VR environment for teaching Computer Graphics. • Teaching Computer Graphics in Virtual Reality is promising. • Learners have various movement and teleportation pattern and different interaction behavior. • A comparison of Desktop and VR users shows differences between groups, as well as a comparison of novices and experts. • The evaluation contains Multimodal Learning Analytics, Quantitative Feedback, and Usability aspects. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. A framework for the efficient enhancement of non-uniform illumination underwater image using convolution neural network.
- Author
-
Zhang, Wenbo, Liu, Weidong, Li, Le, Jiao, Huifeng, Li, Yanli, Guo, Liwei, and Xu, Jingming
- Subjects
- *
CONVOLUTIONAL neural networks , *GENERATIVE adversarial networks , *LIGHT sources , *IMAGE enhancement (Imaging systems) , *COLOR in nature , *GAUSSIAN function - Abstract
In this paper, the non-uniform illumination enhancement problem of underwater images under the artificial light sources conditions is investigated based on Convolution Neural Network (CNN). First, we propose a trainable end-to-end enhancer called NUIENet, for enhancing the non-uniform illumination of underwater images. The proposed model consists of correction network and fusion layers. The correction network adopts the encoder–decoder structure with skip connections to enhance the features of different channels in the HSV domain, and then these enhanced features are fused by the fusion layers to obtain the desired high-quality images. Second, we built an underwater images dataset using Generative Adversarial Network (GAN) and Gaussian Function. Finally, both qualitative and quantitative experimental results show that the proposed method can produce better performance compared to other state-of-the-art enhancement methods on both real-word and synthetic underwater dataset. • This paper proposes a non-uniform illumination enhancer CNN-based which uses the encoder–decoder structure with skip connections to enhance the underwater images with NUI to the desired high-quality images. • To boost underwater imaging processing, we construct a dataset of the underwater image with NUI based on the GAN and Gaussian function which contains NUI images and their corresponding high-quality reference image. • Compared with other state-of-the-art NUIE methods, the proposed network achieves a nature color correction and superior or equivalent visibility improvement. [Display omitted] [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. DCU-NET: Self-supervised monocular depth estimation based on densely connected U-shaped convolutional neural networks.
- Author
-
Zheng, Qiumei, Yu, Tao, and Wang, Fenghua
- Subjects
- *
CONVOLUTIONAL neural networks , *MONOCULARS , *INFORMATION networks - Abstract
Depth estimation is crucial for scene understanding and downstream tasks, especially the self-supervised training methods showing great potential. The overall structure and local details of the scene are essential for improving the quality of depth estimation. The proposal of Monodepth2 has led to significant progress in self-supervised monocular depth estimation. However, Monodepth2 uses the most basic encoder–decoder architecture. The limited data flow information of the network leads to a large semantic gap between the encoder and the decoder, which reduces the accuracy of the network for fine-grained feature recognition. Monodepth2 adopts Resnet18 pre-trained on the Imagenet dataset as the encoder. This traditional convolutional pooling structure results in a loss of pixel information in the network at every scale. In order to solve this problem, this paper proposes an improved DepthNet. The network adopts Hrnet in semantic segmentation as the base encoder, which adopts an advanced multi-scale fusion method in the whole process, thus avoiding the loss of pixel information. An additional densely connected U-Net is employed at the decoder side to provide more information flow. Furthermore, the semantic gap between the encoder and decoder is reduced by adding different numbers of residual connections and channel attention on each layer. The network structure can be regarded as a collection of fully convolutional networks. Since the deep features of the network have a higher correlation with the vertical position, we add a spatial location attention module to the deep-level network to reduce this semantic gap. The approach performs significantly well on the KITTI dataset benchmark, with several performance criteria comparable to supervised monocular depth inference methods. • This work is a deep estimation network for scene reconstruction and scene understanding. This network redesigns the self-supervised monocular depth framework from an entirely new perspective. The network uses HrNet from the field of semantic segmentation as the base encoder, which employs a progressive multi-scale fusion approach throughout, thus avoiding the loss of pixel information. • An additional densely connected U-Net is used at the decoder side to provide further information flow. To reduce the semantic gap between codecs, we add a different number of residual connections and channel attention on each layer. The network is not trained with the help of other auxiliary networks, and the performance of the depth estimation is improved only by stimulating the network's potential. • This work achieves best-in-class accuracy in monocular depth estimation. When the model in this paper is used for 3D scene reconstruction, it can perform a complete recovery of the scene structure. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. BrightFormer: A transformer to brighten the image.
- Author
-
Wang, Yong, Li, Bo, and Yuan, Xinlin
- Subjects
- *
IMAGE intensifiers , *PIXELS - Abstract
Low-light image enhancement algorithms need recover overall information of images, including local details and global information. However, existing image enhancement methods mainly focus on local details or global information. Therefore, it is challenging to balance the two aspects at the same time. This paper proposes a local dual-branch network (BrightFormer) for image enhancement that combines convolutions and transformers as to solution. The salient features of this paper are: (1) convolution is adopted to refine high-frequency information so that local features are preserved and propagated throughout the network; (2) combining gated parameters with prior information on illumination (ill-map) in self-attention can not only improves the flexibility of feature expression but also extract global features more easily; (3) the obtained local details and global features are fused by spatial and channel attention in Feature equalization fusion unit (FEFU); (4) a Deep feedforward network (DFN) is utilized to encode the location information between adjacent pixels, and the GELU activation function is used to retain useful features and eliminate useless features with an attention-like mechanism. Experimental results show that BrightFormer achieves competitive performance on quantitative metrics and visual perception on the datasets such as LOL, MEF and LIME etc. [Display omitted] • Using cross-convolution can extract rich local features of the image. • Incorporates the gating mechanism and prior information on the illumination. • Different attention mechanisms are adopted for local and global features. • Use the GELU activation function as an attention mechanism. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Visualization of 3D forest fire spread based on the coupling of multiple weather factors.
- Author
-
Meng, Qingkuo, Huai, Yongjian, You, Jiawei, and Nie, Xiaoying
- Subjects
- *
FOREST fires , *WEATHER control , *WEATHER , *VISUALIZATION , *WIND speed , *RAINFALL - Abstract
Combustibles, topography, and weather factors are the three essential factors affecting forest fire behavior, and current forest fire spread models need to consider weather factors fully. This paper proposes a forest fire spread method based on environmental weather factors to present a visualized simulation of forest fire spread in the natural environment. Forest pyrolysis differs based on water content, so a single-tree pyrolysis model with temperature as its core has been constructed to describe the differences in forest pyrolysis during different seasons visually. In addition, based on the improved Huygens principle as the theoretical basis for forest fire spread, weather factors such as wind speed, wind direction, and precipitation controlled by weather are coupled with the forest fire spread process. And the forest fire spread in three-dimensional scenarios is simulated by considering environmental factors. The visualization of the forest fire extinguishing process caused by precipitation is realized. Finally, the interaction between rain and snow, terrain and trees is realized when precipitation affects the corresponding landscape and vegetation texture to enhance the realism of the constructed forest environment. In short, this paper proposes a forest fire spread method based on environmental weather factors, which intuitively expresses the influence of different weather factors on forest fire spread, thereby improving the immersive experience of the related senses and realizing realistic scene roaming. [Display omitted] • Based on the single tree pyrolysis model, differentiating forest burning. • Visualization of the influence of simulated weather factors on forest fire behavior. • Use texture mixing technology to construct 3D forest weather scene. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. A novel isosurface segmentation method using common boundary tests.
- Author
-
Wang, Cuilan
- Subjects
- *
TEACHING aids , *VISUALIZATION , *GEOGRAPHIC boundaries - Abstract
Visualizing the isosurfaces that represent material boundaries is an important technique for understanding the features of interest in a scalar volumetric dataset. However, one isosurface may contain multiple types of boundaries, i.e., boundaries between different pairs of materials. In this paper, we present a novel isosurface segmentation method that aids in learning structural information of a dataset by separating different types of boundaries in one isosurface. This method uses common boundary tests to classify a point on the isosurface. The test determines whether a point on the isosurface that is at a boundary shared by both the isosurface and a reference isosurface. It uses a gradient-guided sampling approach and is based on material boundary properties. A new region growing algorithm is developed to improve the segmentation results. Our new method can also be used to segment an isosurface that passes through both material boundaries and the interior of a material. Two applications of the new method are also demonstrated in the paper. One is to render and segment section planes to enhance visualization. The other one is to obtain more accurate and meaningful isosurface statistics. • Segment an isosurface that contains multiple types of material boundaries. • Use region growing approach to improve the isosurface segmentation results. • Correctly segment the section planes to show the interior structure of the object. • Segment an isosurface that passes through both material boundaries and material interior. • Obtain more accurate isosurface statistics by averaging them over different portions of the segmented isosurface. [Display omitted] [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Immersive presentations of real-world medical equipment through interactive VR environment populated with the high-fidelity 3D model of mobile MRI unit
- Author
-
Tadeja, Sławomir Konrad, Bohné, Thomas, Godula, Kacper, Cybulski, Artur, and Woźniak, Magdalena Maria
- Published
- 2024
- Full Text
- View/download PDF
22. Developing an immersive virtual farm simulation for engaging and effective public education about the dairy industry.
- Author
-
Nguyen, Anh, Francis, Michael, Windfeld, Emma, Lhermie, Guillaume, and Kim, Kangsoo
- Subjects
- *
DAIRY industry , *PUBLIC education , *DAIRY farms , *AGRICULTURAL education , *DAIRY processing - Abstract
Growing public interest in understanding the origins and production methods of dairy products, driven by concerns related to environmental impact, local sourcing, and ethics, highlights an important trend. Nevertheless, a knowledge-trust gap persists between consumers and the dairy industry. Addressing this gap, in this paper, we developed an immersive virtual farm simulation to provide realistic on-farm experiences to the public. Within the virtual farm, users can explore various sites where dairy cows are raised and gain insights into dairy production processes using a head-mounted display (HMD). This simulation was demonstrated at local libraries, involving 48 public participants. We collected and analyzed participants' feedback on various aspects, including usability and their overall perceptions, to assess the simulation's effectiveness as an agricultural education tool. We investigated the impact of the virtual experience on participants' perceived knowledge gain and their awareness of the dairy industry. The results indicate that our dairy farm simulation was positively received as an effective tool for public education. Emphasizing the potential of virtual reality (VR) simulations in agricultural education and the industry, we discuss our key findings and future plans. [Display omitted] • Development process to build a realistic immersive simulation of a dairy farm for public education purposes. • Findings on the usability and user perception of the immersive experience for education purposes. • The system is an effective and useful tool for learning and the provision of information about the dairy industry. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Mixed reality human teleoperation with device-agnostic remote ultrasound: Communication and user interaction.
- Author
-
Black, David, Nogami, Mika, and Salcudean, Septimiu
- Subjects
- *
MIXED reality , *REMOTE control , *ULTRASONIC imaging , *TELECOMMUNICATION systems , *TELEROBOTICS , *HUMAN beings - Abstract
For many applications, remote guidance and telerobotics provide great advantages. For example, tele-ultrasound can bring much-needed expert healthcare to isolated communities. However, existing tele-guidance methods have serious limitations including either low precision for video conference-based systems, or high complexity and cost for telerobotics. A new concept called human teleoperation leverages mixed reality, haptics, and high-speed communication to provide tele-guidance that gives an expert nearly-direct remote control without requiring a robot. This paper provides an overview of the human teleoperation concept and its application to tele-ultrasound. The concept and its impact are discussed. A new approach to remote streaming and control of point-of-care ultrasound systems independent of their manufacturer is described, as is a high-speed communication system for the HoloLens 2 that is compatible with ResearchMode API sensor stream access. Details of these systems are shown in supplementary video demonstrations. Novel interaction methods enabled by HoloLens 2-based pose tracking are also introduced and tests of the communication and user interaction are presented. The results show continued improvement of the system compared to previous work in instrumentation, HCI, and communication. The system thus has good potential for tele-ultrasound, as well as possible other applications of human teleoperation including remote maintenance, inspection, and training. The remote ultrasound streaming and control application is made available open source. [Display omitted] • System improvements to human teleoperation demonstrate its feasibility. • Other devices such as the Nreal Light can be used for implementation. • New device-agnostic remote ultrasound streaming and control demonstrated. • HoloLens pose tracking enables human teleoperation with limited compute resources. • HoloLens-based communication system provides effective sensor data streaming. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Local geometry-perceptive mesh convolution with multi-ring receptive field.
- Author
-
Liu, Shanghuan, Chen, Xunhao, Gai, Shaoyan, and Da, Feipeng
- Subjects
- *
COMPUTER vision - Abstract
Learning 3D mesh representation is necessary for many computer vision and graphic tasks. Recently, some works have studied convolution methods for directly processing input meshes. However, these methods are usually weak in extracting local geometry information because of the disadvantages such as isotropic filter, neglect of mesh topology, and a small convolution field. In this paper, we introduce a local geometry-perceptive mesh convolution, which pays attention to mesh irregular structures for efficiently capturing geometry features in a multi-ring receptive field. Specifically, we define each template node's dynamic neighbor-attention weights used in multiple attention aggregation operations for obtaining local mesh change information of different vertices in the multi-ring field. After each aggregation, a shared anisotropic filter maps the catenation of each new vertex and its neighbors for extracting geometry features of the current ring. Then, complete local geometry features of each vertex in its large local field are obtained by summing the mapped results of each aggregation. Moreover, the position features of each vertex are added to its local geometry features to get the final representation vector of the vertex. We demonstrate the proposed mesh convolution method's strong ability in modeling 3D mesh shapes. [Display omitted] • Utilizing local irregular mesh structures, dynamic neighbor-attention weights of template's nodes perceive local mesh change information of different vertices. • Multiple attention aggregations and shared anisotropic filters help extract geometry features of large local receptive fields with multi-ring. • The proposed method determines our deep 3D morphable models (3DMMs) have smaller sizes than previous models and achieve state-of-the-art performances. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Voting-based patch sequence autoregression network for adaptive point cloud completion.
- Author
-
Wu, Hang and Miao, Yubin
- Subjects
- *
POINT cloud , *PETRI nets , *NETWORK performance - Abstract
Point cloud completion aims to estimate the whole shapes of objects from their partial scans, and one of the main obstacles that prevents current methods from being applied in real-world scenarios is the variety of structural losses in real-scanned objects, which can hardly be fully included and reflected by the training samples. In this paper, we introduce Patch Sequence Autoregression Network (PSA-Net), a learning-based method that can be trained without the partial point clouds in dataset and is inherently adaptable to input scans with different levels of shape incompleteness: It makes restoring the unseen parts of objects be equivalent to predicting the missing tokens in local patch embedding sequences, and such prediction can start from any initial states. Specifically, we first introduce a Sequential Patch AutoEncoder that reconstructs complete point clouds from quantized patch feature sequences. Second, we establish a Mixed Patch Autoregression pipeline that can flexibly infer the whole sequence from any number of known tokens at any positions. Third, we propose a Voting-Based Mapping module that makes input points softly vote for their possible related tokens in sequences based on their local areas, which transforms partial point clouds to masked sequences in test. Quantitative and qualitative evaluations on two synthetic and four real-world datasets illustrate the competitive performances of our network when comparing with existing approaches. [Display omitted] • A Sequential Patch AutoEncoder for shape generation from quantized feature sequence. • A Mixed Patch Autoregression pipeline for token prediction from any initial states. • A Voting-based Mapping module for transformation from partial shapes to sequences. • Competitive performances on two synthetic and four real-world datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Frequency-aware network for low-light image enhancement.
- Author
-
Shang, Kai, Shao, Mingwen, Qiao, Yuanjian, and Liu, Huan
- Subjects
- *
IMAGE intensifiers , *IMAGE enhancement (Imaging systems) , *IMAGE reconstruction , *FREQUENCY-domain analysis - Abstract
Low-light images often suffer from severe visual degradation, affecting both human perception and high-level computer vision tasks. Most existing methods process images in the spatial domain, making it challenging to simultaneously improve brightness while suppressing noise. In this paper, we present a novel perspective to enhance images based on frequency domain characteristics. Specifically, we reveal that the low-frequency components are closely related to luminance and color, whereas the high-frequency components are not. Based on this observation, we propose the Frequency-aware Network (FaNet) for low-light image enhancement. By selectively adjusting low-frequency components, FaNet preserves more high-frequency details while achieving low-light image enhancement. Additionally, we employ a multi-scale framework and selective fusion for effective feature learning and image reconstruction. Experimental results demonstrate the superiority of the proposed method. [Display omitted] • We reveal that the luminance is closely related to low-frequency components. • We design a frequency-aware network to utilize frequency domain features. • A multi-scale framework and selective fusion is proposed for feature learning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. GMM-ICQ: A GMM vertex-optimization-based implicitly-connected quadrilateral format for 3D mesh storage.
- Author
-
Lin, Dayong, Zhao, Chunhui, Tian, Qihang, Xu, Yunfei, Wang, Ruilin, and Qu, Zonghua
- Subjects
- *
GAUSSIAN mixture models , *MICROSOFT Surface (Computer) , *QUADRILATERALS , *COMPUTER graphics , *STORAGE - Abstract
3D meshes are commonly utilized and may be considered to be the most popular surface representation in computer graphics due to their simplicity, efficiency and flexibility. However, the explicit storage of mesh vertices and connectivity, as in widely-used PLY and OBJ file formats, leads to substantial memory consumption. This, in turn, directly affects the processing and transmission in downstream applications. Though mesh simplification and mesh compression are common strategies to lessen memory consumption, they exhibit inherent limitations either in maintaining a balance between accuracy, efficiency, memory usage and mesh quality, or breaking the simplicity of explicit storage and struggling with optimizing the trade-off between compression performance and computational resource consumption. To overcome these limitations, inspired by the Gaussian Mixture Model (GMM), this paper proposes a GMM vertex-optimization-based implicitly-connected quadrilateral format for 3D mesh storage, named GMM-ICQ. Extensive qualitative and quantitative evaluations demonstrate that the GMM-ICQ format achieves efficient compression by retaining only a small amount of vertex information, while preserving sharp features and maintaining relatively high mesh quality. It also exhibits a certain degree of robustness in the presence of noise interference. Furthermore, benefiting from the inherent grid-based connectivity, the GMM-ICQ format maintains the simplicity of explicit storage and can be implemented as a progressive variant without incurring additional computational overhead. [Display omitted] • We present a GMM vertex-optimization-based implicitly-connected quadrilateral format for 3D mesh storage. • Simultaneously balances accuracy, efficiency, memory usage, and mesh quality. • Preserves the simplicity of explicit storage (such as PLY and OBJ). • No additional computational overhead needed for progressive variant implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Linear time manageable edge-aware filtering on complementary tree structures.
- Author
-
Bu, Penghui, Wang, Hang, Yang, Tao, and Zhao, Hong
- Subjects
- *
TIME complexity , *IMAGE denoising , *GEODESIC distance , *SPANNING trees , *TREES - Abstract
Typical non-local edge-aware filtering methods build long-range connections by deriving a minimum spanning tree (MST) from the input image. Each pixel on the MST only connects to a sub-set of pixels in the 8-connected neighborhood, resulting in piece-wise constant output with fake edges among sub-trees for the unbalanced information propagation along eight directions. In this paper, we propose two complementary spatial trees to incorporate information from the entire image. The structure of each tree depends on the spatial relationships of neighboring pixels. The distances between any two pixels in both spatial space and intensity space are the shortest distances on each tree. We introduce an efficient algorithm to recursively compute the output and the normalization constant on each tree with linear time complexity. For each pixel, we first calculate the outputs from eight subtrees and then fuse them to obtain the result on each tree structure. The final filtering output of our method is the weighted average of the results from two complementary spatial trees. Moreover, we present a distance mapping scheme to adjust the intensity distance between neighboring pixels, enabling our method to filter out a manageable degree of low-amplitude structures while sharpening major edges. Extensive experiments in graphic applications, such as image denoising, JPEG artifact removal, tone mapping, detail enhancement, and colorization, demonstrate the effectiveness and versatility of our method. [Display omitted] • Novel complementary trees to estimate the geodesic distance between any two pixels. • Efficient algorithms with linear time complexity to compute the weighted average of all pixels in the input image. • A distance mapping scheme in intensity space to manageably filter out low-amplitude structures. • Quantitative evaluation and qualitative comparison on various graphic applications to show the effectiveness and the versatility of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Human image animation via semantic guidance.
- Author
-
Guo, Congwei, Ke, Yongken, Wan, Zhenkai, Jia, Minrui, Wang, Kai, and Yang, Shuai
- Subjects
- *
PARSING (Computer grammar) , *HUMAN body , *HUMAN beings - Abstract
Image animation creates visually compelling effects by animating still source images according to driving videos. Recent work performs animation on arbitrary objects using unsupervised methods and can relatively robustly perform motion transfer on human bodies. However, the complex representation of motion and unknown correspondence between human bodies often lead to issues such as distorted limbs and missing semantics, which make human animation challenging. In this paper, we propose a semantically guided, unsupervised method of motion transfer, which uses semantic information to model motion and appearance. Specifically, we use a pre-trained human parsing network to encode the rich and diverse foreground semantic information, thus generating fine details. Secondly, we use a cross-modal attention layer to learn the semantic region's correspondence between human bodies to guide the network in selecting appropriate input features, prompting the network to generate accurate results. Experiments demonstrate that our method outperforms state-of-the-art methods in motion-related metrics, while effectively addressing the problems of semantic missing and unclear limb structures prevalent in human motion transfer. These improvements can facilitate its applications in various fields, such as education and entertainment. [Display omitted] • Proposes a novel framework for human image animation using semantic features and cross-modal attention. • Introduces semantic segmentation features to represent motion and identity information. • Employs a cross-modal attention mechanism to establish correspondences between semantic regions. • Achieves state-of-the-art performance on complex human motions like TaiChi poses. • Reduces issues of missing semantics and distorted limbs common in human animation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. RepDehazeNet: Dual subnets image dehazing network based on structural re-parameterization.
- Author
-
Luo, Xiaozhong, Zhong, Han, Lu, Junjie, Meng, Chen, and Han, Xu
- Subjects
- *
DECODING algorithms , *HAZE , *DEEP learning , *PARAMETERIZATION - Abstract
In recent times, there has been notable and swift advancement in the field of image dehazing. Several deep learning techniques have demonstrated remarkable proficiency in resolving homogeneous dehazing issues. Nonetheless, the current dehazing approaches are generally formulated to deal with homogeneous haze, which is often undermined in real-world scenarios due to the uncertain haze dispersion. In this paper, we propose a dehazing model named RepDehazeNet by combining a structurally Reparameterization Encoder-Decoder subnet and a Full-Resolution Attention subnet. To be specific, the structural reparameterization idea is introduced into the encoder–decoder subnet to strengthen the feature extraction of dehazed images and improve the feature extraction speed. RepDehazeNet is compared with seven SOTA models on different datasets in terms of PSNR, SSIM, parameter quantity, and inference time. Compared to the DW-GAN model, the proposed RepDehazeNet model reduces the number of parameters by 2.7 million, and improves the inference speed by 90.3%, while achieving a higher PSNR of 0.5 dB on the NH-Haze2021 dataset. The experimental results demonstrate that the proposed RepDehazeNet model can effectively improve the real-time performance, accuracy of dehazing synthesized and nonhomogeneous haze images. [Display omitted] • Structural reparameterization dehazenet: outstanding performance, faster speed. • Replacing Tanh with ReLU leads to better results. • Transfer learning addresses the problem of insufficient samples. • Dual subnets method proves highly effective in datasets of different scales. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. 3D face reconstruction from a single image based on hybrid-level contextual information with weak supervision.
- Author
-
Liu, Yang, Ran, Teng, Yuan, Liang, Lv, Kai, and Zheng, Guoquan
- Subjects
- *
TRANSFORMER models , *SUPERVISION - Abstract
Currently, deep learning-based 3D face reconstruction methods have shown promising results. However, they ignore the contextual information of the face, which is a topologically unified entirety. This paper proposes a 3D face reconstruction approach based on hybrid-level contextual information. Firstly, we suggest a regression network with contextual modeling capability at the feature level, PPR-CNet, which adopts a preferential parameter regression to regress the 3DMM parameters dynamically based on their various impacts on the reconstructed 3D face. Furthermore, we design a contextual landmark loss to constrain the face geometry at the landmark level. We introduce a differentiable renderer combined with multiple loss functions for weakly-supervised training. Quantitative experiments on two benchmarks show our method outperforms several SOTA methods. Extensive qualitative experiments indicate that our method performs efficiently in realism, facial proportion, and occlusion. [Display omitted] • We propose an approach to reconstruct a 3D face from a single image based on hybrid-level contextual information. • We propose a regression network, PPR-CNet, with contextual modeling capability, which regresses 3DMM parameters dynamically. • We design a contextual landmark loss to constrain face geometry employing landmark contextual information. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. [formula omitted]-Curves: Extended log-aesthetic curves with variable shape parameter.
- Author
-
Tsuchie, Shoichi and Yoshida, Norimasa
- Subjects
- *
CURVES , *AUTOMOBILE industry , *CURVATURE - Abstract
This paper proposes a novel curve called the α - curve , which is specified by a variable shape parameter α and can be applied in curve reconstruction involving curve fairing, fitting, and segmentation. In contrast to classical log-aesthetic curves in which α is constant, this study contributes a novel formulation of α that varies monotonically within a specified segment. By introducing the variable α , the logarithmic curvature graph (LCG) becomes piecewise linear and parabolic, or any other function if necessary. Consequently, in comparison to conventional log-aesthetic curves that have limited field usage based on their rigidity, α -curves have various LCGs and flexible representations, thereby opening up many practical applications. Experimental results and comparative studies on curves created by CAD experts in the automotive industry demonstrate the theoretical and practical validity of α -curves. [Display omitted] • α -Curves are proposed by extending the mathematical framework of log-aesthetic curves • An α -curve is specified by a variable shape parameter • α -Curves have various log-curvature graphs (LCG) and flexible representations • α -Curves are applied to a new fairing, avoiding conventional issues of side effects [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. LARFNet: Lightweight asymmetric refining fusion network for real-time semantic segmentation.
- Author
-
Hu, Xuegang and Gong, Juelin
- Subjects
- *
PROBLEM solving , *MULTICASTING (Computer networks) , *PYRAMIDS , *PIXELS - Abstract
In this paper, we propose a lightweight asymmetric refining fusion network (LARFNet) for real-time semantic segmentation to solve the problem that some existing models cannot achieve good segmentation accuracy with real-time inference speed in mobile devices due to the huge computational overhead. Specifically, LARFNet adopts an asymmetric encoder–decoder structure. The depth-wise separable asymmetric interaction module (DSAI module) is designed in the encoding process, which effectively extracted local and surrounding information under different receptive fields with optimized convolution in the condition of ensuring communication between channels. In the decoder, we design the bilateral pyramid pooling attention module (BPPA module) and the multi-stage refinement fusion module (MRF Module). The BPPA module is used to integrate the high-level output multi-scale context information. Based on spatial and channel attention mechanisms, the MRF module is proposed to refine the feature maps of different resolutions and guide the feature fusion. Experimental results show that LARFNet achieves 69.2% mIoU and 65.6% mIoU on Cityscapes and Camvid datasets at 127 FPS and 222 FPS respectively, only using a single NVIDIA GeForce GTX2080Ti GPU and 0.72M parameters without any pre-training or pre-processing. Compared with most of the existing state-of-the-art models, the proposed method realizes the efficient use of network parameters at a faster speed, reduces the number of network parameters, and still achieves the accuracy of good segmentation. [Display omitted] • In this paper, we propose a lightweight real-time semantic segmentation network which considers inference speed, number of model parameters and segmentation accuracy as a whole, named lightweight asymmetric refining fusion network (LARFNet). The network mainly consists of three modules. The DSAI module is used to extract different features. The BPPA module is used to provide pixel-level attention for features, and the MRF module is used to guide feature fusion after optimizing features. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Foreword to the Special Section on Smart Tools and Applications in Graphics (STAG 2023).
- Author
-
Capece, Nicola, Lupinetti, Katia, Erra, Ugo, and Banterle, Francesco
- Subjects
- *
SHAPE analysis (Computational geometry) , *RAPID prototyping , *COMPUTATIONAL geometry , *MACHINE learning , *USER experience - Abstract
The Special Section contains extended and revised versions of the best papers presented at the 10th Conference on Smart Tools and Applications in Graphics (STAG 2023), held in Matera on November 16–17, 2023. Four papers were selected by appointed members from the Program Committee; extended versions were submitted and further reviewed by external experts. The result is a rich collection of papers spanning diverse domains: from shape analysis and computational geometry to advanced applications in machine learning, virtual interaction, and digital fabrication. Topics include shape modeling, functional maps, and point clouds, highlighting cutting-edge research in user experience and interaction design. [Display omitted] • 10th Int. Conference on Smart Tools and Applications in Graphics (STAG 2023). • STAG 2023 received 22 submissions, 14 of which were accepted as full papers and 3 as short papers. • Extended versions of 4 selected papers, further reviewed by externals. • Shape analysis, computational geometry, machine learning, virtual interaction. • Digital fabrication, user experience, and computational design. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Assessing the landscape of toolkits, frameworks, and authoring tools for urban visual analytics systems.
- Author
-
Ferreira, Leonardo, Moreira, Gustavo, Hosseini, Maryam, Lage, Marcos, Ferreira, Nivan, and Miranda, Fabio
- Subjects
- *
VISUAL analytics , *DATA analytics , *EXPERTISE , *TAXONOMY , *PROTOTYPES - Abstract
Over the past decade, there has been a significant increase in the development of visual analytics systems dedicated to addressing urban issues. These systems distill intricate urban analysis workflows into intuitive, interactive visual representations and interfaces, enabling users to explore, understand, and derive insights from large and complex data, including street-level imagery, street networks, and building geometries. Developing urban visual analytics systems, however, is a challenging endeavor that requires considerable programming expertise and interaction between various multidisciplinary stakeholders. This situation often leads to monolithic and isolated prototypes that are hard to reproduce, combine, or extend. Concurrently, there has been an increase in the availability of general and urban-specific toolkits, frameworks, and authoring tools that are open source and abstract away the need to implement low-level visual analytics functionalities. This paper provides a hierarchical taxonomy of urban visual analytics systems to contextualize how they are usually designed, implemented, and evaluated. We develop this taxonomy across three distinct levels (i.e. , dimensions, categories, and tags), juxtaposing visualization with analytics, data, and system dimensions. We then assess the extent to which current open-source toolkits, frameworks, and authoring tools can effectively support the development of components tailored to urban visual analytics, identifying their strengths and limitations in addressing the unique challenges posed by urban data. In doing so, we offer a roadmap that can guide the effective employment of existing resources and chart a pathway for developing and refining future systems. [Display omitted] • Review of over 135 papers proposing urban visual analytics systems. • Hierarchical taxonomy of urban visual analytics systems considering over 160 tags. • Evaluation of toolkits, frameworks, and authoring tools for urban visual analytics. • We highlight the need for interoperability and sustainable cyberinfrastructures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. SHREC 2024: Recognition of dynamic hand motions molding clay.
- Author
-
Veldhuijzen, Ben, Veltkamp, Remco C., Ikne, Omar, Allaert, Benjamin, Wannous, Hazem, Emporio, Marco, Giachetti, Andrea, LaViola, Joseph J., He, Ruiwen, Benhabiles, Halim, Cabani, Adnane, Fleury, Anthony, Hammoudi, Karim, Gavalas, Konstantinos, Vlachos, Christoforos, Papanikolaou, Athanasios, Romanelis, Ioannis, Fotis, Vlassis, Arvanitis, Gerasimos, and Moustakas, Konstantinos
- Subjects
- *
MIXED reality , *MOTION capture (Human mechanics) , *GESTURE , *RESEARCH teams , *SKELETON - Abstract
Gesture recognition is a tool to enable novel interactions with different techniques and applications, like Mixed Reality and Virtual Reality environments. With all the recent advancements in gesture recognition from skeletal data, it is still unclear how well state-of-the-art techniques perform in a scenario using precise motions with two hands. This paper presents the results of the SHREC 2024 contest organized to evaluate methods for their recognition of highly similar hand motions using the skeletal spatial coordinate data of both hands. The task is the recognition of 7 motion classes given their spatial coordinates in a frame-by-frame motion. The skeletal data has been captured using a Vicon system and pre-processed into a coordinate system using Blender and Vicon Shogun Post. We created a small, novel dataset with a high variety of durations in frames. This paper shows the results of the contest, showing the techniques created by the 5 research groups on this challenging task and comparing them to our baseline method. [Display omitted] [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Foreword to the special section on 3D object retrieval 2023 symposium (3DOR2023).
- Author
-
Biasotti, Silvia, Daoudi, Mohamed, Fugacci, Ulderico, Lavoué, Guillaume, and Veltkamp, Remco C.
- Subjects
- *
RESEARCH methodology evaluation , *CONFERENCES & conventions - Abstract
• Foreword to the 16th Eurographics Symposium on 3D Object Retrieval (3DOR2023). • Research into methods and evaluation techniques for shape retrieval are presented. • It contains 9 full technical papers and 3 SHREC full papers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. SketchCleanNet — A deep learning approach to the enhancement and correction of query sketches for a 3D CAD model retrieval system.
- Author
-
Manda, Bharadwaj, Kendre, Prasad Pralhad, Dey, Subhrajit, and Muthuganapathy, Ramanathan
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *COMPUTER vision , *COMPUTER graphics , *ENGINEERING design , *SEARCH algorithms - Abstract
Search and retrieval remains a major research topic in several domains, including computer graphics, computer vision, engineering design, etc. A search engine requires primarily an input search query and a database of items to search from. In engineering, which is the primary context of this paper, the database consists of 3D CAD models, such as washers, pistons, connecting rods, etc. A query from a user is typically in the form of a sketch, which attempts to capture the details of a 3D model. However, sketches have certain typical defects such as gaps, over-drawn portions (multi-strokes), etc. Since the retrieved results are only as good as the input query, sketches need cleaning-up and enhancement for better retrieval results. In this paper, a deep learning approach is proposed to improve or clean the query sketches. Initially, sketches from various categories are analysed in order to understand the many possible defects that may occur. A dataset of cleaned-up or enhanced query sketches is then created based on an understanding of these defects. Consequently, an end-to-end training of a deep neural network is carried out in order to provide a mapping between the defective and the clean sketches. This network takes the defective query sketch as the input and generates a clean or an enhanced query sketch. Qualitative and quantitative comparisons of the proposed approach with other state-of-the-art techniques show that the proposed approach is effective. The results of the search engine are reported using both the defective and enhanced query sketches, and it is shown that using the enhanced query sketches from the developed approach yields improved search results. [Display omitted] • The first learning-based strategy to clean rough query sketches of 3D CAD models • Introduces SketchCleanNet — an end-to-end image translation scheme • SketchCleanNet aims to understand the mapping between rough sketches and clean query images • A novel scheme to calculate the loss is introduced • Dataset Contribution: The resulting enhanced query sketch dataset is made available publicly. • This paper will significantly contribute to the research community and give researchers opportunities to develop new algorithms for search and retrieval of 3D mechanical components. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. SHREC'22 track: Open-Set 3D Object Retrieval.
- Author
-
Feng, Yifan, Gao, Yue, Zhao, Xibin, Guo, Yandong, Bagewadi, Nihar, Bui, Nhat-Tan, Dao, Hieu, Gangisetty, Shankar, Guan, Ripeng, Han, Xie, Hua, Cong, Hunakunti, Chidambar, Jiang, Yu, Jiao, Shichao, Ke, Yuqi, Kuang, Liqun, Liu, Anan, Nguyen, Dinh-Huan, Nguyen, Hai-Dang, and Nie, Weizhi
- Subjects
- *
OBJECT recognition (Computer vision) , *POINT cloud - Abstract
This paper reports the results of the SHREC'22 track: Open-Set 3D Object Retrieval, the goal of which is to evaluate the performance of different retrieval algorithms under the Open-Set setting and modality-missing setting, respectively. Since objects from unseen categories are very common in real-world applications, we design the open-set 3D object retrieval to expand the application of traditional 3D object retrieval. In this track, we generate open-set 3D object retrieval datasets OS-MN40 and OS-MN40-Miss based on the ModelNet40 dataset, which are collected for the open-set setting and both open-set setting and modality-missing setting, respectively. Both the two datasets include the training set (2822 objects from 8 categories) and the retrieval set (960 query objects and 8527 target objects from the other 32 categories). The categories of retrieval (query/target) sets are not seen in the training set. For each object in the OS-MN40, four types of modalities, including mesh, point cloud, multi-view, and voxel, are provided. Each object in the OS-MN40-Miss is represented with incomplete modality information, which is collected to simulate the retrieval task in the real world. This track attracted eight participants from four countries and 191 runs of all submissions. The evaluation results show a promising scenario about open-set retrieval on 3D objects with multi-modal and multi-resolution representation and reveal interesting insights in dealing with retrieving 3D objects in unknown-category objects. [Display omitted] • This paper reports the results of the SHREC'22 track: Open-Set 3D Object Retrieval. • Retrieve 3D objects with unknown categories. • Multi-modal and multi-resolution 3D object retrieval. • Modality miss problem in 3D object retrieval. • Transfer Learning and Novelty Detection in 3D object retrieval. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Probabilistic summarization via importance-driven sampling for large-scale patch-based scientific data visualization.
- Author
-
Yang, Yang, Wu, Yu, and Cao, Yi
- Subjects
- *
SCIENTIFIC visualization , *DATA visualization , *ENTROPY (Information theory) , *DATA reduction - Abstract
Probabilistic summarization is the process of creating compact statistical representations of the original data. It is used for data reduction, and to facilitate efficient post-hoc visualization for large-scale patch-based data generated in parallel numerical simulation. To ensure high reconstruction accuracy, existing methods typically merge and repartition data patches stored across multiple processor cores, which introduces time-consuming processing. Therefore, this paper proposes a novel probabilistic summarization method for large-scale patch-based scientific data. It considers neighborhood statistical properties by importance-driven sampling guided by the information entropy, thus eliminating the requirement of patch merging and repartitioning. In addition, the reconstruction value of a given spatial location is estimated by coupling the statistical representations of each data patch and the sampling results, thereby maintaining high reconstruction accuracy. We demonstrate the effectiveness of our method using five datasets, with a maximum grid size of one billion. The experimental results show that the method presented in this paper reduced the amount of data by about one order of magnitude. Compared with the current state-of-the-art methods, our method had higher reconstruction accuracy and lower computational cost. [Display omitted] • A novel probabilistic summarization method for large-scale patch-based scientific data. • The neighborhood statistical properties are considered by importance-driven sampling guided by the information entropy. • Compared with the current state-of-the-art methods, our method had higher reconstruction accuracy and lower computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Weighted guided image filtering with entropy evaluation weighting.
- Author
-
Jia, Hongbin, Yin, Qingbo, and Lu, Mingyu
- Subjects
- *
IMAGE fusion , *IMAGE denoising , *EXTREME value theory , *HYPERBOLIC functions , *REGULARIZATION parameter , *STATISTICAL weighting , *ENTROPY - Abstract
Although the guided image filter (GIF) is an excellent edge-preserving filter, it generally suffers from halo artifacts due to the local property and the fixed regularization parameter. To address the problem, a weighted guided image filter (WGIF) was proposed by incorporating an edge-aware weighting into the GIF. In the filtering process, WGIF employs an averaging strategy for edge-aware weighting. Although the averaging strategy is a highly efficient method, it is susceptible to extreme values and tends to obscure critical factors, so it often leads to inaccurate results. Consequently, the output results quality of the WGIF is often degraded. To remedy the deficiency, a weighted guided image filter with entropy evaluation weighting (EEW-WGIF) is proposed in this paper. EEW-WGIF employs an edge-aware weighting strategy based on entropy evaluation method to detect edges more accurately, and incorporates an explicit constraint based on the gradient variation to better preserve edges. To verify the filtering effectiveness of the EEW-WGIF, it was applied to edge-preserving smoothing filtering, exposure images fusion, single image detail enhancement, structure-transferring filtering and image denoising. Experimental results show that the proposed filter can achieve excellent performance in both visual quality and objective evaluation. [Display omitted] • An edge-aware weighting strategy based on entropy evaluation method is proposed, which is more reliable in calculating the importance of each edge-aware factor. • The proposed EEW-WGIF incorporates an explicit constraint based on gradient variation and hyperbolic function to handle the edges so that the edges can be better preserved. • The EEW-WGIF was applied to edge-preserving smoothing filtering, exposure images fusion, single image detail enhancement, structure-transferring filtering and image denoising. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. An overview of Eulerian video motion magnification methods.
- Author
-
Ahmed, Ahmed Mohamed, Abdelrazek, Mohamed, Aryal, Sunil, and Nguyen, Thanh Thi
- Subjects
- *
CONVOLUTIONAL neural networks , *RANGE of motion of joints , *IMAGE processing , *MOTION - Abstract
The concept of video motion magnification has become increasingly relevant due to its ability to detect small and invisible motions that can be of great value in a variety of applications. A variety of approaches have been developed to magnify these motions and variations. While both Eulerian and Lagrangian processing methods are widely used for motion magnification, Eulerian approaches are more commonly employed due to their lower computational cost. This paper provides an overview of the powerful Eulerian motion magnification techniques. We begin with a brief introduction to technical concepts associated with Eulerian motion techniques such as pyramids and filters in image processing. Additionally, we provide a comparison between the Lagrangian and Eulerian perspectives, followed by a comprehensive overview of the various Eulerian motion magnification (EVM) techniques available. Finally, we present implementation results and a comparative analysis of some of the Eulerian motion techniques. [Display omitted] • Detecting Imperceptible Motions: Explore motion magnification's role in revealing subtle movements. • Demystifying Eulerian Processing: Simplify complex concepts, mathematical foundations. • Real-World Applications: Illustrate motion magnification's utility in healthcare, construction, etc. • Eulerian Advantages: Compare and emphasize the strength of Eulerian over Lagrangian methods. • Comprehensive Survey: Cover a range of Eulerian motion magnification techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Self-report user interfaces for patients with Rheumatic and Musculoskeletal Diseases: App review and usability experiments with mobile user interface components.
- Author
-
Nunes, Francisco, Rato Grego, Petra, Araújo, Ricardo, and Silva, Paula Alexandra
- Subjects
- *
RHEUMATISM , *USER interfaces , *MUSCULOSKELETAL system diseases , *PATIENT reported outcome measures , *SELF-evaluation , *IPHONE (Smartphone) - Abstract
Rheumatic and Musculoskeletal Diseases (RMDs) affect 120 million Europeans and are responsible for joint inflammation, stiffness, pain, and fatigue. Patient-Reported Outcome Measures (PROMs), essential to diagnosis and treatment adjustments, are expected to revolutionise rheumatology care if mobile apps reach clinical practice. However, patients often experience finger dexterity issues that can hinder their interaction with mobile apps. This paper investigates the interaction of patients with RMDs with mobile apps for self-report. We started by reviewing existing iPhone and Android apps for RMDs, to identify common user interface (UI) components, and conducted usability experiments with 20 patients with RMDs to record their performance. The usability experiments showed that in-line selectors are the best-performing UI component and that column selectors are considered the most usable by patients. Sliders perform worse than in-line selectors, with significant differences. Results also showed little difference between test conditions aligned with mobile UI design guidelines and those that provided larger or more spaced targets, leading us to conclude that following existing Apple Human Interface Guidelines and Android Material Design will lead to apps with UIs that are appropriate for patients with RMDs. [Display omitted] • In-line selectors are the UI component that affords the best user performance for patients with Rheumatic and Musculoskeletal Diseases. • Column selectors are perceived as the most usable by patients with Rheumatic and Musculoskeletal Diseases. • Sliders perform worse than in-line selectors, with significant differences. • Following UI Apple and Android guidelines is appropriate for patients with RMDs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. A multimodal smartwatch-based interaction concept for immersive environments.
- Author
-
Lang, Matěj, Strobel, Clemens, Weckesser, Felix, Langlois, Danielle, Kasneci, Enkelejda, Kozlíková, Barbora, and Krone, Michael
- Subjects
- *
SMARTWATCHES , *EYE tracking , *PERSONAL computers , *UNITS of measurement , *USER experience , *VIRTUAL reality - Abstract
Augmented and Virtual Reality (AR/VR) environments require user interaction concepts beyond the traditional mouse-and-keyboard setup for seated desktop computer usage. Although advanced input modalities such as hand or gaze tracking have been developed, they have yet to be widely adopted in available hardware. Modern smartwatches have been shown to provide a powerful and intuitive means of input, thereby overcoming the limitation of the current AR/VR headsets. They typically offer a set of interesting input modalities, such as a touchscreen, rotary buttons, and an Inertial Measurement Unit (IMU), which can be used for mid-air gesture recognition. Compared to other input devices, they have the benefit that they are hands-free as soon as the user stops interacting since they are attached to the wrist. As many concepts have been proposed, comparative evaluations of their effectiveness and user-friendliness are still rare. In this paper, we evaluate the usability of two commonly found approaches for using a smartwatch as an interaction device, specifically in immersive environments provided by AR/VR HMDs: using the physical inputs of the watch (touchscreen, rotary buttons) or mid-air gestures. We conducted a user study with 20 participants, where they tested both of the interaction methods, and we compared them in their usability and performance. Based on a prototypical AR application, we evaluated the performance and user experience of these two smartwatch-based interaction concepts. We have found that the input using a touchscreen and buttons was generally favored by the participants and led to shorter task completion times. [Display omitted] • We assessed smartwatch interaction: buttons, touchscreens, and gestures for AR/VR. • We implemented AR app with common concepts, ran a user study (20 participants). • Users preferred touchscreens over gestures; they're faster and less taxing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Lightweight fully connected network-based fast CU size decision for video-based point cloud compression.
- Author
-
Que, Shicheng and Li, Yue
- Subjects
- *
VIDEO coding , *POINT cloud , *FEATURE extraction , *DECISION making - Abstract
Video-based point cloud compression (V-PCC) utilizes high efficiency video coding (HEVC) to compress geometry and attribute videos generated from dynamic point cloud projection. However, the HEVC exhaustive coding unit (CU) size decision process is complex and hinders the real-time application of V-PCC. To reduce the coding complexity of V-PCC, this paper proposes a method that combined hand-crafted features and lightweight neural network to accurately predict the best CU partition in advance. First, we extract hand-crafted features, including direct features (DFs) and indirect feature (IF), as mixed features. DFs are simple and require no additional calculation, while IF is obtained indirectly by transforming the global and local distortions of the CU extracted before size decision determination. Second, we propose a lightweight fully connected network (LFCN) as the backbone network, two feature types are used as inputs to the LFCN to predict whether the CU should be split into sub-CUs, and the LFCN can be fully integrated into the encoder with only about 1.58KB of additional parameters. Experimental results show that the proposed method reduces coding complexity by an average of 51.2% while Luma's BD-TotalRate only increases by 0.1% on average under the All Intra (AI) configuration. [Display omitted] • Lightweight neural network-based fast coding method is proposed for V-PCC. • Direct features and indirect feature are jointly extracted for fast CU partition. • The proposed method can be effectively used in I frame and P frame. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Perceptual thresholds of visual size discrimination in augmented and virtual reality.
- Author
-
Wang, Liwen, Cai, Shaoyu, and Sandor, Christian
- Subjects
- *
DISCRIMINATION against overweight persons , *VISUAL discrimination , *VIRTUAL reality , *AUGMENTED reality , *COMPUTING platforms - Abstract
The perception of size in virtual objects in Augmented Reality (AR) and Virtual Reality (VR) is a not trivial issue, as the effectiveness of manipulating and interacting with virtual content depends on the accuracy of size perception. However, there are missing straightforward comparisons between VR and AR in terms of size perception for the deep understanding of size perceptual differences. Understanding these perceptual differences can inform designers on how to adapt content when transitioning between these two spatial computing platforms. In this paper, we conducted two psychophysical experiments to measure the perceptual thresholds of size discrimination for virtual objects. Our results indicated that users are more sensitive to size changes in VR than in video see-through AR, suggesting that size differences are easier to be perceived in VR than in AR. Additionally, for increase or decrease of sizes, the accuracy of judgments showed an asymmetric trend in video see-through AR. [Display omitted] • A comparative experiment to understand the difference in size perception in AR vs VR. • The thresholds for size discrimination in AR and VR are not the same. • The accuracy of judgments is asymmetric for increases and decreases of sizes in AR. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Efficient boundary surface reconstruction from multi-label volumetric data with mathematical morphology.
- Author
-
N'Guyen, Franck, Kanit, Toufik, Maisonneuve, F., and Imad, Abdellatif
- Subjects
- *
MATHEMATICAL morphology , *SURFACE reconstruction , *LEANNESS , *CUBES , *AMBIGUITY - Abstract
This paper proposes a new, fully automatic and robust approach to generating triangular meshes directly from volumetric data (scanned images), particularly when these images contain multiple adjacent labels. Current meshing techniques produce a number of mesh elements directly related to the number of components (voxels) in the image. This number can be considerable if the image is large. The proposed methodology is significantly less dense in terms of the number of elements compared to marching cube methods. The proposed method presents no configuration ambiguity and is faithful to the original morphology of the images regardless of the thinness of the topologies or the presence of possible erratic morphological configurations that may lead to geometric interpretation indecisions. [Display omitted] • New fully automatic and robust approach to generate triangular meshes directly from volumetric data, in particular when these images contain adjoining multiple-label. • The proposed methodology is significantly less dense in terms of the number of elements compared to Marching Cube Methods. • The proposed method presents no configuration ambiguity and is faithful to the original morphology of the images regardless of the thinness of the topologies or the presence of possible erratic morphological configurations that may lead to geometric interpretation indecisions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Computational design of planet regolith sampler based on Bayesian optimization.
- Author
-
Li, Mingyu, Zhu, Lifeng, Yan, Yibing, Zhao, Ziyi, and Song, Aiguo
- Subjects
- *
REGOLITH , *STRAINS & stresses (Mechanics) , *GRANULAR materials , *COMPUTER-aided design , *SPACE exploration , *ENGINEERING design - Abstract
Regolith sampling is one of the core missions in deep space exploration. The design, optimization, and fabrication of samplers are challenging tasks to meet the requirements of deep space exploration, often necessitating complex modeling with computer-aided design tools and demanding the expertise of experienced space engineers with lengthy design iterations. We propose an interactive design framework where designers collaborate with optimization tool to streamline the design process. With the operator adjusting the design goals, we introduce Bayesian optimization to automatically suggest the next sets of parameters to explore. This approach is suitable for optimization scenarios when the design goals cannot be well established as analytical functions and fewer design iterations are required. In this paper, we design and optimize the core structure of the sampler under both stress analysis and discrete element analysis, considering lower stress, greater sampling volume per unit power consumption, and smaller size. Both simulation and physical experimental results show that the design proposed by our framework outperforms existing designs with a small number of design iterations. [Display omitted] • A computational design framework for searching forms of planetary regolith samplers. • The introduction of Bayesian optimization to reduce the cost in optimizing the shape in terms of efficient interaction with granular material. • Simulation and physical experiments to validate the proposed design from our framework. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Simulating hyperelastic materials with anisotropic stiffness models in a particle-based framework.
- Author
-
Wang, Tiancheng, Xu, Yanrui, Li, Ruolan, Wang, Haoping, Xiong, Yuege, and Wang, Xiaokun
- Subjects
- *
POISSON'S ratio , *HUMAN mechanics , *ENERGY function , *MUSCLE contraction , *ELASTIC scattering - Abstract
We present a particle-based smoothed particle hydrodynamics (SPH) framework for simulating hyperelastic materials with anisotropic stiffness models. While most elastic simulations predominantly rely on mesh-based approaches, such as the Finite Element method, the relationship between Lamé's first parameter and Poisson's ratio complicates the strict enforcement of volume conservation, making it challenging to stabilize simulations for common biological tissues like fat and muscle. In this paper, we couple an implicit divergence-free SPH solver with particle-based deformation gradient computation and apply various elastic energy functions to achieve incompressible elastic simulations. The incompressibility of elastic objects and collisions between different bodies are managed by the implicit SPH algorithm. We further incorporate anisotropic energy functions, constructed from the extrapolation of Cauchy–Green invariants, to introduce anisotropic properties to the objects. By integrating activation and contraction coefficients into the energy functions, particles can simulate muscle contractions and lift heavy objects. Our method can effectively represent elastic objects with varying mechanical properties across different directions and be further employed to mimic muscle contractions. Experiments demonstrate that our approach provides realistic simulations for a wide range of animal and human body movements. [Display omitted] • Leveraging a Lagrangian-based approach for the simulation of anisotropic elasticity. • Integration of Smoothed Particle Hydrodynamics with anisotropic energy functions for advanced modeling. • Adaptable simulation of muscle contraction, effectively mimicking a diverse range of movement behaviors. • Enforcing strict incompressibility in the simulation of muscle-like tissues, ensuring highly accurate representations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. A survey of deep learning methods and datasets for hand pose estimation from hand-object interaction images.
- Author
-
Woo, Taeyun, Park, Wonjung, Jeong, Woohyun, and Park, Jinah
- Subjects
- *
POSE estimation (Computer vision) , *DEEP learning , *JOINTS (Anatomy) , *IMPLICIT functions , *VIRTUAL reality , *COMPUTER vision - Abstract
The research topic of estimating hand pose from the images of hand-object interaction has the potential for replicating natural hand behavior in many practical applications of virtual reality and robotics. However, the intricacy of hand-object interaction combined with mutual occlusion, and the need for physical plausibility, brings many challenges to the problem. This paper provides a comprehensive survey of the state-of-the-art deep learning-based approaches for estimating hand pose (joint and shape) in the context of hand-object interaction. We discuss various deep learning-based approaches to image-based hand tracking, including hand joint and shape estimation. In addition, we review the hand-object interaction dataset benchmarks that are well-utilized in hand joint and shape estimation methods. Deep learning has emerged as a powerful technique for solving many problems including hand pose estimation. While we cover extensive research in the field, we discuss the remaining challenges leading to future research directions. [Display omitted] • Deep learning is effectively used for estimating hand pose from images. • The correlation between a hand and an object helps in estimating hand-object pose. • Hand model helps estimate hand shape, but it restricts within the model's prior. • Implicit function methods have emerged in hand-object pose estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.