22,698 results on '"Solid Modeling"'
Search Results
2. Transformational Regional-Scale Earthquake Simulations with the DOE EarthQuake SIMulation Exascale Framework
- Author
-
McCallen, David, Pitarka, Arben, Tang, Houjun, Pankajakshan, Ramesh, Petersson, Anders, and Miah, Mamun
- Subjects
Civil Engineering ,Information and Computing Sciences ,Engineering ,Bioengineering ,Earthquakes ,Computational modeling ,Finite element analysis ,Motion measurement ,Solid modeling ,Soil measurement ,Seismic waves ,US Department of Energy ,Exascale computing ,Safety ,Disaster management ,Numerical and Computational Mathematics ,Computation Theory and Mathematics ,Distributed Computing ,Fluids & Plasmas ,Information and computing sciences - Abstract
Earthquakes present worldwide risk to economic and human safety. The 2023 earthquakes in Turkiye provided a reminder of the potential for catastrophic consequences with 50,700 deaths and 15.7 million people affected. The ability to predict ground motions and infrastructure damage for earthquakes continues to be a challenging problem for scientists and engineers. Until now, estimates of ground motions have been performed empirically by looking at sparse data from past earthquakes. This approach can provide statistical information on intensity amplitudes but cannot inform site-specific ground motions essential to developing the most effective resilience. Interest has grown in large-scale computational models to simulate earthquakes at regional scale. The U.S. Department of Energy EarthQuake SIMulation (EQSIM) framework was developed for regional-scale earthquake simulations at unprecedented fidelity, taking advantage of emerging GPU-accelerated systems. This article describes the EQSIM workflow and demonstrates regional-scale simulations with the new computational capability available to scientists in their quest to mitigate future disasters.
- Published
- 2024
3. Explain Vision Focus: Blending Human Saliency Into Synthetic Face Images.
- Author
-
Zhang, Kaiwei, Zhu, Dandan, Min, Xiongkuo, Duan, Huiyu, and Zhai, Guangtao
- Published
- 2025
- Full Text
- View/download PDF
4. Straw: A Stress-Aware WL-Based Read Reclaim Technique for High-Density NAND Flash-Based SSDs.
- Author
-
Chun, Myoungjun, Lee, Jaeyong, Choi, Inhyuk, Park, Jisung, Kim, Myungsuk, and Kim, Jihong
- Abstract
Although read disturbance has emerged as a major reliability concern, managing read disturbance in modern NAND flash memory has not been thoroughly investigated yet. From a device characterization study using real modern NAND flash memory, we observe that reading a page incurs heterogeneous reliability impacts on each WL, which makes the existing block-level read reclaim extremely inefficient. We propose a new WL-level read-reclaim technique, called Straw, which keeps track of the accumulated read-disturbance effect on each WL and reclaims only heavily-disturbed WLs. By avoiding unnecessary read-reclaim operations, Straw reduces read-reclaim-induced page writes by 83.6% with negligible storage overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. Boosting Micro-Expression Recognition via Self-Expression Reconstruction and Memory Contrastive Learning.
- Author
-
Bao, Yongtang, Wu, Chenxi, Zhang, Peng, Shan, Caifeng, Qi, Yue, and Ben, Xianye
- Abstract
Micro-expression (ME) is an instinctive reaction that is not controlled by thoughts. It reveals one's inner feelings, which is significant in sentiment analysis and lie detection. Since micro-expression is expressed as subtle facial changes within particular facial action units, learning discriminative and generalized features for Micro-expression Recognition (MER) is challenging. To achieve the purpose, this paper proposes a novel MER framework that simultaneously integrates supervised Prototype-based Memory Contrastive Learning (PMCL) for discriminative feature mining and adds Self-expression Reconstruction (SER) as an auxiliary task and regularization for better generalization. In particular, the proposed SER module is forced as a regularization by reconstructing input ME from the randomly dropped patch-wise features in the bottleneck. And, the PMCL module globally compares historical and current cluster agents learned from training instances to enhance intra-class compactness and inter-class separability. Extensive experiments are conducted on three benchmarks, e.g., SMIC, CASME II, and SAMM, under evaluation criteria of both Composite Database Evaluation (CDE) and Single Database Evaluation (SDE) protocols. The results show our method surpasses other state-of-the-art approaches under various evaluation metrics, achieving overall 86.30% unweighed F1-score and 88.30% unweighed average recall on the composite dataset. Furthermore, the ablation studies verify the effectiveness of our SER for better generalization and PMCL for better discrimination in learning feature representation from limited micro-expression samples. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. AdaMix: Adaptive Resampling of Multiscale Object Mixup for Lidar Data Augmentation.
- Author
-
Zhai, Ruifeng, Gao, Fengli, Guo, Yanliang, Huang, Wuling, Song, Junfeng, Li, Xueyan, and Ma, Rui
- Abstract
Lidar data, which can describe the 3D spatial information of the environment in the form of point clouds, play an important role in autonomous driving and related downstream tasks such as 3D object detection. However, unlike for images, collecting and labeling lidar data is often very expensive. As an effective means to increase the quantity of annotated data for training deep learning models, data augmentation (DA) has been widely used in the image field, but studies on augmenting lidar point clouds are only at the beginning stage. In this article, we propose AdaMix, a novel framework for lidar DA via adaptive resampling of multiscale object mixup. AdaMix contains two different object mixup schemes, i.e., object-level and part-level mixup, to augment the lidar data with the existing object instances from different scenes. For object-level mixup, a learning-based point upsampling operation is employed to obtain a set of dense objects, such as vehicles and pedestrians. For part-level mixup, parts from different vehicles are composed together and upsampled to generate vehicles of complete and dense shapes. To mix the dense objects into a new scene, AdaMix introduces a novel projection-based downsampling method to adaptively downsample the objects based on the location generated from a location sampling module. We evaluate the performance of AdaMix with several 3D object detection models on the KITTI dataset. Experimental results demonstrate that AdaMix consistently surpasses state-of-the-art lidar DA methods in improving the average precision of vehicle and pedestrian detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Computational Study of Chemical Uniformity Impacts on Electrodeposition.
- Author
-
Chalupa, Adam, Warner, Joel, and Martin, Jarett
- Subjects
- *
POWER semiconductors , *CHEMICAL processes , *COMPUTATIONAL fluid dynamics , *LAMINAR flow , *REACTIVE power - Abstract
Industrial semiconductor electrodeposition plating cells require recirculation of process chemicals with consistent flow and minimal contaminants to prevent defects from developing during film deposition. This manuscript investigates how recirculation nozzle quality and nozzle machining can affect bath chemical uniformity. Computational fluid dynamics simulations are utilized to visualize bath chemical velocities based on variable nozzle conditions in four case studies. Results show that strict quality control of inlet nozzles, in conjunction with proper mounting angles, induce laminar bath flow. Greater fluid uniformity and laminar flow translate to a reduction of in-line defects and increased wafer yield. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Continuously Controllable Facial Expression Editing in Talking Face Videos.
- Author
-
Sun, Zhiyao, Wen, Yu-Hui, Lv, Tian, Sun, Yanan, Zhang, Ziyang, Wang, Yaoyuan, and Liu, Yong-Jin
- Abstract
Recently audio-driven talking face video generation has attracted considerable attention. However, very few researches address the issue of emotional editing of these talking face videos with continuously controllable expressions, which is a strong demand in the industry. The challenge is that speech-related expressions and emotion-related expressions are often highly coupled. Meanwhile, traditional image-to-image translation methods cannot work well in our application due to the coupling of expressions with other attributes such as poses, i.e., translating the expression of the character in each frame may simultaneously change the head pose due to the bias of the training data distribution. In this paper, we propose a high-quality facial expression editing method for talking face videos, allowing the user to control the target emotion in the edited video continuously. We present a new perspective for this task as a special case of motion information editing, where we use a 3DMM to capture major facial movements and an associated texture map modeled by a StyleGAN to capture appearance details. Both representations (3DMM and texture map) contain emotional information and can be continuously modified by neural networks and easily smoothed by averaging in coefficient/latent spaces, making our method simple yet effective. We also introduce a mouth shape preservation loss to control the trade-off between lip synchronization and the degree of exaggeration of the edited expression. Extensive experiments and a user study show that our method achieves state-of-the-art performance across various evaluation criteria. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Style-exprGAN: Diverse Smile Style Image Generation Via Attention-Guided Adversarial Networks.
- Author
-
Tu, Ching-Ting and Chen, Kuan-Lin
- Abstract
This article proposes a data-driven approach for generating personalized smile style images for neutral expressions, which aims to produce diverse smile styles while preserving individual features. Unlike other generator models that require expensive manual facial attribute labeling, we designed an auxiliary expression attention Siamese network (EASN) to extract identity-irrelevant facial expression attention regions and guide the proposed two-stage style-expression generative adversarial network (style-exprGAN). The first generator stage generates the overall facial geometry and virtual smile features, while the second stage refines the image quality. Additionally, we introduced traditional geometry warping methods to include registered neutral expression images for consistent transformation and realistic texture fusion. Results show that the proposed method effectively synthesizes realistic and diverse smile styles while preserving individual features. Furthermore, we demonstrate the potential of our data-driven approach by applying the generated personalized smile style images to image augmentation tasks, improving the stability and robustness of facial recognition models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Focus on Cooperation: A Face-to-Face VR Serious Game for Relationship Enhancement.
- Author
-
Bian, Yulong, Zhou, Chao, Zhang, Yang, Liu, Juan, Sheng, Jenny, and Liu, Yong-Jin
- Abstract
Exploring effective approaches to enhance face-to-face interactions and interpersonal relationships is an important topic in the applications of affective computing. According to the co-actualization model, we propose a face-to-face co-participation serious game for relationship enhancement, with a focus on battling COVID-19. Moreover, a prototype system is developed using an immersive virtual environment and a low-cost brain-computer interface. Through this system, a dynamic flow experience enhancement tool is utilized to involve partners in the cooperative task. To evaluate the system performance, two studies are conducted with schoolmates as participants. Study 1 compares the cooperative and competitive modes, and demonstrates that the former elicited higher level of decision-making challenge and affections, which are beneficial for forming relationships. Study 2 further examines the effect of the dynamic flow enhancement tool in the cooperative task and the results show its effectiveness in promoting flow experience, perceived closeness, and intimacy in relationships. Given this short-term participation, participants felt a greater sense of closeness and intimacy than they had before the test. In conclusion, our proposed system is effective in enhancing schoolmate relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Solid modelling approach for 3D tolerance analysis of linear dimension applied to planar faces in an assembly.
- Author
-
Tounsi, Nejah and Louhichi, Borhen
- Subjects
GEOMETRIC modeling ,LINEAR statistical models ,CENTROID ,SYMMETRY ,COMPUTER software - Abstract
This paper presents a 3D tolerance analysis approach for linear dimensions applied to planar faces in an assembly. The assembly variations are generated and visualized as an explicit geometrical stack-up of the component variations using the solid modeller Solidworks®. The feature variations are obtained by adapting the geometric solid model of each component, either by offsetting the target planar face or by tilting it within the tolerance zone. A concept of Oriented Minimum Bounding Box (OMBB) is introduced to generate individual component variations with any generalized shape of the target planar face. The analysis of the OMBB extents, the tilting angles and the corresponding pivot points has revealed symmetry in these data. Rigorous mathematical formulations have been implemented in this study to handle the general case of large and small displacements. An approach is suggested to evaluate the functional dimensions, the target face's centroid and normal for each assembly variation. Functional dimensions of the assembly variations obtained by the software '3DCS Variation Analyst' are found to deviate from those obtained by the proposed approach by up to 40% of the assembly tolerance size. 3DCS tool has also failed to detect out-of-specification assembly variations, which were identified by the proposed approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Anatomical Model Reconstruction (Solid Modeling) Using a Reverse Engineering Approach
- Author
-
Adugna, Yosef W., Kurukkal, Navaneethan S., Lemu, Hirpa G., Correia, José A. F. O., Series Editor, De Jesus, Abílio M. P., Series Editor, Ayatollahi, Majid Reza, Advisory Editor, Berto, Filippo, Advisory Editor, Fernández-Canteli, Alfonso, Advisory Editor, Hebdon, Matthew, Advisory Editor, Kotousov, Andrei, Advisory Editor, Lesiuk, Grzegorz, Advisory Editor, Murakami, Yukitaka, Advisory Editor, Carvalho, Hermes, Advisory Editor, Zhu, Shun-Peng, Advisory Editor, Bordas, Stéphane, Advisory Editor, Fantuzzi, Nicholas, Advisory Editor, Susmel, Luca, Advisory Editor, Dutta, Subhrajit, Advisory Editor, Maruschak, Pavlo, Advisory Editor, Fedorova, Elena, Advisory Editor, Pavlou, Dimitrios, editor, Adeli, Hojjat, editor, Georgiou, Georgios C., editor, Giljarhus, Knut Erik, editor, and Sha, Yanyan, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Utilization of Machine Learning for the Objective Assessment of Rhinoplasty Outcomes
- Author
-
Topsakal, Oguzhan, Dobratz, Eric J, Akbas, Mustafa Ilhan, Dougherty, William M, Akinci, Tahir Cetin, and Celikoyar, Mehmet Mazhar
- Subjects
Information and Computing Sciences ,Human-Centred Computing ,Bioengineering ,Three-dimensional displays ,Solid modeling ,Nose ,Machine learning ,Surgery ,Artificial intelligence ,Machine learning algorithms ,evaluation ,machine learning ,plastic surgery ,rhinoplasty ,Engineering ,Technology ,Information and computing sciences - Published
- 2023
14. Detecting Facial Landmarks on 3D Models Based on Geometric Properties—A Review of Algorithms, Enhancements, Additions and Open-Source Implementations
- Author
-
Topsakal, Oguzhan, Akinci, Tahir Cetin, Murphy, Joshua, Preston, Taylor Lee-James, and Celikoyar, Mehmet Mazhar
- Subjects
Bioengineering ,Three-dimensional displays ,Solid modeling ,Face recognition ,Surgery ,Facial recognition ,Open source software ,3D ,landmarks detection ,face analysis ,geometric ,open source ,review ,Information and Computing Sciences ,Engineering ,Technology - Published
- 2023
15. Are 3D Face Shapes Expressive Enough for Recognising Continuous Emotions and Action Unit Intensities?
- Author
-
Tellamekala, Mani Kumar, Sumer, Omer, Schuller, Bjorn W., Andre, Elisabeth, Giesbrecht, Timo, and Valstar, Michel
- Abstract
Recognising continuous emotions and action unit (AU) intensities from face videos, requires a spatial and temporal understanding of expression dynamics. Existing works primarily rely on 2D face appearance features to extract such dynamics. This work focuses on a promising alternative based on parametric 3D face alignment models, which disentangle different factors of variation, including expression-induced shape variations. We aim to understand how expressive 3D face shapes are in estimating valence-arousal and AU intensities compared to the state-of-the-art 2D appearance-based models. We benchmark five recent 3D face models: ExpNet, 3DDFA-V2, RingNet, DECA, and EMOCA. In valence-arousal estimation, expression features of 3D face models consistently surpassed previous works and yielded an average concordance correlation of. 745 and. 574 on SEWA and AVEC 2019 CES corpora, respectively. We also study how 3D face shapes performed on AU intensity estimation on BP4D and DISFA datasets, and report that 3D face features were on par with 2D appearance features in recognising AUs 4, 6, 10, 12, and 25, but not the entire set of AUs. To understand this discrepancy, we conduct a correspondence analysis between valence-arousal and AUs, which points out that accurate prediction of valence-arousal may require the knowledge of only a few AUs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Generating Multiple 4D Expression Transitions by Learning Face Landmark Trajectories.
- Author
-
Otberdout, Naima, Ferrari, Claudio, Daoudi, Mohamed, Berretti, Stefano, and Bimbo, Alberto Del
- Abstract
In this article, we address the problem of 4D facial expressions generation. This is usually addressed by animating a neutral 3D face to reach an expression peak, and then get back to the neutral state. In the real world though, people show more complex expressions, and switch from one expression to another. We thus propose a new model that generates transitions between different expressions, and synthesizes long and composed 4D expressions. This involves three sub-problems: (1) modeling the temporal dynamics of expressions, (2) learning transitions between them, and (3) deforming a generic mesh. We propose to encode the temporal evolution of expressions using the motion of a set of 3D landmarks, that we learn to generate by training a manifold-valued GAN (Motion3DGAN). To allow the generation of composed expressions, this model accepts two labels encoding the starting and the ending expressions. The final sequence of meshes is generated by a Sparse2Dense mesh Decoder (S2D-Dec) that maps the landmark displacements to a dense, per-vertex displacement of a known mesh topology. By explicitly working with motion trajectories, the model is totally independent from the identity. Extensive experiments on five public datasets show that our proposed approach brings significant improvements with respect to previous solutions, while retaining good generalization to unseen data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. A Unified Machine Learning Through Focus Resist 3-D Structure Model.
- Author
-
Xia, Mingyang, Yan, Yan, Li, Chen, and Shi, Xuelong
- Subjects
- *
CONVOLUTIONAL neural networks , *MACHINE learning , *DEEP learning , *CROSS correlation , *LITHOGRAPHY - Abstract
To ensure post OPC data quality, examination based on estimated resist contours at resist bottom alone is insufficient, reliable prediction of lithography performance within process window must rely on complete information of on-wafer resist 3D structures. In this regard, resist 3D structure model, in particular, the through focus resist 3D structure model, with full chip capability will be the ultimate model in demand. To develop machine learning resist 3D structure models,we have proposed the physics-based information encoding scheme, together with carefully chosen deep convolution neural network and model training strategies. Our proposed through focus resist 3D structure model is based on conditional U-net structure with first five eigen images as model’s main inputs and the focus setting as the conditional input. The average normalized cross correlation (NCC) or mean structure similarity index between ground truth and model predicted resist 3D structures can reach 0.92. With single GPU (Tesla M60), it takes 6.1ms for the model to produce resist 3D structure covering area of 1.8umx1.8 $\mu {\mathrm{ m}}$. The model is fast enough and can be engineered for full chip implementation. The model can extend the capability of detecting lithography process window aware resist loss related hotspots. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. A Deep Invertible 3-D Facial Shape Model for Interpretable Genetic Syndrome Diagnosis
- Author
-
Bannister, Jordan J, Wilms, Matthias, Aponte, J David, Katz, David C, Klein, Ophir D, Bernier, Francois PJ, Spritz, Richard A, Hallgrmsson, Benedikt, and Forkert, Nils D
- Subjects
Information and Computing Sciences ,Human-Centred Computing ,Bioengineering ,Diagnosis ,Computer-Assisted ,Face ,Humans ,Three-dimensional displays ,Solid modeling ,Surface morphology ,Shape ,Genetics ,Data models ,Morphology ,Genetic syndrome ,normalizing flow ,interpretable machine learning ,3D shape model ,Engineering ,Medical and Health Sciences ,Medical Informatics ,Health services and systems ,Applied computing - Abstract
One of the primary difficulties in treating patients with genetic syndromes is diagnosing their condition. Many syndromes are associated with characteristic facial features that can be imaged and utilized by computer-assisted diagnosis systems. In this work, we develop a novel 3D facial surface modeling approach with the objective of maximizing diagnostic model interpretability within a flexible deep learning framework. Therefore, an invertible normalizing flow architecture is introduced to enable both inferential and generative tasks in a unified and efficient manner. The proposed model can be used (1) to infer syndrome diagnosis and other demographic variables given a 3D facial surface scan and (2) to explain model inferences to non-technical users via multiple interpretability mechanisms. The model was trained and evaluated on more than 4700 facial surface scans from subjects with 47 different syndromes. For the challenging task of predicting syndrome diagnosis given a new 3D facial surface scan, age, and sex of a subject, the model achieves a competitive overall top-1 accuracy of 71%, and a mean sensitivity of 43% across all syndrome classes. We believe that invertible models such as the one presented in this work can achieve competitive inferential performance while greatly increasing model interpretability in the domain of medical diagnosis.
- Published
- 2022
19. An Ex Vivo Study of Outward Electrical Impedance Tomography (OEIT) for Intravascular Imaging
- Author
-
Luo, Yuan, Huang, Dong, Huang, Zi-Yu, Hsiai, Tzung K, and Tai, Yu-Chong
- Subjects
Engineering ,Biomedical Engineering ,Heart Disease ,Atherosclerosis ,Biomedical Imaging ,Heart Disease - Coronary Heart Disease ,Bioengineering ,Cardiovascular ,4.2 Evaluation of markers and technologies ,Detection ,screening and diagnosis ,Animals ,Electric Impedance ,Phantoms ,Imaging ,Swine ,Tomography ,Tomography ,X-Ray Computed ,Imaging ,Electrical impedance tomography ,Conductivity ,Lesions ,Electrodes ,Solid modeling ,electrical impedance tomography ,intravascular imaging ,intravascular navigation ,Artificial Intelligence and Image Processing ,Electrical and Electronic Engineering ,Biomedical engineering ,Electronics ,sensors and digital hardware ,Computer vision and multimedia computation - Abstract
ObjectiveAtherosclerosis is a chronic immuno-inflammatory condition emerging in arteries and considered the cause of a myriad of cardiovascular diseases. Atherosclerotic lesion characterization through invasive imaging modalities is essential in disease evaluation and determining intervention strategy. Recently, electrical properties of the lesions have been utilized in assessing its vulnerability mainly owing to its capability to differentiate lipid content existing in the lesion, albeit with limited detection resolution. Electrical impedance tomography is the natural extension of conventional spectrometric measurement by incorporating larger number of interrogating electrodes and advanced algorithm to achieve imaging of target objects and thus provides significantly richer information. It is within this context that we develop Outward Electrical Impedance Tomography (OEIT), aimed at intravascular imaging for atherosclerotic lesion characterization.MethodsWe utilized flexible electronics to establish the 32-electrode OEIT device with outward facing configuration suitable for imaging of vessels. We conducted comprehensive studies through simulation model and ex vivo setup to demonstrate the functionality of OEIT.ResultsQuantitative characterization for OEIT regarding its proximity sensing and conductivity differentiation was achieved using well-controlled experimental conditions. Imaging capability for OEIT was further verified with phantom setup using porcine aorta to emulate in vivo environment.ConclusionWe have successfully demonstrated a novel tool for intravascular imaging, OEIT, with unique advantages for atherosclerosis detection.SignificanceThis study demonstrates for the first time a novel electrical tomography-based platform for intravascular imaging, and we believe it paves the way for further adaptation of OEIT for intravascular detection in more translational settings and offers great potential as an alternative imaging tool for medical diagnosis.
- Published
- 2022
20. A Miniature Flexible Coil for High-SNR MRI of the Pituitary Gland
- Author
-
Lin, Jiahao, Liu, Siyuan, Bergsneider, Marvin, Hadley, J Rock, Prashant, Giyarpuram N, Peeters, Sophie, Candler, Robert N, and Sung, Kyunghyun
- Subjects
Engineering ,Brain Cancer ,Rare Diseases ,Bioengineering ,Biomedical Imaging ,Cancer ,Brain Disorders ,Neurosciences ,Magnetic resonance imaging ,Signal to noise ratio ,Pituitary gland ,Phantoms ,Solid modeling ,Radio frequency ,Numerical models ,Flexible RF-coil ,miniature ,coil simulation ,pituitary microadenomas ,signal-to-noise ratio ,high-resolution ,endoscopic endonasal surgery ,Information and Computing Sciences ,Technology ,Information and computing sciences - Published
- 2022
21. Improved Gamma-Ray Point Source Quantification in Three Dimensions by Modeling Attenuation in the Scene
- Author
-
Bandstra, MS, Hellfeld, D, Vavrek, JR, Quiter, BJ, Meehan, K, Barton, PJ, Cates, JW, Moran, A, Negut, V, Pavlovsky, R, and Joshi, THY
- Subjects
Maximum likelihood estimation ,Maximum likelihood detection ,Solid modeling ,Parameter estimation ,Laser radar ,Gamma-rays ,Attenuation ,Attenuation correction ,gamma-ray imaging ,maximum likelihood estimation ,radiological source localization ,radiological source search ,Atomic ,Molecular ,Nuclear ,Particle and Plasma Physics ,Other Physical Sciences ,Biomedical Engineering ,Nuclear & Particles Physics - Abstract
Using a series of detector measurements taken at different locations to localize a source of radiation is a well-studied problem. The source of radiation is sometimes constrained to a single point-like source, in which case the location of the point source can be found using techniques such as maximum likelihood. Recent advancements have shown the ability to locate point sources in 2-D and even 3-D but few have studied the effect of intervening material on the problem. In this work, we examine gamma-ray data taken from a freely moving system and develop voxelized 3-D models of the scene using data from its onboard light detection and ranging (LiDAR) unit. Ray casting is used to compute the distance each gamma ray travels through the scene material, which is then used to calculate attenuation assuming a single attenuation coefficient for solids within the geometry. Parameter estimation using maximum likelihood is performed to simultaneously find the attenuation coefficient, source activity, and source position that best match the data. Using a simulation, we validate the ability of this method to reconstruct the true location and activity of a source, along with the true attenuation coefficient of the structure it is inside, and then we apply the method to measured data with sources and find good agreement.
- Published
- 2021
22. One Transform to Compute Them All: Efficient Fusion-Based Full-Reference Video Quality Assessment.
- Author
-
Venkataramanan, Abhinau K., Stejerean, Cosmin, Katsavounidis, Ioannis, and Bovik, Alan C.
- Subjects
- *
VIDEO compression , *STREAMING media , *VIDEOS , *MASS media industry , *CONTRAST sensitivity (Vision) , *SOCIAL media - Abstract
The Visual Multimethod Assessment Fusion (VMAF) algorithm has recently emerged as a state-of-the-art approach to video quality prediction, that now pervades the streaming and social media industry. However, since VMAF requires the evaluation of a heterogeneous set of quality models, it is computationally expensive. Given other advances in hardware-accelerated encoding, quality assessment is emerging as a significant bottleneck in video compression pipelines. Towards alleviating this burden, we propose a novel Fusion of Unified Quality Evaluators (FUNQUE) framework, by enabling computation sharing and by using a transform that is sensitive to visual perception to boost accuracy. Further, we expand the FUNQUE framework to define a collection of improved low-complexity fused-feature models that advance the state-of-the-art of video quality performance with respect to both accuracy, by 4.2% to 5.3%, and computational efficiency, by factors of 3.8 to 11 times! [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. An Ensemble Learning Algorithm for Cognitive Evaluation by an Immersive Virtual Reality Supermarket.
- Author
-
Wang, Yifan, Yang, Ping, Yu, Jiangtao, Zhang, Shang, Gong, Liang, Liu, Chunfeng, Zhou, Wenjun, and Peng, Bo
- Subjects
MACHINE learning ,ENSEMBLE learning ,MILD cognitive impairment ,OLDER people ,COGNITIVE testing - Abstract
Early screening for Mild Cognitive Impairment (MCI) is crucial in delaying cognitive deterioration and treating dementia. Conventional neuropsychological tests, commonly used for MCI detection, often lack ecological validity due to their simplistic and quiet testing environments. To address this gap, our study developed an immersive VR supermarket cognitive assessment program (IVRSCAP), simulating daily cognitive activities to enhance the ecological validity of MCI detection. This program involved elderly participants from Chengdu Second People’s Hospital and various communities, comprising both MCI patients (N=301) and healthy elderly individuals (N=1027). They engaged in the VR supermarket cognitive test, generating complex datasets including User Behavior Data, Tested Cognitive Dimension Game Data, Trajectory Data, and Regional Data. To analyze this data, we introduced an adaptive ensemble learning method for imbalanced samples. Our study’s primary contribution is demonstrating the superior performance of this algorithm in classifying MCI and healthy groups based on their performance in IVRSCAP. Comparative analysis confirmed its efficacy over traditional imbalanced sample processing methods and classic ensemble learning voting algorithms, significantly outperforming in metrics such as recall, F1-score, AUC, and G-mean. Our findings advocate the combined use of IVRSCAP and our algorithm as a technologically advanced, ecologically valid approach for enhancing early MCI detection strategies. This aligns with our broader aim of integrating realistic simulations with advanced computational techniques to improve diagnostic accuracy and treatment efficacy in cognitive health assessments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. M2M-InvNet: Human Motor Cortex Mapping From Multi-Muscle Response Using TMS and Generative 3D Convolutional Network.
- Author
-
Akbar, Md Navid, Yarossi, Mathew, Rampersad, Sumientra, Lockwood, Kyle, Masoomi, Aria, Tunik, Eugene, Brooks, Dana, and Erdogmus, Deniz
- Subjects
CONVOLUTIONAL neural networks ,TRANSCRANIAL magnetic stimulation ,EVOKED potentials (Electrophysiology) ,MOTOR neurons ,ELECTRIC fields - Abstract
Transcranial magnetic stimulation (TMS) is often applied to the motor cortex to stimulate a collection of motor evoked potentials (MEPs) in groups of peripheral muscles. The causal interface between TMS and MEP is the selective activation of neurons in the motor cortex; moving around the TMS ‘spot’ over the motor cortex causes different MEP responses. A question of interest is whether a collection of MEP responses can be used to identify the stimulated locations on the cortex, which could potentially be used to then place the TMS coil to produce chosen sets of MEPs. In this work we leverage our previous report on a 3D convolutional neural network (CNN) architecture that predicted MEPs from the induced electric field, to tackle an inverse imaging task in which we start with the MEPs and estimate the stimulated regions on the motor cortex. We present and evaluate five different inverse imaging CNN architectures, both conventional and generative, in terms of several measures of reconstruction accuracy. We found that one architecture, which we propose as M2M-InvNet, consistently achieved the best performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Omnidirectional Video Super-Resolution Using Deep Learning.
- Author
-
Agrahari Baniya, Arbind, Lee, Tsz-Kwan, Eklund, Peter W., and Aryal, Sunil
- Published
- 2024
- Full Text
- View/download PDF
26. Fluid-Structure Interaction Within Models of Patient-Specific Arteries: Computational Simulations and Experimental Validations.
- Author
-
Schoenborn, Sabrina, Pirola, Selene, Woodruff, Maria A., and Allenby, Mark C.
- Abstract
Cardiovascular disease (CVD) is the leading cause of mortality worldwide and its incidence is rising due to an aging population. The development and progression of CVD is directly linked to adverse vascular hemodynamics and biomechanics, whose in-vivo measurement remains challenging but can be simulated numerically and experimentally. The ability to evaluate these parameters in patient-specific CVD cases is crucial to better predict future disease progression, risk of adverse events, and treatment efficacy. While significant progress has been made toward patient-specific hemodynamic simulations, blood vessels are often assumed to be rigid, which does not consider the compliant mechanical properties of vessels whose malfunction is implicated in disease. In an effort to simulate the biomechanics of flexible vessels, fluid-structure interaction (FSI) simulations have emerged as promising tools for the characterization of hemodynamics within patient-specific cardiovascular anatomies. Since FSI simulations combine the blood's fluid domain with the arterial structural domain, they pose novel challenges for their experimental validation. This paper reviews the scientific work related to FSI simulations for patient-specific arterial geometries and the current standard of FSI model validation including the use of compliant arterial phantoms, which offer novel potential for the experimental validation of FSI results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Impact of Dynamic Tip-Surface Interactions on Microdroplet Formation via Fluid Force Microscopy.
- Author
-
Yang, Zhuobo, Zhang, Xianmin, Li, Hai, Zhu, Benliang, Feng, Ke, and Wang, Rixin
- Abstract
The emergence of FluidFM technology has presented a novel opportunity for selective aspiration and distribution of liquids at the sub-micron scale. However, precise control over droplet generation volume remains a challenging issue. This study investigates the impact of dynamic parameters between the tip and surface on droplet formation. Initially, we establish a finite element model to simulate the process from the formation of a liquid bridge between the hollow cantilever probe and the substrate to the rupture resulting in droplet formation. Additionally, we propose an image-based method for quantifying droplet volume. Subsequently, we construct a comprehensive experimental framework to delve into the influence of these parameters on microdroplet formation by monitoring the dynamic interaction parameters between the probe tip and the liquid surface. Finally, we optimize the parameters during the droplet generation process, enabling the formation of uniform arrays of microdroplets under controlled experimental conditions. Our study provides a reference for automation parameters of serial experiments and improves experimental throughput. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Pulse Wave Modeling Using Bio-Impedance Simulation Platform Based on a 3D Time-Varying Circuit Model
- Author
-
Ibrahim, Bassem, Hall, Drew A, and Jafari, Roozbeh
- Subjects
Engineering ,Electronics ,Sensors and Digital Hardware ,Biomedical Engineering ,Bioengineering ,Cardiovascular ,Rehabilitation ,Detection ,screening and diagnosis ,4.1 Discovery and preclinical testing of markers and technologies ,Electric Impedance ,Electrodes ,Heart Rate ,Pulse ,Pulse Wave Analysis ,Integrated circuit modeling ,Solid modeling ,Biological system modeling ,Arteries ,Blood ,Sensors ,Impedance ,Bio-impedance ,simulation ,3D tissue model ,pulse wave ,Electrical and Electronic Engineering ,Electrical & Electronic Engineering ,Biomedical engineering ,Electronics ,sensors and digital hardware - Abstract
Cardiovascular disease (CVD) threatens the lives of many and affects their productivity. Wearable sensors can enable continuous monitoring of hemodynamic parameters to improve the diagnosis and management of CVD. Bio-Impedance (Bio-Z) is an effective non-invasive sensor for arterial pulse wave monitoring based on blood volume changes in the artery due to the deep penetration of its current signal inside the tissue. However, the measured data are significantly affected by the placement of electrodes relative to the artery and the electrode configuration. In this work, we created a Bio-Z simulation platform that models the tissue, arterial pulse wave, and Bio-Z sensing configuration using a 3D circuit model based on a time-varying impedance grid. A new method is proposed to accurately simulate the different tissue types such as blood, fat, muscles, and bones in a 3D circuit model in addition to the pulsatile activity of the arteries through a variable impedance model. This circuit model is simulated in SPICE and can be used to guide design decisions (i.e. electrode placement relative to the artery and electrode configuration) to optimize the monitoring of pulse wave prior to experimentation. We present extensive simulations of the arterial pulse waveform for different sensor locations, electrode sizes, current injection frequencies, and artery depths. These simulations are validated by experimental Bio-Z measurements.
- Published
- 2021
29. MBD Based 3D CAD Model Automatic Feature Recognition and Similarity Evaluation
- Author
-
Ding, Shuhui, Feng, Qiang, Sun, Zhaoyang, and Ma, Fai
- Subjects
Networking and Information Technology R&D (NITRD) ,Solid modeling ,Faces ,Three-dimensional displays ,Feature extraction ,Shape ,Computational modeling ,Face recognition ,Automatic feature recognition ,similarity evaluation ,multi-dimensional attributed adjacency matrix ,weighted complete bipartite graph ,Kuhn-Munkres algorithm ,Information and Computing Sciences ,Engineering ,Technology - Abstract
Automatic Feature Recognition (AFR) is considered as the key connection technique of the integration of Computer Aided Design (CAD) and Computer Aided Process Planning (CAPP). At present, there is a lack of a systematic method to identify and evaluate the local features of 3D CAD models. The process information such as topological structure, shape and size, tolerance and surface roughness should be considered. Therefore, a novel Model Based Definition (MBD) based on 3D CAD model AFR and similarity evaluation are proposed in this paper. A Multi-Dimensional Attributed Adjacency Matrix (MDAAM) based on MBD is established based on the fully consideration of the topological structure, shape and size, surface roughness, tolerance and other process information of the B-rep model. Based on the MDAAM, a two-stage model local feature similarity evaluation method is proposed, which combines the methods of optimal matching and adjacency judgment. First, the faces of source feature and target model are used as independent sets to construct a bipartite graph. Secondly, supplement the vertices in the independent set of source feature to make the number of vertices in two independent sets equal. Thirdly, based on MDAAM data, the weighted complete bipartite graph is constructed with the face similarity between two independent sets as the weight. Fourthly, Kuhn-Munkres algorithm is used to calculate the optimal matching between the faces of source feature and target model. Fifthly, the adjacency between matching faces in target model is judged. Finally, the similarity between matching faces of the two models is calculated, which is used as the similarity evaluation result. The effectiveness of this method is verified by three applications.
- Published
- 2021
30. Precision Machining Technology of Jewelry on CNC Machine Tool Based on Mathematical Modeling
- Author
-
Qian Nianhua and Zhou Ningrui
- Subjects
cnc machine tools ,simulation manufacturing ,five-axis linkage equation ,solid modeling ,precision jewelry processing ,00a69 ,Mathematics ,QA1-939 - Abstract
This article establishes the actual movement mathematical model of CNC machine tools for the precision processing of jewelry. Through analyzing the general geometric error analysis model of CNC machine tools with less than five axes and the method of solving precision CNC instructions, the operating principle of the CNC machine tools is studied. At the same time, we use a transformation matrix to express the relationship between the various moving bodies. The article abstracts the complex motion relationship between entities as the relationship between mathematical matrices. The experimental results show that the theoretical method proposed in this paper can increase the machining accuracy of the machine tool by more than 50%.
- Published
- 2023
- Full Text
- View/download PDF
31. POCE: Pose-Controllable Expression Editing.
- Author
-
Wu, Rongliang, Yu, Yingchen, Zhan, Fangneng, Zhang, Jiahui, Liao, Shengcai, and Lu, Shijian
- Subjects
- *
ARTIFICIAL neural networks , *GENERATIVE adversarial networks , *FACIAL expression , *COMPUTER graphics , *THREE-dimensional modeling - Abstract
Facial expression editing has attracted increasing attention with the advance of deep neural networks in recent years. However, most existing methods suffer from compromised editing fidelity and limited usability as they either ignore pose variations (unrealistic editing) or require paired training data (not easy to collect) for pose controls. This paper presents POCE, an innovative pose-controllable expression editing network that can generate realistic facial expressions and head poses simultaneously with just unpaired training images. POCE achieves the more accessible and realistic pose-controllable expression editing by mapping face images into UV space, where facial expressions and head poses can be disentangled and edited separately. POCE has two novel designs. The first is self-supervised UV completion that allows to complete UV maps sampled under different head poses, which often suffer from self-occlusions and missing facial texture. The second is weakly-supervised UV editing that allows to generate new facial expressions with minimal modification of facial identity, where the synthesized expression could be controlled by either an expression label or directly transplanted from a reference UV map via feature transfer. Extensive experiments show that POCE can learn from unpaired face images effectively, and the learned model can generate realistic and high-fidelity facial expressions under various new poses. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Rethinking Lightweight Salient Object Detection via Network Depth-Width Tradeoff.
- Author
-
Li, Jia, Qiao, Shengye, Zhao, Zhirui, Xie, Chenxi, Chen, Xiaowu, and Xia, Changqun
- Subjects
- *
OBJECT recognition (Computer vision) , *FEATURE extraction , *TASK analysis , *NETWORK performance , *SEMANTICS - Abstract
Existing salient object detection methods often adopt deeper and wider networks for better performance, resulting in heavy computational burden and slow inference speed. This inspires us to rethink saliency detection to achieve a favorable balance between efficiency and accuracy. To this end, we design a lightweight framework while maintaining satisfying competitive accuracy. Specifically, we propose a novel trilateral decoder framework by decoupling the U-shape structure into three complementary branches, which are devised to confront the dilution of semantic context, loss of spatial structure and absence of boundary detail, respectively. Along with the fusion of three branches, the coarse segmentation results are gradually refined in structure details and boundary quality. Without adding additional learnable parameters, we further propose Scale-Adaptive Pooling Module to obtain multi-scale receptive field. In particular, on the premise of inheriting this framework, we rethink the relationship among accuracy, parameters and speed via network depth-width tradeoff. With these insightful considerations, we comprehensively design shallower and narrower models to explore the maximum potential of lightweight SOD. Our models are proposed for different application environments: 1) a tiny version CTD-S (1.7M, 125FPS) for resource constrained devices, 2) a fast version CTD-M (12.6M, 158FPS) for speed-demanding scenarios, 3) a standard version CTD-L (26.5M, 84FPS) for high-performance platforms. Extensive experiments validate the superiority of our method, which achieves better efficiency-accuracy balance across five benchmarks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches.
- Author
-
Zheng, Yixiao, Xie, Jiyang, Sain, Aneeshan, Song, Yi-Zhe, and Ma, Zhanyu
- Subjects
- *
TRANSFORMER models , *FEATURE extraction , *TASK analysis , *POINT set theory , *SEMANTICS - Abstract
Sketch is a well-researched topic in the vision community by now. Sketch semantic segmentation in particular, serves as a fundamental step towards finer-level sketch interpretation. Recent works use various means of extracting discriminative features from sketches and have achieved considerable improvements on segmentation accuracy. Common approaches for this include attending to the sketch-image as a whole, its stroke-level representation or the sequence information embedded in it. However, they mostly focus on only a part of such multi-facet information. In this paper, we for the first time demonstrate that there is complementary information to be explored across all these three facets of sketch data, and that segmentation performance consequently benefits as a result of such exploration of sketch-specific information. Specifically, we propose the Sketch-Segformer, a transformer-based framework for sketch semantic segmentation that inherently treats sketches as stroke sequences other than pixel-maps. In particular, Sketch-Segformer introduces two types of self-attention modules having similar structures that work with different receptive fields (i.e., whole sketch or individual stroke). The order embedding is then further synergized with spatial embeddings learned from the entire sketch as well as localized stroke-level information. Extensive experiments show that our sketch-specific design is not only able to obtain state-of-the-art performance on traditional figurative sketches (such as SPG, SketchSeg-150K datasets), but also performs well on creative sketches that do not conform to conventional object semantics (CreativeSketch dataset) thanks for our usage of multi-facet sketch information. Ablation studies, visualizations, and invariance tests further justifies our design choice and the effectiveness of Sketch-Segformer. Codes are available at https://github.com/PRIS-CV/Sketch-SF. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Regular Splitting Graph Network for 3D Human Pose Estimation.
- Author
-
Hassan, Md. Tanvir and Ben Hamza, A.
- Subjects
- *
HUMAN skeleton , *REGULAR graphs , *UNDIRECTED graphs , *TRANSFORMER models , *THREE-dimensional modeling - Abstract
In human pose estimation methods based on graph convolutional architectures, the human skeleton is usually modeled as an undirected graph whose nodes are body joints and edges are connections between neighboring joints. However, most of these methods tend to focus on learning relationships between body joints of the skeleton using first-order neighbors, ignoring higher-order neighbors and hence limiting their ability to exploit relationships between distant joints. In this paper, we introduce a higher-order regular splitting graph network (RS-Net) for 2D-to-3D human pose estimation using matrix splitting in conjunction with weight and adjacency modulation. The core idea is to capture long-range dependencies between body joints using multi-hop neighborhoods and also to learn different modulation vectors for different body joints as well as a modulation matrix added to the adjacency matrix associated to the skeleton. This learnable modulation matrix helps adjust the graph structure by adding extra graph edges in an effort to learn additional connections between body joints. Instead of using a shared weight matrix for all neighboring body joints, the proposed RS-Net model applies weight unsharing before aggregating the feature vectors associated to the joints in order to capture the different relations between them. Experiments and ablations studies performed on two benchmark datasets demonstrate the effectiveness of our model, achieving superior performance over recent state-of-the-art methods for 3D human pose estimation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Efficient 3D Scene Semantic Segmentation via Active Learning on Rendered 2D Images.
- Author
-
Rong, Mengqi, Cui, Hainan, and Shen, Shuhan
- Subjects
- *
IMAGE segmentation , *COMPUTER graphics , *POINT cloud , *THREE-dimensional modeling , *PREDICTION models - Abstract
Inspired by Active Learning and 2D-3D semantic fusion, we proposed a novel framework for 3D scene semantic segmentation based on rendered 2D images, which could efficiently achieve semantic segmentation of any large-scale 3D scene with only a few 2D image annotations. In our framework, we first render perspective images at certain positions in the 3D scene. Then we continuously fine-tune a pre-trained network for image semantic segmentation and project all dense predictions to the 3D model for fusion. In each iteration, we evaluate the 3D semantic model and re-render images in several representative areas where the 3D segmentation is not stable and send them to the network for training after annotation. Through this iterative process of rendering-segmentation-fusion, it can effectively generate difficult-to-segment image samples in the scene, while avoiding complex 3D annotations, so as to achieve label-efficient 3D scene segmentation. Experiments on three large-scale indoor and outdoor 3D datasets demonstrate the effectiveness of the proposed method compared with other state-of-the-art. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Weakly-Supervised 3D Spatial Reasoning for Text-Based Visual Question Answering.
- Author
-
Li, Hao, Huang, Jinfa, Jin, Peng, Song, Guoli, Wu, Qi, and Chen, Jie
- Subjects
- *
OPTICAL character recognition , *TRANSFORMER models , *TASK analysis , *COGNITION , *MODEL airplanes , *QUESTION answering systems - Abstract
Text-based Visual Question Answering (TextVQA) aims to produce correct answers for given questions about the images with multiple scene texts. In most cases, the texts naturally attach to the surface of the objects. Therefore, spatial reasoning between texts and objects is crucial in TextVQA. However, existing approaches are constrained within 2D spatial information learned from the input images and rely on transformer-based architectures to reason implicitly during the fusion process. Under this setting, these 2D spatial reasoning approaches cannot distinguish the fine-grained spatial relations between visual objects and scene texts on the same image plane, thereby impairing the interpretability and performance of TextVQA models. In this paper, we introduce 3D geometric information into the spatial reasoning process to capture the contextual knowledge of key objects step-by-step. Specifically, (i) we propose a relation prediction module for accurately locating the region of interest of critical objects; (ii) we design a depth-aware attention calibration module for calibrating the OCR tokens’ attention according to critical objects. Extensive experiments show that our method achieves state-of-the-art performance on TextVQA and ST-VQA datasets. More encouragingly, our model surpasses others by clear margins of 5.7% and 12.1% on questions that involve spatial reasoning in TextVQA and ST-VQA valid split. Besides, we also verify the generalizability of our model on the text-based image captioning task. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. Single-View 3D Mesh Reconstruction for Seen and Unseen Categories.
- Author
-
Yang, Xianghui, Lin, Guosheng, and Zhou, Luping
- Subjects
- *
COMPUTER vision , *IMAGE reconstruction , *FEATURE extraction , *DEEP learning , *POINT cloud - Abstract
Single-view 3D object reconstruction is a fundamental and challenging computer vision task that aims at recovering 3D shapes from single-view RGB images. Most existing deep learning based reconstruction methods are trained and evaluated on the same categories, and they cannot work well when handling objects from novel categories that are not seen during training. Focusing on this issue, this paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories and encourage models to reconstruct objects literally. Specifically, we propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction. Firstly, we factorize the complicated image-to-mesh mapping into two simpler mappings, i.e., image-to-point mapping and point-to-mesh mapping, while the latter is mainly a geometric problem and less dependent on object categories. Secondly, we devise a local feature sampling strategy in 2D and 3D feature spaces to capture the local geometry shared across objects to enhance model generalization. Thirdly, apart from the traditional point-to-point supervision, we introduce a multi-view silhouette loss to supervise the surface generation process, which provides additional regularization and further relieves the overfitting problem. The experimental results show that our method significantly outperforms the existing works on the ShapeNet and Pix3D under different scenarios and various metrics, especially for novel objects. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Study of Spatio-Temporal Modeling in Video Quality Assessment.
- Author
-
Fang, Yuming, Li, Zhaoqian, Yan, Jiebin, Sui, Xiangjie, and Liu, Hantao
- Subjects
- *
RECURRENT neural networks , *VISUAL learning , *FEATURE extraction , *VIDEO recording , *STREAMING media - Abstract
Video quality assessment (VQA) has received remarkable attention recently. Most of the popular VQA models employ recurrent neural networks (RNNs) to capture the temporal quality variation of videos. However, each long-term video sequence is commonly labeled with a single quality score, with which RNNs might not be able to learn long-term quality variation well: What’s the real role of RNNs in learning the visual quality of videos? Does it learn spatio-temporal representation as expected or just aggregating spatial features redundantly? In this study, we conduct a comprehensive study by training a family of VQA models with carefully designed frame sampling strategies and spatio-temporal fusion methods. Our extensive experiments on four publicly available in- the-wild video quality datasets lead to two main findings. First, the plausible spatio-temporal modeling module (i. e., RNNs) does not facilitate quality-aware spatio-temporal feature learning. Second, sparsely sampled video frames are capable of obtaining the competitive performance against using all video frames as the input. In other words, spatial features play a vital role in capturing video quality variation for VQA. To our best knowledge, this is the first work to explore the issue of spatio-temporal modeling in VQA. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. Cognition Guided Human-Object Relationship Detection.
- Author
-
Zeng, Zhitao, Dai, Pengwen, Zhang, Xuan, Zhang, Lei, and Cao, Xiaochun
- Subjects
- *
COGNITIVE science , *FEATURE extraction , *THREE-dimensional modeling , *COGNITION , *GENOMES - Abstract
Human-object relationship detection reveals the fine-grained relationship between humans and objects, helping the comprehensive understanding of videos. Previous human-object relationship detection approaches are mainly developed with object features and relation features without exploring the specific information of humans. In this paper, we propose a novel Relation-Pose Transformer (RPT) for human-object relationship detection. Inspired by the coordination of eye-head-body movements in cognitive science, we employ the head pose to find those crucial objects that humans focus on and use the body pose with skeleton information to represent multiple actions. Then, we utilize the spatial encoder to capture spatial contextualized information of the relation pair, which integrates the relation features and pose features. Next, the temporal decoder aims to model the temporal dependency of the relationship. Finally, we adopt multiple classifiers to predict different types of relationships. Extensive experiments on the benchmark Action Genome validate the effectiveness of our proposed method and show the state-of-the-art performance compared with related methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Deep Face Video Inpainting via UV Mapping.
- Author
-
Yang, Wenqi, Chen, Zhenfang, Chen, Chaofeng, Chen, Guanying, and Wong, Kwan-Yee K.
- Subjects
- *
IMAGE reconstruction , *DEEP learning , *INPAINTING , *TASK analysis , *STEREOLITHOGRAPHY - Abstract
This paper addresses the problem of face video inpainting. Existing video inpainting methods target primarily at natural scenes with repetitive patterns. They do not make use of any prior knowledge of the face to help retrieve correspondences for the corrupted face. They therefore only achieve sub-optimal results, particularly for faces under large pose and expression variations where face components appear very differently across frames. In this paper, we propose a two-stage deep learning method for face video inpainting. We employ 3DMM as our 3D face prior to transform a face between the image space and the UV (texture) space. In Stage I, we perform face inpainting in the UV space. This helps to largely remove the influence of face poses and expressions and makes the learning task much easier with well aligned face features. We introduce a frame-wise attention module to fully exploit correspondences in neighboring frames to assist the inpainting task. In Stage II, we transform the inpainted face regions back to the image space and perform face video refinement that inpaints any background regions not covered in Stage I and also refines the inpainted face regions. Extensive experiments have been carried out which show our method can significantly outperform methods based merely on 2D information, especially for faces under large pose and expression variations. Project page: https://ywq.github.io/FVIP. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. LSTM-MSA: A Novel Deep Learning Model With Dual-Stage Attention Mechanisms Forearm EMG-Based Hand Gesture Recognition.
- Author
-
Zhang, Haotian, Qu, Hang, Teng, Long, and Tang, Chak-Yin
- Subjects
LONG short-term memory ,CONVOLUTIONAL neural networks ,DEEP learning ,FEATURE extraction ,SIGNAL processing - Abstract
This paper introduces the Long Short-Term Memory with Dual-Stage Attention (LSTM-MSA) model, an approach for analyzing electromyography (EMG) signals. EMG signals are crucial in applications like prosthetic control, rehabilitation, and human-computer interaction, but they come with inherent challenges such as non-stationarity and noise. The LSTM-MSA model addresses these challenges by combining LSTM layers with attention mechanisms to effectively capture relevant signal features and accurately predict intended actions. Notable features of this model include dual-stage attention, end-to-end feature extraction and classification integration, and personalized training. Extensive evaluations across diverse datasets consistently demonstrate the LSTM-MSA’s superiority in terms of F1 score, accuracy, recall, and precision. This research provides a model for real-world EMG signal applications, offering improved accuracy, robustness, and adaptability. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Neural Stimulation Hardware for the Selective Intrafascicular Modulation of the Vagus Nerve.
- Author
-
Strauss, I., Agnesi, F., Zinno, C., Giannotti, A., Dushpanova, A., Casieri, V., Terlizzi, D., Bernini, F., Gabisonia, K., Wu, Y., Jiang, D., Paggi, V., Lacour, S., Recchia, F., Demosthenous, A., Lionetti, V., and Micera, S.
- Subjects
VAGUS nerve stimulation ,ACTION potentials ,EVOKED potentials (Electrophysiology) ,VAGUS nerve ,INNERVATION of the heart - Abstract
The neural stimulation of the vagus nerve is able to modulate various functions of the parasympathetic response in different organs. The stimulation of the vagus nerve is a promising approach to treating inflammatory diseases, obesity, diabetes, heart failure, and hypertension. The complexity of the vagus nerve requires highly selective stimulation, allowing the modulation of target-specific organs without side effects. Here, we address this issue by adapting a neural stimulator and developing an intraneural electrode for the particular modulation of the vagus nerve. The neurostimulator parameters such as amplitude, pulse width, and pulse shape were modulated. Single-, and multi-channel stimulation was performed at different amplitudes. For the first time, a polyimide thin-film neural electrode was designed for the specific stimulation of the vagus nerve. In vivo experiments were performed in the adult minipig to validate to elicit electrically evoked action potentials and to modulate physiological functions, validating the spatial selectivity of intraneural stimulation. Electrochemical tests of the electrode and the neurostimulator showed that the stimulation hardware was working correctly. Stimulating the porcine vagus nerve resulted in spatially selective modulation of the vagus nerve. ECAP belonging to alpha and beta fibers could be distinguished during single- and multi-channel stimulation. We have shown that the here presented system is able to activate the vagus nerve and can therefore modulate the heart rate, diastolic pressure, and systolic pressure. The here presented system may be used to restore the cardiac loop after denervation by implementing biomimetic stimulation patterns. Presented methods may be used to develop intraneural electrodes adapted for various applications. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. An Upper-Limb Rehabilitation Exoskeleton System Controlled by MI Recognition Model With Deep Emphasized Informative Features in a VR Scene.
- Author
-
Tang, Zhichuan, Wang, Hang, Cui, Zhixuan, Jin, Xiaoneng, Zhang, Lekai, Peng, Yuxin, and Xing, Baixi
- Subjects
CONVOLUTIONAL neural networks ,MOTOR imagery (Cognition) ,ROBOTIC exoskeletons ,HEMIPLEGICS ,FEATURE extraction - Abstract
The prevalence of stroke continues to increase with the global aging. Based on the motor imagery (MI) brain–computer interface (BCI) paradigm and virtual reality (VR) technology, we designed and developed an upper-limb rehabilitation exoskeleton system (VR-ULE) in the VR scenes for stroke patients. The VR-ULE system makes use of the MI electroencephalogram (EEG) recognition model with a convolutional neural network and squeeze-and-excitation (SE) blocks to obtain the patient’s motion intentions and control the exoskeleton to move during rehabilitation training movement. Due to the individual differences in EEG, the frequency bands with optimal MI EEG features for each patient are different. Therefore, the weight of different feature channels is learned by combining SE blocks to emphasize the useful information frequency band features. The MI cues in the VR-based virtual scenes can improve the interhemispheric balance and the neuroplasticity of patients. It also makes up for the disadvantages of the current MI-BCIs, such as single usage scenarios, poor individual adaptability, and many interfering factors. We designed the offline training experiment to evaluate the feasibility of the EEG recognition strategy, and designed the online control experiment to verify the effectiveness of the VR-ULE system. The results showed that the MI classification method with MI cues in the VR scenes improved the accuracy of MI classification (86.49% ± 3.02%); all subjects performed two types of rehabilitation training tasks under their own models trained in the offline training experiment, with the highest average completion rates of 86.82% ± 4.66% and 88.48% ± 5.84%. The VR-ULE system can efficiently help stroke patients with hemiplegia complete upper-limb rehabilitation training tasks, and provide the new methods and strategies for BCI-based rehabilitation devices. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Current Topics in Technology-Enabled Stroke Rehabilitation and Reintegration: A Scoping Review and Content Analysis.
- Author
-
Cisek, Katryna and Kelleher, John D.
- Subjects
REHABILITATION technology ,INFORMATION technology ,STROKE ,STROKE rehabilitation ,SOCIAL participation - Abstract
Background. There is a worldwide health crisis stemming from the rising incidence of various debilitating chronic diseases, with stroke as a leading contributor. Chronic stroke management encompasses rehabilitation and reintegration, and can require decades of personalized medicine and care. Information technology (IT) tools have the potential to support individuals managing chronic stroke symptoms. Objectives. This scoping review identifies prevalent topics and concepts in research literature on IT technology for stroke rehabilitation and reintegration, utilizing content analysis, based on topic modelling techniques from natural language processing to identify gaps in this literature. Eligibility Criteria. Our methodological search initially identified over 14,000 publications of the last two decades in the Web of Science and Scopus databases, which we filter, using keywords and a qualitative review, to a core corpus of 1062 documents. Results. We generate a 3-topic, 4-topic and 5-topic model and interpret the resulting topics as four distinct thematics in the literature, which we label as Robotics, Software, Functional and Cognitive. We analyze the prevalence and distinctiveness of each thematic and identify some areas relatively neglected by the field. These are mainly in the Cognitive thematic, especially for systems and devices for sensory loss rehabilitation, tasks of daily living performance and social participation. Conclusion. The results indicate that IT-enabled stroke literature has focused on Functional outcomes and Robotic technologies, with lesser emphasis on Cognitive outcomes and combined interventions. We hope this review broadens awareness, usage and mainstream acceptance of novel technologies in rehabilitation and reintegration among clinicians, carers and patients. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. A Three-Dimensional Finger-Tapping Framework for Recognition of Patients With Mild Parkinson’s Disease.
- Author
-
Li, Junjie, Zhu, Huaiyu, Wang, Haotian, Wang, Bo, Cen, Zhidong, Yang, Dehao, Liu, Peng, Luo, Wei, and Pan, Yun
- Subjects
PARKINSON'S disease ,SUPPORT vector machines ,FEATURE extraction ,MOTOR ability ,MOVEMENT disorders - Abstract
The finger tapping test is a widely-used and important examination in the Movement Disorder Society Clinical Diagnosis for Parkinson’s Disease. However, finger tapping motion could be affected by age, medication, and other conditions. As a result, Parkinson’s disease patients with mild sign and healthy people could be rated as similar scores on the Movement Disorder Society-sponsored revision of the Unified Parkinson’s Disease Rating Scale, making it difficult for community doctors to perform diagnosis. We therefore propose a three-dimensional finger tapping framework to recognize mild PD patients. Specifically, we first derive the three-dimensional finger-tapping motion using a self-designed three-dimensional finger-tapping measurement system. We then propose a three-dimensional finger-tapping segmentation algorithm to segment three-dimensional finger tapping motion. We next extract three-dimensional pattern features of motor coordination, imbalance impairment, and entropy. We finally adopted the support vector machine as the classifier to recognize PD patients. We evaluated the proposed framework on 49 PD patients and 29 healthy controls and reached an accuracy of 94.9% for the right hand and 89.4% for the left hand. Moreover, the proposed framework reached an accuracy of 95.0% for the right hand and 97.8% for the left hand on 17 mild PD patients and 28 healthy controls who were both rated as 0 or 1 on the Movement Disorder Society-sponsored revision of the Unified Parkinson’s Disease Rating Scale. The results demonstrated that the proposed framework was less sensitive to traditional features and performed well in recognizing mild PD patients by involving three-dimensional patter features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. MDTL: A Novel and Model-Agnostic Transfer Learning Strategy for Cross-Subject Motor Imagery BCI.
- Author
-
Li, Ang, Wang, Zhenyu, Zhao, Xi, Xu, Tianheng, Zhou, Ting, and Hu, Honglin
- Subjects
MOTOR imagery (Cognition) ,BRAIN-computer interfaces ,FEATURE extraction ,LEARNING strategies ,ELECTROENCEPHALOGRAPHY ,MOTOR learning ,DEEP learning - Abstract
In recent years, deep neural network-based transfer learning (TL) has shown outstanding performance in EEG-based motor imagery (MI) brain-computer interface (BCI). However, due to the long preparation for pre-trained models and the arbitrariness of source domain selection, using deep transfer learning on different datasets and models is still challenging. In this paper, we proposed a multi-direction transfer learning (MDTL) strategy for cross-subject MI EEG-based BCI. This strategy utilizes data from multi-source domains to the target domain as well as from one multi-source domain to another multi-source domain. This strategy is model-independent so that it can be quickly deployed on existing models. Three generic deep learning models for MI classification (DeepConvNet, ShallowConvNet, and EEGNet) and two public motor imagery datasets (BCIC IV dataset 2a and Lee2019) are used in this study to verify the proposed strategy. For the four-classes dataset BCIC IV dataset 2a, the proposed MDTL achieves 80.86%, 81.95%, and 75.00% mean prediction accuracy using the three models, which outperforms those without MDTL by 5.79%, 6.64%, and 11.42%. For the binary-classes dataset Lee2019, MDTL achieves 88.2% mean accuracy using the model DeepConvNet. It outperforms the accuracy without MDTL by 23.48%. The achieved 81.95% and 88.2% are also better than the existing deep transfer learning strategy. Besides, the training time of MDTL is reduced by 93.94%. MDTL is an easy-to-deploy, scalable and reliable transfer learning strategy for existing deep learning models, which significantly improves model performance and reduces preparation time without changing model architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. An Effective 3D Text Recurrent Voting Generator for Metaverse.
- Author
-
Park, Woo Hyun, Qureshi, Nawab Muhammad Faseeh, and Shin, Dong Ryeol
- Abstract
Metaverse is a novel innovative platform that connects users worldwide in the distributed virtual environment. People share their interests, opinions, and resources on this virtual reality platform. With this, we come to know that besides other fundamental techniques, the language generation method is also a necessity to regulate the VR environment. There are several types of language generation methods in 3D, including neural learning, such as GRU, RNN, and GPT-3, and transfer learning. This paper proposes a recurrent voting generator (RVG) system that understands the 3D text of a book and performs emotional analytics within a metaverse space. The proposed model RVG evaluates emotions through three algorithms such as the first module is a recurrent sentiment generator (RSG) that analyzes emotions and calculates and generates the distributions. The second module is the sentiment decomposition (SD) that optimizes higher dimensions in Big Data. And, the third module is the compound voting learning (CVL) module that performs calculations with an emphasis on optimal performance. The dataset used to evaluate RVG is based on the content of movie reviews and books. The performance evaluation shows that the proposed approach outperforms better compared to the existing 2D RNN models in the metaverse. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Weakly-Supervised Learning for Fine-Grained Emotion Recognition Using Physiological Signals.
- Author
-
Zhang, Tianyi, El Ali, Abdallah, Wang, Chen, Hanjalic, Alan, and Cesar, Pablo
- Abstract
Instead of predicting just one emotion for one activity (e.g., video watching), fine-grained emotion recognition enables more temporally precise recognition. Previous works on fine-grained emotion recognition require segment-by-segment, fine-grained emotion labels to train the recognition algorithm. However, experiments to collect these labels are costly and time-consuming compared with only collecting one emotion label after the user watched that stimulus (i.e., the post-stimuli emotion labels). To recognize emotions at a finer granularity level when trained with only post-stimuli labels, we propose an emotion recognition algorithm based on Deep Multiple Instance Learning (EDMIL) using physiological signals. EDMIL recognizes fine-grained valence and arousal (V-A) labels by identifying which instances represent the post-stimuli V-A annotated by users after watching the videos. Instead of fully-supervised training, the instances are weakly-supervised by the post-stimuli labels in the training stage. The V-A of instances are estimated by the instance gains, which indicate the probability of instances to predict the post-stimuli labels. We tested EDMIL on three different datasets, CASE, MERCA and CEAP-360VR, collected in three different environments: desktop, mobile and HMD-based Virtual Reality, respectively. Recognition results validated with the fine-grained V-A self-reports show that for subject-independent 3-class classification (high/neutral/low), EDMIL obtains promising recognition accuracies: 75.63% and 79.73% for V-A on CASE, 70.51% and 67.62% for V-A on MERCA and 65.04% and 67.05% for V-A on CEAP-360VR. Our ablation study shows that all components of EDMIL contribute to both the classification and regression tasks. Our experiments also show that (1) compared with fully-supervised learning, weakly-supervised learning can reduce the problem of overfitting caused by the temporal mismatch between fine-grained annotations and physiological signals, (2) instance segment lengths between 1-2 s result in the highest recognition accuracies and (3) EDMIL performs best if post-stimuli annotations consist of less than 30% or more than 60% of the entire video watching. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Modeling and Compensation for Repositioning Error in Discontinuous GBSAR Monitoring.
- Author
-
Mo, Yuanhui, Lai, Tao, Wang, Qingsong, and Huang, Haifeng
- Abstract
To compensate for the repositioning error (RE) introduced by the radar position offset in discontinuous ground-based synthetic aperture radar (GBSAR) monitoring, a new mathematical framework for modeling the baseline error based on the Taylor expansion is developed in this letter. And then, a novel 3-D model called multiparameter nonlinear trigonometric model (MNTM) is proposed to accurately compensate for the RE. Furthermore, to improve the compensating efficiency, we further develop an efficient 2-D model called linear trigonometric model (LTM). Both simulation and field experiments verify the superiority and feasibility of the proposed methods, which measure the displacement with sub-millimeter accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. SAR-to-Virtual Optical Image Translation for Improving SAR Automatic Target Recognition.
- Author
-
Lee, In Ho and Park, Chan Gook
- Abstract
This letter addresses the challenges associated with interpreting synthetic aperture radar (SAR) images, which provide limited visual information compared with optical images. We propose a new method for generating virtual data and an SAR-to-optical image translation neural network to recognize targets in SAR images. The creation of virtual data involves classifying it into targets and backgrounds. The target data generate a virtual optical image using a 3-D model, while the background data are synthesized by combining the target with the existing SAR image. For image translation using the virtual dataset, we design a modified dense nested U-net that converts images for target recognition. By incorporating the proposed translation network into the YOLO v4 detection algorithm, we verify the impact of virtual optical images on target recognition. The experimental results demonstrate that our proposed method outperforms the conventional approach, which relies solely on SAR target image data for learning. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.