11,888 results on '"Stricker P"'
Search Results
2. Diagnostic accuracy of the Stricker Learning Span and Mayo Test Drive Composite for amnestic Mild Cognitive Impairment.
- Author
-
Stricker, Nikki H., Twohy, Erin L, Albertson, Sabrina M., Christianson, Teresa J., Stricker, John L, Machulda, Mary M., Karstens, Aimee J, Patel, Jay S, Kremers, Walter K., Hassenstab, Jason J., Jack, Clifford R., Knopman, David S., Mielke, Michelle M., and Petersen, Ronald C.
- Abstract
Background: Remote assessment tools offer significant promise for aiding early detection of cognitive impairment. Mayo Test Drive (MTD): Test Development through Rapid Iteration, Validation and Expansion, is a web‐based platform for remote self‐administered assessment that includes a computer adaptive word list memory test (Stricker Learning Span; SLS) and a measure of processing speed (Symbols Test). We examined the diagnostic accuracy of the SLS and a MTD composite (SLS max span, SLS trials 1–5 total correct, SLS delay correct, [Symbols correct item response time*‐1]) for amnestic mild cognitive impairment (aMCI). We also explored diagnostic accuracy for a broader group that included individuals with possible MCI (pMCI; see Figure 1). Methods: Participants were recruited from the Mayo Clinic Study of Aging for this ancillary remote study. 226 were cognitively unimpaired (CU; concordant CU diagnosis by 3 independent raters). Fifty‐six participants had possible MCI (at least 1 of 3 raters indicated MCI) and 16 had a consensus diagnosis of aMCI. Primary outcome variables were SLS sum of trials, AVLT sum of trials, Symbols correct items response time, and MTD composite. Mean difference analyses used linear model ANOVAs (alpha =.05). Receiver operating characteristic (ROC) curves were applied; we derived optimal cutoff scores based on the Youden index method. Results: Both aMCI and possible MCI groups showed significantly lower performance than the CU group on SLS (Hedge's g aMCI = ‐1.72, pMCI = ‐1.04), Symbols (Hedge's g aMCI = 1.38, pMCI = 0.64), and MTD composite (Hedge's g aMCI = ‐1.89, pMCI = ‐1.09); see Tables 1 and 2 (all p's<.01 even when additionally covarying age, education, sex). Total area under the curve was high for differentiating of CU vs. aMCI (MTD Composite = 0.91, SLS = 0.91) and acceptable for CU vs. pMCI (MTD Composite = 0.77, SLS = 0.77); see Table 2. Consistent with our flexible platform, a variety of device types were used to complete remote testing (32% smartphone, 12% tablet, 56% PC). Conclusions: MTD and the SLS show high diagnostic accuracy for aMCI. MTD is a flexible, brief, and easy‐to‐use remote cognitive assessment tool with great potential for scalable use in future studies seeking to maximize inclusion of individuals with aMCI in clinical trials or for cognitive screening in clinical settings. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Vom guten Recht des Teufels. Kasus, Tropus und die Macht der Sprache beim Stricker und im Erzählmotiv »The Devil and the Lawyer« (AT 1186; Mot M 215)
- Author
-
Bleumer, Hartmut
- Published
- 2011
- Full Text
- View/download PDF
4. Liebe und Ehe : Lehrgedichte von dem Stricker
- Author
-
MOELLEKEN, WOLFGANG WILFRIED, EDITED BY and MOELLEKEN, WOLFGANG WILFRIED
- Published
- 2020
- Full Text
- View/download PDF
5. Composition-property extrapolation for compositionally complex solid solutions based on word embeddings
- Author
-
Zhang, Lei, Banko, Lars, Schuhmann, Wolfgang, Ludwig, Alfred, and Stricker, Markus
- Subjects
Condensed Matter - Materials Science - Abstract
Mastering the challenge of predicting properties of unknown materials with multiple principal elements (high entropy alloys/compositionally complex solid solutions) is crucial for the speedup in materials discovery. We show and discuss three models, using property data from two ternary systems (Ag-Pd-Ru; Ag-Pd-Pt), to predict material performance in the shared quaternary system (Ag-Pd-Pt-Ru). First, we apply Gaussian Process Regression (GPR) based on composition, which includes both Ag and Pd, achieving an initial correlation coefficient for the prediction ($r$) of 0.63 and a determination coefficient ($r^2$) of 0.08. Second, we present a version of the GPR model using word embedding-derived materials vectors as representations. Using materials-specific embedding vectors significantly improves the predictive capability, evident from an improved $r^2$ of 0.65. The third model is based on a `standard vector method' which synthesizes weighted vector representations of material properties, then creating a reference vector that results in a very good correlation with the quaternary system's material performance (resulting $r$ of 0.89). Our approach demonstrates that existing experimental data combined with latent knowledge of word embedding-based representations of materials can be used effectively for materials discovery where data is typically sparse., Comment: 17 pages, 12 figures, pre-print
- Published
- 2024
6. SurgeoNet: Realtime 3D Pose Estimation of Articulated Surgical Instruments from Stereo Images using a Synthetically-trained Network
- Author
-
Aboukhadra, Ahmed Tawfik, Robertini, Nadia, Malik, Jameel, Elhayek, Ahmed, Reis, Gerd, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Surgery monitoring in Mixed Reality (MR) environments has recently received substantial focus due to its importance in image-based decisions, skill assessment, and robot-assisted surgery. Tracking hands and articulated surgical instruments is crucial for the success of these applications. Due to the lack of annotated datasets and the complexity of the task, only a few works have addressed this problem. In this work, we present SurgeoNet, a real-time neural network pipeline to accurately detect and track surgical instruments from a stereo VR view. Our multi-stage approach is inspired by state-of-the-art neural-network architectural design, like YOLO and Transformers. We demonstrate the generalization capabilities of SurgeoNet in challenging real-world scenarios, achieved solely through training on synthetic data. The approach can be easily extended to any new set of articulated surgical instruments. SurgeoNet's code and data are publicly available.
- Published
- 2024
7. Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies
- Author
-
Sarode, Shalini, Khan, Muhammad Saif Ullah, Shehzadi, Tahira, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,I.2.6 - Abstract
We propose ClassroomKD, a novel multi-mentor knowledge distillation framework inspired by classroom environments to enhance knowledge transfer between student and multiple mentors. Unlike traditional methods that rely on fixed mentor-student relationships, our framework dynamically selects and adapts the teaching strategies of diverse mentors based on their effectiveness for each data sample. ClassroomKD comprises two main modules: the Knowledge Filtering (KF) Module and the Mentoring Module. The KF Module dynamically ranks mentors based on their performance for each input, activating only high-quality mentors to minimize error accumulation and prevent information loss. The Mentoring Module adjusts the distillation strategy by tuning each mentor's influence according to the performance gap between the student and mentors, effectively modulating the learning pace. Extensive experiments on image classification (CIFAR-100 and ImageNet) and 2D human pose estimation (COCO Keypoints and MPII Human Pose) demonstrate that ClassroomKD significantly outperforms existing knowledge distillation methods. Our results highlight that a dynamic and adaptive approach to mentor selection and guidance leads to more effective knowledge transfer, paving the way for enhanced model performance through distillation.
- Published
- 2024
8. Continual Human Pose Estimation for Incremental Integration of Keypoints and Pose Variations
- Author
-
Khan, Muhammad Saif Ullah, Khan, Muhammad Ahmed Ullah, Afzal, Muhammad Zeshan, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper reformulates cross-dataset human pose estimation as a continual learning task, aiming to integrate new keypoints and pose variations into existing models without losing accuracy on previously learned datasets. We benchmark this formulation against established regularization-based methods for mitigating catastrophic forgetting, including EWC, LFL, and LwF. Moreover, we propose a novel regularization method called Importance-Weighted Distillation (IWD), which enhances conventional LwF by introducing a layer-wise distillation penalty and dynamic temperature adjustment based on layer importance for previously learned knowledge. This allows for a controlled adaptation to new tasks that respects the stability-plasticity balance critical in continual learning. Through extensive experiments across three datasets, we demonstrate that our approach outperforms existing regularization-based continual learning strategies. IWD shows an average improvement of 3.60\% over the state-of-the-art LwF method. The results highlight the potential of our method to serve as a robust framework for real-world applications where models must evolve with new data without forgetting past knowledge.
- Published
- 2024
9. Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts
- Author
-
Khan, Mohammad Sadil, Sinha, Sankalp, Sheikh, Talha Uddin, Stricker, Didier, Ali, Sk Aziz, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Graphics - Abstract
Prototyping complex computer-aided design (CAD) models in modern softwares can be very time-consuming. This is due to the lack of intelligent systems that can quickly generate simpler intermediate parts. We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models using designer-friendly instructions for all skill levels. Furthermore, we introduce a data annotation pipeline for generating text prompts based on natural language instructions for the DeepCAD dataset using Mistral and LLaVA-NeXT. The dataset contains $\sim170$K models and $\sim660$K text annotations, from abstract CAD descriptions (e.g., generate two concentric cylinders) to detailed specifications (e.g., draw two circles with center $(x,y)$ and radius $r_{1}$, $r_{2}$, and extrude along the normal by $d$...). Within the Text2CAD framework, we propose an end-to-end transformer-based auto-regressive network to generate parametric CAD models from input texts. We evaluate the performance of our model through a mixture of metrics, including visual quality, parametric precision, and geometrical accuracy. Our proposed framework shows great potential in AI-aided design applications. Our source code and annotations will be publicly available., Comment: Accepted in NeurIPS 2024 (Spotlight)
- Published
- 2024
10. BRep Boundary and Junction Detection for CAD Reverse Engineering
- Author
-
Ali, Sk Aziz, Khan, Mohammad Sadil, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Computer Science - Multimedia - Abstract
In machining process, 3D reverse engineering of the mechanical system is an integral, highly important, and yet time consuming step to obtain parametric CAD models from 3D scans. Therefore, deep learning-based Scan-to-CAD modeling can offer designers enormous editability to quickly modify CAD model, being able to parse all its structural compositions and design steps. In this paper, we propose a supervised boundary representation (BRep) detection network BRepDetNet from 3D scans of CC3D and ABC dataset. We have carefully annotated the 50K and 45K scans of both the datasets with appropriate topological relations (e.g., next, mate, previous) between the geometrical primitives (i.e., boundaries, junctions, loops, faces) of their BRep data structures. The proposed solution decomposes the Scan-to-CAD problem in Scan-to-BRep ensuring the right step towards feature-based modeling, and therefore, leveraging other existing BRep-to-CAD modeling methods. Our proposed Scan-to-BRep neural network learns to detect BRep boundaries and junctions by minimizing focal-loss and non-maximal suppression (NMS) during training time. Experimental results show that our BRepDetNet with NMS-Loss achieves impressive results., Comment: 6 pages, 5 figures
- Published
- 2024
- Full Text
- View/download PDF
11. ShapeAug++: More Realistic Shape Augmentation for Event Data
- Author
-
Bendig, Katharina, Schuster, René, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The novel Dynamic Vision Sensors (DVSs) gained a great amount of attention recently as they are superior compared to RGB cameras in terms of latency, dynamic range and energy consumption. This is particularly of interest for autonomous applications since event cameras are able to alleviate motion blur and allow for night vision. One challenge in real-world autonomous settings is occlusion where foreground objects hinder the view on traffic participants in the background. The ShapeAug method addresses this problem by using simulated events resulting from objects moving on linear paths for event data augmentation. However, the shapes and movements lack complexity, making the simulation fail to resemble the behavior of objects in the real world. Therefore in this paper, we propose ShapeAug++, an extended version of ShapeAug which involves randomly generated polygons as well as curved movements. We show the superiority of our method on multiple DVS classification datasets, improving the top-1 accuracy by up to 3.7% compared to ShapeAug., Comment: accepted in Lecture Notes in Computer Science (LNCS)
- Published
- 2024
12. Continuous Associations between Remote Self-Administered Cognitive Measures and Imaging Biomarkers of Alzheimer’s Disease
- Author
-
Boots, E. A., Frank, R. D., Fan, W. Z., Christianson, T. J., Kremers, W. K., Stricker, J. L., Machulda, M. M., Fields, J. A., Hassenstab, J., Graff-Radford, J., Vemuri, P., Jack, C. R., Knopman, D. S., Petersen, R. C., and Stricker, Nikki H.
- Published
- 2024
- Full Text
- View/download PDF
13. Stereotype threat, gender and mathematics attainment: A conceptual replication of Stricker & Ward.
- Author
-
Matthew Inglis and Steven O'Hagan
- Subjects
Medicine ,Science - Abstract
Stereotype threat has been proposed as one cause of gender differences in post-compulsory mathematics participation. Danaher and Crandall argued, based on a study conducted by Stricker and Ward, that enquiring about a student's gender after they had finished a test, rather than before, would reduce stereotype threat and therefore increase the attainment of women students. Making such a change, they argued, could lead to nearly 5000 more women receiving AP Calculus AB credit per year. We conducted a preregistered conceptual replication of Stricker and Ward's study in the context of the UK Mathematics Trust's Junior Mathematical Challenge, finding no evidence of this stereotype threat effect. We conclude that the 'silver bullet' intervention of relocating demographic questions on test answer sheets is unlikely to provide an effective solution to systemic gender inequalities in mathematics education.
- Published
- 2022
- Full Text
- View/download PDF
14. GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets
- Author
-
Oehri, Sven, Ebert, Nikolas, Abdullah, Ahmed, Stricker, Didier, and Wasenmüller, Oliver
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent studies showcase the competitive accuracy of Vision Transformers (ViTs) in relation to Convolutional Neural Networks (CNNs), along with their remarkable robustness. However, ViTs demand a large amount of data to achieve adequate performance, which makes their application to small datasets challenging, falling behind CNNs. To overcome this, we propose GenFormer, a data augmentation strategy utilizing generated images, thereby improving transformer accuracy and robustness on small-scale image classification tasks. In our comprehensive evaluation we propose Tiny ImageNetV2, -R, and -A as new test set variants of Tiny ImageNet by transferring established ImageNet generalization and robustness benchmarks to the small-scale data domain. Similarly, we introduce MedMNIST-C and EuroSAT-C as corrupted test set variants of established fine-grained datasets in the medical and aerial domain. Through a series of experiments conducted on small datasets of various domains, including Tiny ImageNet, CIFAR, EuroSAT and MedMNIST datasets, we demonstrate the synergistic power of our method, in particular when combined with common train and test time augmentations, knowledge distillation, and architectural design choices. Additionally, we prove the effectiveness of our approach under challenging conditions with limited training data, demonstrating significant improvements in both accuracy and robustness, bridging the gap between CNNs and ViTs in the small-scale dataset domain., Comment: This paper has been accepted at International Conference on Pattern Recognition (ICPR), 2024
- Published
- 2024
15. G3FA: Geometry-guided GAN for Face Animation
- Author
-
Javanmardi, Alireza, Pagani, Alain, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Animating human face images aims to synthesize a desired source identity in a natural-looking way mimicking a driving video's facial movements. In this context, Generative Adversarial Networks have demonstrated remarkable potential in real-time face reenactment using a single source image, yet are constrained by limited geometry consistency compared to graphic-based approaches. In this paper, we introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation. Our novel approach empowers the face animation model to incorporate 3D information using only 2D images, improving the image generation capabilities of the talking head synthesis model. We integrate inverse rendering techniques to extract 3D facial geometry properties, improving the feedback loop to the generator through a weighted average ensemble of discriminators. In our face reenactment model, we leverage 2D motion warping to capture motion dynamics along with orthogonal ray sampling and volume rendering techniques to produce the ultimate visual output. To evaluate the performance of our G3FA, we conducted comprehensive experiments using various evaluation protocols on VoxCeleb2 and TalkingHead benchmarks to demonstrate the effectiveness of our proposed framework compared to the state-of-the-art real-time face animation methods., Comment: BMVC 2024, Accepted
- Published
- 2024
16. Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer
- Author
-
Shehzadi, Tahira, Ifza, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision. Semi-Supervised Object Detection (SSOD) leverages a combination of a small labeled dataset and a larger, unlabeled dataset. This approach effectively reduces the dependence on large labeled datasets, which are often expensive and time-consuming to obtain. Initially, SSOD models encountered challenges in effectively leveraging unlabeled data and managing noise in generated pseudo-labels for unlabeled data. However, numerous recent advancements have addressed these issues, resulting in substantial improvements in SSOD performance. This paper presents a comprehensive review of 27 cutting-edge developments in SSOD methodologies, from Convolutional Neural Networks (CNNs) to Transformers. We delve into the core components of semi-supervised learning and its integration into object detection frameworks, covering data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods. Furthermore, we conduct a comparative analysis of various SSOD models, evaluating their performance and architectural differences. We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.
- Published
- 2024
17. CLEO: Continual Learning of Evolving Ontologies
- Author
-
Muralidhara, Shishir, Bukhari, Saqib, Schneider, Georg, Stricker, Didier, and Schuster, René
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Continual learning (CL) addresses the problem of catastrophic forgetting in neural networks, which occurs when a trained model tends to overwrite previously learned information, when presented with a new task. CL aims to instill the lifelong learning characteristic of humans in intelligent systems, making them capable of learning continuously while retaining what was already learned. Current CL problems involve either learning new domains (domain-incremental) or new and previously unseen classes (class-incremental). However, general learning processes are not just limited to learning information, but also refinement of existing information. In this paper, we define CLEO - Continual Learning of Evolving Ontologies, as a new incremental learning setting under CL to tackle evolving classes. CLEO is motivated by the need for intelligent systems to adapt to real-world ontologies that change over time, such as those in autonomous driving. We use Cityscapes, PASCAL VOC, and Mapillary Vistas to define the task settings and demonstrate the applicability of CLEO. We highlight the shortcomings of existing CIL methods in adapting to CLEO and propose a baseline solution, called Modelling Ontologies (MoOn). CLEO is a promising new approach to CL that addresses the challenge of evolving ontologies in real-world applications. MoOn surpasses previous CL approaches in the context of CLEO., Comment: Accepted to ECCV 2024
- Published
- 2024
18. EgoFlowNet: Non-Rigid Scene Flow from Point Clouds with Ego-Motion Support
- Author
-
Battrawy, Ramy, Schuster, René, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent weakly-supervised methods for scene flow estimation from LiDAR point clouds are limited to explicit reasoning on object-level. These methods perform multiple iterative optimizations for each rigid object, which makes them vulnerable to clustering robustness. In this paper, we propose our EgoFlowNet - a point-level scene flow estimation network trained in a weakly-supervised manner and without object-based abstraction. Our approach predicts a binary segmentation mask that implicitly drives two parallel branches for ego-motion and scene flow. Unlike previous methods, we provide both branches with all input points and carefully integrate the binary mask into the feature extraction and losses. We also use a shared cost volume with local refinement that is updated at multiple scales without explicit clustering or rigidity assumptions. On realistic KITTI scenes, we show that our EgoFlowNet performs better than state-of-the-art methods in the presence of ground surface points., Comment: This paper is published in BMVC2023 (pp. 441-443)
- Published
- 2024
19. RMS-FlowNet++: Efficient and Robust Multi-Scale Scene Flow Estimation for Large-Scale Point Clouds
- Author
-
Battrawy, Ramy, Schuster, René, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The proposed RMS-FlowNet++ is a novel end-to-end learning-based architecture for accurate and efficient scene flow estimation that can operate on high-density point clouds. For hierarchical scene f low estimation, existing methods rely on expensive Farthest-Point-Sampling (FPS) to sample the scenes, must find large correspondence sets across the consecutive frames and/or must search for correspondences at a full input resolution. While this can improve the accuracy, it reduces the overall efficiency of these methods and limits their ability to handle large numbers of points due to memory requirements. In contrast to these methods, our architecture is based on an efficient design for hierarchical prediction of multi-scale scene flow. To this end, we develop a special flow embedding block that has two advantages over the current methods: First, a smaller correspondence set is used, and second, the use of Random-Sampling (RS) is possible. In addition, our architecture does not need to search for correspondences at a full input resolution. Exhibiting high accuracy, our RMS-FlowNet++ provides a faster prediction than state-of-the-art methods, avoids high memory requirements and enables efficient scene flow on dense point clouds of more than 250K points at once. Our comprehensive experiments verify the accuracy of RMS FlowNet++ on the established FlyingThings3D data set with different point cloud densities and validate our design choices. Furthermore, we demonstrate that our model has a competitive ability to generalize to the real-world scenes of the KITTI data set without fine-tuning., Comment: This version of the article has been accepted by International Journal of Computer Vision (IJCV), and published in 23.05.2024
- Published
- 2024
- Full Text
- View/download PDF
20. Unlocking the Potential of Operations Research for Multi-Graph Matching
- Author
-
Kahl, Max, Stricker, Sebastian, Hutschenreiter, Lisa, Bernard, Florian, and Savchynskyy, Bogdan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We consider the incomplete multi-graph matching problem, which is a generalization of the NP-hard quadratic assignment problem for matching multiple finite sets. Multi-graph matching plays a central role in computer vision, e.g., for matching images or shapes, so that a number of dedicated optimization techniques have been proposed. While the closely related NP-hard multi-dimensional assignment problem (MDAP) has been studied for decades in the operations research community, it only considers complete matchings and has a different cost structure. We bridge this gap and transfer well-known approximation algorithms for the MDAP to incomplete multi-graph matching. To this end, we revisit respective algorithms, adapt them to incomplete multi-graph matching, and propose their extended and parallelized versions. Our experimental validation shows that our new method substantially outperforms the previous state of the art in terms of objective and runtime. Our algorithm matches, for example, 29 images with more than 500 keypoints each in less than two minutes, whereas the fastest considered competitor requires at least half an hour while producing far worse results.
- Published
- 2024
21. Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation
- Author
-
Khan, Muhammad Saif Ullah, Sinha, Sankalp, Stricker, Didier, Liwicki, Marcus, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Reconstructing texture-less surfaces poses unique challenges in computer vision, primarily due to the lack of specialized datasets that cater to the nuanced needs of depth and normals estimation in the absence of textural information. We introduce "Shape2.5D," a novel, large-scale dataset designed to address this gap. Comprising 1.17 million frames spanning over 39,772 3D models and 48 unique objects, our dataset provides depth and surface normal maps for texture-less object reconstruction. The proposed dataset includes synthetic images rendered with 3D modeling software to simulate various lighting conditions and viewing angles. It also includes a real-world subset comprising 4,672 frames captured with a depth camera. Our comprehensive benchmarks demonstrate the dataset's ability to support the development of algorithms that robustly estimate depth and normals from RGB images and perform voxel reconstruction. Our open-source data generation pipeline allows the dataset to be extended and adapted for future research. The dataset is publicly available at https://github.com/saifkhichi96/Shape25D., Comment: Accepted for publication in IEEE Access
- Published
- 2024
- Full Text
- View/download PDF
22. Dislocation cartography: Representations and unsupervised classification of dislocation networks with unique fingerprints
- Author
-
Udofia, Benjamin, Jogi, Tushar, and Stricker, Markus
- Subjects
Condensed Matter - Materials Science ,Computer Science - Machine Learning - Abstract
Detecting structure in data is the first step to arrive at meaningful representations for systems. This is particularly challenging for dislocation networks evolving as a consequence of plastic deformation of crystalline systems. Our study employs Isomap, a manifold learning technique, to unveil the intrinsic structure of high-dimensional density field data of dislocation structures from different compression axis. The resulting maps provide a systematic framework for quantitatively comparing dislocation structures, offering unique fingerprints based on density fields. Our novel, unbiased approach contributes to the quantitative classification of dislocation structures which can be systematically extended., Comment: 26 pages, 7 figures
- Published
- 2024
23. Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification
- Author
-
Khan, Muhammad Saif Ullah, Shehzadi, Tahira, Noor, Rabeya, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Automated signature verification on bank checks is critical for fraud prevention and ensuring transaction authenticity. This task is challenging due to the coexistence of signatures with other textual and graphical elements on real-world documents. Verification systems must first detect the signature and then validate its authenticity, a dual challenge often overlooked by current datasets and methodologies focusing only on verification. To address this gap, we introduce a novel dataset specifically designed for signature verification on bank checks. This dataset includes a variety of signature styles embedded within typical check elements, providing a realistic testing ground for advanced detection methods. Moreover, we propose a novel approach for writer-independent signature verification using an object detection network. Our detection-based verification method treats genuine and forged signatures as distinct classes within an object detection framework, effectively handling both detection and verification. We employ a DINO-based network augmented with a dilation module to detect and verify signatures on check images simultaneously. Our approach achieves an AP of 99.2 for genuine and 99.4 for forged signatures, a significant improvement over the DINO baseline, which scored 93.1 and 89.3 for genuine and forged signatures, respectively. This improvement highlights our dilation module's effectiveness in reducing both false positives and negatives. Our results demonstrate substantial advancements in detection-based signature verification technology, offering enhanced security and efficiency in financial document processing., Comment: Accepted for publication in 16th IAPR International Workshop on Document Analysis Systems 2024
- Published
- 2024
24. Situational Instructions Database: Task Guidance in Dynamic Environments
- Author
-
Khan, Muhammad Saif Ullah, Sinha, Sankalp, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The Situational Instructions Database (SID) addresses the need for enhanced situational awareness in artificial intelligence (AI) systems operating in dynamic environments. By integrating detailed scene graphs with dynamically generated, task-specific instructions, SID provides a novel dataset that allows AI systems to perform complex, real-world tasks with improved context sensitivity and operational accuracy. This dataset leverages advanced generative models to simulate a variety of realistic scenarios based on the 3D Semantic Scene Graphs (3DSSG) dataset, enriching it with scenario-specific information that details environmental interactions and tasks. SID facilitates the development of AI applications that can adapt to new and evolving conditions without extensive retraining, supporting research in autonomous technology and AI-driven decision-making processes. This dataset is instrumental in developing robust, context-aware AI agents capable of effectively navigating and responding to unpredictable settings. Available for research and development, SID serves as a critical resource for advancing the capabilities of intelligent systems in complex environments. Dataset available at \url{https://github.com/mindgarage/situational-instructions-database}., Comment: 9 pages, 6 figures
- Published
- 2024
25. UnSupDLA: Towards Unsupervised Document Layout Analysis
- Author
-
Sheikh, Talha Uddin, Shehzadi, Tahira, Hashmi, Khurram Azeem, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Document layout analysis is a key area in document research, involving techniques like text mining and visual analysis. Despite various methods developed to tackle layout analysis, a critical but frequently overlooked problem is the scarcity of labeled data needed for analyses. With the rise of internet use, an overwhelming number of documents are now available online, making the process of accurately labeling them for research purposes increasingly challenging and labor-intensive. Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era. To address this, we employ a vision-based approach for analyzing document layouts designed to train a network without labels. Instead, we focus on pre-training, initially generating simple object masks from the unlabeled document images. These masks are then used to train a detector, enhancing object detection and segmentation performance. The model's effectiveness is further amplified through several unsupervised training iterations, continuously refining its performance. This approach significantly advances document layout analysis, particularly precision and efficiency, without labels., Comment: ICDAR 2024 - Workshop
- Published
- 2024
26. Unifying atoms and colloids near the glass transition through bond-order topology
- Author
-
Stricker, Laura, Derlet, Peter M., Demirörs, Ahmet Faik, Vutukuri, Hanumantha Rao, and Vermant, Jan
- Subjects
Condensed Matter - Soft Condensed Matter ,82D30 - Abstract
In this combined experimental and simulation study, we utilize bond-order topology to quantitatively match particle volume fraction in mechanically uniformly compressed colloidal suspensions with temperature in atomistic simulations. The obtained mapping temperature is above the dynamical glass transition temperature, indicating that the colloidal systems examined are structurally most like simulated undercooled liquids. Furthermore, the structural mapping procedure offers a unifying framework for quantifying relaxation in arrested colloidal systems., Comment: Main: 6 pages, 3 figures. Supplementary Material: 10 pages, 14 figures
- Published
- 2024
- Full Text
- View/download PDF
27. Rolf Bergmann & Stefanie Stricker (Hg.). 2018. Namen und Wörter. Theoretische Grenzen – Übergänge im Sprachwandel (Germanistische Bibliothek 64). Heidelberg: Winter. 262 S.
- Author
-
Thöny Luzius
- Subjects
Germanic languages. Scandinavian languages ,PD1-7159 - Published
- 2019
- Full Text
- View/download PDF
28. Performative Ding-Bedeutung: Der Stricker und sein metaphysisches Dinge-Verständnis in seiner Kleinepik ('Von Edelsteinen', 'Der wunderbare Stein', 'Der Hahn und die Perle')
- Author
-
Silvan Wagner
- Subjects
kleinepik ,magie ,metaphysik ,materialität ,performanz ,weisheit ,mittelalter ,History (General) ,D1-2009 ,Medieval history ,D111-203 - Abstract
In seiner Kleinepik behandelt der Stricker die Macht von Preziosen in kritischer, dialektischer und unterhaltsamer Art und Weise. Bislang wurde in der Forschung dabei vor allem sein aufklärerischer, magiekritischer Impetus betont. Im hermeneutischen Vergleich seiner Kleinepik zeigt sich aber, dass der Stricker durchaus an einer magischen Dimension der Dinge festhält, die sich allerdings nur im Zusammenspiel der Physis der Dinge, der Erzählungen von ihren Mächten und der weisen Performanz mit ihnen entfaltet. | Within his short narratives, the Stricker tells about the power of precious things in a critical, dialectical, and – last but not least – entertaining manner. Research has hitherto emphasized his modern way of thinking whilst criticizing magical practices. But when comparing his short narratives in a hermeneutical way, one can see that the Stricker indeed affirms the magical dimensions of things. However, this magic only develops when the physique of things, the stories about their powers, and a wise performance with them come together.
- Published
- 2020
- Full Text
- View/download PDF
29. Estimating Human Poses Across Datasets: A Unified Skeleton and Multi-Teacher Distillation Approach
- Author
-
Khan, Muhammad Saif Ullah, Limbachiya, Dhavalkumar, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Human pose estimation is a key task in computer vision with various applications such as activity recognition and interactive systems. However, the lack of consistency in the annotated skeletons across different datasets poses challenges in developing universally applicable models. To address this challenge, we propose a novel approach integrating multi-teacher knowledge distillation with a unified skeleton representation. Our networks are jointly trained on the COCO and MPII datasets, containing 17 and 16 keypoints, respectively. We demonstrate enhanced adaptability by predicting an extended set of 21 keypoints, 4 (COCO) and 5 (MPII) more than original annotations, improving cross-dataset generalization. Our joint models achieved an average accuracy of 70.89 and 76.40, compared to 53.79 and 55.78 when trained on a single dataset and evaluated on both. Moreover, we also evaluate all 21 predicted points by our two models by reporting an AP of 66.84 and 72.75 on the Halpe dataset. This highlights the potential of our technique to address one of the most pressing challenges in pose estimation research and application - the inconsistency in skeletal annotations., Comment: 15 pages (with references)
- Published
- 2024
30. End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents
- Author
-
Ehsan, Iqraa, Shehzadi, Tahira, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Table detection, a pivotal task in document analysis, aims to precisely recognize and locate tables within document images. Although deep learning has shown remarkable progress in this realm, it typically requires an extensive dataset of labeled data for proficient training. Current CNN-based semi-supervised table detection approaches use the anchor generation process and Non-Maximum Suppression (NMS) in their detection process, limiting training efficiency. Meanwhile, transformer-based semi-supervised techniques adopted a one-to-one match strategy that provides noisy pseudo-labels, limiting overall efficiency. This study presents an innovative transformer-based semi-supervised table detector. It improves the quality of pseudo-labels through a novel matching strategy combining one-to-one and one-to-many assignment techniques. This approach significantly enhances training efficiency during the early stages, ensuring superior pseudo-labels for further training. Our semi-supervised approach is comprehensively evaluated on benchmark datasets, including PubLayNet, ICADR-19, and TableBank. It achieves new state-of-the-art results, with a mAP of 95.7% and 97.9% on TableBank (word) and PubLaynet with 30% label data, marking a 7.4 and 7.6 point improvement over previous semi-supervised table detection approach, respectively. The results clearly show the superiority of our semi-supervised approach, surpassing all existing state-of-the-art methods by substantial margins. This research represents a significant advancement in semi-supervised table detection methods, offering a more efficient and accurate solution for practical document analysis tasks., Comment: ICDAR-IJDAR 2024
- Published
- 2024
31. CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification
- Author
-
Sinha, Sankalp, Khan, Muhammad Saif Ullah, Sheikh, Talha Uddin, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Zero-shot learning has been extensively investigated in the broader field of visual recognition, attracting significant interest recently. However, the current work on zero-shot learning in document image classification remains scarce. The existing studies either focus exclusively on zero-shot inference, or their evaluation does not align with the established criteria of zero-shot evaluation in the visual recognition domain. We provide a comprehensive document image classification analysis in Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) settings to address this gap. Our methodology and evaluation align with the established practices of this domain. Additionally, we propose zero-shot splits for the RVL-CDIP dataset. Furthermore, we introduce CICA (pronounced 'ki-ka'), a framework that enhances the zero-shot learning capabilities of CLIP. CICA consists of a novel 'content module' designed to leverage any generic document-related textual information. The discriminative features extracted by this module are aligned with CLIP's text and image features using a novel 'coupled-contrastive' loss. Our module improves CLIP's ZSL top-1 accuracy by 6.7% and GZSL harmonic mean by 24% on the RVL-CDIP dataset. Our module is lightweight and adds only 3.3% more parameters to CLIP. Our work sets the direction for future research in zero-shot document classification., Comment: 18 Pages, 4 Figures and Accepted in ICDAR 2024
- Published
- 2024
32. Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer
- Author
-
Shehzadi, Tahira, Sarode, Shalini, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Table detection within document images is a crucial task in document processing, involving the identification and localization of tables. Recent strides in deep learning have substantially improved the accuracy of this task, but it still heavily relies on large labeled datasets for effective training. Several semi-supervised approaches have emerged to overcome this challenge, often employing CNN-based detectors with anchor proposals and post-processing techniques like non-maximal suppression (NMS). However, recent advancements in the field have shifted the focus towards transformer-based techniques, eliminating the need for NMS and emphasizing object queries and attention mechanisms. Previous research has focused on two key areas to improve transformer-based detectors: refining the quality of object queries and optimizing attention mechanisms. However, increasing object queries can introduce redundancy, while adjustments to the attention mechanism can increase complexity. To address these challenges, we introduce a semi-supervised approach employing SAM-DETR, a novel approach for precise alignment between object queries and target features. Our approach demonstrates remarkable reductions in false positives and substantial enhancements in table detection performance, particularly in complex documents characterized by diverse table structures. This work provides more efficient and accurate table detection in semi-supervised settings., Comment: ICDAR 2024
- Published
- 2024
33. A Hybrid Approach for Document Layout Analysis in Document images
- Author
-
Shehzadi, Tahira, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Document layout analysis involves understanding the arrangement of elements within a document. This paper navigates the complexities of understanding various elements within document images, such as text, images, tables, and headings. The approach employs an advanced Transformer-based object detection network as an innovative graphical page object detector for identifying tables, figures, and displayed elements. We introduce a query encoding mechanism to provide high-quality object queries for contrastive learning, enhancing efficiency in the decoder phase. We also present a hybrid matching scheme that integrates the decoder's original one-to-one matching strategy with the one-to-many matching strategy during the training phase. This approach aims to improve the model's accuracy and versatility in detecting various graphical elements on a page. Our experiments on PubLayNet, DocLayNet, and PubTables benchmarks show that our approach outperforms current state-of-the-art methods. It achieves an average precision of 97.3% on PubLayNet, 81.6% on DocLayNet, and 98.6 on PubTables, demonstrating its superior performance in layout analysis. These advancements not only enhance the conversion of document images into editable and accessible formats but also streamline information retrieval and data extraction processes., Comment: ICDAR 2024
- Published
- 2024
34. Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection
- Author
-
Shehzadi, Tahira, Hashmi, Khurram Azeem, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this paper, we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework, particularly focusing on the challenges posed by the quality of object queries. In DETR-based SSOD, the one-to-one assignment strategy provides inaccurate pseudo-labels, while the one-to-many assignments strategy leads to overlapping predictions. These issues compromise training efficiency and degrade model performance, especially in detecting small or occluded objects. We introduce Sparse Semi-DETR, a novel transformer-based, end-to-end semi-supervised object detection solution to overcome these challenges. Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries, significantly improving detection capabilities for small and partially obscured objects. Additionally, we integrate a Reliable Pseudo-Label Filtering Module that selectively filters high-quality pseudo-labels, thereby enhancing detection accuracy and consistency. On the MS-COCO and Pascal VOC object detection benchmarks, Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods that highlight Sparse Semi-DETR's effectiveness in semi-supervised object detection, particularly in challenging scenarios involving small or partially obscured objects., Comment: CVPR2024
- Published
- 2024
35. SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks
- Author
-
Xie, Yaxu, Pagani, Alain, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Scene graphs have been recently introduced into 3D spatial understanding as a comprehensive representation of the scene. The alignment between 3D scene graphs is the first step of many downstream tasks such as scene graph aided point cloud registration, mosaicking, overlap checking, and robot navigation. In this work, we treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network. We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature via our designed feature fusion module. Partial matching is enabled by using a learnable method to select the top-k similar node pairs. Subsequent downstream tasks such as point cloud registration are achieved by running a pre-trained registration network within the matched regions. We further propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method. It reduces the false point correspondences estimated especially in low-overlapping cases. Experiments show that our method improves the alignment accuracy by 10~20% in low-overlap and random transformation scenarios and outperforms the existing work in multiple downstream tasks., Comment: 16 pages, 10 figures
- Published
- 2024
36. Employing constrained non-negative matrix factorization for microstructure segmentation
- Author
-
Chauniyal, Ashish, Thome, Pascal, and Stricker, Markus
- Subjects
Condensed Matter - Materials Science - Abstract
Materials characterization using electron backscatter diffraction (EBSD) requires indexing the orientation of the measured region from Kikuchi patterns. The quality of Kikuchi patterns can degrade due to pattern overlaps arising from two or more orientations, in the presence of defects or grain boundaries. In this work we employ constrained non-negative matrix factorization to segment a microstructure with small grain misorientations,< 1 degree, and predict the amount of pattern overlap. First we implement the method on mixed simulated patterns - that replicates a pattern overlap scenario, and demonstrate the resolution limit of pattern mixing or factorization resolution using a weight metric. Subsequently, we segment a single-crystal dendritic microstructure and compare the results with high resolution EBSD. By utilizing weight metrics across a low angle grain boundary we demonstrate how very small misorientations/low-angle grain boundaries can be resolved at a pixel level. Our approach constitutes a versatile and robust tool, complementing other fast indexing methods for microstructure characterization., Comment: 22 pages, 7 figures
- Published
- 2024
37. Human Pose Descriptions and Subject-Focused Attention for Improved Zero-Shot Transfer in Human-Centric Classification Tasks
- Author
-
Khan, Muhammad Saif Ullah, Naeem, Muhammad Ferjad, Tombari, Federico, Van Gool, Luc, Stricker, Didier, and Afzal, Muhammad Zeshan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present a novel LLM-based pipeline for creating contextual descriptions of human body poses in images using only auxiliary attributes. This approach facilitates the creation of the MPII Pose Descriptions dataset, which includes natural language annotations for 17,367 images containing people engaged in 410 distinct activities. We demonstrate the effectiveness of our pose descriptions in enabling zero-shot human-centric classification using CLIP. Moreover, we introduce the FocusCLIP framework, which incorporates Subject-Focused Attention (SFA) in CLIP for improved text-to-image alignment. Our models were pretrained on the MPII Pose Descriptions dataset and their zero-shot performance was evaluated on five unseen datasets covering three tasks. FocusCLIP outperformed the baseline CLIP model, achieving an average accuracy increase of 8.61\% (33.65\% compared to CLIP's 25.04\%). Notably, our approach yielded improvements of 3.98\% in activity recognition, 14.78\% in age classification, and 7.06\% in emotion recognition. These results highlight the potential of integrating detailed pose descriptions and subject-level guidance into general pretraining frameworks for enhanced performance in downstream tasks.
- Published
- 2024
38. MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding
- Author
-
Chang, Chun-Peng, Wang, Shaoxiang, Pagani, Alain, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
3D visual grounding involves matching natural language descriptions with their corresponding objects in 3D spaces. Existing methods often face challenges with accuracy in object recognition and struggle in interpreting complex linguistic queries, particularly with descriptions that involve multiple anchors or are view-dependent. In response, we present the MiKASA (Multi-Key-Anchor Scene-Aware) Transformer. Our novel end-to-end trained model integrates a self-attention-based scene-aware object encoder and an original multi-key-anchor technique, enhancing object recognition accuracy and the understanding of spatial relationships. Furthermore, MiKASA improves the explainability of decision-making, facilitating error diagnosis. Our model achieves the highest overall accuracy in the Referit3D challenge for both the Sr3D and Nr3D datasets, particularly excelling by a large margin in categories that require viewpoint-dependent descriptions.
- Published
- 2024
39. Continental-scale nutrient and contaminant delivery by Pacific salmon
- Author
-
Brandt, Jessica E., Wesner, Jeff S., Ruggerone, Gregory T., Jardine, Timothy D., Eagles-Smith, Collin A., Ruso, Gabrielle E., Stricker, Craig A., Voss, Kristofor A., and Walters, David M.
- Published
- 2024
- Full Text
- View/download PDF
40. A multi-center international study to evaluate the safety, functional and oncological outcomes of irreversible electroporation for the ablation of prostate cancer
- Author
-
Zhang, Kai, Stricker, Phillip, Löhr, Martin, Stehling, Michael, Suberville, Michel, Cussenot, Olivier, Lunelli, Luca, Ng, Chi-Fai, Teoh, Jeremy, Laguna, Pilar, and de la Rosette, Jean
- Published
- 2024
- Full Text
- View/download PDF
41. Associations of continuum beliefs with personality disorder stigma: correlational and experimental evidence
- Author
-
Stricker, Johannes, Jakob, Louisa, and Pietrowsky, Reinhard
- Published
- 2024
- Full Text
- View/download PDF
42. Chitchat as Interference: Adding User Backstories to Task-Oriented Dialogues
- Author
-
Stricker, Armand and Paroubek, Patrick
- Subjects
Computer Science - Computation and Language - Abstract
During task-oriented dialogues (TODs), human users naturally introduce chitchat that is beyond the immediate scope of the task, interfering with the flow of the conversation. To address this issue without the need for expensive manual data creation, we use few-shot prompting with Llama-2-70B to enhance the MultiWOZ dataset with user backstories, a typical example of chitchat interference in TODs. We assess the impact of this addition by testing two models: one trained solely on TODs and another trained on TODs with a preliminary chitchat interaction. Our analysis demonstrates that our enhanced dataset poses a challenge for these systems. Moreover, we demonstrate that our dataset can be effectively used for training purposes, enabling a system to consistently acknowledge the user's backstory while also successfully moving the task forward in the same turn, as confirmed by human evaluation. These findings highlight the benefits of generating novel chitchat-TOD scenarios to test TOD systems more thoroughly and improve their resilience to natural user interferences, Comment: Accepted @ LREC-COLING 2024
- Published
- 2024
43. Speech foundation models in healthcare: Effect of layer selection on pathological speech feature prediction
- Author
-
Wiepert, Daniela A., Utianski, Rene L., Duffy, Joseph R., Stricker, John L., Barnard, Leland R., Jones, David T., and Botha, Hugo
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Accurately extracting clinical information from speech is critical to the diagnosis and treatment of many neurological conditions. As such, there is interest in leveraging AI for automatic, objective assessments of clinical speech to facilitate diagnosis and treatment of speech disorders. We explore transfer learning using foundation models, focusing on the impact of layer selection for the downstream task of predicting pathological speech features. We find that selecting an optimal layer can greatly improve performance (~15.8% increase in balanced accuracy per feature as compared to worst layer, ~13.6% increase as compared to final layer), though the best layer varies by predicted feature and does not always generalize well to unseen data. A learned weighted sum offers comparable performance to the average best layer in-distribution (only ~1.2% lower) and had strong generalization for out-of-distribution data (only 1.5% lower than the average best layer)., Comment: Accepted to INTERSPEECH 2024
- Published
- 2024
44. A Unified Approach to Emotion Detection and Task-Oriented Dialogue Modeling
- Author
-
Stricker, Armand and Paroubek, Patrick
- Subjects
Computer Science - Computation and Language - Abstract
In current text-based task-oriented dialogue (TOD) systems, user emotion detection (ED) is often overlooked or is typically treated as a separate and independent task, requiring additional training. In contrast, our work demonstrates that seamlessly unifying ED and TOD modeling brings about mutual benefits, and is therefore an alternative to be considered. Our method consists in augmenting SimpleToD, an end-to-end TOD system, by extending belief state tracking to include ED, relying on a single language model. We evaluate our approach using GPT-2 and Llama-2 on the EmoWOZ benchmark, a version of MultiWOZ annotated with emotions. Our results reveal a general increase in performance for ED and task results. Our findings also indicate that user emotions provide useful contextual conditioning for system responses, and can be leveraged to further refine responses in terms of empathy., Comment: Accepted @ IWSDS 2024
- Published
- 2024
45. ShapeAug: Occlusion Augmentation for Event Camera Data
- Author
-
Bendig, Katharina, Schuster, René, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recently, Dynamic Vision Sensors (DVSs) sparked a lot of interest due to their inherent advantages over conventional RGB cameras. These advantages include a low latency, a high dynamic range and a low energy consumption. Nevertheless, the processing of DVS data using Deep Learning (DL) methods remains a challenge, particularly since the availability of event training data is still limited. This leads to a need for event data augmentation techniques in order to improve accuracy as well as to avoid over-fitting on the training data. Another challenge especially in real world automotive applications is occlusion, meaning one object is hindering the view onto the object behind it. In this paper, we present a novel event data augmentation approach, which addresses this problem by introducing synthetic events for randomly moving objects in a scene. We test our method on multiple DVS classification datasets, resulting in an relative improvement of up to 6.5 % in top1-accuracy. Moreover, we apply our augmentation technique on the real world Gen1 Automotive Event Dataset for object detection, where we especially improve the detection of pedestrians by up to 5 %., Comment: Accepted at ICPRAM 2024
- Published
- 2024
46. Rémy Stricker (1936-2019).
- Author
-
Piéjus, Anne
- Published
- 2020
47. Learned Fusion: 3D Object Detection using Calibration-Free Transformer Feature Fusion
- Author
-
Fürst, Michael, Jakkamsetty, Rahul, Schuster, René, and Stricker, Didier
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The state of the art in 3D object detection using sensor fusion heavily relies on calibration quality, which is difficult to maintain in large scale deployment outside a lab environment. We present the first calibration-free approach for 3D object detection. Thus, eliminating the need for complex and costly calibration procedures. Our approach uses transformers to map the features between multiple views of different sensors at multiple abstraction levels. In an extensive evaluation for object detection, we not only show that our approach outperforms single modal setups by 14.1% in BEV mAP, but also that the transformer indeed learns mapping. By showing calibration is not necessary for sensor fusion, we hope to motivate other researchers following the direction of calibration-free fusion. Additionally, resulting approaches have a substantial resilience against rotation and translation changes., Comment: 11 pages, 5 figures
- Published
- 2023
48. A Relational Frame Theory-Based Intervention for Improving Reading and Mathematical Competencies Among School Children
- Author
-
Stricker, Charles, Mao, Jin, Cassidy, Sarah, Colbert, Dylan, and Roche, Bryan
- Published
- 2024
- Full Text
- View/download PDF
49. Delphi consensus project on prostate-specific membrane antigen (PSMA)–targeted surgery—outcomes from an international multidisciplinary panel
- Author
-
Berrens, Anne-Claire, Scheltema, Matthijs, Maurer, Tobias, Hermann, Ken, Hamdy, Freddie C., Knipper, Sophie, Dell’Oglio, Paolo, Mazzone, Elio, de Barros, Hilda A., Sorger, Jonathan M., van Oosterom, Matthias N., Stricker, Philip D., van Leeuwen, Pim J., Rietbergen, Daphne D. D., Valdes Olmos, Renato A., Vidal-Sicart, Sergi, Carroll, Peter R., Buckle, Tessa, van der Poel, Henk G., and van Leeuwen, Fijs W. B.
- Published
- 2024
- Full Text
- View/download PDF
50. To detach or not to detach the umbo in type I tympanoplasty: functional results
- Author
-
Lotto, Cecilia, Fink, Raffael, Stricker, Daniel, Fernandez, Ignacio J., Beckmann, Sven, Presutti, Livio, Caversaccio, Marco, Molinari, Giulia, and Anschuetz, Lukas
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.