Author: "Possegger, Horst" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Possegger, Horst"' showing total 165 results

Start Over Author "Possegger, Horst"

165 results on '"Possegger, Horst"'

1. GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

Author: Mirza, M. Jehanzeb, Zhao, Mengjie, Mao, Zhuoyuan, Doveh, Sivan, Lin, Wei, Gavrikov, Paul, Dorkenwald, Michael, Yang, Shiqi, Jha, Saurav, Wakaki, Hiromi, Mitsufuji, Yuki, Possegger, Horst, Feris, Rogerio, Karlinsky, Leonid, and Glass, James
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work, we propose a novel method (GLOV) enabling Large Language Models (LLMs) to act as implicit Optimizers for Vision-Langugage Models (VLMs) to enhance downstream vision tasks. Our GLOV meta-prompts an LLM with the downstream task description, querying it for suitable VLM prompts (e.g., for zero-shot classification with CLIP). These prompts are ranked according to a purity measure obtained through a fitness function. In each respective optimization step, the ranked prompts are fed as in-context examples (with their accuracies) to equip the LLM with the knowledge of the type of text prompts preferred by the downstream VLM. Furthermore, we also explicitly steer the LLM generation process in each optimization step by specifically adding an offset difference vector of the embeddings from the positive and negative solutions found by the LLM, in previous optimization steps, to the intermediate layer of the network for the next generation step. This offset vector steers the LLM generation toward the type of language preferred by the downstream VLM, resulting in enhanced performance on the downstream vision tasks. We comprehensively evaluate our GLOV on 16 diverse datasets using two families of VLMs, i.e., dual-encoder (e.g., CLIP) and encoder-decoder (e.g., LLaVa) models -- showing that the discovered solutions can enhance the recognition performance by up to 15.0% and 57.5% (3.8% and 21.6% on average) for these models., Comment: Code: https://github.com/jmiemirza/GLOV
Published: 2024

2. Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed

Author: Prutsch, Alexander, Bischof, Horst, and Possegger, Horst
Subjects: Computer Science - Robotics, Computer Science - Computer Vision and Pattern Recognition
Abstract: For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant challenges in terms of training resource requirements and deployment on embedded hardware. We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU. Due to our lightweight architectural choices and the focus on reducing the required training resources, our model can easily be applied to custom datasets. Furthermore, its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources., Comment: Accepted to IROS 2024
Published: 2024

3. Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

Author: Fruhwirth-Reisinger, Christian, Lin, Wei, Malić, Dušan, Bischof, Horst, and Possegger, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Accurate 3D object detection in LiDAR point clouds is crucial for autonomous driving systems. To achieve state-of-the-art performance, the supervised training of detectors requires large amounts of human-annotated data, which is expensive to obtain and restricted to predefined object categories. To mitigate manual labeling efforts, recent unsupervised object detection approaches generate class-agnostic pseudo-labels for moving objects, subsequently serving as supervision signal to bootstrap a detector. Despite promising results, these approaches do not provide class labels or generalize well to static objects. Furthermore, they are mostly restricted to data containing multiple drives from the same scene or images from a precisely calibrated and synchronized camera setup. To overcome these limitations, we propose a vision-language-guided unsupervised 3D detection approach that operates exclusively on LiDAR point clouds. We transfer CLIP knowledge to classify point clusters of static and moving objects, which we discover by exploiting the inherent spatio-temporal information of LiDAR point clouds for clustering, tracking, as well as box and label refinement. Our approach outperforms state-of-the-art unsupervised 3D object detectors on the Waymo Open Dataset ($+23~\text{AP}_{3D}$) and Argoverse 2 ($+7.9~\text{AP}_{3D}$) and provides class labels not solely based on object size assumptions, marking a significant advancement in the field., Comment: Accepted to BMVC 2024
Published: 2024

4. Into the Fog: Evaluating Multiple Object Tracking Robustness

Author: Kirillova, Nadezda, Mirza, M. Jehanzeb, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: State-of-the-art (SOTA) trackers have shown remarkable Multiple Object Tracking (MOT) performance when trained and evaluated on current benchmarks. However, these benchmarks primarily consist of clear scenarios, overlooking adverse atmospheric conditions such as fog, haze, smoke and dust. As a result, the robustness of SOTA trackers remains underexplored. To address these limitations, we propose a pipeline for physic-based volumetric fog simulation in arbitrary real-world MOT dataset utilizing frame-by-frame monocular depth estimation and a fog formation optical model. Moreover, we enhance our simulation by rendering of both homogeneous and heterogeneous fog effects. We propose to use the dark channel prior method to estimate fog (smoke) color, which shows promising results even in night and indoor scenes. We present the leading tracking benchmark MOTChallenge (MOT17 dataset) overlaid by fog (smoke for indoor scenes) of various intensity levels and conduct a comprehensive evaluation of SOTA MOT methods, revealing their limitations under fog and fog-similar challenges.
Published: 2024

5. MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection

Author: Micorek, Jakub, Possegger, Horst, Narnhofer, Dominik, Bischof, Horst, and Kozinski, Mateusz
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose a novel approach to video anomaly detection: we treat feature vectors extracted from videos as realizations of a random variable with a fixed distribution and model this distribution with a neural network. This lets us estimate the likelihood of test videos and detect video anomalies by thresholding the likelihood estimates. We train our video anomaly detector using a modification of denoising score matching, a method that injects training data with noise to facilitate modeling its distribution. To eliminate hyperparameter selection, we model the distribution of noisy video features across a range of noise levels and introduce a regularizer that tends to align the models for different levels of noise. At test time, we combine anomaly indications at multiple noise scales with a Gaussian mixture model. Running our video anomaly detector induces minimal delays as inference requires merely extracting the features and forward-propagating them through a shallow neural network and a Gaussian mixture model. Our experiments on five popular video anomaly detection benchmarks demonstrate state-of-the-art performance, both in the object-centric and in the frame-centric setup.
Published: 2024

6. Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Author: Mirza, M. Jehanzeb, Karlinsky, Leonid, Lin, Wei, Doveh, Sivan, Micorek, Jakub, Kozinski, Mateusz, Kuehne, Hilde, and Possegger, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Prompt ensembling of Large Language Model (LLM) generated category-specific prompts has emerged as an effective method to enhance zero-shot recognition ability of Vision-Language Models (VLMs). To obtain these category-specific prompts, the present methods rely on hand-crafting the prompts to the LLMs for generating VLM prompts for the downstream tasks. However, this requires manually composing these task-specific prompts and still, they might not cover the diverse set of visual concepts and task-specific styles associated with the categories of interest. To effectively take humans out of the loop and completely automate the prompt generation process for zero-shot recognition, we propose Meta-Prompting for Visual Recognition (MPVR). Taking as input only minimal information about the target task, in the form of its short natural language description, and a list of associated class labels, MPVR automatically produces a diverse set of category-specific prompts resulting in a strong zero-shot classifier. MPVR generalizes effectively across various popular zero-shot image recognition benchmarks belonging to widely different domains when tested with multiple LLMs and VLMs. For example, MPVR obtains a zero-shot recognition improvement over CLIP by up to 19.8% and 18.2% (5.0% and 4.5% on average over 20 datasets) leveraging GPT and Mixtral LLMs, respectively, Comment: ECCV Camera Ready. Code & Data: https://jmiemirza.github.io/Meta-Prompting/
Published: 2024

7. Identifying and Extracting Pedestrian Behavior in Critical Traffic Situations

Author: Schachner, Martin, Schneider, Bernd, Weissenbacher, Fabian, Kirillova, Nadezda, Possegger, Horst, Bischof, Horst, and Klug, Corina
Subjects: Computer Science - Robotics
Abstract: A better understanding of interactive pedestrian behavior in critical traffic situations is essential for the development of enhanced pedestrian safety systems. Real-world traffic observations play a decisive role in this, since they represent behavior in an unbiased way. In this work, we present an approach of how a subset of very considerable pedestrian-vehicle interactions can be derived from a camera-based observation system. For this purpose, we have examined road user trajectories automatically for establishing temporal and spatial relationships, using 110h hours of video recordings. In order to identify critical interactions, our approach combines the metric post-encroachment time with a newly introduced motion adaption metric. From more than 11,000 reconstructed pedestrian trajectories, 259 potential scenarios remained, using a post-encroachment time threshold of 2s. However, in 95% of cases, no adaptation of the pedestrian behavior was observed due to avoiding criticality. Applying the proposed motion adaption metric, only 21 critical scenarios remained. Manual investigations revealed that critical pedestrian vehicle interactions were present in 7 of those. They were further analyzed and made publicly available for developing pedestrian behavior models3. The results indicate that critical interactions in which the pedestrian perceives and reacts to the vehicle at a relatively late stage can be extracted using the proposed method., Comment: 7 pages, 8 figures, ITSC 2023 accepted
Published: 2024

8. Meta-prompting for Automating Zero-Shot Visual Recognition with LLMs

Author: Mirza, M. Jehanzeb, Karlinsky, Leonid, Lin, Wei, Doveh, Sivan, Micorek, Jakub, Kozinski, Mateusz, Kuehne, Hilde, Possegger, Horst, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

9. GACE: Geometry Aware Confidence Enhancement for Black-Box 3D Object Detectors on LiDAR-Data

Author: Schinagl, David, Krispel, Georg, Fruhwirth-Reisinger, Christian, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Widely-used LiDAR-based 3D object detectors often neglect fundamental geometric information readily available from the object proposals in their confidence estimation. This is mostly due to architectural design choices, which were often adopted from the 2D image domain, where geometric context is rarely available. In 3D, however, considering the object properties and its surroundings in a holistic way is important to distinguish between true and false positive detections, e.g. occluded pedestrians in a group. To address this, we present GACE, an intuitive and highly efficient method to improve the confidence estimation of a given black-box 3D object detector. We aggregate geometric cues of detections and their spatial relationships, which enables us to properly assess their plausibility and consequently, improve the confidence estimation. This leads to consistent performance gains over a variety of state-of-the-art detectors. Across all evaluated detectors, GACE proves to be especially beneficial for the vulnerable road user classes, i.e. pedestrians and cyclists., Comment: ICCV 2023, code is available at https://github.com/dschinagl/gace
Published: 2023

10. TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

Author: Mirza, M. Jehanzeb, Karlinsky, Leonid, Lin, Wei, Possegger, Horst, Feris, Rogerio, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision and Language Models (VLMs), such as CLIP, have enabled visual recognition of a potentially unlimited set of categories described by text prompts. However, for the best visual recognition performance, these models still require tuning to better fit the data distributions of the downstream tasks, in order to overcome the domain shift from the web-based pre-training data. Recently, it has been shown that it is possible to effectively tune VLMs without any paired data, and in particular to effectively improve VLMs visual recognition performance using text-only training data generated by Large Language Models (LLMs). In this paper, we dive deeper into this exciting text-only VLM training approach and explore ways it can be significantly further improved taking the specifics of the downstream task into account when sampling text data from LLMs. In particular, compared to the SOTA text-only VLM training approach, we demonstrate up to 8.4% performance improvement in (cross) domain-specific adaptation, up to 8.7% improvement in fine-grained recognition, and 3.1% overall average improvement in zero-shot classification compared to strong baselines., Comment: Code is available at: https://github.com/jmiemirza/TAP
Published: 2023

11. Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

Author: Leitner, Stefan, Mirza, M. Jehanzeb, Lin, Wei, Micorek, Jakub, Masana, Marc, Kozinski, Mateusz, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In autonomous driving scenarios, current object detection models show strong performance when tested in clear weather. However, their performance deteriorates significantly when tested in degrading weather conditions. In addition, even when adapted to perform robustly in a sequence of different weather conditions, they are often unable to perform well in all of them and suffer from catastrophic forgetting. To efficiently mitigate forgetting, we propose Domain-Incremental Learning through Activation Matching (DILAM), which employs unsupervised feature alignment to adapt only the affine parameters of a clear weather pre-trained network to different weather conditions. We propose to store these affine parameters as a memory bank for each weather condition and plug-in their weather-specific parameters during driving (i.e. test time) when the respective weather conditions are encountered. Our memory bank is extremely lightweight, since affine parameters account for less than 2% of a typical object detector. Furthermore, contrary to previous domain-incremental learning approaches, we do not require the weather label when testing and propose to automatically infer the weather condition by a majority voting linear classifier., Comment: Intelligent Vehicle Conference (oral presentation)
Published: 2023

12. LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

Author: Mirza, M. Jehanzeb, Karlinsky, Leonid, Lin, Wei, Kozinski, Mateusz, Possegger, Horst, Feris, Rogerio, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed category set) classifiers trained with supervised fine tuning. In this paper we show, for the first time, how to reduce this gap without any labels and without any paired VL data, using an unlabeled image collection and a set of texts auto-generated using a Large Language Model (LLM) describing the categories of interest and effectively substituting labeled visual instances of those categories. Using our label-free approach, we are able to attain significant performance improvements over the zero-shot performance of the base VL model and other contemporary methods and baselines on a wide variety of datasets, demonstrating absolute improvement of up to 11.7% (3.8% on average) in the label-free setting. Moreover, despite our approach being label-free, we observe 1.3% average gains over leading few-shot prompting baselines that do use 5-shot supervision., Comment: NeurIPS 2023 (Camera Ready) - Project Page: https://jmiemirza.github.io/LaFTer/
Published: 2023

13. MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Author: Lin, Wei, Karlinsky, Leonid, Shvetsova, Nina, Possegger, Horst, Kozinski, Mateusz, Panda, Rameswar, Feris, Rogerio, Kuehne, Hilde, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Large scale Vision-Language (VL) models have shown tremendous success in aligning representations between visual and text modalities. This enables remarkable progress in zero-shot recognition, image generation & editing, and many other exciting tasks. However, VL models tend to over-represent objects while paying much less attention to verbs, and require additional tuning on video data for best zero-shot action recognition performance. While previous work relied on large-scale, fully-annotated data, in this work we propose an unsupervised approach. We adapt a VL model for zero-shot and few-shot action recognition using a collection of unlabeled videos and an unpaired action dictionary. Based on that, we leverage Large Language Models and VL models to build a text bag for each unlabeled video via matching, text expansion and captioning. We use those bags in a Multiple Instance Learning setup to adapt an image-text backbone to video data. Although finetuned on unlabeled video data, our resulting models demonstrate high transferability to numerous unseen zero-shot downstream tasks, improving the base VL model performance by up to 14\%, and even comparing favorably to fully-supervised baselines in both zero-shot and few-shot video recognition transfer. The code will be released later at \url{https://github.com/wlin-at/MAXI}., Comment: Accepted at ICCV 2023
Published: 2023

14. TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering

Author: Lin, Wei, Kukleva, Anna, Possegger, Horst, Kuehne, Hilde, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Temporal action segmentation in untrimmed videos has gained increased attention recently. However, annotating action classes and frame-wise boundaries is extremely time consuming and cost intensive, especially on large-scale datasets. To address this issue, we propose an unsupervised approach for learning action classes from untrimmed video sequences. In particular, we propose a temporal embedding network that combines relative time prediction, feature reconstruction, and sequence-to-sequence learning, to preserve the spatial layout and sequential nature of the video features. A two-step clustering pipeline on these embedded feature representations then allows us to enforce temporal consistency within, as well as across videos. Based on the identified clusters, we decode the video into coherent temporal segments that correspond to semantically meaningful action classes. Our evaluation on three challenging datasets shows the impact of each component and, furthermore, demonstrates our state-of-the-art unsupervised action segmentation results., Comment: Computer Vision Winter Workshop 2023
Published: 2023

15. MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds

Author: Krispel, Georg, Schinagl, David, Fruhwirth-Reisinger, Christian, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The sensing process of large-scale LiDAR point clouds inevitably causes large blind spots, i.e. regions not visible to the sensor. We demonstrate how these inherent sampling properties can be effectively utilized for self-supervised representation learning by designing a highly effective pre-training framework that considerably reduces the need for tedious 3D annotations to train state-of-the-art object detectors. Our Masked AutoEncoder for LiDAR point clouds (MAELi) intuitively leverages the sparsity of LiDAR point clouds in both the encoder and decoder during reconstruction. This results in more expressive and useful initialization, which can be directly applied to downstream perception tasks, such as 3D object detection or semantic segmentation for autonomous driving. In a novel reconstruction approach, MAELi distinguishes between empty and occluded space and employs a new masking strategy that targets the LiDAR's inherent spherical projection. Thereby, without any ground truth whatsoever and trained on single frames only, MAELi obtains an understanding of the underlying 3D scene geometry and semantics. To demonstrate the potential of MAELi, we pre-train backbones in an end-to-end manner and show the effectiveness of our unsupervised pre-trained weights on the tasks of 3D object detection and semantic segmentation., Comment: Accepted to WACV 2024, 16 pages
Published: 2022

16. Sparse Message Passing Network with Feature Integration for Online Multiple Object Tracking

Author: Wang, Bisheng, Possegger, Horst, Bischof, Horst, and Cao, Guo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Existing Multiple Object Tracking (MOT) methods design complex architectures for better tracking performance. However, without a proper organization of input information, they still fail to perform tracking robustly and suffer from frequent identity switches. In this paper, we propose two novel methods together with a simple online Message Passing Network (MPN) to address these limitations. First, we explore different integration methods for the graph node and edge embeddings and put forward a new IoU (Intersection over Union) guided function, which improves long term tracking and handles identity switches. Second, we introduce a hierarchical sampling strategy to construct sparser graphs which allows to focus the training on more difficult samples. Experimental results demonstrate that a simple online MPN with these two contributions can perform better than many state-of-the-art methods. In addition, our association method generalizes well and can also improve the results of private detection based methods., Comment: 8 pages, 2 figures
Published: 2022

17. A Pedestrian Detection Case Study for a Traffic Light Controller

Author: Wendt, Alexander, Possegger, Horst, Bittner, Matthias, Schnöll, Daniel, Wess, Matthias, Malić, Dušan, Bischof, Horst, Jantsch, Axel, Pasricha, Sudeep, editor, and Shafique, Muhammad, editor
Published: 2024
Full Text: View/download PDF

18. Video Test-Time Adaptation for Action Recognition

Author: Lin, Wei, Mirza, Muhammad Jehanzeb, Kozinski, Mateusz, Possegger, Horst, Kuehne, Hilde, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in test data. However, test-time adaptation of video action recognition models against common distribution shifts has so far not been demonstrated. We propose to address this problem with an approach tailored to spatio-temporal models that is capable of adaptation on a single video sample at a step. It consists in a feature distribution alignment technique that aligns online estimates of test set statistics towards the training statistics. We further enforce prediction consistency over temporally augmented views of the same test video sample. Evaluations on three benchmark action recognition datasets show that our proposed technique is architecture-agnostic and able to significantly boost the performance on both, the state of the art convolutional architecture TANet and the Video Swin Transformer. Our proposed method demonstrates a substantial performance gain over existing test-time adaptation approaches in both evaluations of a single distribution shift and the challenging case of random distribution shifts. Code will be available at \url{https://github.com/wlin-at/ViTTA}., Comment: Accepted at CVPR 2023
Published: 2022

19. ActMAD: Activation Matching to Align Distributions for Test-Time-Training

Author: Mirza, Muhammad Jehanzeb, Soneira, Pol Jané, Lin, Wei, Kozinski, Mateusz, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time. We propose to perform this adaptation via Activation Matching (ActMAD): We analyze activations of the model and align activation statistics of the OOD test data to those of the training data. In contrast to existing methods, which model the distribution of entire channels in the ultimate layer of the feature extractor, we model the distribution of each feature in multiple layers across the network. This results in a more fine-grained supervision and makes ActMAD attain state of the art performance on CIFAR-100C and Imagenet-C. ActMAD is also architecture- and task-agnostic, which lets us go beyond image classification, and score 15.4% improvement over previous approaches when evaluating a KITTI-trained object detector on KITTI-Fog. Our experiments highlight that ActMAD can be applied to online adaptation in realistic scenarios, requiring little data to attain its full performance., Comment: CVPR 2023 - Project Page: https://jmiemirza.github.io/ActMAD/
Published: 2022

20. MATE: Masked Autoencoders are Online 3D Test-Time Learners

Author: Mirza, M. Jehanzeb, Shin, Inkyu, Lin, Wei, Schriebl, Andreas, Sun, Kunyang, Choe, Jaesung, Possegger, Horst, Kozinski, Mateusz, Kweon, In So, Yoon, Kun-Jin, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data. Like existing TTT methods from the 2D image domain, MATE also leverages test data for adaptation. Its test-time objective is that of a Masked Autoencoder: a large portion of each test point cloud is removed before it is fed to the network, tasked with reconstructing the full point cloud. Once the network is updated, it is used to classify the point cloud. We test MATE on several 3D object classification datasets and show that it significantly improves robustness of deep networks to several types of corruptions commonly occurring in 3D point clouds. We show that MATE is very efficient in terms of the fraction of points it needs for the adaptation. It can effectively adapt given as few as 5% of tokens of each test sample, making it extremely lightweight. Our experiments show that MATE also achieves competitive performance by adapting sparsely on the test data, which further reduces its computational overhead, making it ideal for real-time applications., Comment: Code is available at this repository: https://github.com/jmiemirza/MATE
Published: 2022

21. Test-time adversarial detection and robustness for localizing humans using ultra wide band channel impulse responses

Author: Kolli, Abhiram, Mirza, Muhammad Jehanzeb, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Keyless entry systems in cars are adopting neural networks for localizing its operators. Using test-time adversarial defences equip such systems with the ability to defend against adversarial attacks without prior training on adversarial samples. We propose a test-time adversarial example detector which detects the input adversarial example through quantifying the localized intermediate responses of a pre-trained neural network and confidence scores of an auxiliary softmax layer. Furthermore, in order to make the network robust, we extenuate the non-relevant features by non-iterative input sample clipping. Using our approach, mean performance over 15 levels of adversarial perturbations is increased by 55.33% for the fast gradient sign method (FGSM) and 6.3% for both the basic iterative method (BIM) and the projected gradient method (PGD)., Comment: 5 pages, 4 figures, ICASSP Conference
Published: 2022

22. SAILOR: Scaling Anchors via Insights into Latent Object Representation

Author: Malić, Dušan, Fruhwirth-Reisinger, Christian, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: LiDAR 3D object detection models are inevitably biased towards their training dataset. The detector clearly exhibits this bias when employed on a target dataset, particularly towards object sizes. However, object sizes vary heavily between domains due to, for instance, different labeling policies or geographical locations. State-of-the-art unsupervised domain adaptation approaches outsource methods to overcome the object size bias. Mainstream size adaptation approaches exploit target domain statistics, contradicting the original unsupervised assumption. Our novel unsupervised anchor calibration method addresses this limitation. Given a model trained on the source data, we estimate the optimal target anchors in a completely unsupervised manner. The main idea stems from an intuitive observation: by varying the anchor sizes for the target domain, we inevitably introduce noise or even remove valuable object cues. The latent object representation, perturbed by the anchor size, is closest to the learned source features only under the optimal target anchors. We leverage this observation for anchor size optimization. Our experimental results show that, without any retraining, we achieve competitive results even compared to state-of-the-art weakly-supervised size adaptation approaches. In addition, our anchor calibration can be combined with such existing methods, making them completely unsupervised., Comment: WACV 2023; code is available at https://github.com/malicd/sailor
Published: 2022

23. An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions

Author: Mirza, M. Jehanzeb, Masana, Marc, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although deep neural networks enable impressive visual perception performance for autonomous driving, their robustness to varying weather conditions still requires attention. When adapting these models for changed environments, such as different weather conditions, they are prone to forgetting previously learned information. This catastrophic forgetting is typically addressed via incremental learning approaches which usually re-train the model by either keeping a memory bank of training samples or keeping a copy of the entire model or model parameters for each scenario. While these approaches show impressive results, they can be prone to scalability issues and their applicability for autonomous driving in all weather conditions has not been shown. In this paper we propose DISC -- Domain Incremental through Statistical Correction -- a simple online zero-forgetting approach which can incrementally learn new tasks (i.e weather conditions) without requiring re-training or expensive memory banks. The only information we store for each task are the statistical parameters as we categorize each domain by the change in first and second order statistics. Thus, as each task arrives, we simply 'plug and play' the statistical vectors for the corresponding task into the model and it immediately starts to perform well on that task. We show the efficacy of our approach by testing it for object detection in a challenging domain-incremental autonomous driving scenario where we encounter different adverse weather conditions, such as heavy rain, fog, and snow., Comment: Accepted to CVPR Workshops - Camera Ready Version
Published: 2022

24. OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data

Author: Schinagl, David, Krispel, Georg, Possegger, Horst, Roth, Peter M., and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: While 3D object detection in LiDAR point clouds is well-established in academia and industry, the explainability of these models is a largely unexplored field. In this paper, we propose a method to generate attribution maps for the detected objects in order to better understand the behavior of such models. These maps indicate the importance of each 3D point in predicting the specific objects. Our method works with black-box models: We do not require any prior knowledge of the architecture nor access to the model's internals, like parameters, activations or gradients. Our efficient perturbation-based approach empirically estimates the importance of each point by testing the model with randomly generated subsets of the input point cloud. Our sub-sampling strategy takes into account the special characteristics of LiDAR data, such as the depth-dependent point density. We show a detailed evaluation of the attribution maps and demonstrate that they are interpretable and highly informative. Furthermore, we compare the attribution maps of recent 3D object detection architectures to provide insights into their decision-making processes., Comment: CVPR 2022, code is available at https://github.com/dschinagl/occam
Published: 2022

25. CycDA: Unsupervised Cycle Domain Adaptation from Image to Video

Author: Lin, Wei, Kukleva, Anna, Sun, Kunyang, Possegger, Horst, Kuehne, Hilde, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and video frames; (2) modality gap between image and video data. To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap. We alternate between the spatial and spatio-temporal learning with knowledge transfer between the two in each cycle. We evaluate our approach on benchmark datasets for image-to-video as well as for mixed-source domain adaptation achieving state-of-the-art results and demonstrating the benefits of our cyclic adaptation. Code is available at \url{https://github.com/wlin-at/CycDA}., Comment: Accepted at ECCV2022. Supplementary included
Published: 2022

26. 3D Human Pose Estimation Using M\'obius Graph Convolutional Networks

Author: Azizi, Niloofar, Possegger, Horst, Rodolà, Emanuele, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D human pose estimation is fundamental to understanding human behavior. Recently, promising results have been achieved by graph convolutional networks (GCNs), which achieve state-of-the-art performance and provide rather light-weight architectures. However, a major limitation of GCNs is their inability to encode all the transformations between joints explicitly. To address this issue, we propose a novel spectral GCN using the M\"obius transformation (M\"obiusGCN). In particular, this allows us to directly and explicitly encode the transformation between joints, resulting in a significantly more compact representation. Compared to even the lightest architectures so far, our novel approach requires 90-98% fewer parameters, i.e. our lightest M\"obiusGCN uses only 0.042M trainable parameters. Besides the drastic parameter reduction, explicitly encoding the transformation of joints also enables us to achieve state-of-the-art results. We evaluate our approach on the two challenging pose estimation benchmarks, Human3.6M and MPI-INF-3DHP, demonstrating both state-of-the-art results and the generalization capabilities of M\"obiusGCN.
Published: 2022

27. The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization

Author: Mirza, M. Jehanzeb, Micorek, Jakub, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Domain adaptation is crucial to adapt a learned model to new scenarios, such as domain shifts or changing data distributions. Current approaches usually require a large amount of labeled or unlabeled data from the shifted domain. This can be a hurdle in fields which require continuous dynamic adaptation or suffer from scarcity of data, e.g. autonomous driving in challenging weather conditions. To address this problem of continuous adaptation to distribution shifts, we propose Dynamic Unsupervised Adaptation (DUA). By continuously adapting the statistics of the batch normalization layers we modify the feature representations of the model. We show that by sequentially adapting a model with only a fraction of unlabeled data, a strong performance gain can be achieved. With even less than 1% of unlabeled data from the target domain, DUA already achieves competitive results to strong baselines. In addition, the computational overhead is minimal in contrast to previous approaches. Our approach is simple, yet effective and can be applied to any architecture which uses batch normalization as one of its components. We show the utility of DUA by evaluating it on a variety of domain adaptation datasets and tasks including object recognition, digit recognition and object detection., Comment: Accepted to CVPR 2022 - Camera Ready Version - Code: https://github.com/jmiemirza/DUA
Published: 2021

28. FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Author: Fruhwirth-Reisinger, Christian, Opitz, Michael, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: In the field of autonomous driving, self-training is widely applied to mitigate distribution shifts in LiDAR-based 3D object detectors. This eliminates the need for expensive, high-quality labels whenever the environment changes (e.g., geographic location, sensor setup, weather condition). State-of-the-art self-training approaches, however, mostly ignore the temporal nature of autonomous driving data. To address this issue, we propose a flow-aware self-training method that enables unsupervised domain adaptation for 3D object detectors on continuous LiDAR point clouds. In order to get reliable pseudo-labels, we leverage scene flow to propagate detections through time. In particular, we introduce a flow-based multi-target tracker, that exploits flow consistency to filter and refine resulting tracks. The emerged precise pseudo-labels then serve as a basis for model re-training. Starting with a pre-trained KITTI model, we conduct experiments on the challenging Waymo Open Dataset to demonstrate the effectiveness of our approach. Without any prior target domain knowledge, our results show a significant improvement over the state-of-the-art., Comment: Accepted to BMVC 2021
Published: 2021

29. FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Author: Krispel, Georg, Opitz, Michael, Waltner, Georg, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics, I.4.6, I.4.8, I.2.9, I.2.10
Abstract: We introduce a simple yet effective fusion method of LiDAR and RGB data to segment LiDAR point clouds. Utilizing the dense native range representation of a LiDAR sensor and the setup calibration, we establish point correspondences between the two input modalities. Subsequently, we are able to warp and fuse the features from one domain into the other. Therefore, we can jointly exploit information from both data sources within one single network. To show the merit of our method, we extend SqueezeSeg, a point cloud segmentation network, with an RGB feature branch and fuse it into the original structure. Our extension called FuseSeg leads to an improvement of up to 18% IoU on the KITTI benchmark. In addition to the improved accuracy, we also achieve real-time performance at 50 fps, five times as fast as the KITTI LiDAR data recording speed., Comment: Accepted for publication in WACV 2020
Published: 2019

30. CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video

Author: Lin, Wei, Kukleva, Anna, Sun, Kunyang, Possegger, Horst, Kuehne, Hilde, Bischof, Horst, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
Published: 2022
Full Text: View/download PDF

31. An Intelligent Scanning Vehicle for Waste Collection Monitoring

Author: Waltner, Georg, Jaschik, Malte, Rinnhofer, Alfred, Possegger, Horst, Bischof, Horst, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Sclaroff, Stan, editor, Distante, Cosimo, editor, Leo, Marco, editor, Farinella, Giovanni M., editor, and Tombari, Federico, editor
Published: 2022
Full Text: View/download PDF

32. Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

Author: Waltner, Georg, Maurer, Michael, Holzmann, Thomas, Ruprecht, Patrick, Opitz, Michael, Possegger, Horst, Fraundorfer, Friedrich, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Automated toll systems rely on proper classification of the passing vehicles. This is especially difficult when the images used for classification only cover parts of the vehicle. To obtain information about the whole vehicle. we reconstruct the vehicle as 3D object and exploit this additional information within a Convolutional Neural Network (CNN). However, when using deep networks for 3D object classification, large amounts of dense 3D models are required for good accuracy, which are often neither available nor feasible to process due to memory requirements. Therefore, in our method we reproject the 3D object onto the image plane using the reconstructed points, lines or both. We utilize this sparse depth prior within an auxiliary network branch that acts as a regularizer during training. We show that this auxiliary regularizer helps to improve accuracy compared to 2D classification on a real-world dataset. Furthermore due to the design of the network, at test time only the 2D camera images are required for classification which enables the usage in portable computer vision systems., Comment: Submitted to the IEEE International Conference on Intelligent Transportation Systems 2018 (ITSC), 6 pages, 4 figures; changed format in compliance with adapted IEEE template
Published: 2018

33. Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Author: Opitz, Michael, Waltner, Georg, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble. These loss functions can be applied either for weight initialization or during training. Together, our contributions leverage large embedding sizes more effectively by significantly reducing correlation of the embedding and consequently increase retrieval accuracy of the embedding. Our method works with any differentiable loss function and does not introduce any additional parameters during test time. We evaluate our metric learning method on image retrieval tasks and show that it improves over state-of-the-art methods on the CUB 200-2011, Cars-196, Stanford Online Products, In-Shop Clothes Retrieval and VehicleID datasets., Comment: Extension to our paper BIER: Boosting Independent Embeddings Robustly (ICCV 2017 oral) - submitted to PAMI
Published: 2018

34. Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry Systems

Author: Kolli, Abhiram, primary, Casamassima, Filippo, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2024
Full Text: View/download PDF

35. Grid Loss: Detecting Occluded Faces

Author: Opitz, Michael, Waltner, Georg, Poier, Georg, Possegger, Horst, and Bischof, Horst
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution layer independently rather than over the whole feature map. This results in parts being more discriminative on their own, enabling the detector to recover if the detection window is partially occluded. By mapping our loss layer back to a regular fully connected layer, no additional computational cost is incurred at runtime compared to standard CNNs. We demonstrate our method for face detection on several public face detection benchmarks and show that our method outperforms regular CNNs, is suitable for realtime applications and achieves state-of-the-art performance., Comment: accepted to ECCV 2016
Published: 2016

36. 3D Human Pose Estimation Using Möbius Graph Convolutional Networks

Author: Azizi, Niloofar, primary, Possegger, Horst, additional, Rodolà, Emanuele, additional, and Bischof, Horst, additional
Published: 2022
Full Text: View/download PDF

37. An Intelligent Scanning Vehicle for Waste Collection Monitoring

Author: Waltner, Georg, primary, Jaschik, Malte, additional, Rinnhofer, Alfred, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2022
Full Text: View/download PDF

38. Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry Systems

Author: Kolli, Abhiram, Casamassima, Filippo, Possegger, Horst, Bischof, Horst, Kolli, Abhiram, Casamassima, Filippo, Possegger, Horst, and Bischof, Horst
Abstract: Using neural networks for localization of key fob within and surrounding a car as a security feature for keyless entry is fast emerging. In this paper we study: 1) the performance of pre-computed features of neural networks based UWB (ultra wide band) localization classification forming the baseline of our experiments. 2) Investigate the inherent robustness of various neural networks; therefore, we include the study of robustness of the adversarial examples without any adversarial training in this work. 3) Propose a multi-head self-supervised neural network architecture which outperforms the baseline neural networks without any adversarial training. The model's performance improved by 67% at certain ranges of adversarial magnitude for fast gradient sign method and 37% each for basic iterative method and projected gradient descent method.
Published: 2024

39. The Sixth Visual Object Tracking VOT2018 Challenge Results

Author: Kristan, Matej, Leonardis, Aleš, Matas, Jiří, Felsberg, Michael, Pflugfelder, Roman, Zajc, Luka Čehovin, Vojír̃, Tomáš, Bhat, Goutam, Lukežič, Alan, Eldesokey, Abdelrahman, Fernández, Gustavo, García-Martín, Álvaro, Iglesias-Arias, Álvaro, Alatan, A. Aydin, González-García, Abel, Petrosino, Alfredo, Memarmoghadam, Alireza, Vedaldi, Andrea, Muhič, Andrej, He, Anfeng, Smeulders, Arnold, Perera, Asanka G., Li, Bo, Chen, Boyu, Kim, Changick, Xu, Changsheng, Xiong, Changzhen, Tian, Cheng, Luo, Chong, Sun, Chong, Hao, Cong, Kim, Daijin, Mishra, Deepak, Chen, Deming, Wang, Dong, Wee, Dongyoon, Gavves, Efstratios, Gundogdu, Erhan, Velasco-Salido, Erik, Khan, Fahad Shahbaz, Yang, Fan, Zhao, Fei, Li, Feng, Battistone, Francesco, De Ath, George, Subrahmanyam, Gorthi R. K. S., Bastos, Guilherme, Ling, Haibin, Galoogahi, Hamed Kiani, Lee, Hankyeol, Li, Haojie, Zhao, Haojie, Fan, Heng, Zhang, Honggang, Possegger, Horst, Li, Houqiang, Lu, Huchuan, Zhi, Hui, Li, Huiyun, Lee, Hyemin, Chang, Hyung Jin, Drummond, Isabela, Valmadre, Jack, Martin, Jaime Spencer, Chahl, Javaan, Choi, Jin Young, Li, Jing, Wang, Jinqiao, Qi, Jinqing, Sung, Jinyoung, Johnander, Joakim, Henriques, Joao, Choi, Jongwon, van de Weijer, Joost, Herranz, Jorge Rodríguez, Martínez, José M., Kittler, Josef, Zhuang, Junfei, Gao, Junyu, Grm, Klemen, Zhang, Lichao, Wang, Lijun, Yang, Lingxiao, Rout, Litu, Si, Liu, Bertinetto, Luca, Chu, Lutao, Che, Manqiang, Maresca, Mario Edoardo, Danelljan, Martin, Yang, Ming-Hsuan, Abdelpakey, Mohamed, Shehata, Mohamed, Kang, Myunggu, Lee, Namhoon, Wang, Ning, Miksik, Ondrej, Moallem, P., Vicente-Moñivar, Pablo, Senna, Pedro, Li, Peixia, Torr, Philip, Raju, Priya Mariam, Ruihe, Qian, Wang, Qiang, Zhou, Qin, Guo, Qing, Martín-Nieto, Rafael, Gorthi, Rama Krishna, Tao, Ran, Bowden, Richard, Everson, Richard, Wang, Runling, Yun, Sangdoo, Choi, Seokeon, Vivas, Sergio, Bai, Shuai, Huang, Shuangping, Wu, Sihang, Hadfield, Simon, Wang, Siwen, Golodetz, Stuart, Ming, Tang, Xu, Tianyang, Zhang, Tianzhu, Fischer, Tobias, Santopietro, Vincenzo, Štruc, Vitomir, Wei, Wang, Zuo, Wangmeng, Feng, Wei, Wu, Wei, Zou, Wei, Hu, Weiming, Zhou, Wengang, Zeng, Wenjun, Zhang, Xiaofan, Wu, Xiaohe, Wu, Xiao-Jun, Tian, Xinmei, Li, Yan, Lu, Yan, Law, Yee Wei, Wu, Yi, Demiris, Yiannis, Yang, Yicai, Jiao, Yifan, Li, Yuhong, Zhang, Yunhua, Sun, Yuxuan, Zhang, Zheng, Zhu, Zheng, Feng, Zhen-Hua, Wang, Zhihui, He, Zhiqun, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Leal-Taixé, Laura, editor, and Roth, Stefan, editor
Published: 2019
Full Text: View/download PDF

40. MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds

Author: Krispel, Georg, primary, Schinagl, David, additional, Fruhwirth-Reisinger, Christian, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2024
Full Text: View/download PDF

41. Towards Large-Scale Video-Based Highway Traffic Monitoring

Author: Spoecklberger, Johannes, primary, Micorek, Jakub, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2023
Full Text: View/download PDF

42. Identifying and Extracting Pedestrian Behavior in Critical Traffic Situations

Author: Schachner, Martin, primary, Schneider, Bernd, additional, Weissenbacher, Fabian, additional, Kirillova, Nadezda, additional, Possegger, Horst, additional, Bischof, Horst, additional, and Klug, Corina, additional
Published: 2023
Full Text: View/download PDF

43. Test-Time Adversarial Detection and Robustness for Localizing Humans Using Ultra Wide Band Channel Impulse Responses

Author: Kolli, Abhiram, primary, Mirza, Muhammad Jehanzeb, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2023
Full Text: View/download PDF

44. Efficient Model Averaging for Deep Neural Networks

Author: Opitz, Michael, Possegger, Horst, Bischof, Horst, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Lai, Shang-Hong, editor, Lepetit, Vincent, editor, Nishino, Ko, editor, and Sato, Yoichi, editor
Published: 2017
Full Text: View/download PDF

45. Video Test-Time Adaptation for Action Recognition

Author: Lin, Wei, primary, Mirza, Muhammad Jehanzeb, additional, Kozinski, Mateusz, additional, Possegger, Horst, additional, Kuehne, Hilde, additional, and Bischof, Horst, additional
Published: 2023
Full Text: View/download PDF

46. ActMAD: Activation Matching to Align Distributions for Test-Time-Training

Author: Mirza, M. Jehanzeb, primary, Soneira, Pol Jané, additional, Lin, Wei, additional, Kozinski, Mateusz, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2023
Full Text: View/download PDF

47. The Visual Object Tracking VOT2016 Challenge Results

Author: Kristan, Matej, Leonardis, Aleš, Matas, Jiři, Felsberg, Michael, Pflugfelder, Roman, Čehovin, Luka, Vojír̃, Tomáš, Häger, Gustav, Lukežič, Alan, Fernández, Gustavo, Gupta, Abhinav, Petrosino, Alfredo, Memarmoghadam, Alireza, Garcia-Martin, Alvaro, Solís Montero, Andrés, Vedaldi, Andrea, Robinson, Andreas, Ma, Andy J., Varfolomieiev, Anton, Alatan, Aydin, Erdem, Aykut, Ghanem, Bernard, Liu, Bin, Han, Bohyung, Martinez, Brais, Chang, Chang-Ming, Xu, Changsheng, Sun, Chong, Kim, Daijin, Chen, Dapeng, Du, Dawei, Mishra, Deepak, Yeung, Dit-Yan, Gundogdu, Erhan, Erdem, Erkut, Khan, Fahad, Porikli, Fatih, Zhao, Fei, Bunyak, Filiz, Battistone, Francesco, Zhu, Gao, Roffo, Giorgio, Subrahmanyam, Gorthi R. K. Sai, Bastos, Guilherme, Seetharaman, Guna, Medeiros, Henry, Li, Hongdong, Qi, Honggang, Bischof, Horst, Possegger, Horst, Lu, Huchuan, Lee, Hyemin, Nam, Hyeonseob, Chang, Hyung Jin, Drummond, Isabela, Valmadre, Jack, Jeong, Jae-chan, Cho, Jae-il, Lee, Jae-Yeong, Zhu, Jianke, Feng, Jiayi, Gao, Jin, Choi, Jin Young, Xiao, Jingjing, Kim, Ji-Wan, Jeong, Jiyeoup, Henriques, João F., Lang, Jochen, Choi, Jongwon, Martinez, Jose M., Xing, Junliang, Gao, Junyu, Palaniappan, Kannappan, Lebeda, Karel, Gao, Ke, Mikolajczyk, Krystian, Qin, Lei, Wang, Lijun, Wen, Longyin, Bertinetto, Luca, Rapuru, Madan Kumar, Poostchi, Mahdieh, Maresca, Mario, Danelljan, Martin, Mueller, Matthias, Zhang, Mengdan, Arens, Michael, Valstar, Michel, Tang, Ming, Baek, Mooyeol, Khan, Muhammad Haris, Wang, Naiyan, Fan, Nana, Al-Shakarji, Noor, Miksik, Ondrej, Akin, Osman, Moallem, Payman, Senna, Pedro, Torr, Philip H. S., Yuen, Pong C., Huang, Qingming, Martin-Nieto, Rafael, Pelapur, Rengarajan, Bowden, Richard, Laganière, Robert, Stolkin, Rustam, Walsh, Ryan, Krah, Sebastian B., Li, Shengkun, Zhang, Shengping, Yao, Shizeng, Hadfield, Simon, Melzi, Simone, Lyu, Siwei, Li, Siyi, Becker, Stefan, Golodetz, Stuart, Kakanuru, Sumithra, Choi, Sunglok, Hu, Tao, Mauthner, Thomas, Zhang, Tianzhu, Pridmore, Tony, Santopietro, Vincenzo, Hu, Weiming, Li, Wenbo, Hübner, Wolfgang, Lan, Xiangyuan, Wang, Xiaomeng, Li, Xin, Li, Yang, Demiris, Yiannis, Wang, Yifan, Qi, Yuankai, Yuan, Zejian, Cai, Zexiong, Xu, Zhan, He, Zhenyu, Chi, Zhizhen, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Hua, Gang, editor, and Jégou, Hervé, editor
Published: 2016
Full Text: View/download PDF

48. The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results

Author: Felsberg, Michael, Kristan, Matej, Matas, Jiři, Leonardis, Aleš, Pflugfelder, Roman, Häger, Gustav, Berg, Amanda, Eldesokey, Abdelrahman, Ahlberg, Jörgen, Čehovin, Luka, Vojír̃, Tomáš, Lukežič, Alan, Fernández, Gustavo, Petrosino, Alfredo, Garcia-Martin, Alvaro, Montero, Andrés Solís, Varfolomieiev, Anton, Erdem, Aykut, Han, Bohyung, Chang, Chang-Ming, Du, Dawei, Erdem, Erkut, Khan, Fahad Shahbaz, Porikli, Fatih, Zhao, Fei, Bunyak, Filiz, Battistone, Francesco, Zhu, Gao, Seetharaman, Guna, Li, Hongdong, Qi, Honggang, Bischof, Horst, Possegger, Horst, Nam, Hyeonseob, Valmadre, Jack, Zhu, Jianke, Feng, Jiayi, Lang, Jochen, Martinez, Jose M., Palaniappan, Kannappan, Lebeda, Karel, Gao, Ke, Mikolajczyk, Krystian, Wen, Longyin, Bertinetto, Luca, Poostchi, Mahdieh, Maresca, Mario, Danelljan, Martin, Arens, Michael, Tang, Ming, Baek, Mooyeol, Fan, Nana, Al-Shakarji, Noor, Miksik, Ondrej, Akin, Osman, Torr, Philip H. S., Huang, Qingming, Martin-Nieto, Rafael, Pelapur, Rengarajan, Bowden, Richard, Laganière, Robert, Krah, Sebastian B., Li, Shengkun, Yao, Shizeng, Hadfield, Simon, Lyu, Siwei, Becker, Stefan, Golodetz, Stuart, Hu, Tao, Mauthner, Thomas, Santopietro, Vincenzo, Li, Wenbo, Hübner, Wolfgang, Li, Xin, Li, Yang, Xu, Zhan, He, Zhenyu, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Hua, Gang, editor, and Jégou, Hervé, editor
Published: 2016
Full Text: View/download PDF

49. The Visual Object Tracking VOT2014 Challenge Results

Author: Kristan, Matej, Pflugfelder, Roman, Leonardis, Aleš, Matas, Jiri, Čehovin, Luka, Nebehay, Georg, Vojíř, Tomáš, Fernández, Gustavo, Lukežič, Alan, Dimitriev, Aleksandar, Petrosino, Alfredo, Saffari, Amir, Li, Bo, Han, Bohyung, Heng, CherKeng, Garcia, Christophe, Pangeršič, Dominik, Häger, Gustav, Khan, Fahad Shahbaz, Oven, Franci, Possegger, Horst, Bischof, Horst, Nam, Hyeonseob, Zhu, Jianke, Li, JiJia, Choi, Jin Young, Choi, Jin-Woo, Henriques, João F., van de Weijer, Joost, Batista, Jorge, Lebeda, Karel, Öfjäll, Kristoffer, Yi, Kwang Moo, Qin, Lei, Wen, Longyin, Maresca, Mario Edoardo, Danelljan, Martin, Felsberg, Michael, Cheng, Ming-Ming, Torr, Philip, Huang, Qingming, Bowden, Richard, Hare, Sam, Lim, Samantha YueYing, Hong, Seunghoon, Liao, Shengcai, Hadfield, Simon, Li, Stan Z., Duffner, Stefan, Golodetz, Stuart, Mauthner, Thomas, Vineet, Vibhav, Lin, Weiyao, Li, Yang, Qi, Yuankai, Lei, Zhen, Niu, Zhi Heng, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Agapito, Lourdes, editor, Bronstein, Michael M., editor, and Rother, Carsten, editor
Published: 2015
Full Text: View/download PDF

50. SAILOR: Scaling Anchors via Insights into Latent Object Representation

Author: Malic, Dusan, primary, Fruhwirth-Reisinger, Christian, additional, Possegger, Horst, additional, and Bischof, Horst, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

165 results on '"Possegger, Horst"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources