Publications – Jan-Nico Zaech

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

zaech — Sat, 01 Jun 2024 00:00:06 +0000

Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc Van Gool

IEEE Conference on Computer Vision and Pattern Recognition 2024 (CVPR)

Abstract

Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic tasks and real visual data.

Optimizing Long-Term Player Tracking and Identification in NAO Robot Soccer by fusing Game-state and External Video

zaech — Fri, 02 Jun 2023 15:45:51 +0000

Giuliano Albanese*, Arka Mitra*, Jan-Nico Zaech*, Yupeng Zhao*, Ajad Chhatkuli, and Luc Van Gool

International Conference on Robotics and Automation Workshops, ICRA 2023 (CoPerception: Collaborative Perception and Learning)

Abstract

Monitoring a fleet of robots requires stable long-term tracking with re-identification, which is yet an unsolved challenge in many scenarios. One application of this is the analysis of autonomous robotic soccer games at RoboCup. Tracking these games requires the handling of identically looking players, strong occlusions, and non-professional video recordings, but also offers state information estimated by the robots. In order to make effective use of the information coming from the robot sensors, we propose a robust tracking and identification
pipeline. It fuses external non-calibrated camera data with the robots’ internal states using quadratic optimization for tracklet matching. The approach is validated using game recordings from previous RoboCup World Cups.

Adiabatic Quantum Computing for Multi Object Tracking

zaech — Sun, 19 Jun 2022 00:00:18 +0000

Jan-Nico Zaech, Alexander Liniger, Martin Danelljan, Dengxin Dai, Luc Van Gool

Conference on Computer Vision and Pattern Recognition, CVPR 2022

Abstract

Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time. The association step naturally leads to discrete optimization problems. As these optimization problems are often NP-hard, they can only be solved exactly for small instances on current hardware. Adiabatic quantum computing (AQC) offers a solution for this, as it has the potential to provide a considerable speedup on a range of NP-hard optimization problems in the near future. However, current MOT formulations are unsuitable for quantum computing due to their scaling properties. In this work, we therefore propose the first MOT formulation designed to be solved with AQC. We employ an Ising model that represents the quantum mechanical system implemented on the AQC. We show that our approach is competitive compared with state-of-the-art optimization-based approaches, even when using of-the-shelf integer programming solvers. Finally, we demonstrate that our MOT problem is already solvable on the current generation of real quantum computers for small examples, and analyze the properties of the measured solutions.

Learnable Online Graph Representations for 3D Multi-Object Tracking

zaech — Mon, 23 May 2022 16:41:28 +0000

Jan-Nico Zaech, Dengxin Dai, Alexander Liniger, Martin Danelljan, Luc Van Gool

International Conference on Robotics and Automation Workshops, ICRA 2022

Abstract

Tracking of objects in 3D is a fundamental task in computer vision that finds use in a wide range of applications
such as autonomous driving, robotics or augmented reality. Most recent approaches for 3D multi object tracking (MOT) from LIDAR use object dynamics together with a set of handcrafted features to match detections of objects. However, manually designing such features and heuristics is cumbersome and often leads to suboptimal performance. In this work, we instead strive towards a unified and learning based approach to the 3D MOT problem. We design a graph structure to jointly process detection and track states in an online manner. To this end, we employ a Neural Message Passing network for data association that is fully trainable. Our approach provides a natural way for track initialization and handling of false positive detections, while significantly improving track stability. We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.

Unsupervised robust domain adaptation without source data

zaech — Tue, 04 Jan 2022 20:39:34 +0000

Peshal Agarwal, Danda Pani Paudel, Jan-Nico Zaech, Luc Van Gool

Winter Conference on Applications of Computer Vision, WACV 2022

Abstract

We study the problem of robust domain adaptation in the context of unavailable target labels and source data. The considered robustness is against adversarial perturbations. This paper aims at answering the question of finding the right strategy to make the target model robust and accurate in the setting of unsupervised domain adaptation without source data. The major findings of this paper are:(i) robust source models can be transferred robustly to the target;(ii) robust domain adaptation can greatly benefit from non-robust pseudo-labels and the pair-wise contrastive loss. The proposed method of using non-robust pseudo-labels performs surprisingly well on both clean and adversarial samples, for the task of image classification. We show a consistent performance improvement of over 10% in accuracy against the tested baselines on four benchmark datasets.

Decoder fusion RNN: Context and interaction aware decoders for trajectory prediction

zaech — Mon, 27 Sep 2021 20:47:00 +0000

Edoardo Mello Rella, Jan-Nico Zaech, Alexander Liniger, Luc Van Gool

International Conference on Intelligent Robots and Systems, IROS 2021

Abstract

Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others’ actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our network is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We design a map encoder that embeds polyline segments, combines them to create a graph structure, and merges their relevant parts with
the agents’ embeddings. We fuse the encoded map information with further inter-agent interactions only inside the decoder and propose to use explicit training as a method to effectively utilize the information available. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.

Action sequence predictions of vehicles in urban environments using map and social context

zaech — Tue, 27 Oct 2020 21:05:00 +0000

Jan-Nico Zaech, Dengxin Dai, Alexander Liniger, Luc Van Gool

International Conference on Intelligent Robots and Systems, IROS 2020

Abstract

This work studies the problem of predicting the sequence of future actions for surrounding vehicles in real-world driving scenarios. To this aim, we make three main contributions. The first contribution is an automatic method to convert the trajectories recorded in real-world driving scenarios to action sequences with the help of HD maps. The method enables automatic dataset creation for this task from large-scale driving data. Our second contribution lies in applying the method to the well-known traffic agent tracking and prediction dataset Argoverse, resulting in 228,000 action sequences. Additionally, 2,245 action sequences were manually annotated for testing. The third contribution is to propose a novel action sequence prediction method by integrating past positions and velocities of the traffic agents, map information and social context into a single end-to-end trainable neural network. Our experiments prove the merit of the data creation method and the value of the created dataset – prediction performance improves consistently with the size of the dataset and shows that our action prediction method outperforms comparing models.

Texture Underfitting for Domain Adaptation

zaech — Mon, 28 Oct 2019 19:05:45 +0000

Jan-Nico Zaech, Dengxin Dai, Martin Hahner, Luc Van Gool

Inteligent Transportation Systems Conference (IEEE), ITSC 2019

Abstract

Comprehensive semantic segmentation is one of the key components for robust scene understanding and a
requirement to enable autonomous driving. Driven by large scale datasets, convolutional neural networks show impressive results on this task. However, a segmentation algorithm generalizing to various scenes and conditions would require an enormously diverse dataset, making the labour intensive data acquisition and labeling process prohibitively expensive. Under the assumption of structural similarities between segmentation
maps, domain adaptation promises to resolve this challenge by transferring knowledge from existing, potentially simulated datasets to new environments where no supervision exists. While the performance of this approach is contingent on the concept that neural networks learn a high level understanding of scene structure, recent work suggests that neural networks are biased towards overfitting to texture instead of learning structural and shape information. Considering the ideas underlying semantic segmentation, we employ random image stylization to augment the training dataset and propose a training procedure that facilitates texture underfitting to improve the performance of domain adaptation. In experiments with supervised as well as unsupervised methods for the task of synthetic-to-real domain adaptation, we show that our approach outperforms conventional training methods.

Learning to avoid poor images: Towards task-aware C-arm cone-beam CT trajectories

zaech — Sun, 13 Oct 2019 08:48:48 +0000

Jan-Nico Zaech, Cong Gao, Bastian Bier, Russell Taylor, Andreas Maier, Nassir Navab, Mathias Unberath

Medical Image Computing and Computer Assisted Intervention, MICCAI 2019

Abstract

Metal artifacts in computed tomography (CT) arise from a mismatch between physics of image formation and idealized assumptions during tomographic reconstruction. These artifacts are particularly strong around metal implants, inhibiting widespread adoption of 3D cone-beam CT (CBCT) despite clear opportunity for intra-operative verification of implant positioning, e. g. in spinal fusion surgery. On synthetic and real data, we demonstrate that much of the artifact can be avoided by acquiring better data for reconstruction in a task-aware and patient-specific manner, and describe the first step towards the envisioned task-aware CBCT protocol. The traditional short-scan CBCT trajectory is planar, with little room for scene-specific adjustment. We extend this trajectory by autonomously adjusting out-of-plane angulation. This enables C-arm source trajectories that are scene-specific in that they avoid acquiring ”poor images”, characterized by beam hardening, photon starvation, and noise. The recommendation of ideal out-of-plane angulation is performed on-the-fly using a deep convolutional neural network that regresses a detectability-rank derived from imaging physics.