List of publications

Selected publications

On the Application of Convex Transforms to Metric Search

Scalable similarity search in metric spaces relies on using the mathematical properties of the space in order to allow efficient querying. Most important in this context is the triangle inequality property, which can allow the majority of individual similarity comparisons to be avoided for a given query. However many important metric spaces, typically those with high dimensionality, are not amenable to such techniques. In the past convex transforms have been studied as a pragmatic mechanism which can overcome this effect; however the problem with this approach is that the metric properties may be lost, leading to loss of accuracy. Here, we study the underlying properties of such transforms and their effect on metric indexing mechanisms. We show there are some spaces where certain transforms may be applied without loss of accuracy, and further spaces where we can understand the engineering tradeoffs between accuracy and efficiency. We back these observations with experimental analysis. To highlight the value of the approach, we show three large spaces deriving from practical domains whose dimensionality prevents normal indexing techniques, but where the transforms applied give scalable access with a relatively small loss of accuracy.

The Mathematical Language of Quantum Theory

This book presents a clear and detailed exposition of the fundamental concepts of quantum theory: states, effects, observables, channels and instru- ments. It introduces several up-to-date topics, such as state discrimination, quantum tomography, measurement disturbance and entanglement distillation. A separate chapter is devoted to quantum entanglement. The theory is illustrated with numerous examples, reflecting recent developments in the field. The treatment emphasises quantum information, though its general approach makes it a useful resource for graduate students and researchers in all subfields of quantum theory. Focusing on mathematically precise formulations, the book summarises the relevant mathematics.

Redakční systém odborného časopisu s podporou exportu do digitální knihovny

Production workflow of publishing scientific, especially mathematical journals is based on TeX and related technologies. Publisher usually prepare and make papers available electronically in a digital library, optimized for digital delivery and eventually for reading too. Paper describe designed and implemented production workflow of several mathematical journals that archive their production in the Czech Digital Mathematics Library DML-CZ, which is subsequently available in the European Digital Mathematics Library EuDML.

Combining Cache and Priority Queue to Enhance Evaluation of Similarity Search Queries

A variety of applications have been using content-based similarity search techniques. Higher effectiveness of the search can be, in some cases, achieved by submitting multiple similar queries. We propose new approximation techniques that are specially designed to enhance the trade-off between the effectiveness and the efficiency of multiple k-nearest-neighbors queries. They combine the probability of an indexed object to be a part of the precise query result and the time needed to examine the object. This enables us to improve processing times while maintaining the same query precision as compared to the traditional approximation technique without the proposed optimizations.

MotionMatch: Motion Recognition Technology

MotionMatch is a software technology for recognizing persons according to the way they walk. The recognition process is based on analysis of motion capture data which can be acquired by motion capturing devices, including popular Microsoft Kinect and ASUS Xtion. The acquired data are firstly preprocessed by detecting walking cycles and extracting movement features in form of relative velocities of the specific joints for each walking cycle. Then individual walking cycles can be mutually compared to calculate their similarity. A proposed classification method is finally used to recognize the person who has performed a query motion. The current version of the MotionMatch technology is demonstrated via a web application that allows users to select a query motion and verify whether the technology recognizes the query person correctly.

DISA at ImageCLEF 2014 Revised: Search-based Image Annotation with DeCAF Features

This paper constitutes an extension to the report on DISA MU team participation in the ImageCLEF 2014 Scalable Concept Image Annotation Task as published in [3]. Specifically, we introduce a new similarity search component that was implemented into the system, report on the results achieved by utilizing this component, and analyze the influence of different similarity search parameters on the annotation quality.

Employing Subsequence Matching in Audio Data Processing

We overview current problems of audio retrieval and time-series subsequence matching. We discuss the usage of subsequence matching approaches in audio data processing, especially in automatic speech recognition (ASR) area and we aim at improving performance of the retrieval process. To overcome the problems known from the time-series area like the occurrence of implementation bias and data bias we present a Subsequence Matching Framework as a tool for fast prototyping, building, and testing similarity search subsequence matching applications. The framework is build on top of MESSIF (Metric Similarity Search Implementation Framework) and thus the subsequence matching algorithms can exploit advanced similarity indexes in order to significantly increase their query processing performance. To prove our concept we provide a design of query-by-example spoken term detection type of application with the usage of phonetic posteriograms and subsequence matching approach.

Binary Sketches for Secondary Filtering

This paper addresses the problem of matching the most similar data objects to a given query object. We adopt a generic model of similarity that involves the domain of objects and metric distance functions only. We examine the case of a large dataset in a complex data space which makes this problem inherently difficult. Many indexing and searching approaches have been proposed but they have often failed to efficiently prune complex search spaces and access large portions of the dataset when evaluating queries. We propose an approach to enhancing the existing search techniques so as to significantly reduce the number of accessed data objects while preserving the quality of the search results. In particular, we extend each data object with its sketch, a short binary string in Hamming space. These sketches approximate the similarity relationships in the original search space, and we use them to filter out non-relevant objects not pruned by the original search technique. We provide a probabilistic model to tune the parameters of the sketch-based filtering separately for each query object. Experiments conducted with different similarity search techniques and real-life datasets demonstrate that the secondary filtering can speed-up similarity search several times.

Searching for Variable-Speed Motions in Long Sequences of Motion Capture Data

Motion capture data digitally represent human movements by sequences of body configurations in time. Subsequence searching in long sequences of such spatio-temporal data is difficult as query-relevant motions can vary in execution speeds and styles and can occur anywhere in a very long data sequence. To deal with these problems, we employ a fast and effective similarity measure that is elastic. The property of elasticity enables matching of two overlapping but slightly misaligned subsequences with a high confidence. Based on the elasticity, the long data sequence is partitioned into overlapping segments that are organized in multiple levels. The number of levels and sizes of overlaps are optimized to generate a modest number of segments while being able to trace an arbitrary query. In a retrieval phase, a query is always represented as a single segment and fast matched against segments within a relevant level without any costly post-processing. Moreover, visiting adjacent levels makes possible subsequence searching of time-warped (i.e., faster or slower executed) queries. To efficiently search on a large scale, segment features can be binarized and segmentation levels independently indexed. We experimentally demonstrate effectiveness and efficiency of the proposed approach for subsequence searching on a real-life dataset.

All publications



Total number of publications: 2


Responsible contact: test@email.cz

You are running an old browser version. We recommend updating your browser to its latest version.

More info