Knowing: A Generic Data Analysis Application

We got another demo accepted:

Knowing: A Generic Data Analysis Application

Thomas Bernecker, Franz Graf, Hans-Peter Kriegel, Nepomuk Seiler, Christoph Türmer, Dieter Dill
To appear at 15th International Conference on Extending Database Technology (2012)
March 27-30, 2012, Berlin, Germany

Abstract:

Extracting knowledge from data is, in most cases, not restricted to the analysis itself but accompanied by preparation and post-processing steps. Handling data coming directly from the source, e.g. a sensor, often requires preconditioning like parsing and removing irrelevant information before data mining algorithms can be applied to analyze the data. Stand-alone data mining frameworks in general do not provide such components since they require a specified input data format. Furthermore, they are often restricted to the available algorithms or a rapid integration of new algorithms for the purpose of quick testing is not possible. To address this shortcoming, we present the data analysis framework Knowing, which is easily extendible with additional algorithms by using an OSGi compliant architecture. In this demonstration, we apply the Knowing framework to a medical monitoring system recording physical activity. We use the data of 3D accelerometers to detect activities and perform data mining techniques and motion detection to classify and evaluate the quality and amount of physical activities. In the presented use case, patients and physicians can analyze the daily activity processes and perform long term data analysis by using an aggregated view of the results of the data mining process. Developers can integrate and evaluate newly developed algorithms and methods for data mining on the recorded database.

BibTex

@INPROCEEDINGS{BerGraKriSeietal12,
  AUTHOR     = {T. Bernecker and F. Graf and H.-P. Kriegel and N. Seiler and C. Tuermer and D. Dill},
  TITLE      = {Knowing: A Generic Data Analysis Application},
  BOOKTITLE  = {Proceedings of the 15th International Conference on Extending Database Technology (EDBT), Berlin, Germany},
  YEAR       = {2012}
}

More informations will be published at the official publication site at the LMU.

Research Idea: Evaluation of Traffic Lane Detection with OpenStreetMap GPS Data

I am soon leaving University and thus the time for pure research will soon be over. Unfortunately I still have some ideas for possible research. I’ve tried getting them out of my head as this has not yet worked out, I’ll try to write them down – maybe somewone finds them interesting enough for a Bachelor-/Masterthesis or something like that …

Introduction

OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them. The project was started because most maps you think of as free actually have legal or technical restrictions on their use, holding back people from using them in creative, productive, or unexpected ways. The OpenStreetMap approach is comparable to Wikipedia where everyone can contribute content. In openStreetMap, registered users can edit the map directly by using different editors or indirectly by providing ground truth data in terms of GPS tracks following pathes or roads. A recent study shows, that the difference between OpenStreetMap’s street network coverage for car navigation in Germany and a comparable proprietary dataset was only 9% in June 2011.

In 2010, Yihua Chen and John Krumm have published a paper at ACM GIS about “Probabilistic Modeling of Traffic Lanes from GPS Traces“. Chen and Krum apply Gaussian micture Models (GMM) on a data set of 55 shuttle vehicles driving between the Microsoft corporate buildings in the Seattle area. The vehicles were tracked for an average of 12.7 days resulting in about 20 million GPS points. By applying their algorithm to this data, they were able to infer lane structures from the given GPS tracks.

Adding and validating lane attributes completely manually is a rather tedious task for humans – especially in cases of data sets like OpenStreetMap. Therefore it should be evaluated if the proposed algorithm could be applied to OpenStreetMap data in order to infer and/or validate lane attributes on existing data in an automatic or semiautomatic way.

Continue reading Research Idea: Evaluation of Traffic Lane Detection with OpenStreetMap GPS Data

SRTM Plugin for OpenStreetMap

One of the features of our TrafficMining Project at the LMU was to use additional attributes in routing queries. Standart routing queries usually just use time and distance for calculating the fastest/shortest routes. In order to have an additional attribute we decided to use evelation data as this might be an issue if you also want to take fuel cost into account or if you’re planning a bike tour (instead of crossing a hill, you might want to consider a longer tour that avoids crossing the hill).

The problem just was that data nodes from OpenStreetMap are defined mostly by id, latitude and longitude, which is totally enough for painting 2D maps and standard routing queries. As the elevation is not added to OpenStreetMap data directly (and it is also not intended to be added to the OSM data base), a component was needed that parses both Nasa SRTM data as well as OSM data files in order to combine the data.

In the first version, we parsed the SRTM data directly and addied it to the nodes of the OSM-Graph directly. During one refactoring, we decided to integtrate Osmosis into the project in order to be able to read PBF files (another OpenStreetMap file format). During this integration we decided to separate the SRTM parsing into a separate module so that other people can make use of it as well. The plugin was open sourced some time ago at google code as the “osmosis-srtm-plugin” under an LGPL licence.

Relevant Links:

TrafficMining Project goes open source

Quite some time ago I wrote about a little demo that was published at SIGMOD 2010 and SSTD 2011 (see post1 and post2).

The TrafficMining project could be described shortly as:

An academic framework for routing algorithms based on OpenStreetMapdata. Actually this framework is not intended to replace current routing applications but to provide an easy to use GUI for testing and developing new routing algorithms on real OpenStreetMap data.

Well, what makes this worth a post is the fact that we finally switched development over to GoogleCode with a discussion group at Google Groups.
GoogleCode has the major advantage of a Mercurial repository, an issue tracker, easy code reviews and an miproved possibility to contribute code. If you just want to follow the development, just join the google group or keep a bookmark to the project’s update feed.

By the way: the PAROS and MARiO downloads can be found there in the downloads section.

Finished my Posters for ICIP and MICCAI

Finally finished the posters for my publications:

F. Graf, H.-P. Kriegel, M. Schubert, S. Poelsterl, A. Cavallaro
2D Image Registration in CT Images using Radial Image Descriptors
In Medical Image Computing and Computer-Assisted Intervention (MICCAI), Toronto, Canada, 2011.

and

F. Graf, H.-P. Kriegel, M. Weiler
Robust Segmentation of Relevant Regions in Low Depth of Field Images
In Proceedings of the IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, 2011.

Maximum Gain Round Trips with Cost Constraints

The idea is the following: Finding the shortest/fastes path from A to B is rather exploited. But if you start a hike, knowing that you want to spend 4 hours and then come back to the starting point. Then the problem suddenly starts to become a bit complex (NP-hard to be honest if you do not add any constraints).

We propose a solution to do this kind of search a bit more efficient. but don’t expect linear search time 😉 And – in contrast to quite some other research – we are operating on REAL data obtained from OpenStreetMap.

Abstract:

Searching for optimal ways in a network is an important task in multiple application areas such as social networks, co-citation graphs or road networks. In the majority of applications, each edge in a network is associated with a certain cost and an optimal way minimizes the cost while fulfilling a certain property, e.g connecting a start and a destination node. In this paper, we want to extend pure cost networks to so-called cost-gain networks. In this type of network, each edge is additionally associated with a certain gain. Thus, a way having a certain cost additionally provides a certain gain. In the following, we will discuss the problem of finding ways providing maximal gain while costing less than a certain budget. An application for this type of problem is the round trip problem of a traveler: Given a certain amount of time, which is the best round trip traversing the most scenic landscape or visiting the most important sights? In the following, we distinguish two cases of the problem. The first does not control any redundant edges and the second allows a more sophisticated handling of edges occurring more than once. To answer the maximum round trip queries on a given graph data set, we propose unidirectional and bidirectional search algorithms. Both types of algorithms are tested for the use case named above on real world spatial networks.

Documents

At our project site you can find:

Bibtex

@TECHREPORT{GraKriSchu11,
  AUTHOR      = {F. Graf and H.-P. Kriegel and M. Schubert},
  TITLE       = {Maximum Gain Round Trips with Cost Constraints},
  INSTITUTION = {Institute for Informatics, Ludwig-Maximilians-University, Munich, Germany},
  YEAR        = {2011},
  LINK        = {http://arxiv.org/abs/1105.0830v1}
}

MARiO: Multi Attribute Routing in Open Street Map

Yeah, I got a new Publication accepted at Symposium on Spatial and Temporal Databases (SSTD) 2011 that is dealing with OpenStreetMap Data (using the JXMapKit and JXMapViewer).

MARiO: Multi Attribute Routing in Open Street Map

Franz Graf, Hans-Peter Kriegel, Matthias Schubert, Matthias Renz

Published at Symposium on Spatial and Temporal Databases (SSTD) 2011
Conference Date: August 24th – 26th, 2011
Conference Location: Minneapolis, MN, USA.

Abstract:

In recent years, the Open Street Map (OSM) project collected a large repository of spatial network data containing a rich variety of information about traffic lights, road types, points of interest etc.. Formally, this network can be described as a multi-attribute graph, i.e. a graph considering multiple attributes when describing the traversal of an edge. In this demo, we present our framework for Multi Attribute Routing in Open Street Map (MARiO). MARiO includes methods for preprocessing OSM data by deriving attribute information and integrating additional data from external sources. There are several routing algorithms already available and additional methods can be easily added by using a plugin mechanism. Since routing in a multi-attribute environment often results in large sets of potentially interesting routes, our graphical fronted allows various views to interactively explore query results.

Documents:

Bibtex

@INPROCEEDINGS{GraKriRenSch11,
  AUTHOR      = {F. Graf and H.-P. Kriegel and M. Renz and M. Schubert},
  TITLE       = {{MARiO}: Multi Attribute Routing in Open Street Map},
  BOOKTITLE   = {Proceedings of the 12th International Symposium on Spatial and Temporal Databases (SSTD), Minneapolis, MN, USA},
  YEAR        = {2011}
}

Robust Segmentation of Relevant Regions in Low Depth of Field Images

Great, we got accepted (as a poster) on the ICIP 2011 with the paper “Robust Segmentation of Relevant Regions in Low Depth of Field Images”:

Low depth of field (DOF) is an important technique to emphasize the object of interest (OOI) within an image. When viewing a low depth of field image, the viewer implicitly segments the image into region of interest and non regions of interest which has major impact on the perception of the image. Thus, robust algorithms for the detection of the OOI in low DOF images provide valuable information for subsequent image processing and image retrieval. In this paper we propose a robust and parameterless algorithm for the fully automatic segmentation of low depth of field images. We compare our method with three similar methods and show the superior robustness even though our algorithm does not require any parameters to be set by hand. The experiments are conducted on a real world data set with high and low depth of field images. (Abstract from the paper)

The work is a result of a collaboration with Michael Weiler. We extended his Diploma thesis and produced an improved segmentation algorithm for Low Depth Of Field images. Compared to the other 3 competing algorithms, ours is a bit slower but at least it works. The other algorithms turned out to be extremely unstable and/or sensitive to parameters.

On the project site you can find

  • an online demo
  • the test images,
  • the masks
  • the NetBeans project including the full Java source code for our algorithm and the reimplementation of the comparison partners (of course we had to re-implement as we didn’t even get binaries – as usual)

So if you plan to do some image segmentation, just go there download the stuff and cite our work 😉

Fully automatic detection of the vertebrae in 2D CT images – the Talk

Yea finally I gave the talk for my Publication “Fully automatic detection of the vertebrae in 2D CT images” Paper 7962-11 at SPIE Medical Imaging 2011, Conference 7962 Image Processing (see index) in front of about 200 people.

Everything went fine. Just some nice questions right after the talk and some hints afterwards. Hey – some guys even remembered the talk 2 days later! 🙂

Thanks, SPIE Medical Imaging.

Fully automatic detection of the vertebrae in 2D CT images

FINALLY submitted the paper which we sent to the SPIE Medical Imaging Conference  (and got accepted!)

Abstract:

Knowledge about the vertebrae is a valuable source of information for several annotation tasks. In recent years, the research community spent a considerable effort for detecting, segmenting and analyzing the vertebrae and the spine in various image modalities like CT or MR. Most of these methods rely on prior knowledge like the location of the vertebrae or other initial information like the manual detection of the spine. Furthermore, the majority of these methods require a complete volume scan. With the existence of use cases where only a single slice is available, there arises a demand for methods allowing the detection of the vertebrae in 2D images. In this paper, we propose a fully automatic and parameterless algorithm for detecting the vertebrae in 2D CT images. Our algorithm starts with detecting candidate locations by taking the density of bone-like structures into account. Afterwards, the candidate locations are extended into candidate regions for which certain image features are extracted. The resulting feature vectors are compared to a sample set of previously annotated and processed images in order to determine the best candidate region. In a final step, the result region is readjusted until convergence to a locally optimal position. Our new method is validated on a real world data set of more than 9 329 images of 34 patients being annotated by a clinician in order to provide a realistic ground truth.

More Information and the paper as PDF can be found at my publication site.