Semantic Perception, Mapping and Exploration (SPME)
St. Paul, Minnesota, USA, May 14th, 2012


  • Presentations of the workshop speakers are now available online!
  • The program for SPME 2012 is now available below!
  • The workshop proceedings will be made available soon.


As robots and autonomous systems move away from laboratory setups towards complex real-world scenarios, both the perception capabilities of these systems and their abilities to acquire and model semantic information must become more powerful. The autonomous acquisition of information, the extraction of semantic models, and exploration strategies for deciding where and how to acquire the most relevant information pertinent to a specific semantic model are the research foci of an annual series of workshops at ICRA, called Semantic Perception, Mapping and Exploration (SPME).

Semantic perception for intelligent systems such as robots has seen a lot of progress recently, with many new and interesting techniques being developed in parallel by different research groups. Moreover, with the advent of inexpensive and accurate 3D imaging sensors, there has been an explosion of interest in 3D point clouds across a broad range of people. Not neglecting this trend, this edition of the workshop series puts a special focus on (3D) Semantic Perception.

While there is a lot of work on 3D perception that is freely available, and initiatives as PCL and 3DTK are enabling the community to build on previous results in order to push the frontiers for researchers further, there are some open questions left to be answered. There is no consensus yet emerging on the standard solutions, features and algorithms needed for semantic perception, mapping and exploration, and if the current approaches are viable on the long run. This workshop provides the venue for discussing the definition and uses of semantic information for and by perception, and to identify the most important directions of future research and development of new tools that would aid it.

Please download and distribute our call for papers.

Important Dates

  • April 01, 2012 - Submissions Due
  • April 16, 2012 - Notification of Acceptance
  • May 01, 2012 - Final Papers Due
  • May 14, 2012 - Workshop at ICRA


09:00 Introduction (by the workshop organizers) 
Marianna Madry, Dan Song, Carl Henrik Ek and Danica Kragic
09:30  Invited talk: Andrzej Pronobis, Alper Aydemir, Kristoffer Sjöö and Patric Jensfelt 
Exploiting Semantics in Mobile Robotics

Robots have finally escaped from industrial workplaces and made their way into our homes, offices and public spaces. In order to realize the dream of robot assistants performing human tasks together with humans in a seamless fashion, we need to provide them with the fundamental capability of understanding complex and unstructured environments. In this talk, we will provide an overview of our recent work on semantic spatial understanding and exploiting semantic knowledge for generating more informed and efficient robot behavior in human environments. We will start by presenting our spatial knowledge modeling framework and continue with methods of acquiring semantic world descriptions, abstracting and reasoning about object locations, topology and segments of space. Finally, we will show that semantic knowledge can indeed improve performance of a robot on the task of large-scale object search and make the robot's behavior much more intuitive and human-like.
Invited talk: Wolfram Burgard
Techniques for Object Recognition from Range Data

In this talk we address the problem of object recognition in 3D point cloud data. We first present a novel interest point extraction method that operates on range images generated from arbitrary 3D point clouds. Our approach explicitly considers the borders of the objects according transitions from foreground to background. We furthermore introduce a corresponding feature descriptor. We present rigorous experiments in which we analyze the usefulness our method for object detection. We furthermore describe a novel algorithm for constructing a compact representation of 3D point clouds. Our approach extracts an alphabet of local scans from the scene. The words of this alphabet are then used to replace recurrent local 3D structures, which leads to a substantial compression of the entire point cloud. We optimize our model in terms of complexity and accuracy by minimizing the Bayesian information criterion (BIC). Experimental evaluations on large real-world data show that our method allows us to accurately reconstruct environments with as few as 70 words. We finally discuss how this method can be utilized for object recognition and loop closure detection ind SLAM (Simultaneous Localization and Mapping).
Max Bajracharya, Jeremy Ma, Andrew Howard and Larry Matthies
Henrik Andreasson and Todor Stoyanov
Torsten Fiolka, Joerg Stueckler, Dominik Alexander Klein, Dirk Schulz and Sven Behnke
Invited talk: Dave Meger and Jim Little
Finding Objects in Cluttered Scenes for Home Robotics

Semantic perception for robots is a primary focus at the Laboratory for Computational Intelligence at UBC. We have built systems that enable mobile robots to find objects using visual cues and to learn about shared workspaces. We've demonstrated these abilities on Curious George, our visually-guided mobile robot that has competed and won the Semantic Robot Vision Challenge at AAAI (2007), CVPR (2008) and ISVC (2009), in a completely autonomous visual search task.
Recently, our focus has turned towards several particular challenges facing semantic perception systems in real homes where objects are often hidden by others and can only be seen from a sub-set of viewpoints. We have presented methods to model visual appearance under varying viewpoint and occlusion. These models allow reasoning about objects and scenes in 3D, and give rise to active control strategies to aid perception. Our results show the promise of robust semantic perception in unstructured homes in the near future.
Invited talk: Dieter Fox
Features for RGB-D Object Recognition

Features are important components of object recognition systems. The combination of color and depth information given by RGB-D cameras provides opportunities for the development of improved features. In this talk, I will discuss our recent work on learning features for object recognition. Kernel descriptors provide a flexible framework for incorporating manually designed point features. Hierarchical matching pursuit uses sparse coding to learn features from raw, unlabeled RGB-D data. Both approaches achieve high accuracy on RGB-D object recognition tasks.
Evan Herbst, Xiaofeng Ren and Dieter Fox
Shane Griffith, Vladimir Sukhoy, Todd Wegter and Alexander Stoytchev
Parnian Alimi, David Meger and James Little
Invited talk: Aitor Aldoma, Walter Wohlkinger and Markus Vincze
3D Object Recognition and Categorization

Recognizing free-form shapes in clutter and occlusion is currently one of the most ambitious and challenging task in the field of 3D computer vision and robotics, given the typical distortions which 3D data undergoes due to noisy sensors, viewpoint changes and point density variations. Recently, research in the field of 3D object recognition has been fostered not only by the development of the aforementioned scenarios, but also by the availability of low-cost, real-time 3D sensors such as the Microsoft Kinect and the Asus Xtion. Several 3D object recognition approaches exist which are commonly grouped in global and local approaches: the first being characterized by highly efficient methods which however require segmentation capabilities to hypothesize about objects, while local approaches are able to handle cluttered and occluded objects without segmentation at the price of higher computational times due to less compact local representation of the models. In this talk, we present recent 3D features and descriptors, algorithms and complete recognition pipelines (both local and global) that has been successfully deployed for the task of 3D object recognition and categorization, as well as the estimation of the object 6DOF poses which enables robotics applications such as grasping.
17:00  Panel discussion and closing remarks (by the workshop organizers) 

The program for SPME 2012 is also available here:


We solicit paper submissions, optionally accompanied by a video, both of which will be reviewed (not double-blind) by the program committee. The review criteria will be: technical quality, significance of system demonstration, and topicality. We aim to accept 9 to 12 papers for oral presentation at the meeting. Papers should be up to 6 pages in length, and formatted according to the IEEE ICRA style. Videos will be shown during an afternoon video session open to the public.

Accepted papers and videos will be assembled into proceedings that are going to be published online, and distributed in CD format at the workshop. In addition, we will pursue publication of a special journal issue to include the best papers.

This edition of the annual workshop series focuses on (3D) semantic perception. Topics of interest include, but are not necessarily limited to: 
  • Extracting semantic information from visual sensors, 3D sensors, or different sensor modalities
  • Semantic scene interpretation (and decomposition into parts of interest)
  • Semantic object perception (incl. localization, identification, anchoring)
  • Categorization or classification of objects, rooms, and environments
  • Modeling (of objects and environments), registration using semantic information etc.
  • Specifying and exploiting background knowledge for semantic perception and mapping
All papers must be submitted electronically as PDF files through the Easychair submission system. In case of problems or larger video attachments, contact the organizers: [email protected]

Invited Talks

The workshop will feature several invited talks from key researchers in the field:
  • Wolfram Burgard, University of Freiburg, Germany
  • Aitor Aldoma, Walter Wohlkinger and Markus Vincze, TU Vienna, Austria
  • Dieter Fox, University of Washington, USA
  • Andrzej Pronobis, Alper Aydemir, Kristoffer Sjöö and Patric Jensfelt, KTH Royal Institute of Technology, Sweden
  • Dave Meger and Jim Little, University of British Columbia, Canada

Program Committee

  • Francesco Amigoni, Politecnico di Milano, Italy
  • Michael Beetz, Technische Universitaet Muenchen, Germany
  • Sven Behnke, University of Bonn, Germany
  • Wolfram Burgard, University of Freiburg, Germany
  • Henrik Christensen, KUKA Chair of Robotics, RIM Georgia Tech, USA
  • Tom Duckett, University of Lincoln, UK
  • Joachim Hertzberg, University of Osnabrueck, Germany
  • Patric Jensfelt, KTH Royal Institute of Technology, Sweden
  • Kurt Konolige, Willow Garage, USA
  • Jim Little, University of British Columbia, Canada
  • Bhaskara Marthi, Willow Garage, USA
  • Alessandro Saffiotti, Ierebro University, Sweden
  • Markus Vincze, TU Vienna, Austria


Contact the organizers: [email protected]

Partner workshop at ICRA

This is the first of two workshops at ICRA dealing with semantic perception, and we would also like to draw your attention to the other workshop on Friday. It is a hands-on workshop focusing, amongst other topics, on narrowing down the definition of semantics, life-long learning in the context of semantic mapping, knowledge representations and higher-level perception. Accepted are posters and demos. For more information see: