Archive for the 'New Research' Category

An Insider’s Guide to BigDog

In common with half of YouTube, I was mesmerized by the BigDog videos from Boston Dynamics earlier in the year, though I couldn’t say much about how the robot worked. For everyone hungry for some more technical details, check out the talk by Marc Raibert at Carnegie Mellon’s Field Robotics 25 event. There’s some interesting discussion of the design of the system, where’s it’s headed, and more great video.

There are a bunch of other worthwhile talks from the event. I particularly enjoyed Hugh Durrant-Whyte’s description of building a fully automated container terminal “without a graduate student in 1000km”.

Off to ICRA

For the next week I’ll be at the International Conference on Robotics and Automation in Pasadena. For Monday and Tuesday I’m going to the Future of Visual Navigation workshop. For the main conference I’ll be presenting my paper “Accelerated Appearance-Only SLAM“, with some new ideas for very fast inference in our FAB-MAP framework. We’ve also just released the software.

If you’re at the conference, come say hi!

OpenGL Invades the Real World

Augmented reality systems are beginning to look pretty good these days. The videos below show some recent results from an ISMAR paper by Georg Klein. The graphics shown are inserted directly into the live video stream, so that you can play with them as you wave the camera around. To do this, the system needs to know where the camera is, so that it can render the graphics with the right size and position. Figuring out the camera motion by tracking features in the video turns out to be not that easy, and people have been working on it for years. As you can see below, the current crop of solutions are pretty solid, and run at framerate too. More details on Georg’s website.

You need to a flashplayer enabled browser to view this YouTube video

You need to a flashplayer enabled browser to view this YouTube video

Back in 2005, Andy Davison’s original augmented reality system got me excited enough that I decided to do a PhD. The robustness of these systems has improved a lot since then, to the point where they’re a fairly short step from making good AR games possible. In fact, there are a few other cool computer-vision based game demos floating around the lab at the moment. It’s easy to see this starting a new gaming niche. Basic vision-based games have been around for a while, but the new systems really are a shift in gear.

There are still some problems to be ironed out - current systems don’t deal with occlusion at all, for example. You can see some other issues in the video involving moving objects and repetitive texture. Still, it looks like they’re beginning to work well enough to start migrating out of the lab. First applications will definitely be of the camera-and-screen variety. Head-mounted display style systems are still some way off; the reason being that decent displays just don’t seem to exist right now.

(For people who wonder what this has to do with robotics - the methods used for tracking the environment here are basically identical to those used for robot navigation over larger scales.)

Citation: Parallel Tracking and Mapping for Small AR Workspaces“, Georg Klein and David Murray, ISMAR 2007.

Deep Learning

After working in robotics for a while, it becomes apparent that despite all the recent progress, the underlying machine learning tools we have at our disposal are still quite primitive. Our standard stock of techniques like Support Vector Machines and boosting methods are both more than ten years old, and while you can do some neat things with them, in practice they are limited in the kind of things they can learn efficiently. There’s been lots of progress since the techniques were first published, particularly through careful design of features, but to get beyond the current plateau it feels like we’re going to need something really new.

For a glimmer of what “something new” might look like, I highly recommend this wonderful Google Tech Talk by Geoff Hinton: “The Next Generation of Neural Networks“, where he discusses restricted Boltzmann machines. There are some stunning results, and an entertaining history of learning algorithms, during which he amusingly dismisses SVMs as “a very clever type of Perceptron“. There’s a more technical version of the talk in this NIPS tutorial, along with a workshop on the topic. Clearly the approach scales beyond toy problems - they have an entry sitting high on the Netflix Prize leaderboard.

These results with deep architectures are very exciting. Neural network research has effectively been abandoned by most of the machine learning community for years, partly becuase SVMs work so well, and partly because there was no good way to train multi-layer networks. SVMs were very pleasant to work with - there was no parameter tuning and black magic involved, you just throw data at them and press start. However, it seems clear that to make real progress we’re going to have to return to multi-layer learning architectures at some point. It’s good to see progress in that direction.

Hat tip: Greg Linden

More from ISRR

ISRR finished today. It’s been a good conference, low on detailed technical content, but high on interaction and good for an overview of parts of robotics I rarely get to see.

One of the highlights of the last two days was a demo from Japanese robotics legend Shigeo Hirose, who put on a show with his ACM R5 swimming snake robot in the hotel’s pool. Like many Japanese robots, it’s remote controlled rather than autonomous, but it’s a marvellous piece of mechanical design. Also on show was a hybrid roller-walker robot and some videos of a massive seven-ton climbing robot for highway construction.

You need to a flashplayer enabled browser to view this YouTube video

Another very interesting talk with some neat visual results was given by Shree Nayar, on understanding illumination in photographs. If you take a picture of a scene, the light that reaches the camera can be thought of as having two components - direct and global. The “direct light” leaves the light source and arrives at the camera via a single reflection off the object. The “global light” takes more complicated paths, for example via multiple reflections, subsurface scatter, volumetric scatter, etc. What Nayar showed was that by controlling the illumination, it’s possible to separate the direct and global components of the lighting. Actually, this turns out to be almost embarrassingly simple to do - and it produces some very interesting results. Some shown below, and many more here. It’s striking how much the direct-only photographs look like renderings from simple computer graphics systems like OpenGL. Most of the reason early computer graphics looked unrealistic was due to the difficulty of modelling the global illumination component. The full paper is here.

Scene Direct Global

Lots of other great technical talks too, but obviously I’m biased towards posting about the ones with pretty pictures!

Citation: “Visual Chatter in the Real World”, S. Nayar et. al., ISRR 2007

ISRR Highlights - Day 1

I’m currently in Hiroshima, Japan at ISRR. It’s been a good conference so far, with lots of high quality talks. I’m also enjoying the wonderful Japanese food (though fish for breakfast is a little strange).

One of the most interesting talks from Day 1 was about designing a skin-like touch sensor. The design is ingeniously simple, consisting of a layer of urethane foam with some embedded LEDs and photodiodes. The light from the LED scatters into the foam and is detected by the photodiode. When the foam is deformed by pressure, the amount of light reaching the photodiode changes. By arranging an array of these sensing sites under a large sheet of foam, you get a skin-like large-area pressure sensor. The design is simple, cheap, and appears to be quite effective.

Principle of the Sensor

Having a decent touch sensor like this is important. People rely on their sense of touch much more than they realize - one of the presenters demonstrated this by showing some videos of people trying to perform simple mechanical tasks with anaesthetised sensory neurons (they weren’t doing well). Walking robots weren’t getting very far until people realized the importance of having pressure sensors in the soles of the feet.

The authors were able to show some impressive new abilities with a humanoid robot using their sensor. Unfortunately I can’t find their videos online, but the below figure shows a few frames of the robot picking up a 30KG load. Using its touch sensor the robot can steady itself against the table, which helps with stability.

Touching the washing

I get the impression that the sensor is limited by the thickness of the foam - too thick to use on fingers for example. It’s also a long way from matching the abilities of human skin, which has much higher resolution and sensitivity to other stimuli like heat, etc. Still, it’s a neat technology!

Update: Here’s another image of the robot using it’s touch sensor to help with a roll-and-rise manoeuvre. There’s a video over at BotJunkie.

Citation:Whole body haptics for augmented humanoid task capabilities“, Yasuo Kuniyoshi, Yoshiyuki Ohmura, and Akihiko Nagakubo, International Symposium on Robotics Research 2007.