Google Street View – Soon in 3D?

Some Google Street View cars were spotted in Italy this morning. Anyone who works in robotics will immediately notice the SICK laser scanners. It looks like we can expect 3D city data from Google sometime soon. Very interesting!

Street View car spotted in Rome

More pictures of the car here, here and here.

The cars have two side-facing vertical scanners, and another forward-facing horizontal scanner. Presumably they will do scan matching with the horizontal laser, and use that to align the data from the side-facing lasers to get some 3D point clouds. Typical output will look like this (video shows data collected from a similar system built by one of my labmates.)

The other sensors on the pole seem to have been changed too. Gone are the Ladybug2 omnidirectional cameras used on the American and Australian vehicles, replaced by what looks like a custom camera array. This photo also shows a third sensor, which I can’t identify.

So, what is Google doing with 3D laser data? The obvious application is 3D reconstruction for Google Earth. Their current efforts to do this involve user-generated 3D models from Sketchup. They have quite a lot of contributed models, but there is only so far you can get with an approach like that. With an automated solution, they could go for blanket 3D coverage. For an idea of what the final output might look like, have a look at the work of Frueh and Zakhor at Berkeley. They combined aerial and ground based laser with photo data to create full 3D city models. I am not sure Google will go to quite this length, but it certainly looks like they’re made a start on collecting the street-level data. Valleywag claims Google are hiring 300 drivers for their European data gathering efforts, so they will soon be swimming in laser data.

Frueh and Zakhor 3D city model

 

Google aren’t alone in their 3D mapping efforts. Startup Earthmine has been working on this for a while, using a stereo-vision based approach (check out their slick video demonstrating the system). I also recently built a street-view car myself, to gather data for my PhD research. One way or another, it looks like online maps are headed to a new level in the near future.

Update:  Loads more sightings of these cars, all over the world. San Francisco, Oxford, all over Spain. Looks like this is a full-scale data gathering effort, rather than a small test project.

Clever Feet

Check out this great TED talk by UC Berkeley biologist Robert Full. His subject is feet – or rather, all the clever ways animals have evolved to turn leg power into forward motion.
It’s a short, fun talk, and rather nicely makes the point that the secret to success for many of nature’s creations resides not in sensing or intelligence, but in good mechanical design. The nice thing about this is that nature’s mechanical innovations are much easier to duplicate than her neurological ones. The talk ends with examples of robotic applications, such as Boston Dynamics’ cockroach-inspired RHex and Stanford’s gecko-inspired climbing robots.

Hat tip: Milan

OpenGL Invades the Real World

Augmented reality systems are beginning to look pretty good these days. The videos below show some recent results from an ISMAR paper by Georg Klein. The graphics shown are inserted directly into the live video stream, so that you can play with them as you wave the camera around. To do this, the system needs to know where the camera is, so that it can render the graphics with the right size and position. Figuring out the camera motion by tracking features in the video turns out to be not that easy, and people have been working on it for years. As you can see below, the current crop of solutions are pretty solid, and run at framerate too. More details on Georg’s website.

You need to a flashplayer enabled browser to view this YouTube video

You need to a flashplayer enabled browser to view this YouTube video

Back in 2005, Andy Davison’s original augmented reality system got me excited enough that I decided to do a PhD. The robustness of these systems has improved a lot since then, to the point where they’re a fairly short step from making good AR games possible. In fact, there are a few other cool computer-vision based game demos floating around the lab at the moment. It’s easy to see this starting a new gaming niche. Basic vision-based games have been around for a while, but the new systems really are a shift in gear.

There are still some problems to be ironed out – current systems don’t deal with occlusion at all, for example. You can see some other issues in the video involving moving objects and repetitive texture. Still, it looks like they’re beginning to work well enough to start migrating out of the lab. First applications will definitely be of the camera-and-screen variety. Head-mounted display style systems are still some way off; the reason being that decent displays just don’t seem to exist right now.

(For people who wonder what this has to do with robotics – the methods used for tracking the environment here are basically identical to those used for robot navigation over larger scales.)

Citation: Parallel Tracking and Mapping for Small AR Workspaces“, Georg Klein and David Murray, ISMAR 2007.

Big Dog on Ice

Boston Dynamics just released a new video of Big Dog, their very impressive walking robot. This time it tackles snow, ice and jumping, as well as its old party trick of recovering after being kicked. Apparently it can carry 150 Kg too. This is an extremely impressive demo – it seems light-years ahead of other walking robot’s I’ve seen.

You need to a flashplayer enabled browser to view this YouTube video

I must admit to having almost no idea how the robot works. Apparently it uses joint sensors, foot pressure, gyroscope and stereo vision. Judging from the speed of the reactions, I doubt vision plays much of a role. It looks like the control is purely reactive – the robot internally generates a simple gait (ignoring the environment), and then responds to disturbances to try and keep itself stable. While they’ve obviously got a pretty awesome controller, even passive mechanical systems can be surprisingly stable with good design – have a look at this self-stabilizing bicycle.

The one part of the video where it looks like the control isn’t purely reactive is the sped-up sequence towards the end where it climbs over building rubble. There it does seem to be choosing its foot placement. I would guess they’re just beginning to integrate some vision information. Unsurprisingly, walking with planning is currently much slower than “walking by moving your legs”.

Either way, I guess DARPA will be suitably impressed.

Update: More details on how the robot works here.

Deep Learning

After working in robotics for a while, it becomes apparent that despite all the recent progress, the underlying machine learning tools we have at our disposal are still quite primitive. Our standard stock of techniques like Support Vector Machines and boosting methods are both more than ten years old, and while you can do some neat things with them, in practice they are limited in the kind of things they can learn efficiently. There’s been lots of progress since the techniques were first published, particularly through careful design of features, but to get beyond the current plateau it feels like we’re going to need something really new.

For a glimmer of what “something new” might look like, I highly recommend this wonderful Google Tech Talk by Geoff Hinton: “The Next Generation of Neural Networks“, where he discusses restricted Boltzmann machines. There are some stunning results, and an entertaining history of learning algorithms, during which he amusingly dismisses SVMs as “a very clever type of Perceptron“. There’s a more technical version of the talk in this NIPS tutorial, along with a workshop on the topic. Clearly the approach scales beyond toy problems – they have an entry sitting high on the Netflix Prize leaderboard.

These results with deep architectures are very exciting. Neural network research has effectively been abandoned by most of the machine learning community for years, partly becuase SVMs work so well, and partly because there was no good way to train multi-layer networks. SVMs were very pleasant to work with – there was no parameter tuning and black magic involved, you just throw data at them and press start. However, it seems clear that to make real progress we’re going to have to return to multi-layer learning architectures at some point. It’s good to see progress in that direction.

Hat tip: Greg Linden

More from ISRR

ISRR finished today. It’s been a good conference, low on detailed technical content, but high on interaction and good for an overview of parts of robotics I rarely get to see.

One of the highlights of the last two days was a demo from Japanese robotics legend Shigeo Hirose, who put on a show with his ACM R5 swimming snake robot in the hotel’s pool. Like many Japanese robots, it’s remote controlled rather than autonomous, but it’s a marvellous piece of mechanical design. Also on show was a hybrid roller-walker robot and some videos of a massive seven-ton climbing robot for highway construction.

You need to a flashplayer enabled browser to view this YouTube video

Another very interesting talk with some neat visual results was given by Shree Nayar, on understanding illumination in photographs. If you take a picture of a scene, the light that reaches the camera can be thought of as having two components – direct and global. The “direct light” leaves the light source and arrives at the camera via a single reflection off the object. The “global light” takes more complicated paths, for example via multiple reflections, subsurface scatter, volumetric scatter, etc. What Nayar showed was that by controlling the illumination, it’s possible to separate the direct and global components of the lighting. Actually, this turns out to be almost embarrassingly simple to do – and it produces some very interesting results. Some shown below, and many more here. It’s striking how much the direct-only photographs look like renderings from simple computer graphics systems like OpenGL. Most of the reason early computer graphics looked unrealistic was due to the difficulty of modelling the global illumination component. The full paper is here.

Scene Direct Global

Lots of other great technical talks too, but obviously I’m biased towards posting about the ones with pretty pictures!

Citation: “Visual Chatter in the Real World”, S. Nayar et. al., ISRR 2007

ISRR Highlights – Day 1

I’m currently in Hiroshima, Japan at ISRR. It’s been a good conference so far, with lots of high quality talks. I’m also enjoying the wonderful Japanese food (though fish for breakfast is a little strange).

One of the most interesting talks from Day 1 was about designing a skin-like touch sensor. The design is ingeniously simple, consisting of a layer of urethane foam with some embedded LEDs and photodiodes. The light from the LED scatters into the foam and is detected by the photodiode. When the foam is deformed by pressure, the amount of light reaching the photodiode changes. By arranging an array of these sensing sites under a large sheet of foam, you get a skin-like large-area pressure sensor. The design is simple, cheap, and appears to be quite effective.

Principle of the Sensor

Having a decent touch sensor like this is important. People rely on their sense of touch much more than they realize – one of the presenters demonstrated this by showing some videos of people trying to perform simple mechanical tasks with anaesthetised sensory neurons (they weren’t doing well). Walking robots weren’t getting very far until people realized the importance of having pressure sensors in the soles of the feet.

The authors were able to show some impressive new abilities with a humanoid robot using their sensor. Unfortunately I can’t find their videos online, but the below figure shows a few frames of the robot picking up a 30KG load. Using its touch sensor the robot can steady itself against the table, which helps with stability.

Touching the washing

I get the impression that the sensor is limited by the thickness of the foam – too thick to use on fingers for example. It’s also a long way from matching the abilities of human skin, which has much higher resolution and sensitivity to other stimuli like heat, etc. Still, it’s a neat technology!

Update: Here’s another image of the robot using it’s touch sensor to help with a roll-and-rise manoeuvre. There’s a video over at BotJunkie.

Citation:Whole body haptics for augmented humanoid task capabilities“, Yasuo Kuniyoshi, Yoshiyuki Ohmura, and Akihiko Nagakubo, International Symposium on Robotics Research 2007.

Off to ISRR

For the next week I’ll be in Japan attending the International Symposium of Robotics Research. Should be lots of fun, and a good time to find out about all the new developments outside of my little corner of the robot research universe. Come say hello if you’re at the conference.

Urban Challenge Winners Announced

1st Place – Tartan Racing (Carnegie Mellon)

2nd Place – Stanford Racing Team

3rd Place – Victor Tango (Virginia Tech)

That’s all the info on the web at the moment. More details should be available soon. Check Wired or TGDaily.

Update:
The details are out (video, photos, more photos). The biggest surprise was that final ordering came down to time alone; no team was penalized for violating the rules of the road (I wonder if this can be correct – the webcast showed Victor Tango mounting the kerb at one point). On adjusted time, Tartan was about 20 minutes ahead of Stanford, with Victor Tango another 20 minutes behind. MIT placed fourth.

DARPA director Tony Tether seemed to state quite strongly that this will be the final Grand Challenge. I’d wait and see on that front. I seem to remember similar statements after the 2005 Challenge. It’s possible the event will continue, but not under DARPA. What exactly the subject of a future challenge would be is not obvious. There’s still a lot of work to be done to build reliable autonomous vehicles, but many of the basics have now been covered. The Register reports that Red Whittaker is proposing an endurance event to test performance over a longer period with variable weather conditions. I think maybe a more interesting challenge would be to raise the bar on sensing issues. Right now the teams are heavily reliant on pre-constructed digital maps and GPS. In principle, there’s no reason they couldn’t run without GPS using only a normal road-map, but taking the crutches away would force the teams to deal with some tough issues. It’s a significant step up, but no worse than the jump between the 2005 Challenge and yesterday.

Whatever DARPA decides to do, I hope they don’t make the mistake of walking away from this prematurely. The Grand Challenges have built up a big community of researchers around autonomous vehicles. They’re also priceless PR for science and engineering in general. I think the teams are resourceful enough to find funding for themselves, but without the crucial ingredient of a public challenge to work toward, things may lose momentum. The next time a politician frets about the low uptake of science courses, I hope someone suggests funding another Grand Challenge.

Six Robots Cross the Line

The Urban Challenge is over – six of the eleven finalists completed the course. Stanford, Tartan Racing and Victor Tango all finished within a few minutes of each other, just before DARPA’s 6-hour time limit. Ben Franklin, MIT and Cornell also finished, but it looks like they were outside the time limit. The DARPA judges have now got to collate all the data about how well the robots obeyed the rules of the road, and will announce a winner tomorrow morning. It’s going to be very close. From watching the webcast, it looks like either Stanford or Tartan Racing will take the top spot, but making the call between them will be very hard. Both put in almost flawless performances.

Junior Finishes

Six hours of urban driving, without any human intervention, is quite a remarkable feat. In fact, watching the best cars, I quickly forgot that they weren’t being driven by humans. That’s really quite amazing. I can’t think of any higher praise for the achievement of the competitors.

There were some thrills and spills along the way – TerraMax rammed a building, TeamUCF wandered into a garden, and MIT and Cornell had a minor collision. Once the weaker bots were eliminated though, everything went remarkably smoothly. The last four or five hours passed almost without event. MIT clearly had some trouble, randomly stopping and going very slowly on the off-road sections (looks like their sensor thresholds were set too low), but they’re a first time entry, so getting to the finish line at all is a major achievement.

DARPA put on a extremely professional event. In an interview after the finish, DARPA director Tony Tether said he didn’t expect to run another Challenge. It will be interesting to see where autonomous driving goes from here. The top teams have clearly made huge progress, but the technology is still a long way from the point where you’d let it drive the kids to school. Many things about the challenge were simplified – there were no pedestrians to avoid, and the vehicles all had precise pre-constructed digital maps of the race area, specifying things like where the stop lines were. Putting this technology to use in the real world is still some distance away, but much closer than anyone would have imagined five years ago.