Mobile Manipulation Made Easy

GetRobo has an interesting interview with Brian Gerkey of Willow Garage. Willow Garage are a strange outfit – a not-for-profit company developing open source robotic hardware and software, closely linked to Stanford. They’re funded privately by a dot com millionaire. They started with several projects including autonomous cars and autonomous boats, but now concentrate on building a new robot called PR2.

The key thing PR2 is designed to support is mobile manipulation. Basically research robots right now come in two varieties – sensors on wheels, that move about but can’t interact with anything, and fixed robotic arms, that manipulate objects but are rooted in place. A few research groups have build mobile manipulation systems where the robot can both move about and interact with objects, but the barrier to entry here is really high. There’s a massive amount of infrastructure you need to get a decent mobile manipulation platform going – navigation, obstacle avoidance, grasping, cross-calibration, etc. As a result, there are very very few researchers in this area. This is a terrible shame, because there are all sorts of interesting possibilities opened up by having a robot that can both move and interact. Willow Garage’s PR2 is designed to fill the gap – an off-the-shelf system that provides basic mobile manipulation capabilities.

Brian: We have a set of demos that we hope that the robot can do out of the box. So things like basic navigation around the environment so that it doesn’t run into things and basic motion planning with the arms, basic identifying which is looking at an object and picking it out from sitting on the table and picking it up and moving it somewhere. So the idea is that it should have some basic mobile manipulation capabilities so that the researcher who’s interested in object recognition doesn’t have to touch the arm part in order to make the object recognizer better. The arm part is not to say that it can be improved but good enough.

If they can pull this off it’ll be great for robotics research. All the pieces don’t have to be perfect, just enough so that say a computer vision group could start exploring interactive visual learning without having to worry too much about arm kinematics, or a manipulation group could experiment on a mobile platform without having to write a SLAM engine.

Another interesting part of the interview was the discussion of software standards. Brian is one of the lead authors of Player/Stage, the most popular robot OS. Player is popular, but very far from universal – there are nearly as many robot OSes as there are robot research groups (e.g. CARMEN, Orca, MRPT, MOOS, Orocos, CLARAty, MS Robotics Studio, etc, etc). It seems PR2 will have yet another OS, for which there are no apologies:

I think it’s probably still too early in robotics to come up with a standard. I don’t think we have enough deployed systems that do real work to have a meaningful standard. Most of the complex robots we have are in research labs. A research lab is the first place we throw away a standard. They’re building the next thing. So in robotics labs, a standard will be of not much use. They are much more useful when you get to the commercialization side to build interoperable piece. And at that point we may want to talk about standards and I think it’s still a little early. Right now I’m much more interested in getting a large user community and large developer community. I’m less interested in whether it’s blessed as a standard by a standard’s body.

Anyone working in robotics will recognise the truth of this. Very much a sensible attitude for the moment.

Google Street View – Soon in 3D?

Some Google Street View cars were spotted in Italy this morning. Anyone who works in robotics will immediately notice the SICK laser scanners. It looks like we can expect 3D city data from Google sometime soon. Very interesting!

Street View car spotted in Rome

More pictures of the car here, here and here.

The cars have two side-facing vertical scanners, and another forward-facing horizontal scanner. Presumably they will do scan matching with the horizontal laser, and use that to align the data from the side-facing lasers to get some 3D point clouds. Typical output will look like this (video shows data collected from a similar system built by one of my labmates.)

The other sensors on the pole seem to have been changed too. Gone are the Ladybug2 omnidirectional cameras used on the American and Australian vehicles, replaced by what looks like a custom camera array. This photo also shows a third sensor, which I can’t identify.

So, what is Google doing with 3D laser data? The obvious application is 3D reconstruction for Google Earth. Their current efforts to do this involve user-generated 3D models from Sketchup. They have quite a lot of contributed models, but there is only so far you can get with an approach like that. With an automated solution, they could go for blanket 3D coverage. For an idea of what the final output might look like, have a look at the work of Frueh and Zakhor at Berkeley. They combined aerial and ground based laser with photo data to create full 3D city models. I am not sure Google will go to quite this length, but it certainly looks like they’re made a start on collecting the street-level data. Valleywag claims Google are hiring 300 drivers for their European data gathering efforts, so they will soon be swimming in laser data.

Frueh and Zakhor 3D city model


Google aren’t alone in their 3D mapping efforts. Startup Earthmine has been working on this for a while, using a stereo-vision based approach (check out their slick video demonstrating the system). I also recently built a street-view car myself, to gather data for my PhD research. One way or another, it looks like online maps are headed to a new level in the near future.

Update:  Loads more sightings of these cars, all over the world. San Francisco, Oxford, all over Spain. Looks like this is a full-scale data gathering effort, rather than a small test project.

Big Dog on Ice

Boston Dynamics just released a new video of Big Dog, their very impressive walking robot. This time it tackles snow, ice and jumping, as well as its old party trick of recovering after being kicked. Apparently it can carry 150 Kg too. This is an extremely impressive demo – it seems light-years ahead of other walking robot’s I’ve seen.

You need to a flashplayer enabled browser to view this YouTube video

I must admit to having almost no idea how the robot works. Apparently it uses joint sensors, foot pressure, gyroscope and stereo vision. Judging from the speed of the reactions, I doubt vision plays much of a role. It looks like the control is purely reactive – the robot internally generates a simple gait (ignoring the environment), and then responds to disturbances to try and keep itself stable. While they’ve obviously got a pretty awesome controller, even passive mechanical systems can be surprisingly stable with good design – have a look at this self-stabilizing bicycle.

The one part of the video where it looks like the control isn’t purely reactive is the sped-up sequence towards the end where it climbs over building rubble. There it does seem to be choosing its foot placement. I would guess they’re just beginning to integrate some vision information. Unsurprisingly, walking with planning is currently much slower than “walking by moving your legs”.

Either way, I guess DARPA will be suitably impressed.

Update: More details on how the robot works here.

Urban Challenge Winners Announced

1st Place – Tartan Racing (Carnegie Mellon)

2nd Place – Stanford Racing Team

3rd Place – Victor Tango (Virginia Tech)

That’s all the info on the web at the moment. More details should be available soon. Check Wired or TGDaily.

The details are out (video, photos, more photos). The biggest surprise was that final ordering came down to time alone; no team was penalized for violating the rules of the road (I wonder if this can be correct – the webcast showed Victor Tango mounting the kerb at one point). On adjusted time, Tartan was about 20 minutes ahead of Stanford, with Victor Tango another 20 minutes behind. MIT placed fourth.

DARPA director Tony Tether seemed to state quite strongly that this will be the final Grand Challenge. I’d wait and see on that front. I seem to remember similar statements after the 2005 Challenge. It’s possible the event will continue, but not under DARPA. What exactly the subject of a future challenge would be is not obvious. There’s still a lot of work to be done to build reliable autonomous vehicles, but many of the basics have now been covered. The Register reports that Red Whittaker is proposing an endurance event to test performance over a longer period with variable weather conditions. I think maybe a more interesting challenge would be to raise the bar on sensing issues. Right now the teams are heavily reliant on pre-constructed digital maps and GPS. In principle, there’s no reason they couldn’t run without GPS using only a normal road-map, but taking the crutches away would force the teams to deal with some tough issues. It’s a significant step up, but no worse than the jump between the 2005 Challenge and yesterday.

Whatever DARPA decides to do, I hope they don’t make the mistake of walking away from this prematurely. The Grand Challenges have built up a big community of researchers around autonomous vehicles. They’re also priceless PR for science and engineering in general. I think the teams are resourceful enough to find funding for themselves, but without the crucial ingredient of a public challenge to work toward, things may lose momentum. The next time a politician frets about the low uptake of science courses, I hope someone suggests funding another Grand Challenge.

Six Robots Cross the Line

The Urban Challenge is over – six of the eleven finalists completed the course. Stanford, Tartan Racing and Victor Tango all finished within a few minutes of each other, just before DARPA’s 6-hour time limit. Ben Franklin, MIT and Cornell also finished, but it looks like they were outside the time limit. The DARPA judges have now got to collate all the data about how well the robots obeyed the rules of the road, and will announce a winner tomorrow morning. It’s going to be very close. From watching the webcast, it looks like either Stanford or Tartan Racing will take the top spot, but making the call between them will be very hard. Both put in almost flawless performances.

Junior Finishes

Six hours of urban driving, without any human intervention, is quite a remarkable feat. In fact, watching the best cars, I quickly forgot that they weren’t being driven by humans. That’s really quite amazing. I can’t think of any higher praise for the achievement of the competitors.

There were some thrills and spills along the way – TerraMax rammed a building, TeamUCF wandered into a garden, and MIT and Cornell had a minor collision. Once the weaker bots were eliminated though, everything went remarkably smoothly. The last four or five hours passed almost without event. MIT clearly had some trouble, randomly stopping and going very slowly on the off-road sections (looks like their sensor thresholds were set too low), but they’re a first time entry, so getting to the finish line at all is a major achievement.

DARPA put on a extremely professional event. In an interview after the finish, DARPA director Tony Tether said he didn’t expect to run another Challenge. It will be interesting to see where autonomous driving goes from here. The top teams have clearly made huge progress, but the technology is still a long way from the point where you’d let it drive the kids to school. Many things about the challenge were simplified – there were no pedestrians to avoid, and the vehicles all had precise pre-constructed digital maps of the race area, specifying things like where the stop lines were. Putting this technology to use in the real world is still some distance away, but much closer than anyone would have imagined five years ago.

Spot the Sensor

Watching all the videos and photography coming out of the Urban Challenge, the thing I pay most attention to are the racks of sensors sitting on top of the robots. The sensor choices the teams have made are quite different to the previous 2005 Grand Challenge. There has been a big move to more sophisticated laser systems, and some notable absences in the use of cameras. Here’s a short sensor-spotter’s guide:


For almost all the competitors, the primary sensor is the time-of-flight lidar, often called simply a laser scanner. These are very popular in robotics, because they provide accurate distances to obstacles with higher robustness and less complexity than alternatives such as stereo vision. Some models to look out for:


SICK Lidar

Used by 26 of the 36 semi-finalists, these blue boxes are ubiquitous in robotics research labs around the world because they’re the cheapest decent lidar available and are more than accurate enough for most applications. Typically they are operated with a maximum range of 25m, with distances accurate to a few centimetres . They’re 2D scanners, so they only see a slice through the world. This is normally fine for dealing with obstacles like walls or trees which extend vertically from the ground, but can land you in trouble for overhanging obstacles that aren’t at the same height as the laser. In the previous Challenge, these were the primary laser sensors for many teams. This time around they seem to be mostly relegated to providing some extra sensing in blind-spots.
SICK scanners have a list price of around $6,000, but there is a low-price deal for Grand Challenge entries. Indeed, the SICK corporation has had so much business and publicity from the Grand Challenge, that this year they decided to enter a team of their own.


Velodyne Lidar

New kid on the block for the Urban Challenge, the Velodyne scanner is conspicuously popular this year. It’s used by 12 of the 36 semi-finalists, including most of the top teams. With a list price of $75,000, the Velodyne is quite a bit more pricey than the common SICK. However, instead of just containing a single laser, the Velodyne has a fan of 64 lasers, giving a genuine 3D picture of the surroundings.

There’s an interesting story behind the Velodyne sensor. Up until two years ago Velodyne was a company that made subwoofers. It’s founders decided to enter the 2005 Grand Challenge as a hobby project. Back then, the SICK scanner was about the best available, but it didn’t provide enough data, so many teams were loading up their vehicles with racks of SICKs. Team DAD instead produced a custom laser scanner that was a big improvement on what was available at the time. Their website illustrates the change quite nicely. For the Urban Challenge, they decided to concentrate on selling their new scanner to other teams instead of entering themselves. I’m sure this is exactly the kind of ecosystem of technology companies DARPA dreams about creating with these challenges.

I understand that the Velodyne data is a bit nosier than a typical SICK because of cross-talk between the lasers, but it’s obviously more than good enough to do the job. These sensors produce an absolute flood of data – more than a million points a second – and dealing with that is driving a lot of teams’ computing requirements.

Teams who couldn’t afford the hefty price tag of this sensor have improvised Velodyne-like scanners by putting SICKs on turntables or pan-tilt units, but the SICK wasn’t designed for applications like this, so the data is quite sparse and it’s tricky to synchronize the laser data with the pan-tilt position.


Riegl Lidar

Some of the more well-funded competitors are using these high-end lidar systems from Riegl. These are 2D scanners similar to the SICK, but have longer range and more sophisticated processing to deal with confusing multiple returns. However, they will set you back a hefty $28,000.

Ibeo is a subsidiary of SICK that makes sensors for the automotive market. They produce several models of laser scanner, such as the flying-saucer like attachments seen here on the front of team CarOLO. I’m not too familiar with these sensors, but I believe they are rotating laser fans – something like a scaled-down Velodyne.


Vision is less prevalent this year than I was expecting. As far as I can gather, none of the teams have gone in for a computer-vision based approach to recognising other cars. I suppose with a good laser sensor it’s mostly unnecessary, plus you have the advantage of being immune to illumination problems which can foil vision techniques. Many teams have cameras for detecting lane markings, but that appears to be the extent of it.
Some teams, such as Stanford’s Junior, are all-laser systems with no cameras at all. Given that vision was the core of the secret sauce that helped Stanford win the 2005 Grand Challenge, and their early press photos prominently showed a Ladybug2 camera, I was pretty surprised by this. The reason is revealed in this interview with Mike Montemerlo where he shows plenty of results using the Ladybug2, but explains that they had to abandon the sensor after their lead vision programmer left for a job with Google (who have some interest in Ladybugs). The final version of Junior uses laser reflectance information to find the road markings, and judging by the results so far, seems to be getting on just fine without vision.

Cameras come in all shapes and sizes, but a few to look out for:

PointGrey Bumblebee

Bumblebee Stereo Camera

Point Grey are a popular supplier of stereo vision systems, and you can see these cameras attached to a number of vehicles. Princeton’s Team Prowler has a system based entirely around these stereo cameras – a choice they made for budget reasons.


Point Grey Ladybug2

This is a spherical vision system composed of 6 tightly packed cameras, also produced by Point Grey. After Stanford abandoned their vision system, I don’t think any entries are using this camera – but there’s one sitting on my desk as I type this, so I’m including the picture anyway.


Though not very visible, several cars are sporting radar units. The MIT vehicle has 16! Radar is good for long range, out to hundreds of meters, but it’s noisy and has poor resolution. However, when you’re travelling fast and just want to know if there’s a major obstacle up ahead, it does the job. It’s already used in several commercial automotive safety systems.


GPS is obviously a core sensor for every entrant. Most vehicles have several redundant GPS units on their roofs, popular suppliers being companies like Trimble who sell rugged, high-accuracy units developed for applications in precision agriculture.


Though not visible on the outside, many of the entrants have inertial measurement units tucked inside. These little packages of gyroscopes and accelerometers help the vehicles keep track of position during GPS outages. High-end IMUs can be amazingly precise, but have a price tag to match.

This post has become something of a beast. If you still can’t get enough of sensors, there are some interesting videos here and here where Virgina Tech and Ben Franklin discuss their sensor suites.

Urban Challenge Final

The final of the Urban Challenge is just about to begin. There’s a live video stream at The commentators aren’t experts, but the footage is excellent. TGDaily, who have provided some of the best coverage during the week, are also live blogging the final and hopefully will have some more sensible commentary.

Once the robots roll out the gate, they’re going to be totally on their own for up to six hours. I hear some of the teams were making code changes right through last night, so what happens today is anybody’s guess. For lots of people out in Victorville, it’s going to be a very tense few hours.

Urban Challenge Finalists Announced

11 teams have been selected for the final of the Urban Challenge. The teams are:

Tartan Racing
Stanford Racing Team
Team Oshkosh Truck
Team Cornell
Victor Tango

Ben Franklin Racing Team
Team UCF
Team AnnieWay
Intelligent Vehicle Systems

DARPA originally planned to have 20 teams in the final, but decided that none of the other competitors met the minimum safety standards. In the words of DARPA directory Tony Tether – “It would be terrible for one bot to take out another”.

The final will be Webcast live at, starting at 7:30 a.m. PT (10:30 a.m. ET, 14:30pm GMT). In the meantime, favourites Tartan Racing have some nice videos on their race blog.

Urban Challenge Under Way!

As I write this, 36 robotic cars are driving themselves around an abandoned air force base in California. Their task is essentially to pass a California driving test, without a driver. The competition is the DARPA Urban Challenge, successor to the 2005 Grand Challenge, and there is a $2 million prize at stake. Last time, the winning car succeeded in driving itself 132 miles through the Nevada desert. This time, the competitors will have the much harder task of dealing with urban driving in traffic. (Because robot drivers are a currently a wee bit erratic, the traffic is provided by a crew of 50 professional stunt drivers in cars reinforced with roll cages).

For robotic researchers, this is high octane stuff, and personally I’ve been glued to it all week. Best coverage I’ve found is Wired’s Defence Blog, the video reports from TGDaily, and the photos from one of the Stanford team. Currently the event is going through the initial qualifying rounds. Twelve teams have been eliminated so far, including some well-funded and professional entries such as Georgia Tech. There has been at least one spectacular malfunction. Best performances so far have come from Stanford, Cargenie Mellon, Cornell and Virginia Tech, all of whom have already secured a place in the final.

I’ll be posting highlights here throughout the week. Go, go, robot races!