A Thousand Kilometers of Appearance-Only SLAM

I’m off to RSS 2009 in Seattle next week to present a new paper on FAB-MAP, our appearance-based navigation system. For the last year I’ve been hard at work on pushing the scale of the system. Our initial approach from 2007 could handle trajectories about 1km long. This year, we’re presenting a new system that we demonstrate doing real-time place recognition over a 1,000km trajectory. In terms of accuracy, the 1,000km result seems to be on the edge of what we can do, however at around the 100km scale performance is really rather good. Some video results below.

You need to a flashplayer enabled browser to view this YouTube video

One of the hardest things to get right was simply gathering the 1,000km dataset. The physical world is unforgiving! Everything breaks. I’ll have a few posts about the trials of building the data collection system over the next few days.

Amazon Buys SnapTell

So the visual search story of the day is that Amazon has acquired SnapTell. This is a really natural fit – SnapTell have solid technology, and Amazon are one of the best use cases. Not too surprised to hear the deal has been done – SnapTell has been conspicuously quiet for several months, and word was that they either had to exit or secure another funding round before the end of the year. So congratulations are in order to everyone at SnapTell on securing what seems like an ideal exit.

The big question now is how this changes the playing field for other companies in the visual search space. I would assume Amazon will move SnapTell’s focus away from their enhanced print advertising service and concentrate on image recognition for books, CDs, DVD, etc. (Up to now, Amazon has been doing this with human-powered image recognition, which was nuts.) While this makes perfect sense for Amazon, it’s going to mean more rather than less opportunities for companies still focused on the general visual search market.

So  I guess this is an ideal point to mention the open secret that I’m currently co-founding Plink, a new visual search engine similar in capability to SnapTell. While our demo shows some familiar use cases, we’re working on taking the technology in some entirely new directions. Visual search is very young, there’s a whole lot still to do! Anyone interested in visual search, feel free to contact me.

Autonomous Marathon!

Congratulations to everyone at Willow Garage for reaching Milestone 2 in the development of the PR2 robot. 26.2 miles of autonomous indoor navigation, including opening eight doors and plugging in to nine power sockets. We’ve been watching the video in the lab with serious robot envy. Very cool!

You need to a flashplayer enabled browser to view this YouTube video

Dinosaurs and Tail Risk

Writing in this morning’s FT, Nassim Nicholas Taleb proposes Ten principles for a Black Swan-proof world:

1. What is fragile should break early while it is still small. Nothing should ever become too big to fail. Evolution in economic life helps those with the maximum amount of hidden risks — and hence the most fragile — become the biggest.

Then we will see an economic life closer to our biological environment: smaller companies, richer ecology, no leverage.

A sensible plan, but unfortunately Mr. Taleb’s faith in biology is misplaced.

Why the Dinosaurs got so Large

19th-century palaeontologist Edward Drinker Cope noticed that animal lineages tend to get bigger over evolutionary time, starting out small and leaving ever bigger descendants. This process came to be known as Cope’s rule.

Getting bigger has evolutionary advantages, explains David Hone, an
expert on Cope’s rule at the Institute of Vertebrate Paleontology and
Paleoanthropology in Beijing, China. “You are harder to predate and it
is easier for you to fight off competitors for food or for mates.” But
eventually it catches up with you. “We also know that big animals are
generally more vulnerable to extinction,” he says. Larger animals eat
more and breed more slowly than smaller ones, so their problems are
greater when times are tough and food is scarce. “Many of the very
large mammals, such as Paraceratherium, had a short tenure in the
fossil record, while smaller species often tend to be more
persistent,” says mammal palaeobiologist Christine Janis of Brown
University in Providence, Rhode Island. So on one hand natural
selection encourages animals to grow larger, but on the other it
eventually punishes them for doing so. This equilibrium between
opposing forces has prevented most land animals from exceeding about 10 tonnes.

Dinosaurs had skewed incentives and took on too much tail risk! If even evolution falls into this trap, God help the bank regulators…

FAB-MAP in the News

Today’s edition of the New Scientist news feed includes an article about my PhD research. How nice! They called the article ‘Chaos filter stops robots getting lost’. This is kind of  a bizarre title – ‘chaos filter’ seems to be a term of their own invention :).  Still, they mostly got things mostly right. I guess that’s journalism!

Whatever about the strange terminology, it’s great to see the research getting out there. It’s also nice to see the feedback from Robert Sim, who made a rather impressive vision-only robotic system with full autonomy a few years ago, still quite a rare accomplishment.

For anyone interested in the details of the system, have a look at my publications page. New Scientist’s description more or less resembles how our system works, but many of the specifics are a little wide of the mark. In particular, we’re not doing hierarchical clustering of visual words as the article describes – instead we learn a Bayesian network that captures the visual word co-occurrence statistics. This achieves a similar effect in that we implicitly learn about objects in the world, but with none of the hard decisions and awkward parameter tuning involved in clustering.

The Really Big Picture

I was at a lunch talk today by Nick Bostrom, of Oxford’s Future of Humanity Institute. The institute has an unusual mandate to consider the really big picture: human extinction risks, truly disruptive technologies such as cognitive enhancement, life extension and brain emulation, and other issues too large for most people to take seriously. It was a pleasure to hear someone thinking clearly and precisely, in the manner of a good philosopher, about topics that are usually the preserve of crackpots. Prof Bostrom’s website is a treasure trove of papers. An atypical but perhaps robot-relevant example is the Whole Brain Emulation Roadmap.

Posts of the Year

As I make arrangements to close down things in the lab and prepare for a bit of turkey and ham, I thought I’d put up some of my favourite blog posts from the last year:

Finally, for a bit of fun, check out the Austrian Hexapod Dance Competition:

You need to a flashplayer enabled browser to view this YouTube video

Progress at Willow Garage

Just came across this new video of the Willow Garage PR2 robot. They’re making rapid progress. When they reach their goal of distributing these platforms to research groups around the world, it will be a good day for robotics. One neat package that comes out of the box up many different near-state-of-the-art capabilities. Right now every research group is independently re-creating platforms from scratch, and it’s a huge obstacle to progress.
If you haven’t heard of Willow Garage, I have an overview here.

 You need to a flashplayer enabled browser to view this YouTube video

Update: Another new video, celebrating two successive days of autonomous runs.

You need to a flashplayer enabled browser to view this YouTube video

Amazon Remembers

When it rains, it pours. Amazon is joining the visual search party, but with a twist.

Today Amazon released an iPhone app with a feature called “Amazon Remembers”. You take a picture of some product you’re interested in and the app uploads it to Amazon. You can then revisit your list later from a browser, e.g. to buy the item. The interesting bit is that Amazon attempts to generate a link to the product page based on your image. Examples here. This isn’t instant, and may take anywhere from five minutes to 24 hours.

Fascinatingly, the back end is apparently not computer vision based. It uses Mechanical Turk, where Amazon is paying people $0.10 per image to generate the links. See here. This is quite amazing to me. I have no idea if Mechanical Turk is deep enough to support this app if it truly becomes popular, but I suppose in that case Amazon could set up a dedicated call-centre type operation to handle the image recognition.

So, visual search companies now have a very direct measure of the value of a correct image search. Judging by the current set of images on Mechanical Turk, a fully automated solution is not possible. However, a hybrid system where some easy categories like books are recognised automatically and harder cases are farmed out to Mechanical Turk would clearly translate into significantly lower costs.
(Of course it’s possible Amazon are already doing this, though I did see several books in the Mechanical Turk requests).

Nokia Point and Find

It seems I missed something fairly major in my round up of mobile visual search companies last week. Nokia have a serious project in the area called Point and Find. You can see a demo here. From MSearchGroove:

“Nokia is committed to owning the visual search space and has committed a staff of 30 to build the business and further develop the technology. The business area has the buy-in of Nokia senior execs and “quite large” funding from the company

The technology comes from an acquisition of a valley startup called Pixto just over a year ago. Nokia’s service is apparently  due for launch soon, initially recognising movie posters only.