Archive for the 'Misc' Category

Pointy

Two and a half years ago I left Google and set out to build a new kind of search engine. This may sound a little crazy, but all the best things are like that :-)

We’ve been avoiding the tech press and trying to build things quietly, but this week we’re launched our user-facing app. I’m really proud of what the team has built, so it’s exciting to finally be able to say a bit more about it.

The problem we’ve been working on is finding specific items locally. For example, a light bulb just broke and it’s a strange fitting, where’s the nearest place you can get a new one?  Or you’re half way through a recipe and realise you’re missing an ingredient – where do you get it?

Existing search engines do a really bad job of answering questions like this. The reason is that, in order to provide a good answer, you need to know what products are stocked in all the local shops. It turns out that nobody has this data – not even the shops themselves in many cases. It’s kind of strange to think that you can search the entire internet in a fraction of a second, but the contents of the shop around the corner remains a mystery unless you go there in person. But that’s the state of the world in 2015. At Pointy we’ve been working on solving that problem.

The core challenge is data collection – building a scalable way to index the contents of every local shop. Enter the Pointy box:

D16929-0051_clipped_rev_1

The Pointy box is a piece of hardware we designed, which sits inline between a shop’s barcode scanner and Point of Sale system. Whenever the shop scans something, we intercept the barcode information and transmit it back to our servers over a built-in cellular connection. From this data we can figure out what products the shop stocks, and get a pretty good estimate of stock levels. How it all works is illustrated on our retailer signup page.

Technically, the magic is in how we interface (or don’t interface) with the POS system. We pull the data directly off the wire between the barcode scanner and the POS. Since we get it at such a low level, we don’t have to worry about integrating with every piece of POS software in the world. There’s still a lot of work to do, but it makes the problem much more tractable. At this stage we can support basically anything we find in the wild1, everything from ancient cash registers that look like they belong in a Western, right up to iPad based systems.

Pointy box

From the retailers’ point of view, it’s extremely simple. They just plug in the box and within a few minutes they have a nice website for their shop, which automatically lists everything they sell. They’re also part of the Pointy local search app. There’s no extra work and no configuration, it just fits in with their existing systems.

app_overview

Happily, retailers seem to love this. We started to roll it out widely in June this year, and by December roughly 1 in 8 of all shops in our launch city (Dublin, Ireland) are using the system.

map

There’s a vast variety of shops now on the platform, basically the whole range of local shops: bike stores, pharmacies, hardware, convenience, pet shops, delis, supermarketswine stores, toy shops, book stores, garden centres, even horse supply shops. There’s a huge data challenge in identifying the right name and picture to go with a barcode, and that actually occupies a big chunk of our engineering team, but that’s a topic for another post.

app_product

Infrastructure

Our system is built on Google Cloud Platform, which has let us scale quickly without having to spend time on non-core problems. We use a mixture of services including App Engine, Compute Engine and Cloud Storage. Services like Task Queues, Cron and Logs give us some great tools for building reliable services with minimal effort.

Processing all of the point of sale transaction data at the level of cities or countries is no small task. We need to deal with everything in real time in order to rapidly detect out-of-stock events and keep the search results accurate. We use Cloud Datastore for logging all of our transaction data. With a little bit of initial design work, we have a system that should scale pretty much indefinitely without us having to worry about it. Server load is concentrated around busy lunchtime and evening shopping hours, so being able to dynamically scale with demand is also helpful.

We use Compute Engine to manage our IoT device deployment, including over-the-air firmware updates and remote debugging. Our product search engine is also hosted on Compute Engine. We do a lot of batch processing of our transaction data logs to extract ranking signals, and process lots of external web data for additional product attributes and ranking signals. Everything is backed up nightly from Datastore to Google Cloud Storage for disaster recovery.

I built my last startup on AWS, so it was a little bit of a change to use Google Cloud this time around. However, it’s been a really great choice. It gives us a beautiful combination of scale and agility. We deploy to production often multiple times per day, which is extremely easy with the GCP tools. This lets us iterate rapidly, and focus on our product rather than system administration.  I suppose when you’re building a search engine, using Google’s infrastructure seems like an obvious choice :-)

What’s Next

It’s been a great experience so far, but we’re not close to the end. There’s still a long way to go to index every shop on the planet, after all. We’re getting there, and having fun along the way. If you’re interested, we’re always looking for good people.

Footnotes
  1. There’s always a few exotic ones that aren’t worth the trouble, but for practical purposes it might as well be 100% coverage.

Startups

Imagine you’re a road engineer and you’re designing an access road for a new town. The town will soon be built in a previously uninhabited area. You’ve managing the construction project, but unfortunately no one can tell you what the population of the town will be.

Taking your job seriously, you sit down to design the best road that you can build. You settle on constructing a seven lane highway with regular flyovers to minimize traffic. The road will be fully lit with a state-of-the-art LED lighting system. You add crash barriers and regularly spaced emergency telephones. After much consideration you decide to also include a rest area with parking and toilets. This involves designing a self-contained water and sewerage system, but it’s obviously worth it.

With three months to go until launch day, you discover problems with road drainage. After the panic subsides, the construction team agrees to work around the clock to refit a completely new system for surface water management. By a minor miracle, the work is completed on time.

Opening day finally arrives and the excitement is intense. Everyone agrees the finished product is an engineering marvel. The new town will have the best road in the world.

Unfortunately, it turns out that the town is a remote settlement with a population of 57. The road is mainly used by an old man and a donkey.

—————

The next year, you are again given a road construction project for another new town. Having learned your lesson, you build a modest single lane road. It’s well constructed but nothing special.

Opening day comes again, and it’s revealed that this time the “town” is in fact a major city with a population of 14 million. There are 50 mile tailbacks for six years before a larger road can be built. Your face appears on wanted posters throughout the nation, and you flee the country in disgrace.

—————

Twitter, I forgive you the Fail Whale. And I hope to always walk the middle *ahem* road.

Building a DIY Street View Car

A little blast from the past here. Several years ago I built something very like a Google Street View car to gather data for my PhD thesis. At the time I wrote up a blog post about the experience, as a guide for anyone else who might want to build such a thing. But I never quite finished it. Upgrading WordPress today, I came across this old post sitting in my drafts folder from years ago, and decided to rescue it. So here it is. The making of a DIY StreetView car.

Continue reading ‘Building a DIY Street View Car’

The Universal Robotic Gripper

I just saw a video of device that consists of nothing more than a rubber balloon, some coffee grounds and a pump. I’m pretty sure it’s going to change robotics forever. Have a look:

You need to a flashplayer enabled browser to view this YouTube video

It’s a wonderful design. It’s cheap to make. You don’t need to position it precisely. You need only minimal knowledge of the object you’re picking up. Robotic grasping has always been too hard to be really practical in the wild. Now a whole class of objects just got relatively easy.

Clearly, the design has it’s limitations. It’s not going to allow for turning the pages of a book, making a cheese sandwich, tying a dasiy chain, etc. But for relatively straightforward manipulation of rigid objects, it’s a beautiful solution. This one little idea could help start a whole industry.

The research was a collaboration between Chicago, Cornell and iRobot, with funding from DARPA. It made the cover of PNAS this month. The research page is here.

Fun with Robots

It’s no secret that I’m a huge fan of Willow Garage. So as they get ready to ship their first PR2 robots, here’s a gratuitous video of the pre-release testing:

You need to a flashplayer enabled browser to view this YouTube video

This second video is a nice overview of what Willow Garage and their open source robotics program is all about:

You need to a flashplayer enabled browser to view this YouTube video

Posts of the Year

As I make arrangements to close down things in the lab and prepare for a bit of turkey and ham, I thought I’d put up some of my favourite blog posts from the last year:

Finally, for a bit of fun, check out the Austrian Hexapod Dance Competition:

You need to a flashplayer enabled browser to view this YouTube video

Silicon Valley Comes to Oxford

I’ll be at Silicon Valley Comes to Oxford all day today. This has been an excellent event in previous years, and there is a strong line-up again this year. Anyone interested in visual search technology, do come say hello.

Update: I heard some great talks today and met lots of interesting people during coffee. Chris Sacca’s pitch workshop was especially good. No bullshit. The most valuable thing was the perspective – all those bits of knowledge that are obvious from the inside but very hard to come by from the outside. And of course hearing Elon Musk was just fantastic.

For those people who were interested in our lab’s visual search engine, there’s an online demo here (scroll down to where it says Real-time Demo). The demo is actually of some older results from about a year ago by a colleague of mine. Things have gotten even better since then.

Computer Vision in the Elastic Compute Cloud

In a datacenter somewhere on the other side of the planet, a rack-mounted computer is busy hunting for patterns in photographs of Oxford.  It is doing this for 10 cents an hour, with more RAM and more horsepower than I can muster on my local machine. This delightful arrangement is made possible by Amazon’s Elastic Compute Cloud.

For the decreasing number of people who haven’t heard of EC2, it’s a pretty simple idea. Via a simple command line interface you can “create” a server running in Amazon’s datacenter. You pick a hardware configuration and OS image, send the request and voilà – about 30 seconds later you get back a response with the IP address of the machine, to which you now have root access and sole use.  You can customize the software environment to your heart’s content and then save the disk image for future use. Of course, now that you can create one instance you can create twenty. Cluster computing on tap.

This is an absolutely fantastic resource for research. I’ve been using it for about six months now, and have very little bad to say about it. Computer vision has an endless appetite for computation. Most groups, including our own, have their own computing cluster but demand for CPU cycles typically spikes around paper deadlines, so having the ability to instantly double or triple the size of your cluster is very nice indeed.

Amazon also have some hi-spec machines available. I recently ran into trouble where I needed about 10GB of RAM for a large learning job. Our cluster is 32-bit, so 4GB RAM is the limit. What might have been a serious headache was solved with a few hours and $10 on Amazon EC2.

The one limitation I’ve found is that disk access on EC2 is a shared resource, so bandwidth to disk tends to be about 10MB/s, as opposed to say 70MB/sec on a local SATA hard drive. Disk bandwidth tends to be a major factor in running time for very big out-of-core learning jobs. Happily, Amazon very recently released a new service called Elastic Block Store which offers dedicated disks, though the pricing is a little hard to figure out.

I should mention that for UK academics there is a free service called National Grid, though personally I’d rather work with Amazon.

Frankly, the possibilities opened up by EC2 just blow my mind. Every coder in a garage now potentially has access to Google-level computation. For tech startups this is a dream. More traditional companies are playing too. People have been talking about this idea for a long time, but it’s finally here, and it rocks!

Update: Amazon are keen to help their scientific users. Great!

“But I’m Not Lost!” – Adoption Challenges for Visual Search

I’m still rather excited about yesterday’s kooaba launch. I’ve been thinking about how long this technology will take to break into the mainstream, and it strikes me that getting people to adopt it is going to take some work.

When people first started using the internet, the idea of search engines didn’t need much promotion. People were very clearly lost, and needed some tool to find the interesting content. Adopting search engines was reactive, rather than active.

Visual search is not like that. If kooaba or others do succeed in building a tool that lets you snap a picture of any object or scene and get information, well, people may completely ignore it. They’re not lost – visual search is a useful extra, not a basic necessity. The technology may never reach usage levels seen by search engines. That said, it’s clearly very useful, and I can see it getting mass adoption. It’ll just need education and promotion. Shazam is great example of a non-essential search engine that’s very useful and massively popular.

So, promotion, and lots of it. What’s the best way? Well, most of the different mobile visual search startups are currently running trail campaigns involving competitions and magazine ads (for example this SnapTell campaign).  Revenue for the startups, plus free public education on how to use visual search. Not a bad deal, easy to see why all the companies are doing it. The only problem is that it may get the public thinking that visual search is only about cheap promotions, not useful for anything real. That would be terrible for long-term usage. I rather prefer kooaba’s demo based on movie posters – it reinforces a real use case, plus it’s got some potential for revenues too.

A Visual Search Engine for iPhone

Today kooaba released their iPhone client. It’s a visual search engine – you take a picture of something, and get search results. The YouTube clip below shows it in action.  Since this is the kind of thing I work on all day long, I’ve got a strong professional interest. Haven’t had a chance to actually try it yet, but I’ll post an update once I can nab a friend with an iPhone this afternoon to give it a test run.

You need to a flashplayer enabled browser to view this YouTube video

At the moment it only recognises movie posters. Basically it’s current form is more of a technology demo than something really useful. Plans are to expand to recognise other things like books, DVDs, etc. I think there’s huge potential for this stuff. Snap a movie poster, see the trailer or get the soundtrack. Snap a book cover, see the reviews on Amazon. Snap an ad in a magazine, buy the product. Snap a resturant, get reviews. Most of the real world becomes clickable. Everything is a link.

The technology is very scalable – The internals use an inverted index just like normal text search engines. In my own research I’m working with hundreds of thousands of images right now. It’s probably going to be possible to index a sizeable fraction of all the objects in the world –  literally take a picture of anything and get search results. The technology is certainly fast enough, though how the recognition rate will hold up with such large databases is currently unknown.

My only question is – where’s the buzz, and why has it taken them so long?

Update: I gave the app a spin today on a friend’s iPhone, and it basically works as advertised. It was rather slow though – maybe 5 seconds per search. I’m not sure if this was a network issue (though the iPhone had a WiFi connection), or maybe kooaba got more traffic today than they were expecting. The core algorithm is fast – easily less than 0.2 seconds (and even faster with the latest GPU-based feature detection).  I am sure the speed issue will be fixed soon. Recognition seemed fine, my friend’s first choice of movie was located no problem. A little internet sleuthing shows that they currently have 5363 movie posters in their database. Recognition shouldn’t be an issue until the database gets much larger.