Epiphenomenalism for Computer Scientists

It’s hard to work on robotics or machine learning and not occasionally think about consciousness.  However, it’s quite easy not to think about it properly! I recently concluded that everything I used to believe on this subject is wrong. So I wanted to write a quick post explaining why.

For a long time, I subscribed to a view on consciousness called “epiphenomenalism”. It just seemed obvious, even necessary. I suspect a lot of computer scientists may share this view. However, I recently had a chance to think a bit more carefully about it, and came upon problems which I now see as fatal. Below I explain briefly what epiphenomenalism is, why it is so appealing to computer scientists, and what convinced me it cannot be right. Everything here is old news in philosophy, but might be interesting for someone coming to the issue from a computer scientist perspective. More

Will the robots take our jobs?

This post is about robots and the economy, but takes some detours first. Bear with me.

Robert Gordon and the End of Growth

There has been a very interesting discussion going on recently, prompted by an article by economist Robert Gordon of Northwestern University. Gordon’s article (“Is US economic growth over?”) makes the case that long-term US economic growth on the scale of the last century was due to one-time events and has run its course, with future growth prospects being much lower. He attributes the growth of the past few centuries to three distinct industrial revolutions. The first, beginning 1750-1830, was due to steam power and railroads. The second, 1870-1900, was due to electrification, internal combustion engines, running water and petroleum. The third, beginning around 1960, was due to the computer and the internet. Gordon makes the case that the second industrial revolution, from 1870-1900, was by far the most important, and that computers and the internet have had far smaller impacts on GDP. Combined with demographic headwinds, he sees much lower rates of growth in the next century.

Martin Wolf summarizes the pessimist’s case succinctly:

Unlimited growth is a heroic assumption. For most of history, next to no measurable growth in output per person occurred. What growth did occur came from rising population. Then, in the middle of the 18th century, something began to stir. Output per head in the world’s most productive economies — the UK until around 1900 and the US, thereafter — began to accelerate. Growth in productivity reached a peak in the two and a half decades after World War II. Thereafter growth decelerated again, despite an upward blip between 1996 and 2004. In 2011 — according to the Conference Board’s database — US output per hour was a third lower than it would have been if the 1950-72 trend had continued (see charts). Prof Gordon goes further. He argues that productivity growth might continue to decelerate over the next century, reaching negligible levels.

Robots to the rescue?

What interests me most is the responses that Gordon’s article has received. His position is very interesting, but likely wrong in one massive aspect.


Dinosaurs and Tail Risk

Writing in this morning’s FT, Nassim Nicholas Taleb proposes Ten principles for a Black Swan-proof world:

1. What is fragile should break early while it is still small. Nothing should ever become too big to fail. Evolution in economic life helps those with the maximum amount of hidden risks — and hence the most fragile — become the biggest.

Then we will see an economic life closer to our biological environment: smaller companies, richer ecology, no leverage.

A sensible plan, but unfortunately Mr. Taleb’s faith in biology is misplaced.

Why the Dinosaurs got so Large

19th-century palaeontologist Edward Drinker Cope noticed that animal lineages tend to get bigger over evolutionary time, starting out small and leaving ever bigger descendants. This process came to be known as Cope’s rule.

Getting bigger has evolutionary advantages, explains David Hone, an
expert on Cope’s rule at the Institute of Vertebrate Paleontology and
Paleoanthropology in Beijing, China. “You are harder to predate and it
is easier for you to fight off competitors for food or for mates.” But
eventually it catches up with you. “We also know that big animals are
generally more vulnerable to extinction,” he says. Larger animals eat
more and breed more slowly than smaller ones, so their problems are
greater when times are tough and food is scarce. “Many of the very
large mammals, such as Paraceratherium, had a short tenure in the
fossil record, while smaller species often tend to be more
persistent,” says mammal palaeobiologist Christine Janis of Brown
University in Providence, Rhode Island. So on one hand natural
selection encourages animals to grow larger, but on the other it
eventually punishes them for doing so. This equilibrium between
opposing forces has prevented most land animals from exceeding about 10 tonnes.

Dinosaurs had skewed incentives and took on too much tail risk! If even evolution falls into this trap, God help the bank regulators…

Computer Vision in the Elastic Compute Cloud

In a datacenter somewhere on the other side of the planet, a rack-mounted computer is busy hunting for patterns in photographs of Oxford.  It is doing this for 10 cents an hour, with more RAM and more horsepower than I can muster on my local machine. This delightful arrangement is made possible by Amazon’s Elastic Compute Cloud.

For the decreasing number of people who haven’t heard of EC2, it’s a pretty simple idea. Via a simple command line interface you can “create” a server running in Amazon’s datacenter. You pick a hardware configuration and OS image, send the request and voilà – about 30 seconds later you get back a response with the IP address of the machine, to which you now have root access and sole use.  You can customize the software environment to your heart’s content and then save the disk image for future use. Of course, now that you can create one instance you can create twenty. Cluster computing on tap.

This is an absolutely fantastic resource for research. I’ve been using it for about six months now, and have very little bad to say about it. Computer vision has an endless appetite for computation. Most groups, including our own, have their own computing cluster but demand for CPU cycles typically spikes around paper deadlines, so having the ability to instantly double or triple the size of your cluster is very nice indeed.

Amazon also have some hi-spec machines available. I recently ran into trouble where I needed about 10GB of RAM for a large learning job. Our cluster is 32-bit, so 4GB RAM is the limit. What might have been a serious headache was solved with a few hours and $10 on Amazon EC2.

The one limitation I’ve found is that disk access on EC2 is a shared resource, so bandwidth to disk tends to be about 10MB/s, as opposed to say 70MB/sec on a local SATA hard drive. Disk bandwidth tends to be a major factor in running time for very big out-of-core learning jobs. Happily, Amazon very recently released a new service called Elastic Block Store which offers dedicated disks, though the pricing is a little hard to figure out.

I should mention that for UK academics there is a free service called National Grid, though personally I’d rather work with Amazon.

Frankly, the possibilities opened up by EC2 just blow my mind. Every coder in a garage now potentially has access to Google-level computation. For tech startups this is a dream. More traditional companies are playing too. People have been talking about this idea for a long time, but it’s finally here, and it rocks!

Update: Amazon are keen to help their scientific users. Great!

Big Data to the Rescue?

Peter Norvig of Google likes to say that for machine learning, you should “worry about the data before you worry about the algorithm”.

Rather than argue about whether this algorithm is better than that algorithm, all you have to do is get ten times more training data. And now all of a sudden, the worst algorithm … is performing better than the best algorithm on less training data.

It’s a rallying cry taken up by many, and there’s a lot of truth to it.  Peter’s talk here has some nice examples (beginning at 4:30). The maxim about more data holds over several orders of magnitude. For some examples of the power of big-data-simple-algorithm for computer vision, check out the work of Alyosha Efros’ group at CMU.  This is all pretty convincing evidence that scale helps. The data tide lifts all boats.

What I find more interesting, though, is the fact that we already seem to have reached the limits of where data scale alone can take us. For example, as discussed in the talk, Google’s statistical machine translation system incorporates a language model consisting of length 7 N-grams trained from a 10^12 word dataset. This is an astonishingly large amount of data. To put that in perspective, a human will hear less than 10^9 words in an entire lifetime. It’s pretty clear that there must be huge gains to be made on the algorithmic side of the equation, and indeed some graphs in the talk show that, for machine translation at least, the performance gain from adding more data has already started to level off. The news from the frontiers of the Netflix Prize is the same – the top teams report that the Netflix dataset is so big that adding more data from sources like IMDB makes no difference at all! (Though this is more an indictment of ontologies than big data.)

So, the future, like the past, will be about the algorithms. The sudden explosion of available data has given us a significant bump in performance, but has already begun to reach its limits. There’s still lots of easy progress to be made as the ability to handle massive data spreads beyond mega-players like Google to more average research groups, but fundamentally we know where the limits of the approach lie. The hard problems won’t be solved just by lots of data and nearest neighbour search. For researchers this is great news – still lots of fun to be had!

Spot the Sensor

Watching all the videos and photography coming out of the Urban Challenge, the thing I pay most attention to are the racks of sensors sitting on top of the robots. The sensor choices the teams have made are quite different to the previous 2005 Grand Challenge. There has been a big move to more sophisticated laser systems, and some notable absences in the use of cameras. Here’s a short sensor-spotter’s guide:


For almost all the competitors, the primary sensor is the time-of-flight lidar, often called simply a laser scanner. These are very popular in robotics, because they provide accurate distances to obstacles with higher robustness and less complexity than alternatives such as stereo vision. Some models to look out for:


SICK Lidar

Used by 26 of the 36 semi-finalists, these blue boxes are ubiquitous in robotics research labs around the world because they’re the cheapest decent lidar available and are more than accurate enough for most applications. Typically they are operated with a maximum range of 25m, with distances accurate to a few centimetres . They’re 2D scanners, so they only see a slice through the world. This is normally fine for dealing with obstacles like walls or trees which extend vertically from the ground, but can land you in trouble for overhanging obstacles that aren’t at the same height as the laser. In the previous Challenge, these were the primary laser sensors for many teams. This time around they seem to be mostly relegated to providing some extra sensing in blind-spots.
SICK scanners have a list price of around $6,000, but there is a low-price deal for Grand Challenge entries. Indeed, the SICK corporation has had so much business and publicity from the Grand Challenge, that this year they decided to enter a team of their own.


Velodyne Lidar

New kid on the block for the Urban Challenge, the Velodyne scanner is conspicuously popular this year. It’s used by 12 of the 36 semi-finalists, including most of the top teams. With a list price of $75,000, the Velodyne is quite a bit more pricey than the common SICK. However, instead of just containing a single laser, the Velodyne has a fan of 64 lasers, giving a genuine 3D picture of the surroundings.

There’s an interesting story behind the Velodyne sensor. Up until two years ago Velodyne was a company that made subwoofers. It’s founders decided to enter the 2005 Grand Challenge as a hobby project. Back then, the SICK scanner was about the best available, but it didn’t provide enough data, so many teams were loading up their vehicles with racks of SICKs. Team DAD instead produced a custom laser scanner that was a big improvement on what was available at the time. Their website illustrates the change quite nicely. For the Urban Challenge, they decided to concentrate on selling their new scanner to other teams instead of entering themselves. I’m sure this is exactly the kind of ecosystem of technology companies DARPA dreams about creating with these challenges.

I understand that the Velodyne data is a bit nosier than a typical SICK because of cross-talk between the lasers, but it’s obviously more than good enough to do the job. These sensors produce an absolute flood of data – more than a million points a second – and dealing with that is driving a lot of teams’ computing requirements.

Teams who couldn’t afford the hefty price tag of this sensor have improvised Velodyne-like scanners by putting SICKs on turntables or pan-tilt units, but the SICK wasn’t designed for applications like this, so the data is quite sparse and it’s tricky to synchronize the laser data with the pan-tilt position.


Riegl Lidar

Some of the more well-funded competitors are using these high-end lidar systems from Riegl. These are 2D scanners similar to the SICK, but have longer range and more sophisticated processing to deal with confusing multiple returns. However, they will set you back a hefty $28,000.

Ibeo is a subsidiary of SICK that makes sensors for the automotive market. They produce several models of laser scanner, such as the flying-saucer like attachments seen here on the front of team CarOLO. I’m not too familiar with these sensors, but I believe they are rotating laser fans – something like a scaled-down Velodyne.


Vision is less prevalent this year than I was expecting. As far as I can gather, none of the teams have gone in for a computer-vision based approach to recognising other cars. I suppose with a good laser sensor it’s mostly unnecessary, plus you have the advantage of being immune to illumination problems which can foil vision techniques. Many teams have cameras for detecting lane markings, but that appears to be the extent of it.
Some teams, such as Stanford’s Junior, are all-laser systems with no cameras at all. Given that vision was the core of the secret sauce that helped Stanford win the 2005 Grand Challenge, and their early press photos prominently showed a Ladybug2 camera, I was pretty surprised by this. The reason is revealed in this interview with Mike Montemerlo where he shows plenty of results using the Ladybug2, but explains that they had to abandon the sensor after their lead vision programmer left for a job with Google (who have some interest in Ladybugs). The final version of Junior uses laser reflectance information to find the road markings, and judging by the results so far, seems to be getting on just fine without vision.

Cameras come in all shapes and sizes, but a few to look out for:

PointGrey Bumblebee

Bumblebee Stereo Camera

Point Grey are a popular supplier of stereo vision systems, and you can see these cameras attached to a number of vehicles. Princeton’s Team Prowler has a system based entirely around these stereo cameras – a choice they made for budget reasons.


Point Grey Ladybug2

This is a spherical vision system composed of 6 tightly packed cameras, also produced by Point Grey. After Stanford abandoned their vision system, I don’t think any entries are using this camera – but there’s one sitting on my desk as I type this, so I’m including the picture anyway.


Though not very visible, several cars are sporting radar units. The MIT vehicle has 16! Radar is good for long range, out to hundreds of meters, but it’s noisy and has poor resolution. However, when you’re travelling fast and just want to know if there’s a major obstacle up ahead, it does the job. It’s already used in several commercial automotive safety systems.


GPS is obviously a core sensor for every entrant. Most vehicles have several redundant GPS units on their roofs, popular suppliers being companies like Trimble who sell rugged, high-accuracy units developed for applications in precision agriculture.


Though not visible on the outside, many of the entrants have inertial measurement units tucked inside. These little packages of gyroscopes and accelerometers help the vehicles keep track of position during GPS outages. High-end IMUs can be amazingly precise, but have a price tag to match.

This post has become something of a beast. If you still can’t get enough of sensors, there are some interesting videos here and here where Virgina Tech and Ben Franklin discuss their sensor suites.