Fact-checking the information exa-ggeration

Numbers: they can be beguiling things, especially when they tell a story we really want to hear.

The bigger the numbers the better, ideally so mind-bogglingly big that they totally overwhelm our critical faculties.

Best of all, take a series of numbers getting ever bigger: a dynamic that makes us feel as if something significant is happening before our eyes.

All of the above feature in this example from Google’s recent annual shareholders meeting:

[Chief executive officer Eric] Schmidt estimates… There are 800 exabytes of information in the world people can access on the Internet, he says, explaining that an exabyte is about 1 billion gigabytes. “Between the dawn of civilization and 2003, there were exactly five exabytes created,” he says. “We now create that every two days.”

You’ll find the precise quote about 24 seconds into this video from Google’s Investor Relations channel.

The statistic prompted this reverie from the inestimable JP Rangaswami on his blog, Confused of Calcutta:

So, while I knew that the amount of information being produced was accelerating, and that too at an increasing rate, I didn’t really have an appreciation of the scale. Now I do, and I’m grateful to Eric Schmidt for that.

Now I’m sure there are many things for which we should be grateful to Eric Schmidt, but perpetuating this five exabyte claim is not one of them. I’ve tracked down the source and it’s not very convincing. This from Language Log, back in 2003:

The canard that “Five exabytes… is equivalent to all words ever spoken by humans since the dawn of time” was repeated in this 11/12/2003 NYT article by Verlyn Klinkenborg. It’s amazing how people pass this stuff around without checking it or thinking it through: Eskimo snow words all over again, though on a much smaller scale (so far).

For leaving aside the practical question of when we date “the dawn of civilisation,” what value judgements are implied in converting the “information” of a pre-digital world into bits and bytes?

How, for instance, do you evaluate a medieval manuscript? Its transcription into ASCII or Unicode may be a fraction of one laughing baby video but I’m not sure the comparison is very meaningful.

And what of all the other artefacts created by our ancestors? The warp and weft of their handmade clothes made unique pixellated patterns, while our machine-produced chainstore garments would be easily de-duped prior to archiving.

It’s really exciting to live in the 21st Century but breathtakingly arrogant to portray our predecessors as information poor. It feeds a narrative of technological determinism and “information overload” while blinding us to a much more enticing prospect: that people have been creating stuff since, erm, the dawn of civilisation.

As I suggested in a previous post, if we want to profit from the massive potential of new media, we’d do well to start with a little more humility and respect for the way people communicated and interacted quite happily for thousands of years without the help of mobile phones and computers.

2 thoughts on “Fact-checking the information exa-ggeration

Leave a comment