A team from Harvard has been studying Google Book’s record of human culture, spanning six centuries and seven languages.

It shows vocabularies expanding and grammar evolving. It contains stories about our adoption of technology, our quest for fame, and our battle for equality. And it hides the traces of tragedy, including traces of political suppression, records of past plagues, and a fading connection with our own history.

The part about our perceptions of time are quite interesting:

“’1951’ was rarely discussed until the years immediately preceding 1951. Its frequency soared in 1951, remained high for three years, and then underwent a rapid decay, dropping by half over the next fifteen years.” But the shape of these graphs is changing. The peak gets higher with every year and we are forgetting our past with greater speed. The half-life of ‘1880’ was 32 years, but that of ‘1973’ was a mere 10 years.

The team found that new technology permeates through our culture with growing speed. By scanning the corpus for 154 inventions created between 1800-1960, from microwave ovens to electroencephalographs, they found that more recent ones took far less time to become widely discussed.

In fact it’s full of gems:

They found that in the early 19th century, celebrities started rising to fame at the age of 43 and it took 8 years for their prominence in books to double. By the mid-20th century, they were starting at 29 and doubling in just over 3 years. However, while the spotlight upon them is more intense, their time in it is briefer. Celebrities tend to peak in fame at the ripe old age of 75 (remember, this is measured by their mentions in books). A century ago, it took 120 years after that for their fame to halve; now, it takes just 71.

Insight into politics and suppression:

The absence of words can be just as informative as their presence – they can represent the cultural fingerprints of censorship and suppression. There’s no shortage of examples. Tiananmen Square became massively more common in English books following 1989, but the frequency of the equivalent characters in Chinese texts remained stable. The names of the Hollywood Ten – a group of alleged Communist sympathisers – were mentioned far less often in English texts after 1947.

The names of artists, writers, political academics, historians and philosophers all became increasingly rare among German texts during the Third Reich, while the names of Nazi party members became six times more common (above middle). None of this is surprising in a historical context, but in the future, the corpus could help to identify victims of censorship in a rapid way, for current or recent events.

I wonder how you’d objectively tell from the data alone who was being suppressed and who was just falling from popularity?

All the above via The cultural genome in Discover Magazine.

You can play with this for yourself at Ngrams in Google Labs which allows you to enter words and see their frequency over time. Try Good vs Bad, or change, revolution, social, public or duty and responsibility

And yes, it’s not just our perception, but there IS more written about rights than responsibilities. (Though also note that this is not necessarily a bad thing).

We can also see what times concepts were at the fore, so we can learn from history how perhaps, eg the active promotion of the concept of thrift:

