AOL Search Queries

Sorry for being such a hermit lately; I'm just concentrating on getting my thesis written.

I did take time out to play with AOL's search query database for a bit. It's probably old news to some of you, but this is the story:

A while back the US Government demanded databases of search results from the major search engines (they wanted to trawl through the data to determine the prevalence of searches for pornography and such, in an attempt to pass some online child protection legislation). Google in particular refused to turn over results citing trade secrets and privacy concerns. A lengthy legal battle ensued, in which in many people's eyes Google fought the good fight.
Anyway, that's just context.

Early this month, AOL's research division released to the public AOLs entire search records for a 3-month period. Not a leak; a deliberate release. Just hours afterwards, AOL admitted they screwed up and retracted the database, but not before people had made copies.

As a concession to privacy concerns, all usernames in the database are replaced by random numbers, but the interesting thing is that the random number is assigned on a per-user basis, so you can see an (anonymous) person's entire search history. You can search the database by keywords, user id, or the website that the search returned.

It's fascinating in a voyeuristic sort of way. Some of it is funny, some of it is sad, and some of it is just plain disturbing.
This example clearly shows the underestimated danger of steak and cheese to society.
Here are some funny examples courtesy of Something Awful.
And here's a pretty interesting writeup of the whole situation.

I think what makes the database fascinating is trying to reconstruct what's going through someone's mind as they enter these search queries. Bizarre stuff.

Brett said...

wow, thats an interesting find Nick. Cool in a somewhat scarey and unsettling kind of way.

Hows this relate to your thesis again? :P