Oh, Google. I know you mean well, but sometimes you just try too hard and come across like a real Chatty Cathy. Even worse, a Chatty Cathy Know-It-All.
I’ve got your back, though. The semantic web is to blame. In your quest to gain a human-like understanding of queries and content, you’ve relied a little too much on the web of information that has been created by the general Internet population. This works well in theory, but you underestimated one thing: Our undying thirst for #juicy #gossip!
At the heart of the matter is Google’s Knowledge Graph. You may know it as the little boxes that answer your questions right at the top of search results, before you have to click on a single link and sometimes before you even know what you’re asking. Google answers your queries and determines what you might be interested in by pulling data from a variety of reputable sites, such as Wikipedia, Freebase and even the CIA World Factbook.
But what happens when Google overreaches? What happens when they sacrifice a little integrity in order to bring you fuller results?
A little over a month ago, I had the baffling pleasure of seeing first hand what happens when Google scrapes the barrel. You see, being a Michigander, I’m far removed from the happenings of the Royal Family from across the pond. For that reason, I could not for the life of me recall what Prince William named his son. You can probably imagine my confusion when I took to Google to find out:
Holy crap! What are they putting in that royal water? That kid is huge!
I admit that I panicked for a moment, thinking I was off my rocker. “I could have sworn they just had that kid!” So where was this picture from the future coming from? After some digging, I found that Google was pulling the image from this ridiculous article, courtesy of the Daily Mail.
That’s right, “one of the key breakthroughs behind the future of search” involves pulling information from a middle-market tabloid. To their credit, they have since started using a picture from Wikipedia of the Duchess holding a very real Prince George. While I found it a tad off-putting that Google would reference anything less than highly authoritative sources, I figured it was a rare occurrence, especially as the Knowledge Graph grows and accumulates more data.
Shortly after, I caught part of Ronan Farrow’s appearance on The Daily Show. As I saw him sitting there, I thought “Boy, that guy sure does look small–even next to Jon Stewart.” So, I used Google in the way I like to imagine Larry and Sergey envisioned it being used back in ’96; to find out how tall celebrities are.
OK, so he’s not short. Something else struck me as strange, however, and this one is a little more subtle than Prince George’s futuristic face. Do you see it? Through their data collection and semantic web of information, Google has deduced that when people look up the height of someone, they are oftentimes curious about the height of their parents. However, in this case, they did not just pull in the height of Ronan’s parents – Woody Allen and Mia Farrow – they also included the height of Ol’ Blue Eyes himself, Frank Sinatra.
Frank Sinatra, for those successful at avoiding all gossip, has been rumored to be the biological father of Ronan Farrow. He is the third result in the “People also search for” portion of Ronan’s Knowledge Graph box, and the first related set of images when searching for Ronan on Google’s image search. These associations make sense, because people are frequently searching for the two of them as they read about these rumors, and they want to look for resemblances in images. However, referencing Sinatra’s height directly in an answer box seems rather bold and uncomfortably definitive.
This, of course, is a result of the semantic web in action. There are likely several webpages referencing their height and other similar attributes, as well as people commonly searching for the height of them both. Therefore, Google has formed a correlation. Yet, as we [should] know, correlation does not guarantee fact. One would think that an “answer box” would only deal in facts.
In their quest to “organize the world’s information and make it universally accessible and useful,” Google has naturally hit some speed bumps along the way. As more objects and facts are added to the Knowledge Base, and further supported through the expansion of structured data, we can expect answers, results and sources to be more concrete.
At the end of the day, I just want to know with absolute certainty when Mila Kunis’ due date is!