On Tuesday Sundar Pichai, the current CEO of Google testified before Congress on an array of issues including privacy, security, possible products targeted at China, their treatment of “hate speech,” and complaints of anti-conservative bias on the company’s products such as search and YouTube. In particular, one claim that Pichai made was that he “lead this company without bias” and that search results, such as when photos of President Trump appear when searching for “idiot,” occur organically due to algorithms and not due to intentional meddling by Google engineers. Even though he admits that the majority of Google’s employees support Democrats, Pichai told Congress that “We built our products in a neutral way.”
You can view his testimony on CSPAN with a transcript here: https://www.c-span.org/video/?455607-1/google-ceo-sundar-pichai-testifies-data-privacy-bias-concerns&start=3793
This claim, that Google search is unbiased at its core and the results which seem to be biased against conservatives are caused by impartial algorithms instead of malicious employees, is deceptive. Google is able to get away with this claim due to the technicalities of machine learning and the overall data pipeline and lifecycle that factors into how it generates search results.
Google incorporates machine learning into its search to deliver more relevant results and advertising. Google is one of the most advanced companies in the field of Artificial Intelligence and Machine Learning, to the point that the company changed its strategy focus from mobile-first to AI-first in 2017. Machine learning algorithms produce results based on the training data they were given (this is why Google is a top player in AI: they collect mountains of data, and can, therefore, train their models to be highly accurate). However, one of the consequences of this training process is that the model becomes biased (in the statistical sense, not the political sense, we’ll get to that) towards the data it was trained on.
Here is where the deception comes in: Googles partners with 3rd party organizations, such as the Southern Poverty Law Center (though Pichai denied that the SPLC had ever flagged a single video in his testimony) and the Anti-Defamation League, to flag content they consider dangerous or hateful. By using biased entities like the ADL and SPLC to decide what content is considered “hateful,” the training data that is fed to the algorithms are biased in favor of a liberal perspective on what constitutes hate speech.
Another way in which bias is introduced is the weight that mainstream media sources are given in search results. In an effort to fight “fake news,” Google weights “trusted” media outlets higher. This creates bias in the results because most mainstream outlets are biased against President Trump, while the sites that are attacked as fake news tend to be conservative such as Breitbart and Infowars.
Furthermore, because this statistical is not impossible for conservative results to sometimes appear, even though on average results at large are liberally biased. This bias occurs without any Google employee having to directly add it in, allowing the company to publicly claim that their products are “built in a neutral way” even though it still produces biased results in the end.
To summarize, here is the process that leads to Pichai’s claim being deceptive:
Google uses machine learning to improve its search
Machine learning algorithms are biased towards the data they are trained with
Google partners with 3rd parties like the ADL and SPLC to identify “hate speech”
Biased information from 3rd parties is incorporated into the ML algorithms that Google uses to weight and filter content
News stories from mainstream outlets are given higher weights, resulting in anti-conservative mainstream news sources appearing higher than smaller conservative outlets
Google’s products become biased against conservatives
Because this process does not involve a Google employee directly writing a feature that suppresses conservative viewpoints, the company is able to publicly claims that the search results are fair
Google is not the only tech company to use a process like this. Here is an infographic by @satoshiksutra showing how Twitter launders its political bias