The difference between ancient myths and modern ones is that the latter are peddled as true. Among these modern myths is the assumption that replacing people with machines running complicated algorithms will give unbiased outputs. You are often told that systems are more objective and less prone to prejudice than human judgment. Many companies are turning to machine learning to review vast amounts of data, from evaluating credit for loan applications, to scanning legal contracts for errors, to looking through employee communications with customers to identify bad conduct.
Unintended consequences are the norm rather than the exception in human endeavours and delegating decision-making to machines is no exception. In 2016, for example, an attempt by Microsoft to converse with millennials using a chat bot plugged into Twitter famously created a racist machine that switched from tweeting that “humans are super cool” to praising Hitler and spewing out misogynistic remarks. This was a system that learnt through interaction with users and bias arose based on the biases of the users driving the interaction. In essence, the users repeatedly tweeted offensive statements at the system and it used those statements as the input for later responses.
In Homo Deus: A Brief History of Tomorrow, Yuval Noah Harari says that high-tech Silicon Valley gurus are creating a new universal narrative that legitimizes the authority of algorithms and Big Data. He calls this novel creed “Dataism”. According to Dataism, 'Beethoven’s Fifth Symphony, a stock-exchange bubble and the flu virus are just three patterns of dataflow that can be analyzed using the same basic concepts and tools. The proponents of the Dataist worldview perceive the entire universe as a flow of data and we are already becoming tiny chips inside a giant system that nobody really understands. The nature of AI algorithms makes it so that we won’t know how or what the system is doing, or how it’s so damn good at predicting and choosing things.'
We may thus be gradually ceding control to algorithms which will make all the important decisions for us. But, as Cathy O’ Neil says in Weapons of Math Destruction, an algorithm is just 'an opinion formalized in code'. We tend to think of machines as somehow cold, calculating and unbiased. We believe that self-driving cars will have no preference during life or death decisions between the driver and a random pedestrian. We trust that smart systems performing credit assessments will ignore everything except the genuinely impactful metrics, such as income. And we have the belief that learning systems will always ultimately enable us to find out the truth because ‘unbiased’ algorithms drive them.
Using machine-learning technology for accomplishing various tasks is attractive but it doesn't eliminate human bias; it just camouflages it with technology. In an ideal world, intelligent systems and their algorithms would be objective. Yet there are many more potential ways in which machines can be taught to do something immoral, unethical, or just plain wrong. Machine bias or algorithm bias is the effect of erroneous assumptions in machine learning processes. Models of human behaviour often use proxies which are questionable and depend on the biases of the modeler. The models are generally opaque and businesses guard them as intellectual property defended by legions of lawyers.
These algorithms have large effects in the real world which can have unfortunate consequences with some beginning to describe data science as the new astrology. Technocrats and managers make debatable value judgments that have their biases written all around them. Complicated mathematical models reframe subtle and subjective conclusions (such as the worth of a worker, service, article, or product). One opaque model feeds into another and could affect your whole life. For eg., a bad credit score could mean that you may not get a house, buy a vehicle or get a job. The techies then claim that it is objective “science” based on measurable data which is supposed to be accepted unchallenged. As long as the algorithms are secret, you will never know what kind of of social sorting is taking place.
Can a computer program be racist? In ‘predictive policing’, historical data about crime is fed into an algorithm and this gives police information about future crime. The system is in use in countries like the US and China. But predictive tools are only as good as the data they are fed. As an article says, Predictive Policing Isn’t About the Future, it’s about the past, These systems are based mostly or entirely on historical crime data held by the police which are a record of how law enforcement responds to particular crimes, rather than the true rate of crime. Hence these data are contaminated by underlying biases about where to deploy police and what type of people commit crimes thereby strengthening these biases. Forecasts are only as good as the data used for their training.
A machine learning algorithm is used by judges in over a dozen states in the US to make decisions on pre-trial conditions, and sometimes, in actual sentencing. A study found that the algorithm was two times more likely to incorrectly predict that black defendants were high risk for recommitting a crime, and conversely two times more likely to incorrectly predict that white defendants were low risk for recommitting a crime. This difference could (and did) result in handing out tougher or more lenient sentences to convicts.
In an article Beware the Big Errors of 'Big Data', Nassim Nicholas Taleb warns that spurious correlations will increase with the voluminous jump in data collection. The huge increase in the haystack will make it harder to find the needle. As Tim Harford, says in the article Big Data: Are we making a big mistake?, ‘Because found data sets are so messy, it can be hard to figure out what biases lurk inside them – and because they are so large, some analysts seem to have decided the sampling problem isn’t worth worrying about. It is.’
The industries involved in various activities like search or credit hide their methods in secret algorithms. The “privacy policies” of various firms are written to their advantage at the expense of the consumer. I have yet to come across anybody who reads them. People mechanically click “I agree” when confronted with “terms of service” agreements since protesting against any clause in them won’t be of any use. The dice is heavily loaded against them. Now there is a “unilateral modification” clause that lets companies change the agreement later, with no notice to the persons affected. Frank Pasquale writes in The Black Box Society:
Unintended consequences are the norm rather than the exception in human endeavours and delegating decision-making to machines is no exception. In 2016, for example, an attempt by Microsoft to converse with millennials using a chat bot plugged into Twitter famously created a racist machine that switched from tweeting that “humans are super cool” to praising Hitler and spewing out misogynistic remarks. This was a system that learnt through interaction with users and bias arose based on the biases of the users driving the interaction. In essence, the users repeatedly tweeted offensive statements at the system and it used those statements as the input for later responses.
In Homo Deus: A Brief History of Tomorrow, Yuval Noah Harari says that high-tech Silicon Valley gurus are creating a new universal narrative that legitimizes the authority of algorithms and Big Data. He calls this novel creed “Dataism”. According to Dataism, 'Beethoven’s Fifth Symphony, a stock-exchange bubble and the flu virus are just three patterns of dataflow that can be analyzed using the same basic concepts and tools. The proponents of the Dataist worldview perceive the entire universe as a flow of data and we are already becoming tiny chips inside a giant system that nobody really understands. The nature of AI algorithms makes it so that we won’t know how or what the system is doing, or how it’s so damn good at predicting and choosing things.'
We may thus be gradually ceding control to algorithms which will make all the important decisions for us. But, as Cathy O’ Neil says in Weapons of Math Destruction, an algorithm is just 'an opinion formalized in code'. We tend to think of machines as somehow cold, calculating and unbiased. We believe that self-driving cars will have no preference during life or death decisions between the driver and a random pedestrian. We trust that smart systems performing credit assessments will ignore everything except the genuinely impactful metrics, such as income. And we have the belief that learning systems will always ultimately enable us to find out the truth because ‘unbiased’ algorithms drive them.
Using machine-learning technology for accomplishing various tasks is attractive but it doesn't eliminate human bias; it just camouflages it with technology. In an ideal world, intelligent systems and their algorithms would be objective. Yet there are many more potential ways in which machines can be taught to do something immoral, unethical, or just plain wrong. Machine bias or algorithm bias is the effect of erroneous assumptions in machine learning processes. Models of human behaviour often use proxies which are questionable and depend on the biases of the modeler. The models are generally opaque and businesses guard them as intellectual property defended by legions of lawyers.
These algorithms have large effects in the real world which can have unfortunate consequences with some beginning to describe data science as the new astrology. Technocrats and managers make debatable value judgments that have their biases written all around them. Complicated mathematical models reframe subtle and subjective conclusions (such as the worth of a worker, service, article, or product). One opaque model feeds into another and could affect your whole life. For eg., a bad credit score could mean that you may not get a house, buy a vehicle or get a job. The techies then claim that it is objective “science” based on measurable data which is supposed to be accepted unchallenged. As long as the algorithms are secret, you will never know what kind of of social sorting is taking place.
Can a computer program be racist? In ‘predictive policing’, historical data about crime is fed into an algorithm and this gives police information about future crime. The system is in use in countries like the US and China. But predictive tools are only as good as the data they are fed. As an article says, Predictive Policing Isn’t About the Future, it’s about the past, These systems are based mostly or entirely on historical crime data held by the police which are a record of how law enforcement responds to particular crimes, rather than the true rate of crime. Hence these data are contaminated by underlying biases about where to deploy police and what type of people commit crimes thereby strengthening these biases. Forecasts are only as good as the data used for their training.
A machine learning algorithm is used by judges in over a dozen states in the US to make decisions on pre-trial conditions, and sometimes, in actual sentencing. A study found that the algorithm was two times more likely to incorrectly predict that black defendants were high risk for recommitting a crime, and conversely two times more likely to incorrectly predict that white defendants were low risk for recommitting a crime. This difference could (and did) result in handing out tougher or more lenient sentences to convicts.
In an article Beware the Big Errors of 'Big Data', Nassim Nicholas Taleb warns that spurious correlations will increase with the voluminous jump in data collection. The huge increase in the haystack will make it harder to find the needle. As Tim Harford, says in the article Big Data: Are we making a big mistake?, ‘Because found data sets are so messy, it can be hard to figure out what biases lurk inside them – and because they are so large, some analysts seem to have decided the sampling problem isn’t worth worrying about. It is.’
The industries involved in various activities like search or credit hide their methods in secret algorithms. The “privacy policies” of various firms are written to their advantage at the expense of the consumer. I have yet to come across anybody who reads them. People mechanically click “I agree” when confronted with “terms of service” agreements since protesting against any clause in them won’t be of any use. The dice is heavily loaded against them. Now there is a “unilateral modification” clause that lets companies change the agreement later, with no notice to the persons affected. Frank Pasquale writes in The Black Box Society:
We cannot so easily assess how well the engines of reputation, search, and finance do their jobs. Trade secrecy, where it prevails, makes it practically impossible to test whether their judgments are valid, honest, or fair. The designation of a person as a bad employment prospect, or a website as irrelevant, or a loan as a bad risk may be motivated by illicit aims, but in most cases we’ll never be privy to the information needed to prove that.
What we do know is that those at the top of the heap will succeed further, thanks in large part to the reputation incurred by past success; those at the bottom are likely to endure cascading disadvantages. Despite the promises of freedom and self-determination held out by the lords of the information age, black box methods are just as likely to entrench a digital aristocracy as to empower experts.