RobotRepublic — Many of us have been arguing for a long time now that an understanding of artificial intelligence is going to have big implications for the future of our world.
Such organizations as the International Committee for Robot Arms Control now are arguing for the need to design limits on Lethal Autonomous Weapons Systems (LAWS), and happily there has been motion in the United Nations in that direction.
Others have been arguing for the need for making AI algorithms accountable. In other words, they want to make it so that users can understand how they have come to the conclusions they reach. The EU’s General Data Protection Regulation will demand certain kinds of accountability and, given the international scale of many large companies, is likely to impact AI use in the commercial domain.
Despite this growing awareness of the importance of accountability of algorithms, governments push recklessly on in trying to use the emerging AI technologies inappropriately.
Here’s a case in point: The US Department of Homeland Security (DHS) recent decision to use AI as part of a proposed “extreme vetting initiative (EVI).”
EVI suggests that it is possible to make “determinations via automation” about whether an individual will become a “positively contributing member of society” or is more likely to be a terrorist threat. Further, the DHS has been talking about awarding a contract to deploy this technology as soon as September 2018, possibly sooner if the US President has his way.
On November 16, a letter was sent to the Acting Secretary of DHS, Hon. Elaine Duke, expressing grave concerns about the proposed plan.
Summarizing the objections, the authors of the letter stated “Simply put, no computational methods can provide reliable or objective assessments of the traits that [DHS] seeks to measure.” The letter was signed by 54 experts on AI, machine learning and related technologies. I’m proud to have been one of those signing.
To understand the argument being made one needs to know a bit about how current machine-learning algorithms work. One major use of these programs is in learning to predict outcomes given a bunch of example data.
The goal is to determine which features of the data can best determine what categories things will fall into.
If the data has some number of examples that fall in one category, and the rest fall in another, then the algorithms can be trained to figure out what factors in the data correlate the most with the different categories.
For example, suppose we have a lot of data about patients coming to the emergency room in a hospital. Some of these patients are sent home and some are admitted. But there are some of those who are sent home who end up coming back to the ER within some small period of time – these are known as “revisits.”
Reducing revisit rates, by appropriately admitting the patients instead of sending them home after the first visit, helps improve patient care and also saves the hospital considerable costs. In fact, reducing revisit rates is considered a critical part of improving health care in America.
In a number of different studies, machine-learning systems have been shown to be very useful in helping hospitals to determine how to reduce these rates. The hospitals’ electronic health records are used to algorithmically compare (i) the people who are admitted and stay, (ii) the people who are released and don’t come back, and (iii) the people who are the revisitors, returning to the ER within some set amount of time.
The hospitals data may include many thousands of potential features for each patient (when they came, where they live, what time they were released, how old they are, what treatments they were given, and many more). AI systems have been shown to be able to smaller sets of these features that can be used to help doctors make better decisions.
So what is wrong with using similar technology for the EVI?
To start with, even the best of these learning systems are functioning at significantly less than a hundred percent accuracy. In the case of the hospital, while it would be nice if we could find all the patients who might come back, it is considered enough to just help doctors do better.
If the revisit rate were cut down, that would help many people, and that would be a great outcome, even if it wasn’t perfect. In the case of the EVI, people requesting asylum into the US may be in extreme danger if they aren’t admitted – what error rate is considered appropriate?
But the problem is worse than even that.
The true challenge to these machine learning algorithms is what is called “data skew.” The more examples in one category, as compared to another, the larger the error rate is likely to be. If one in a hundred patients are revisitors, the system will do better than if that number is one in a thousand.
If one in a 1,000,000 million, the system could degrade to essentially randomness – there is just not enough data to go on. Consider what would happen if the hospital had only had one revisitor, and his name was Fred. Would we want the doctors to assume everyone named Fred should be admitted?
The EVI would be looking at data in which very large numbers of people have been brought into the US, with few (if any) examples of people who have actually committed terrorist acts once here. The situation would be similar to the patient named Fred – lots of data, very few instances. But technically, the skew would defeat the purpose and the system would be largely random.
There are other technical issues outlined in the letter, but this issue of learning in the presence of skew is one of the key ones. As the letter explains “on the scale of the American population and immigration rates, criminal acts are relatively rare, and terrorist acts are extremely rare” – that is the chances of a learning algorithm performing well are very, very low. It is an inappropriate use of AI technology.
And it is likely to negatively impact many people without any significant benefit to public safety.
This is what caused the authors of the letter, and those of us who signed it, to conclude:
Data mining is a powerful tool. Appropriately harnessed, it can do great good for American industry, medicine, and society. And we recognize that the federal government must enforce immigration laws and maintain national security. But the approach set forth by [DHS] is neither appropriate nor feasible.
The EVI is ill-conceived and it should be abandoned.
There’s another reason for concern.
It is important to realize that in this case, a policy decision is being made by those who do not understand the details of the technology they are proposing to use.
AI isn’t some magic bullet. There’s not even magic to it.. That’s why it is critical to our society that we stay abreast of these technologies and what they might mean to our policies, laws and proposed safeguards. We need to make some serious and well-informed choices about AI deployment in the coming weeks and years …
For RobotRepublic, I’m Jim Hendler.