British Police Built a Sprawling Crime-Prediction Machine. Some Results Couldn’t Be Trusted

Staff
By Staff 5 Min Read

The integration of artificial intelligence into public services—specifically within law enforcement—was promised as a leap toward efficiency and proactive safety, but the reality behind the curtain suggests a far more problematic landscape. When Avon and Somerset Police opened their archives to investigative scrutiny, what emerged was a fragmented mess of predictive tools that were arguably failing the very people they were meant to support. From predicting who might go missing to assessing the likelihood of criminal behavior, these models were designed to guide human intuition. However, as independent auditors at Eticas discovered, the algorithms were frequently “hallucinating” risks, flagging ordinary citizens with a level of inaccuracy that would be unacceptable in almost any other professional sector.

The most jarring discovery involves the sheer incompetence of some of these early algorithmic efforts. One model tasked with identifying potential burglars operated for years with a precision rate of less than 10 percent. In practical terms, this meant that for every person the computer flagged as a high-risk criminal, over 90 percent of those assessments were entirely wrong. These aren’t just abstract data points; they represent real people being subjected to systemic suspicion based on the output of broken software. When auditors noted that the performance metrics for these models were shifting wildly—rising and falling without any clear rhyme or reason—they concluded that this wasn’t just a “glitch,” but a hallmark of a system that lacked basic governance and operational integrity.

The official response from the Avon and Somerset Police adds another layer of concern: institutional opacity. While the force claims that many of these problematic models were never actually “deployed” into the field, they failed to account for why these systems were left running to generate audit data for years, if not to inform their decision-making. The police force’s defense—that these were merely automated processes and that experts were reviewing the output—rings hollow in the face of the evidence. When pressed on the existence of an ethics committee intended to safeguard against bias and harm, the force admitted that no meetings had actually taken place. Their reasoning? They claimed they hadn’t produced any models that required an ethical review, a statement that seems at odds with the flawed data they were actively generating.

This lack of internal oversight bleeds into the issue of algorithmic bias. In a move that feels performative at best, the police presented a “bias check app” as evidence of their commitment to fairness. The app simply compared average risk scores between white individuals and people of color, finding “no significant difference.” Independent auditors were quick to dismantle this, noting that comparing averages is a superficial exercise that hides the true, granular nature of discrimination. Failing to test for socioeconomic status, gender, or specific intersectional harms isn’t just a technical oversight; it’s a fundamental failure to protect vulnerable communities from being unfairly targeted by an automated, uncaring hand.

The deeper, more human danger here isn’t just that the computers are wrong; it’s the psychological toll the technology takes on the people using it. Law enforcement officers, often overwhelmed and under pressure, may find it nearly impossible to ignore a “risk score” generated by a screen. Even when an officer knows their own intuition is better, there is an inherent bias to trust the machine, turning human judgment into a rubber stamp for cold, often incorrect, calculations. As former practitioners have noted, there was a genuine desire to improve outcomes, but the lack of capacity and rigor meant that these tools became a crutch that potentially clouded the real-world, boots-on-the-ground intelligence of human workers.

Ultimately, we are left with a cautionary tale about the limits of predictive technology in the public sphere. These systems are still in operation, being used by authorities to judge everything from a child’s educational trajectory to an adult’s likelihood of reoffending. When a system intended to assist human decision-making is only accurate one-third of the time, it ceases to be a tool for justice and becomes a source of administrative chaos. Until there is radical transparency, strict accountability, and a willingness to prioritize human nuance over high-speed processing, the use of AI in policing will remain a dangerous experiment that treats human lives as nothing more than experimental data points.

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *