Madan Thinks

Opinions, Context & Ideas from Me

Perspectives on AI based Facial Recognition in Law Enforcement

Amit Shah, Union Home Minister of India, addressed Parliament early last month on the Delhi riots in his usual belligerent style. Slipped in between the rhetoric was a short comment alluding to the Govt’s deployment of automated Facial Recognition Technology (FR) to identify ~1200 rioters. This comment piqued my interest & thought it was highly consequential. Immediately, my thoughts went back to a chat I had with a close aide to a Sr. Minister in our state govt. Where he talked of the recent purchase of 16,000 CCTV camera & whose potential worth on the field would be realized when the National Govt’s. Crime Records Bureau implemented its proposed Facial Recognition system (AFRS); its tender made for useful reading though the details were basic.

While thinking thru the Home Minister’s pronouncement, myriad thoughts went thru my mind – would the Govt. in the future perform live surveillance via FR in public places? Will our yet to be introduced data protection law provides for privacy protection? Does the Govt.’s roadmap include the integration of its policing schemes reliant on FR with the Aadhar Unique National ID scheme (esp. in light of claims of its security being compromised & affiliated embarrassments)?

While I am wholly oblivious to the Govt’s views on these most pertinent questions, I am however sure that the authorities would have been impressed by Chinese police’s capture of an economic offender via FR in a stadium of 60,000 concert goers!

Does Facial Recognition Technology Trumps the Human Eye in Accuracy?

As an advocate & student of technology, I continue to marvel at the amazing progress made in Computer Vision technology leading to today’s sophisticated AI-driven FR –

  • The epoch-making infusion of neural networks in AI by Jeff Hinton’s team & subsequent commoditization related applications/products by Google, Amazon, Baidu, Tencent, Facebook, Alibaba, Microsoft et al.
  • Supercharged data processing capabilities rendered by big data technology & distributed computing architectures/algorithms
  • the easy availability of vast, real-life data that has been exploited by the likes of Facebook & Baidu to build sophisticated Machine Learning models for AI-driven FR (we’ll expand on this later in the piece).

To understand the prowess of AI-driven FR, we need to look no further than the results of the IMAGENET, the flagship global AI Computer Vision contest, where systems are outperforming the human vision at identifying images at astonishingly high rates

The National Institute of Standards Testing (NIST), a wing of the US Commerce Dept., is well known in technology circles for its Facial Recognition Vendor Test that it has been operating for well over a decade. One of its various tests focusses on FR products catering to law enforcement domain. In 2010, the best technology in this space could identify someone in a collection of 1.6 million photos at about 92% of the time. In the latest 2018 run of the test, the best technology could accomplish this with an accuracy of 99.7%, a nearly 30-fold drop in failure rate. The winner of this latest edition was Microsoft – who celebrated this milestone in an articulate blog post.

But the FRs accuracy & consistency today are much more modest on the field than in these controlled environment tests. While the technology has been commoditized, making it cheaper, the sense of its effectiveness is shaped by the performance of a few top vendors. What these NIST & other results don’t publicize is the steep drop in quality of FR products from anyone other than the top vendors in the field.  I know for a fact that some Indian states have made forays into FR based policing, but these have been baby steps at the automated matching of images leveraging nascent home-grown AI startups. For instance, a state police’s facial recognition application couldn’t differentiate between a boy & a girl & this dept’s trial had a 2% success rate.

Kai Fu Lee, in his beautifully crafted book AI Superpowers, explains that this gulf in quality of AI products from the very top vendors & everyone else is only going to increase. Hence my first concern is around affordability – how many law enforcement agencies can afford these top dollar FR products (as cutting corners here would have dangerous consequences)? Would law enforcement agencies become dependent on these vendors & the growing influence of vendors on these agencies?

Of False-Positives & False-Negatives

A second concern that must be taken on board is that of the false positives that this technology can generate & the effects of it. Take the scenario of the Welsh police’s experiment in the Champions League final that took place in Cardiff in 2018 (a game that I gleefully enjoyed due to the Madrid win). The police ran a live surveillance of the attendees, some 67,000 of them, to log 173 accurate face matches & incorrectly logged a whopping 2,297 people as suspicious—a 92% false-positive rate.

By false positives, I mean the police’s FR incorrectly identifies a Sundar as a Satya—& at the same time accepting the less damaging reverse case of false negatives, in which Sundar goes unidentified even if he’s in the criminal database. While the use of FR & Computer Vision in other domains is acceptable with false positives; a mistake in this field leading to a criminal charge or even a mistaken arrest can have life long, irreversible consequences to the individual & his/her family. This should esp. be expounded in the context of the Indian Judicial & Criminal Justice systems with significant issues such as lengthy under-trial detention, understaffed investigators et al.

Suresh Venkatasubramanian, a Computer Science expert at the University of Utah, says, “From a government’s point of view a dragnet that catches a lot of extra people from which they then filter out what they’re interested in might be considered as working & might not cost them too much. But from your point of view, if you’re caught up in one of these false-positive dragnets, that might not seem like it’s working. Normal error rates would suggest that you’re going to get a lot of hits if you just indiscriminately take a lot of people’s faces & run them against your database. Versus the other way around where you target your search for a specific person & try to match that person’s face to the crowd. There are subtleties in how a system is deployed versus how it was trained. We often see a failure mode for the use of algorithms in these systems where they’re trained for one thing, but they’re being used a slightly different way & that causes problems.”

Sundar, Satya, Woody Harrelson & Machine Learning

Suresh’s views nicely dovetail to my third concern – that of the data & method used to train the Machine Learning models of the FR; this is central to how outcome or putting it in another way, saving innocents from getting caught in the false-positive net. Let me explain – Machine learning embedded in FR systems do not rely on traditional methods where software is embedded with a complex amalgam of logic to help the system identify / match / classify. The Machine Learning way of matching/classification fundamentally changes the paradigm from logic problems to statistical problems. Instead of embedding logic into a system on how to recognize a photo of say Satya, You can take a hundred examples of Satya’s (from various angles, lighting, looks…) photos and a hundred thousand examples of not-Satya’s photos. Then use a statistical engine to train a model that can tell the difference with a certain probability. Then you give the system a photo & it tells you whether it matched Satya or not-Satya & by what degree/probability. Here instead of telling the system the logic, the system works out the logic based on the data & the answers (‘this is Satya, that is not-Satya) that you give it & the statistical framework you establish to govern this training/assertion. This approach of Machine Learning comes with two undeniable risks:

  1. That quality of the outcome you get is just a figment of the quality/variety/detail with which you train the system – ‘garbage in means garbage out’. It is just making a statistical comparison of data sets sent to it to build the logic. So again – what is the quality of your data set? Are there enough volumes in it? How is the data set selected? Discrepancies & aspects that might be in your data that has nothing to do with the people & have no predictive utility but yet affect the outcome massively –  – see the AI use case of skin cancer detection?
  2. This takes me to my other concern – around the probability of the match. Machine Learning doesn’t give binary answers. It provides answers based on probability or degrees of acceptance – ‘maybe’, ‘probably’ outcomes. Take for instance mugshots of thugs – I can safely conclude pictures of Indian thugs will feature an ensemble of large gold chains & other jewellery on their person (going by the skin cancer case our Machine Learning system will risk becoming a gold chain recognizer rather than a thug detector). If we train an FRT on mugshots of such criminals, then take a photo of a good Samaritan wearing a chain & check if there are any matches (taking care to use a relatively low confidence level), the system is likely to render a high probability that this person was a thug! This is not me trying to sabotage an established AI methodology but a report from Gizmodo last year suggested that police in Washington tweaked their FRT built on Amazon’s pioneering Rekognition platform beyond Amazon’s recommendation includ lowering the confidence threshold for a match to below 99% in an effort to ‘cast a wider net’.

The perfect amalgam of the concerns I talked about came true in a situation concerning the famed NYPD. Georgetown University’s Law Center published a report stating the NYPD detectives were “getting creative” in using FRT  & feeding in sketches or celebrity photos they judged looked similar to the person of interest into their FR systems. In one case, actor Woody Harrelson’s photos were fed into the system & the results were used to “apprehend a suspect”. This may be standard policing practice, but the articulate reader of this piece will understand that such actions destroy the effectiveness of the Machine Learning model & produce dangerous false positives. The design & training approach of Machine Learning systems requires a careful assessment & long term strategy, as explained in this Google’s paper on Human-Centric Machine Learning design.

A related aspect is also that top vendors of FRT have been remarkably nimble in managing their offerings i.e. data sets, algorithms, capabilities & have addressed feedback quickly; even going as far as turning off features. The lumbering law enforcement agencies must originate/adopt new processes to leverage this technology effectively. A specific area of focus must be inter-agency interlock issues which has always been a bugbear of these Govt. agencies even in the most developed of nations.

Prejudice Baked into Software

The fourth aspect of this is the element of bias that infiltrates into the FRs Machine Learning model. NIST, whose tests have borne such significance & publicity, called out the empirical bias as a major challenge in a report published a few months ago. Their conclusions were based on a study of 189 FR products from 99 top vendors in this domain. Independent researchers outside of Govt. organizations have also found that societal biases, such as racial prejudices, are reflected in the data used to train facial recognition models as well as in the algorithms themselves. And one of my favourite data analytics magazines – FiveThirtyEight, that has answers to almost everything is also flummoxed in responding to this case of bias in FRs.

Orwellian, Some Say…

Lastly, the question of privacy, consent & legal coverage of these methods. Authorities who are for immediate deployment of this technology esp. in an aggressive sense point to successes in China. Kai Fu Lee puts it rather bluntly in his book, AI Superpowers that the Chinese effectively trade-off data privacy for features, cheaper software. But the Chinese also do not have a choice when it comes to the state’s interference. A primary concern I have is that of consent. We provide consent for the use of face recognition based features in airports (Singapore, Bangalore & HK) or I’ve heard of its use in US banks. But where is the consent aspect when there’s live surveillance & monitoring in public places happening to identify persons of interest at that moment. And this live surveillance is already happening:

  1. A few months ago, in the Chinese city of Guiyang, a city of 4.3 million, a BBC reporter was flagged for arrest within seven minutes of police adding his headshot to a facial recognition database.
  2. Police in the liberal cities of Chicago & Detroit has adopted live surveillance on its citizens since 2016, possibly in contravention of state & federal law.

The cities of San Francisco & Oakland have since banned the use of FR in policing while cities such as Plano in Texas have embraced it.

While thinking thru this, I’m not sure if the proposed Indian use of FR in policing will become into effect at an overarching national level (Chinese Model) or a smorgasbord of regional surveillance models due to the lack of common law (US Model – read this). I would hope the Indian Govt’s policies would be more aligned to the EU Model (GDPR et al.) that places great emphasis on data protection. A friend of mine in the legal fraternity quoted a famous saying in his clan – “Better regulated tomorrow isn’t a promise for today”.

While I admit to not having a strong base of knowledge to give my opinions on the legal aspect. Speaking to friends in the legal fraternity & in the University I teach I have been told an India privacy & data protection law is next to non-existent & the National Data Protection law currently being analyzed has been widely cited for providing the state with enormous power over its citizen’s data, an Orwellian state as one legal expert remarked (though I admit the matter is sub-judice & wouldn’t jump to conclusions based on reporting)

As I conclude, I go back to discussions during the late 90s when our Indian taxation processes were being digitized. There was anxiety aplenty amongst our elders – what happens if these systems didn’t work, what happens if the database had bad data (a popular saying goes that if the Govt got your name’s spelling wrong in your return, it’s easier to change your name)?

Computer vision built on the foundations of AI is a generational technology that drives us closer towards achieving Artificial General Intelligence. I draw upon my own experiences at Accenture where we reimagined an entire client business process  in the network construction space via the power of AI driven Computer Vision & delivered tremendous outcomes.

We are facing the same fears with FR but with magnified complexities, a self-reinforcing cycle of establishment voices & a domain that touches a multitude of dimensions…the very fabric of society. Just as in the case of introducing databases & digitization to taxation in India, the use of FR in policing is an absolute aid to the job but one that must be exercised with proper controls, installation of appropriate legal protection & taking on-board the practical issues with this still-nascent technology.

“Don’t throw the baby out with the bathwater” but then again…don’t rush in where  “angels fear to tread”.

 

Related Reading

The following reads interested me on this subject

The Uyghur Faces & the monitoring of it

Google on responsible AI practices

The secret history of Facial Recognition

Mevgii, the Chinese FR AI giant’s IPO prospectus (I’ve only read serialized summaries of this myself & made for excellent reading)

The evolving situation of Facial Recognition data sets

Image Credit

  1. Getty Free Images

And thanks to my frequent intellectual collaborator, Sandeep Rajagopal, for his efforts in peer reviewing this piece.

Note – The opinions reflect in this piece are my own & does not in anyway reflect the opinions of the firm I am employed with or any other entity.

One comment on “Perspectives on AI based Facial Recognition in Law Enforcement

  1. Martin Klein
    April 7, 2020

    Hello Madan,

    what a surprise – I am surprised and happy to read your article about data protection… I agree and there a still a lot of more things t say about privacy, about our will to use smartphones and all the gimmicks…

    I will read some of the books you recommend….

    Best wishes

    Martin

    P.S. Is it possible to share the pictures we spoke about? The pictures of the Indian people you made?

    Like

Leave a comment

Information

This entry was posted on April 7, 2020 by in Technology.