By James Flint

Three voice AI assistants inhabit my house like truculent ghosts, bored by haunting. My wife uses Siri, in its bronzed Aussie surf-dude incarnation (hmm), to update her shopping lists and tell her where her phone is. It amuses her that her phone assistant is sexier than I am and somewhat more dependable.  

My kids use Alexa to get them out of bed in the morning and play them music when they go to bed at night. It amuses them that they can obey Alexa while ignoring all instructions from their parents. 

And I use Google Assistant to turn the kitchen light on and off, which it usually manages on the fourth attempt, and occasionally check the weather, though as with most things I usually find it quicker, easier and less hassle to just look at my phone. I do not find this amusing. 

No one uses Cortana. I have, though, recently acquired yet another voice assistant, this time in my car.  

We use these tools despite them not really being particularly helpful because when they first came out it was fun and exciting that after decades of failed attempts at computer speech recognition and production, the new approach of natural language processing (NLP) (i.e. training neural nets to do the job, rather than relying on traditional techniques of codified symbol processing) had made science fiction real and it was finally possible for us to talk to our machines.  

And we use them despite my misgivings about the danger that having the representatives of three mega-corps eavesdropping on pretty much all of the activities in our household poses to our privacy. Indeed, I was so worried about this that in my last job, as CEO of a health-data compliant messaging app, I spent around six months trying to win a sizable grant to build an assistant into our product that would do the natural language processing in the phone to protect the data of its users, rather than send it back to a server in Someplace, Indiana, for casual abuse by whomever happened along. 

Unfortunately, we didn’t get the grant because we couldn’t convince the reviewers – health professionals all – that the data privacy issues involved were all that important. They were less worried about them, weirdly, than my wife and kids.  

But the privacy was (and is) a big deal, so much so that Google and Apple themselves began to bow to public pressure and start device-side NLP processing as soon as they were able to develop chips powerful enough to handle the computational load (Google’s Tensor chip, in the Pixel 6, and Apple’s A11 with neural engine, in the iPhone X). 

This has been an improvement, but many problems remain. Not all devices have the architectural elegance of the best flagship phones; privacy terms and permissions are inherently difficult to present and collect on voice-first interfaces; third-party “skills” (the voice equivalent of apps) introduce a whole gamut of security and data protection vulnerabilities… the list goes on. 

Because nothing is ever forgotten on the Internet, thanks to my failed attempt to build a better AI assistant back in 2018 (and because I’m now a full-time privacy professional despite my house full of spyware) I was recently asked to contribute to a cross-disciplinary research project conducted by the Departments of Informatics, Digital Humanities and The Policy Institute at King's College London and the Department of Computing at Imperial College London in collaboration with non-academic partners including Microsoft, Humley and Mycroft. 

Over the last year or so SAIS, as it’s called (it stands for Secure AI assistantS), has been investigating these issues and is now presenting its findings in a series of blogs and podcasts. You can find these on the SAIS website, here: SAIS - Secure AI Assistants ( Episode 2 of the podcast, (in which, excitingly, I feature) has just come out. And the SAIS team will be speaking about the finding on a panel at London Tech Week on 13th June, if you fancy coming down.  

I’ll be there too and, in the run-up to the event, I’m going to be writing a bit about how the world of the AI Assistant has just been completely upended by the recent explosion of ChatGPT and its siblings onto the scene. What do these developments mean for privacy? What does they mean for voice assistants? And will I finally be able to turn on my kitchen light without having to repeat myself four times? Find out here, on the Securys blog, shortly after Easter. 

Image: generated using Dall-E.