I can see blue patches in the sky today.
It’s Friday, January 27, 2023 in Portland, Oregon and I am incredibly into listening to this Kylie Minogue cover of Bette Davis Eyes and you can’t stop me.
Over on the Orange Place, I saw a post from Austin Henley, who works on AI at Microsoft. Henley’s point is one that I broadly agree with: that natural language text/speech interfaces are lazy ones1 when dealing with the potential of what large language models like Chat-GPT and others can do.
Henley says the potential of LLMs is that they
could feed the relevant context to the model behind the scenes and use that to preemptively suggest what I should do next. The toolbar could adapt to my specific task. Dialog boxes wouldn’t have to be so static. I could point to a region of the screen and ask for an explanation. It could identify a misunderstanding before I do.
Which is all true! I think the thing many people are missing (and that Henley is eliding here) is supply that extra context to the application, necessarily involves, uh, measuring and recording the context. In other words, more surveillance.
This is not a big deal in abstract, but we don’t live in the abstract, we live in a world where the majority of models for investing in creating software at the scale needed for large language models involves, let’s say, questionable practices, not least of which is the whole practice of surveillance capitalism.
To be really useful, to offer you personalized recommendations and context-specific prompts, a model would probably want to know things like:
Because, the thing is, those are things that a human would need to know in order to give you good recommendations. They wouldn’t have the perfect recall that a computing system would have the capacity or theoretical ability for.
And not just purchase history, but if you want good recommendations, recommendations that aren’t out of left-field, then you don’t just want purchase history, but you also want explicit preference history.
It is so alluring thinking of all the possible data and contextual information you could suck up and ingest for a system in order to provide better and better suggestions. After all, at some point the map of you becomes indistinguishable from you. And is that okay?
Again, the missing part here includes the economic, political, regulatory and broad-s-Societal context. The pressure, right now, to Make Money From That Data is off the charts. The pressure to steal that data, maybe too! Our information security practices whether experimental never mind in production to securely deal with that data are surely primitive to say the best. And of course this data will be subject to subpoena.
Yes, you can design models that do on-device learning and restrict data to the device. But even then we’d have to come to grips with the tradeoff being if you want Alexa or Siri to be more useful then it would have to be listening all the time. All of it. More data, right? Remember, if this data is on your phone, my understanding is that it’s up for grabs in the U.S. which includes all of those coastal territories subject to DHS jurisdiction.
So, I don’t know. What if you throw away precision? What if you forget over time, or what if the information becomes foggier over time? What if you smooth it all out?
The thing is, we don’t know, and the instinct and impulse will be to know more because these machines are built for processing and we just want them to process and analyze and synthesize and model more, more, more, more.
We aren’t ready, and it’s going to happen anyway. Siri recommendations already use information about how you use your device to provide more context-specific recommendations when you make a search, they already use time and location data to recommend actions.
I don’t think it’s hyperbole to say that your personal assistant needs the model of you, it needs and can also record your personal record.
Fictional examples of such assistants like Jarvis are modeled on butlers like Alfred Pennyworth. Pennyworth witnessed Mr. Wayne’s life from birth, and knows all of his secrets. Alfred’s okay though, because he’s English and discreet.
Are our assistants discreet? Do they leak? Would they leak? We have evidence that they do already. In fact, they don’t so much as leak as actively snitch, if we’re talking about simple things like vision-enabled doorbells.
That’s the superpower of Alfred Pennyworth: he’s a panopticon we invite in, but a panopticon we like: English, and discreet.
That’s it! This one was a short, fast one, and I’m pretty happy with where it ended up and the closing para. I did not know it was going to end there.
(Kylie just segued into Carly Rae Jepsen’s I Really Like You and wow am I in a poppy mood right now)
How are you doing?
Natural language is the lazy user interface, Austin Henley, Austinhenley.com, 27 January 2023 ↩