s18e04: What Are We Doing Here, Exactly?; LLMs Are Playful, Actually

this time

                March 27, 2024

            s18e04: What Are We Doing Here, Exactly?; LLMs Are Playful, Actually

            0.0 Context Setting
It’s the afternoon on Wednesday, March 27, 2024 in Terminal D, ATL airport, after helping out a new client with their 1.5 day workshop. 
I now know that I do not like hotels that are designed like you’re on a cruise ship, i.e. your suite looks out into a giant atrium and, for some reason, the architecture has the side-effect of continuously fooling you into thinking there’s a pool right outside your door, thanks to a combination of a (hidden?) water feature and the acoustics of that atrium. 
0.1 Events: Hallway Track, and Pulling the Cord
Hallway Track is still on hiatus. 
Pulling the Cord, my plain-speaking guide to stopping traditional technology procurement, will be having its next test in the week of 8 April. 
I've had some great feedback that confirmed some of my hunches, and even better feedback that brought new things to my attention. More news about what's changing, and signups, soon.

1.0 Some Things That Caught My Attention
Two big, and different, things today.
1.1 What are we doing here exactly?
So. I have been roped in to help with some Big System Modernization work, and one reason why it’s super interesting this time is that it involves something like an industry association, not of manufacturers/vendors but of buyers, and those buyers more or less saying: “well, we’d like things to change a bit, and we’re going to try doing it together”.
It helps that this not-industry association has been given some money to go start figuring this out. 
One of my favorite questions to ask is “what for?”, so much so that I occasionally make fun of it on whichever social network I have splintered my personality on to at that particular minute. “What for?” is in the same space as the “so that?” part of a user story, and the shitposting version of that question is also in the space of “but is it, tho?” 
Asking “but what for?” is a way to pull back from early optimization and focus on a particular strategy (e.g. modernization) to make sure that people actually, you know, know why they’re here to do whatever thing in the first place. 
It’s a good time to remind people that “technology” is a “tool” for people to get things done, and the things that get done are also priorities set for people. It’s a tool for orienting people around to towards outcomes not outputs again. Modernization is an output, but to what end? 
“But what for?” is also a way in to get really sharp on the story you need to tell for whatever it is you’re trying to achieve and to get support for whatever tactics you think you need to get there, because in precisely zero percent of the time do you need to persuade nobody else to go do the thing. Literally not possible. At all. Fight me on it, I don’t care. 
So I’ll talk through the whole “modernization” deal: there are all the reasons to trot out why you want to modernize a thing. It’s old, it’s brittle, it costs a lot, you can’t do the things that you want to with it, every time you want to do a new thing with it it’s super expensive and takes a long time. 
Sure, those are  totally reasons to modernize a thing, but they’re a bit too... well, myopic isn’t exactly the word. They are not reasons that other people who make decisions about what you’re allowed to do care about. They just don’t. You haven’t connected this modernization -- whatever it is -- to value, which again is a word I am not a fan of, because it’s a placeholder word. It’s like, sure, this isn’t a bad word, but it is also a word like content, in that it surely signifies something, but it’s so, so much more helpful for you to be more specific in your feedback. 
If you’re going to tell me that you’re delivering value then I know that’s a proxy statement, a placeholder for “I totally know the outcome we’re aiming for, and I can totally defend its value because I went out and did a bunch of user research and so on”, which is a thing that has happened at least some digits of a percentage of a time. 
There was a great contribution at this workshop where one of the participants figured this out and was able to express it like this: what’s the problem we’re solving here? There’s an entire audience of people, decision makers, who are very much invested in solving certain problems, and they need something clear, simple, crisp that will sell this approach to them. And also because you want to know that you’re solving a real problem.
In this particular case, it was something along the lines of:
There will be another crisis. The systems we’re talking about did not weather the last crisis well. It was hard to adapt them, they fell over, they did not handle the complex and context-specific rules for that last crisis well, or those rules were hard to implement using the tools and systems we had. We need to solve the problem of adapting to the next crisis better, and there’s a window -- that’s closing -- right now of making sure we can do that.
Now, “modernization” is totally a strategy (or tactic?) you could choose to use to achieve that goal, or solve that problem. There are different ways you might go about it! You might want to unpack [sic] the problem in different ways, like illustrating that the current environment is dysfunctional because you can only go to, I don’t know, say three vendors or so to implement the Thing Addressing the Current Crisis. 
I love it when people come out with these what-fors. It’s worth taking the time on them and making sure they’re clear to everyone. 
One other thing that came out of this workshop that had me nearly jumping out of my chair in “I am writing this down and using it as an example everywhere now” was this system where they’d managed to extract an (expensive) enterprise document management system through this abbreviated process:

they looked at what they were using of the enterprise document management system
compared that to what the expansive enterprise document management system was capable of
had a look at the complexity and cost of managing and using that enterprise document management system (i.e. that it was expensive)
had a look at what they needed

then decided that more or less once they’d done some internal work, they could junk the thing and write their own service that would use whatever cloud provider’s blob storage.
They did this, from a technical point of view, partly because it simplified matters. And it was cheaper. But I love this example because it is clearly in a sense easier to go buy (as part of a vendor’s solution, say) the thing that has the enterprise document management solution because of course duh it checks all the boxes. But you are also buying all the other boxes that exist that aren’t checked. 
This isn’t new, it’s the regular “ha, bet you only use n percentage of y product”, sure there are costs and benefits to doing that. But this was such a wonderful example. 
Anyway, I digress. I don’t want to come across that asking “but why, tho” makes everything easier. Ideally it helps make what you’re doing clearer (and easier to explain!) and helps set direction, but you still have to do a bunch of work picking and setting the right “but why, tho” because you are going to get A. Lot. of answers for the But Why. A sort of Setting Goals is Hard, let’s go Enterprise Solution Shopping. 
1.2 LLMs are Playful, Actually
Oh jeez where to begin with this one-slash-what a doozy. 
Two things coming together in my mind here. Here’s the first:
So. Harper Reed has this writeup on his blog¹ about doing hilarious stuff with a homegrown smart building system, which is to say the combination of:
a) a space that had a whole bunch of Home Assistant sensor and output integrations (like, cameras, doors, switches, sensors, and speakers); and 
b) an OpenAI account 
Harper’s system takes json structured data emitted by his Home Assistant system (i.e. things like It is This Temperature at This Sensor in a way other computers can totally understand and do things with), feeds it into OpenAI’s GPT-whatever model which has enough parameters that it can Do Stuff With That json and write prose about that json because it’s been spammed with enough words written by not necessarily infinite monkeys, but certainly a lot of people not that far evolutionarily from monkeys.
It is simple to say that OpenAI can generate a lot of text programatically, I mean duh, that’s why we call this stuff generative AI. But! 
Harper’s system takes things like open vision models and structured data like there’s a person with a gender (male), with an age (30s), with clothing (a white shirt), hair (a beard), and a height (tall), and so on, and then spams that through OpenAI’s LLM to get prose like this: 
I managed to detect a man interacting with modern technology. Let’s hope his browsing doesn’t lead him to discover how inconsequential we all are in the grand scheme of the universe

and
Looks like our male model in business casual traded standing for sitting. Riveting change. Now he’s “focused” at his desk with his laptop. Work must go on, I guess.

which I will say is objectively funny. It helps, I think that you know this stuff is being generated by an LLM, I think that makes it even funnier because I know the entire rickety structure it’s built on:

a vision model that’s good enough that makes guesses or predictions that can also be totally wrong and have inherent problems and biases
the ability to “prompt engineer” and give an LLM model examples of how, well, silly you want it to be

To be clear, I don’t see this as examples of text that are “good enough” to be, what, good novels? Storytelling? But they’re certainly something and they certainly have a personality. Here’s a key part of Harper’s prompt:

Remember to use plain english. Have a playful personality. Use emojis.
Be a bit like Hunter S Thompson.

Like, on the one hand, this is better than Shitty Elon Musk’s attempts for his Grok AI bot to be “just like Douglas Adams”. I don’t know if this is because Harper’s example specifically stimulates my sarcastic British humor neural structures. It’s like there’s a mini shitty Charlie Brooker in Slack negging everything it can “see”. 
Second thing. Matt Webb has been Irritating Matt Webb and gone done a thing (no, not even the AI-assisted Galactic Compass he’s developed as an iOS app), and that thing is his Poem/1, the rhyming physical clock². 
It’s interesting that Matt first wrote about this pretty much one entire year ago³. 
Matt’s irritating because what he did was “just”⁴:

prompt ChatGPT to make a poem about the time
realize that there’s a qualitative difference when that poem is embodied as an object you look at

Here’s some of the poems:
As the clock strikes one thirty-four, / Embrace this moment, treasure it more.

Five fifty-three, time aglow, / Sun sets, moon's shadow starts to grow.

The clock strikes one-thirty-eight, / Afternoon sun shines bright with fate.

I... don’t think this is taking jobs away from poets. I don’t think Harper’s thing is taking jobs away from, I don’t know, a thriving ecosystem of on-demand improv writers who can quickly bang out a line about who’s standing in your waiting room. 
I do know that there’s a whole bunch of issues that it’s going to sound like I’m trivializing in terms of the source of data for these models that allows this text to be generated in this way, i.e. “by magic”. 
I mean, the clock is funny! At least, I think it’s funny! I feel weird about this, because I am going to use words like:

playful
whimsy
joy
silly
light-hearted

which are honestly giving me some sort of net-related ptsd flashbacks, because we did all this already. The space described by those words in an LLM model’s vector space got mashed and mangled into Surprise and Delight along time ago, a phrase now forever associated in my mind with Share and Enjoy. 
But again, I’m going to go all Genuine People Personalities on this. Last time I brought this up was in s16e18, back in October last year with Boston Dynamics’ demo of a Talking Spot the Robot coupled with personality prompts and startlingly good text-to-speech generation⁵. 
This stuff is fun! I mean yes, Ignore the Ethics (sigh), this definitely feels like New Material People Are Experimenting With that’s a qualitative difference in how software works. I’m not talking about chat interfaces, I’m not talking about how terrible narrative interfaces are for stuff like feature discovery or even the fact that now you literally need to know the spell for what you want the computer to do, I mean, what the fuck, but I’m talking about an entire new level of textual expressiveness that was impossible to do before. 
Well, it wasn’t impossible to do before. Here’s how you would’ve done it before:

You would’ve taken the situations in which you wanted expressive text
You would hook your thing up to something like mturk or another piecework deal (which, hello! Luddism! Piecework!)
Like, you would pay UCB grads hardly even pennies to quickly respond to your requests to “tell me something funny about this picture”

I mean, that’s how you’d do it. I am relatively sure at least one startup tried to do this exact thing. This entire thing is also part of the (humanist?) premise of Neal Stephenson’s Lesser-Cited Manual For The Future, The Diamond Age, in which Stephenson pits an algorithmic teacher against a human teacher, a struggling actor who gets to mother an abstract child through some sort of remote method invocation. 
I mean, come on!
You couldn’t actually do it that way!
(But why couldn’t you do it that way, Dan?)
Well, Other Dan, one of the reasons why is because capitalism as practiced now is shitty and that for you to do that, you’re exploiting people’s creative output and not valuing their expertise and you’re making use of abstractions to distance yourself from the labour that’s critical to your thing actually being a thing. Like, you wouldn’t do it that way because a) for you to pay people fairly (which... what would that be?), then b) you’d likely end up with something too expensive and too few people would pay for it, maybe because c) you’d been outcompeted by others who were Doing It More Cheaply.
While I’m typing this I am somewhat distressed to have to link it to Shit Wonka Experience, the Glasgow poster child that got headlined into “AI Experience Makes Kids Cry” which I understand is crack for editors and publishers. 
Shit Wonka Experience was in some sense giving a bunch of actors a shitty prompt (i.e. not much of a script) and telling them to just fucking figure it out while promising the punters the moon.
The thing about Shit Wonka Experience is that it’s... not a new phenomena? Sure generative AI was used to create the imagery, and the difference there is the availability and scale and, I don’t know, the aesthetic? It was concept art (and not even good concept art, tbh) being passed off as The Thing, and this has been a problem forever!
I am genuinely interested in what it looks like when Things Have Personalities. I think we’re responsible when we make clear that these aren’t Real Things With Personalities, though. In Harper’s example? Demo? Real thing? It’s not like anyone is saying that his office/studio is conscious, that there’s a Thing with a Name churning this out. I think here the thing is Harper hasn’t embued a thing with a Name, but that a Thing has a Personality. 
It is not Samantha, the Office, has a personality, it is just that the office has a particular expressive style. You didn’t need to take an office and pretend it’s a conscious being to trick us, it’s funny enough that this thing has an attitude. 
What’s also exciting about this is that it’s another stage where people are dicking about, and the dicking about is not venture-funded stuff. The venture-funded stuff is boring. The venture-funded equivalent of this is Generating Textual Reports to Satisfy Compliance or Whatever from your Surveillance Footage which sure fine whatever, but what makes it FUN is when the reports are, like, a disaffected video store employee from the 90s, as if your target market for this is intended to be a fatal blow to a gen-xer. 
So. Not every piece of software needs to have a personality in its expression. Dear god no, and now I’m worried that just saying this out loud is going to enable Personality Expressions for a ton of software and startups because They Have To and please, please, no. But it’s an option now, and unfortunately, it’s an option that has come about thanks to a position on training and intellectual property that’s yet another example of Grabbing Shit And Doing Stuff Before Anyone Else Can Relax, And Then Pointedly Saying Well, You Like This And You’re Not Going To Take It Away Now, Are You?
So. Textual, personality-based expressiveness at scale [sic] and available to anyone.
Wait, you know what, as I’m writing this I do know some writers directly affected by this shit, and it’s videogame dialogue writers. Ubisoft have already been public about creating internal tools to relieve the burden of writing a bunch of filler NPC text and... I’m sympathetic to that? You want writers to do high value stuff, sure. I mean, just don’t use this as an excuse to lay people off because you’ve got to satisfy the market gods. But ha ha you totally wouldn’t do that, would you, publicly listed videogame company.
(it is also interesting that videogames -- electronic entertainment -- are the thing that is using or needing lots of creative textual content at scale, and that people do pay for that!)
Wait, also there was all that AI Dungeon stuff a long, long time ago when the first Chat-GPT came out. That was a thing. But now the models are better. But AI Dungeon and other videogame examples (I think Nvidia’s CEO was on the record the other saying share price boosting stuff like “videogames are totally going to be all generative AI within a couple years!) are wide domains, whereas the thing I’m excited about here is... I don’t know, microcopy?
At which point I have to admit that yes, “people who write microcopy” and “content designers who also do microcopy” are... a class of people with a job.
(For new subscribers, this is one of the best recent examples of what it’s like to be forced to ride along with me as I write out loud)
But! Harper couldn’t have made his talking office by hiring a bunch of writers, right? Not in that realtime way? No, the reason why is because he could do it cheaply. And yes, I’m in before people remind me that Carrot Weather⁶ exists.
I do think there’s a thing here, and its shape is getting clearer.

OK, that’s it for today. I've written about 28,000 words worth of notes for that workshop over the past 48 hours, another ~3,400 for this is probably overdoing it. 
How are you doing? I am tired!
Best,
Dan
PS. Huh! This is also episode 600! Surely I get to go into syndication now.

How you can support Things That Caught My Attention
Things That Caught My Attention is a free newsletter, and if you like it and find it useful, please consider becoming a paid supporter.
Let my boss pay!
Do you have an expense account or a training/research materials budget? Let your boss pay, at $25/month, or $270/year, $35/month, or $380/year, or $50/month, or $500/year.
Paid supporters get a free copy of Things That Caught My Attention, Volume 1, collecting the best essays from the first 50 episodes, and free subscribers get a 20% discount.

Our Office Avatar pt 1: The office is talking shit again | Harper Reed's Blog (archive.is), Harper Reed, Harper Reed’s Blog, 26 March, 2024 ↩

Poem/1: AI rhyming clock by Matt Webb — Kickstarter (archive.is), That Matt Webb, Kickstarter, 29 February, 2024 ↩

My new job is AI sommelier and I detect the bouquet of progress (Interconnected) (archive.is), That Matt Webb, Interconnected.org, 3 March, 2023 ↩

I would describe this just as “a load-bearing just” ↩

s16e18: Generative People Personalities; Content Distribution Networks (Taylor's Version) (archive.is), me, this newsletter, 27 October, 2023 ↩

CARROT Weather for iOS and Android (archive.is) ↩

Don't miss what's next. Subscribe to Things That Caught My Attention: