Episode Twenty: Literal Tone Of Voice, Presence, Networks, Applescript

by danhon

1. Literal Tone Of Voice

If you’re British and use Siri, then you’ve probably heard the plummy tones of, well, I can’t describe it in any way other than British Siri Wanker. So in terms of designing a service, here’s a whole (not necessarily new, but deep domain) area that is suddenly becoming more relevant. Time was, you’d kind of laugh about Brian Eno designing the aural identity for Microsoft Windows, but, here are the super dumb things that I put to you:

a) For a certain group of people, hearing British Siri makes them want to punch British Siri in the face, no matter how useful he may be;
b) Samantha, in Spike Jonze’s Her is voiced by Scarlett Johansson;
c) JARVIS in the Marvel Cinematic Universe is voiced by Paul Bettany being an English Butler;
d) The Computer in Star Trek: The Next Generation is voiced by Majel Barrett in somewhat haughty BDSM mode; and
e) Sigourney Weaver in Galaxy Quest.

The super arbitrary, base-level insight thing that I’m going to pull out of this is: hey, voice casting matters! And for all the strides that have been made in terms of attention being paid to copywriting (and the inevitable wrong lessons being learned, or being followed blindly, cargo-cult fashion from standouts in the field like Innocent and Flickr) and (again) GDS in maintaining that Writing Is Design, if we’re talking about conversational interfaces (ha) then we’re also paying attention to a whole bunch of things like prosody and tone and pauses and, and, and. The assertion that British people are highly attuned to things like accent and dialect is a bit of a side-show, we all are: it’s just a matter of degree and sensitivity.

Orthogonal to all of this is the realisation that Overly Friendly Nicey-Nice copy is but one axis on the continuum of how you can write, and that if there were, for example, an obsequiousness setting[1], that might be fun, or at least interesting, to try out.

This is to say that one of the reasons why Theodore fell in love with Samantha is that whether or not she actually had one, she simulated having a personality pretty well. Oh, and she was an actual human being who could read meaning into lines.

It’s easy for us to pay attention to the uncanny valley of computer generated imagery. That doesn’t mean there isn’t a similar uncanny valley of synthesised voice.

(An aside: Chinese Room style artificial intelligence, at least in the delivery of the intelligence, sounds more likely to me in the short term, where zero-hour workers pretty much read out and relay what Siri-the-system says to say to you, delivering the intonation almost by accident.)

[1] via Tom Coates: https://twitter.com/tomcoates/status/435657750312267776 in yet another example of British whimsy

2. Presence

There’s a long history of presence on the internet: from commands like finger on multi-user UNIX systems to the peripheral-vision and green indicators of ICQ and AIM, to the green light on Facebook chat and experiments like BERG’s Availabot[1].

But all of these things seem to be kludges and shims for a non-persistently connected world. Again, running with my Star Trek as stand-in for the future theme, the presumption is that everyone is always available, online and connected. That someone wouldn’t be connected is what’s surprising: Commander Riker tries to get hold of Data and his combadge chirps in an unhappy way, CUT TO everyone on the Bridge looking perplexed, like Data’s gone AWOL. Which, essentially, he has, because in the 24th Century on a star ship, if you’re not on the network there’s something not right with you. Riker would try a few more times, ask the Computer to locate Mr. Data (“Mr. Data is not aboard the Enterprise”) and then cue creepy music and CUT TO TITLES.

This is where Snapchat, designed, as it were, for that golden of gold audiences, the Teenagers, makes an interesting choice (if I’m not reading too much into it): it doesn’t have, from what I can make out, any presence indicators. The presumption is that of course you’re online and of course your recipient(s) are, they’ll just get around to reading it when they do. If anything, this is one of the shifts that I think legitimately shows the difference between so-called digital natives and immigrants. To natives, connectivity just is. A lot of us in industry have grown up with the sound of Hayes v.34 modems pinging away in the background or remembering BBSes, and those of us outside the US remember paying per minute for connectivity and longing for that particularly foreign concept of “free local calls”. To have (admitting that this applies to a certain privileged class, but one that is growing) mostly unfettered, mostly untethered, persistent net access is a New Thing.

[1] http://berglondon.com/projects/availabot/

3. Networks

You can go on and on about what it’s like to live with persistent networking and connectivity. I mean: isn’t it interesting that (and I hate to keep coming back to it, but it’s a handy point of cultural reference) when someone loses network connectivity on the Enterprise it’s more assumed that they’re not actually *on* the Enterprise rather than the network having gone down? Because when the network’s down, you notice: everything stops working. There’s not really spotty connectivity until the plot demands it – which is fair enough, considering we’re examining a 90s fictional TV universe.

This thing about connectivity is interesting in the respect of the different types of network that the assumption of persistent connectivity provides. SMS didn’t really care about presence, and arguably, was it a big missing feature that people were crying out for? It turned out that typical user behaviour (and here I really *am* just reckoning) was that people would have their phone with them all the time and the people who had a mobile phone and “just turned it on to make calls” – I’m looking at you, Dad – were the ones who felt weird. The *point* of the phone was persistent connectivity.

Marc Andreessen had a conversation on Twitter the other day on the difference between a pseudo-closed network like Secret, where there are built-in design constraints that act against unimpeded network growth, and open networks like Facebook or Twitter. Aside from appearing, at least, to look at the whole landscape in terms of a zero-sum vc investment (ie: open always wins over closed, so only pay attention to the winners, without acknowledging that there might still be profitable and viable markets for the non-winners), what feels to me to be somewhat obvious is that persistent connectivity supports multiple types of network graph.

It feels like we’re crossing over some sort of threshold where there’s *enough* connectivity for *enough* people that interesting behaviours are starting to emerge. *When* you have the type of connectivity that enables teens to send around 4 figures worth of messages per day, there’s value in the global lookup type of network that everyone’s on (Facebook) as well as ephemeral, close-group networking (Snapchat) as well as pseudo-anonymous (Secret).

Where it all falls down, of course, is making money and producing sustainable services.

(Obvious disclaimer: Andreessen is a stupendously successful investor, and I am just a guy. But anyway.)

4. Missing Applescript, IFTTT

At the same time, all of this connectivity feels like it’s an explosion of complexity that is a set of non-interoperable walled gardens and no way of providing a certain level of user control. Which makes me wonder about the possibility of the implementation of a layer of scripting like Applescript that at least in some way was easy to understand (though, arguably, not easy *enough* to understand and write) or something like If This Then That, which effectively presents a sort of extra user interface on top of the entire internet.

A lot of these apps and services are effective black boxes that, I suspect, people who do Learn To Code will get frustrated by. A computer can, in principle, do the things I need it to: so why can’t it? Why wouldn’t Facebook let me do this thing? Which is one of the reasons why knowing how all this stuff gets frustrating and complicated when explaining to people who don’t understand: well yes, in theory, we *can* do this thing, but we’d need to connect this bit to that bit and I guess Twitter *could* support animated gifs but it’s not a question of code, it’s a question of them just not wanting to and so it turns out that humans are messy creatures and the big lesson everyone learns is that Computers Only Ever Do What We Tell Them To.

So here’s a question: what’s the most impressive thing someone’s made with something like Scratch?

That’s all for today. Oh, apart from this Easter egg, courtesy of Paul Mison: