s09e04: How can I help you?
0.0 Context setting
It is Saturday 6 February, 2021. It has been a nice day in Portland. We went to the library and picked up some new books, returned some old ones. The library doesn’t do late fees anymore, which is nice (and equitable!), so books just kind of getting auto-renewed, which means needing to more explicitly teach the kids that someone is probably waiting for this book.
I had a difficult week. Some days it’s hard to concentrate. Who am I kidding? Most days. Some days it’s just really a morning, if that, of productive work. I discovered pair-working (two people working on the same document, talking about it!) over Zoom really helped, to the extent that we’d virtually high-five after, I don’t know, remembering a word. So it’s difficult. But hopefully, things will get better.
One big, long thing this week and then a treat. Next episode the plan is for more shorter, smaller things that caught my attention.
Ready? Let’s go.
1.0 Something that caught my attention
I realized a bundle of pieces caught my attention over the past few weeks and I think there’s a common theme among them: customer service.
Let’s start with some of the pieces, in no particular order:
Dustin Curtis’s Apple iCloud, App Store and Apple ID accounts stopped working due to a long chain of events [Dustin’s writeup on his blog]. The symptoms manifested as being unable to download app updates on his Mac, then being unable to use the Music app on his Mac, and then the symptoms spread to his phone. Then Calendar stopped syncing. (iMessage and Photos kept working, though).
Cyd Harrell’s recent tweet about customer service: “one thing people aren't talking about in all of these analyses is the dismantling of customer service in favor of self-service (& especially self-service on the web) in the 90s-early 2000s. that offered gains in some ways, but we see the losses here.”
Thinking about what “government scale” might mean, when we talk about services that work “at scale” and serve large numbers of people; that government scale might should convey breadth of users and requirement of universal service delivery.
Andrew Spinks, the lead developer of Terraria, a videogame, said that he would stop development of the game for Google Stadia (Google’s half… dead? videogame streaming platform) after his entire Google account was banned, with no clear or obvious recourse for reinstatement.
Perennial favorite Philip K. Dick’s Ubik, where Joe Chip isn’t allowed to leave his apartment without paying his front door.
My riff on that from 2018, Everything was connected, and I was fucked.
Every single time I moan about managing the collision between “adult” online accounts and “child” online accounts, and the n-dimensional grey area where “parent” or “family” accounts work as some sort of Lovecraftian combination where nothing works quite how it’s supposed to and causes nightmares and screaming.
Okay, now we’ve got that out of the way, here’s what caught my attention:
Whats bringing the above items together for me is that they feel like they all show the combination of identity, permissions and policies crossing organizational seams. It’s worst when each of these elements are opaque and not visible or understandable to the final end user.
When this particular… trifecta-plus-one (a quadrifecta?) of issues come together, the end result is something that’s at a minimum deeply unsatisfying (something is broken and isn’t working the way it should), or at worst could be life threatening. Any failure of identity, or permissions or policies not working can result in a bad outcome, but when those start to combine and cross organizational seams feels like when situations start to get, I don’t know, exponentially worse.
People who’ve been following along for a while may instantly see the parallels to government services and for that I am both sorry and not sorry.
Organizational seams
Apple
I said part of this is because of organization seams. Here’s an example from Apple:
Apple has a whole tangle of accounts that include, variously, an iTunes Store account (that you might use to buy apps with for your phone or your Mac), an Apple ID (which may or may not be the same as an iCloud account), and an iCloud account for… iCloud-related services like iMessage (blue text messages), iCloud Photos (storing your photos in the cloud) and document storage, including things like “where did that iMovie project get saved? It’s where?!”.
This gets complicated enough because you can have, say, an iCloud account (e.g. my one) and another iCloud account (e.g. my partner’s account) which are used for defining an identity for e.g. using iMessage, and that until a few years ago, a device that used my iCloud account would use a different iTunes Store account with which to make and authenticate purchases, and my partner’s device using their iCloud account would also use the same iTunes Store account with which to make and authenticate purchases. And then something called iCloud Family came in or Family Sharing where a bunch of iCloud accounts could band together in some number defined by corporate policy and use a single defined Payment Account (which would also be an iTunes Store account, or might also be one of the iCloud Accounts) for all purchases.
Trust me, this is as complicated to understand if I drew a diagram than if I were trying to describe it in text.
But, I think, the reason why this is complicated is because each of these things were, broadly, intended to do one thing and then more things were added to them (iCloud accounts were MobileMe accounts were dotMac accounts which were paid-for web hosting, email and calendaring, that sort of thing; iTunes Store accounts were Things You Used To Buy Things From Apple’s iTunes Store, But Then It Turns Out There Were Lots Of Other Things To Buy, Too; Apple IDs were… things that you could have for, say, support requests?) but in any event, it started to get confusing because the one way you could (or would) identify these services is by their email address.
(Do not get me started on Amazon, where you can have the same email address (e.g. mine) work as an account on Amazon.com and also an account on Amazon.co.uk, but they are actually different accounts sometimes but not really?) How does Amazon know? Does it? Who can tell? This is all, again, opaque.
But anyway. This all gets more complicated because this is the world and entropy increases and everything will get more complicated until it is too complicated.
Here’s a Google example:
You can have lots of different kinds of Google account — a personal Google account, a G Suite (or whatever it’s called now) account, a child Google account, an education Google account and so on. These seams often break down when someone gets the (admittedly smart) idea that families want some sort of organization and control over child accounts, so now you’ve got more permissions issues. The wonderful sweet spot is, say, a child account that’s administered by an educational institution but you want that child account to be subject to a family/child permissions policy. Never mind the whole “I have too many Google accounts, I end up just sharing everything amongst all of them anyway” issue.
What’s in common
Just assume for the sake of argument that the same issues exist with Microsoft accounts (they do). And also assume for the sake of argument the particular hell that exists where you have a Microsoft account that is also an Xbox profile that is being used on an iOS Minecraft client, where the user is also a child Apple account set up with Family Sharing.
Look, I think I’ve written about Conway’s law [Wikipedia] before, but I generally don’t like it because of how it’s expressed:
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.
Melvin E. Conway
Which is to say that I think the expression of the law is too abstract and that people prefer concrete examples. Here’s my example: all the above are fucked up because they exhibit what makes sense to individual product groups and teams (hey, it’s the Apple Account team) doing, hopefully, what makes sense and is rational (ha ha, rational in what context, hmm?) and then before you know it, you’re trying to Dr. Frankenstein’s monster all of these individual, ideally, perfectly reasonable and well-functioning products or functionality into some sort of superset, or something that collaborates.
But they weren’t really supposed to collaborate. Why should we expect collaboration now? These requirements for collaboration, in the context of products and features that have a history, weren’t known ahead of time.
Which leads us to seams.
In retrospect, seams is just a fancy way of saying that feeling of being bounced around from department to department when you have an issue, and nobody is able to help you. Which is to say: your problem exists in the space between departments, in some sort of purgatory where features go before they’re perfected and, to some minimal degree, actually work. Likely the people who encountered your problem know, roughly, what the problem is, but it’s kind of someone else’s problem more than it’s their particular problem.
And this is assuming perfect information transfer across seams or departments, which is asking for a lot, right?
Anyway: please don’t read this assuming I’m going to end with some sort of flourish and an answer to how you deal with seams well because, well, there’s always going to be seams. One first instinct might be to set up a properly empowered Group That Has Responsibility For The Thing We Just Realized Is Falling Between The Gaps and guess what! Surprise, that thing and that team has boundaries. This problem will never go away. I don’t know, maybe it’s an opportunity?
What’s next? Scale, of course. Because when you have one problem, why not multiply it a hundred million times to, as we say now, fuck around and find out.
Operating At Scale
The seams are one part, but the other is the Operating At Scale part, which is broadly speaking the bit where the messy, fiddly part of dealing with other humans that doesn’t make any money, i.e. customer support, gets paid short shrift and is a cost center, where you ideally you want to spend as little money as possible because a cost center is just another polite word for ugh if we have to but we really don’t want to. I am led to believe that there are some organizations that do not treat customer support as a cost center, and I suppose they must exist!
How do you provide customer support for millions and millions of people, though? I mean, say you had, I don’t know, a hundred million users in a particular country. That sounds pretty government scale right? Is there a way to be good at customer service when you’re operating at that scale?
Long-term readers may suspect that my particular take on this is going to be along the lines of haha no have you looked at the support forums for Google and Facebook? I mean, is that that much better than having an entire Reddit dedicated to figuring out how to work unemployment insurance? [Washington Post].
Because Here’s The Thing: Statista says around March 2019, there were around 202.5 million iPhones in use (“installed base”) in the United States. Let’s say 0.1% of people encounter a problem approaching the severity of what Dustin Curtis experienced above. That’s… over 200,000 people? Go down another order of magnitude, and it’s still 20,000.
Dustin ultimately had his “issue” resolved (oh how we love the dry, sanitized language for something that I can imagine made Dustin pretty goddamn anxious and worried), and it feels like Apple might even have been better than most at this kind of problem because… you can kind of call someone up at Apple? Even if they can’t help you, there’s still someone you can call.
Can you call someone about your GMail account if you get locked out? I mean, I don’t think so?
So now we come to scale and self service and I’ll quote Cyd Harrell’s tweet again:
“one thing people aren't talking about in all of these analyses is the dismantling of customer service in favor of self-service (& especially self-service on the web) in the 90s-early 2000s. that offered gains in some ways, but we see the losses here.” [tweet]
I agree: you don’t get to hundreds of millions of users — or, sure, billions — without the innovation of self-service. Having to call someone up to change your password doesn’t scale, as it were, and that example is one of a class that government, in general, could do better.
But then there are all the other problems, and I’d say (off the top of my head) that there are two classes of seam problems that come about from dealing with hundreds of thousands or millions of users:
The first class is problems that you don’t know about because they exist in the seams.
I feel like this first class is easier to deal with when you pay attention to customer service and you have a good-enough discipline of being focussed on understanding what your users need and whether you’re meeting that need.
Another way of saying this is: if you have dashboards and metrics, I imagine they’re most likely firstly focussed on making sure you’re meeting targets on bureaucratic process, so, you know, maybe turn them upside down and check that those processes actually result in the outcome your user expects. (Let us leave aside the instances in which you are actively and purposefully attempting to fuck over your users and customers.) So, first listen, track the journey, and then find what’s getting lost, then start paying attention to that. Easy. (Well, easier said than done. But it’s not impossible, it’s just hard. And if NASA can land a 1,025 kilogram nuclear-powered rover on Mars using Webex during a pandemic, then, you know, maybe try harder?
If you’ve found something you can automate, then go nuts. Automate the hell out of it.
The second class is problems that you know exist in the seams and now haha you have a seam problem, do you know how to fix it?
Because my theory is seam problems are cross-responsibility problems, I suspect that they’re the fiddly ones that involve humans being humans and doing things that are entirely unexpected (but shouldn’t be unexpected! This is what humans do! The entire fuck around and find out attitude. Many times we don’t even know we’re fucking around, we’re just… humaning!) and are unanticipated.
So they don’t fit in the system. Let’s ignore software now and just say: they do not fit in your policy. They do not fit in your bureaucratic pseudocode.
An aside and a realization!
Policy (i.e. a set of rules and instructions about a goal/outcome and statements about conditions that must be satisfied, etc.) is, if you squint at it, code.
Therefore… a bureaucracy is pseudocode? If you wrote down what a bureaucracy does, but before you put it in software, then, duh, that’s pseudocode.
They are, for example, what happens when someone whose last name is ‘True’ gets locked out of their iCloud account for 6 months and counting because, really, who could’ve imagined something like that (reader, we could have imagined it. [See falsehoods programmers believe about names]).
If you really want to resolve the exists-in-a-seam problem, then it feels like you need whatever sort of customer support agent who is empowered to do su {fix problem} and then, hopefully, document it and I don’t know, submit a request for that to be patched in your spaghetti code and systems.
But that rarely happens. (It kind of happens in organizations that are super customer focussed and you know they are when they say things like “what is your objective for this support call” and you say “{do the thing that makes this right}” and they go tapity-tapity-tap, perhaps also saying “please hold”, and you hold, and then they do more tapity-tapity and you hold again and then about half an hour magic has happened and the thing that needed to be made right is made right. Which is escalation to some sort of superuser who has the power to wave a magic wand and Alter The Business Processes To Their Whim, Like Some Sort Of Sorcerer Supreme. Because after all, what is customer service, but perseverance?
My point here is that this needs humans. There is no chatbot, there is no self-service, there is no expert system that will fix this kind of problem, and this kind of problem increasingly happens when you work at scale, and if you really want to fix it, then you kind of want empowered, intelligent, problem-solving entities who can also ask questions about what someone needs and for most of those things, it sounds like the cheapest way of doing that is to hire a human. I said most of those things apart from empowered, because that’s a policy decision. That’s having the escalation framework in the first place, and in the worst case, that’s someone with the sufficient power to su root and go in there and, I’m afraid, fuck it, let’s fix it on production.
But. When you have made the decision to outsource, when you have made the decision that customer service is something that loses money, I’m assuming that you’re not in the kind of frame of mind where you actually want to solve these difficult problems. After all, you’re operating at scale and well, don’t the user needs of the many outweigh the user needs of the few? I mean, sure. Until you’re one of the few. And you’re locked out of your account. And you can’t get the health insurance you’re owed because of a policy conflict that can’t be resolved because… reasons? ¯\_(ツ)_/¯
We know the proactive kind of customer service exists because in some places it, or something like it, is called concierge service. I once experienced this one year when I contributed to destroying the environment and did enough flying around to hit Diamond status on Delta, which meant there was a special phone number I could call and I would have conversations like this:
Delta agent: Hello, how can I help you?
Me: I would like to {outcome}. Here are my {complicated criteria}. Can you do that?
Delta agent: Sure! Tapity-tapity-tap. OK, here’s {route}, which will cost {amount}.
Me: OK, what if I want to use miles for part of that?
Delta agent: Sure, we can do that!
Me: OK, let’s do it.
Then, later:
Delta agent: Hello, how can I help you?
Me: I have experienced {disaster}.
Delta agent: Oh no! Let’s see what we can do.
Me: Great. I need to achieve {outcome} but I couldn’t figure out how to do that online.
Delta agent: I’d be delighted to do that for you.
I mean, how awesome is that? That is awesome.
Of course, this is also what’s called a premium enterprise support contract with an SLA and you either explicitly pay for it or you are spending what is likely a ridiculous amount of money in terms of profit margin and the company would like to retent you as a customer. (I know the word retent is not a real word and the word is retain. I just like making fun of the concept of customer retention in this way.)
All of which to say is this: policy is policy. When it is written down, it is an abitrary set of decisions and conditions designed to achieve in the majority of cases a desired outcome. But this policy is decided. It may be subject to constraints (I mean, it is subject to constraints), but nonetheless it is malleable because it is a human construct in a human world. So if you want to operate at scale then you get to decide what you do for the things that fall outside policy. Do you want to deal with them? Or do they live in purgatory?
I want to lastly tie this to government scale (and set a note that I’ll come back to technology companies starting to realize that they have in effect some attributes of sovereign powers in some contexts), in that governments don’t get to choose who they serve. I mean, in effect they do: this is called racism and sexism and discrimination, but the principle of the thing is that the social contract says that government shouldn’t discriminate and that it is supposed to serve all people equally. Google, Apple, Microsoft, that startup that wants to disrupt, I don’t know, bread delivery, these all claim to want to serve everyone and it can be good business for them to eventually or strive to serve everyone, but in effect they do not, and they have had to make decisions about who to serve, which is also entirely within their right. This gap between wait, you want to serve everyone equally and hang on, I’m not being served equally, some sort of gap between a promise or, I don’t know brand value and actual experience is what ends up being encoded in, and a result of, policy.
Tl;dr: have actual humans do actual support and let them actually solve problems, which requires your organization to be set up to do so, P.S. you may have inherited a bunch of code that makes this super hard, but hey, what’re you going to do? Things change.
2.0 As A Treat
… here are some tips for your data strategy.
I have excerpted my favorite ones here:
I am absolutely not sorry at all, not in any way.
Nothing more this episode. The plan for next episode is a bunch of smaller, shorter things that caught my attention.
My gosh, it has been a week. You may probably also have had a week, in the sense that a week has passed. Mine was pretty hard! How about you? As ever, I love hearing from you and actually prefer when you just reply by email rather than public comment because this is a newsletter and also it’s great when you just say “hi” because who doesn’t like that? I like that. I will totally say “hi” back.
Best,
Dan