s09e30: CAPTCHAs, Cryptography and Personhood
0.0 Context setting
It’s Thursday May 19, 2021 and an overcast day in Portland, Oregon.
I spent most of yesterday doing two related things: collating the best bits of the first 50 episodes of this newsletter into one Scrivener project, and then collecting all Snow Crashing, my Snow Crash commentary and criticism, into another Scrivener project. It was the kind of mostly repetitive work that’s easy for m to zone out to: opening up tabs 10 at a time, copy and pasting, giving them the right names and so on.
The first raw text pass of the best bits of the first 50 episodes amount to around 35,000 thousand words, and Snow Crashing, through to the end of chapter 18, is around 43,000 words so far. That’s… a lot. Without introductions, sad-face-emoji.
I’m still figuring out how to package these up better and do a good job of it, and whether a low-key Kickstarter project might be the best way to go about it.
Today's episode originally included a bunch of things that caught my attention but instead turned into a long piece about CAPTCHAs, Cryptography and Personhood.
Let's go:
1.0 Some things that caught my attention
CAPTCHAs, Cryptography and Personhood
Via Vice’s Motherboard1, Cloudflare invented something called Cryptographic Attestation of Personhood2.
First, CAPTCHAs
Cloudflare invented this because a thing called CAPTCHAs exist, which is the name for the thing you have to do to “prove you’re a human”. They are called CAPTCHAs because they are a “Completely Automated Public Turing Test to tell Computers and Humans Apart”. We need to do this because, ultimately, advertising and attention and the need for growth became the reason why we can’t have nice things.
Okay, short version: these days, I think, we need CAPTCHAs when something on the internet needs to be behind a gate that only a proper human being is supposed to access, rather than a computer program. One big reason for this is something like creating an account that could be used to send spam, because if you could automate account creation, you could send a lot of spam. The reason why people might want to send a lot of spam is because humans are terrible and we exist in a society that does not provide for everyone. The advertising thing is because a computer program can click on lots of ads, which is a pretty shitty existence if you’re a computer program I suppose, and clicking on ads means you’re not a real person, which an advertiser would not want to pay for. So this sort of arms race where people grow more and more arms to provide how human they are. Maybe this explains spiders and crabs?
Anyway. We don’t live in a great society. Yet. Ish. Kind of. And maybe doesn’t exist on purpose. Anyway, that’s beside the point.
Cloudflare is annoyed about this because CAPTCHAs are annoying, not least of which because they are discriminatory. Anyway, you can read the wikipedia article about them.
Cloudflare is mostly annoyed because CAPTCHAs take lots of time that could be spent doing other things like, I don’t know, organizing for a future society where people’s needs are met because “humanity wastes about 500 years per day on CAPTCHAs” and if you put it that way, it sounds like a pretty good Twilight Zone episode where people are just stuck in front of glowing screens spotting bicycles or chimneys or crosswalks. Forever. Because capitalism or whatever.
ANYWAY.
Mathematically a person. Ish.
Cloudflare’s solution to this that you have a hardware key, which is like the best kind of 2 factor authentication because of encryption and mathematics, and really, it’s pretty good if you ignore the existence of humans and their propensity to do human things. I mean, the maths holds up. Instead of doing a Captcha, Cloudflare says, you would just plug in your hardware key if it is not already plugged in and then touch it and then ta-da, and this is the important bit for me at least, you have done a cryptographic attestation of personhood.
This is a weird name! The weird name comes from the history, which I wrote above, which is the bit of Captcha that stands for “Turing test test to tell computers and humans apart”, which again only exists because in general there are circumstances where we want a real human to do something (i.e. deliberately, and more crucially, slowly) as opposed to a computer doing something (i.e. yes, technically and if you want to be annoying about it, on the instructions of a human, but also crucially very quickly and lots and lots)3.
The “very quickly and lots and lots” is basically the most important distinction here, which kind of holds up unless you get into other edge cases like organizing lots of phones in front of a few redundant arrays of inexpensive humans (don’t say it out loud) and paying them the lowest prevailing local wage to press buttons. Which is a thing that happens. To, say, give five stars to things4. See: capitalism, above.
A Test of User Presence
The main issue I have, and to be fair, it’s merely a semantic one, is that the phrase cryptographic attestation of personhood doesn’t actually attest personhood, i.e. doesn’t actually declare that there is a person.
Cloudflare’s cryptographic attestation of personhood means a user is allowed access “upon verification of the user presence test”5, which is in the W3C spec of WebAuthn. What’s a test of user presence? It’s
a simple form of authorization gesture and technical process where a user interacts with an authenticator by (typically) simply touching it (other modalities may also exist), yielding a Boolean result.5
Which is very careful to say nothing about personhood.
Don't cross the streams
The W3C spec describes a test of user presence. This is important! It doesn't say anything about personhood. This is, I don't know what it's properly called, but let's call it something like semantic drift. It is like when you use a word that has a specific meaning in one domain, and then import it into another domain and have it mean something else. To parapharase:
Egon: Don't cross the streams.
Peter: Why?
Egon: It would be bad.
Peter: I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
Egon: Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light.
Raymond: Total protonic reversal.
Peter: That's bad. Okay. Alright, important safety tip, thanks Egon.
To paraphrase, you don't want total semantic reversal.
One example of this my so-called semantic drift is the usage of the word "trust" to mean lots of different things, which I briefly wrote about in the context of cryptocurrencies6.
"Trust" in conversations about cryptocurrencies is a single word that can mean "trust in people and institutions", "trust in technical systems being free of errors", "trust in technical systems being secure" and so on. These are all different things!
And so personhood has a very specific meaning. Look, here is the definition of a legal person from Cornell Law School which is not very helpful because, well, Western law has used the term "person" and made it more specific, when it has a generally accepted common meaning. What I mean is that "personhood" is... a difficult and controversial term! Even Wikipedia agrees with me!
Defining personhood is a controversial topic in philosophy and law and is closely tied with legal and political concepts of citizenship, equality, and liberty.7
You say "Personhood" and at least a few people are going to wonder if you're also referencing something like Great Ape Personhood or Environmental Personhood.
So, first I bristle at the encroachment of the very specific term personhood in a proposed spec because these things creep and it is unlike software engineering to be loose with terminology. This is, after all the discipline that has certain opinions about the Richard Curtis, Hugh Grant-starring flop Gnu/Linux, Actually8.
This bothers me because I am afraid of a dominant force in society (i.e. certain technology companies) having the power to make certain things true and certain definitions if not true, then accepted purely through mass force of application.
Fine, so what would you do about it?
The term CAPTCHA comes about because of the theory of Turing's test. The test has a whole section on Wikipedia covering its weaknesses.
The actual problem here is that, as reminded by Tom Insam3 there's a scarce resource and we want to be able to control access to it, especially in terms of the rate of access, because we have ruined sand to make it look like it can think by turning it into patterns that can do mathematics really quickly. Proving to a reasonable degree that you are a human is a way to do that, and a good-enough short cut, but it is as much to do with proving to a reasonable degree that you are not a piece of mathematical sand.
Ideally, we don't start wading into implying that a computer program can tell whether you're a human or not because down that road lies our species' worst atrocities.
I think it's important to realize that the method might not change. In fact, I don't think it would, appreciably. Because as I'm re-reading what I wrote above, I think a key point is reasonably.
Proving to a reasonable degree that you're a human means doing something that ranges from difficult (i.e. slower than "most" humans) to currently impossible. The use of to a reasonable degree means it has the potential to be a shifting standard, which in practice it has been: witness the arms race of increasingly complicated CAPTCHAs.
Look, Cloudflare understands this. They have a whole article in their learning center about CAPTCHAS9 which in its introduction says both:
- A CAPTCHA test is designed to determine if an online user is really a human and not a bot; and
- Although CAPTCHAs are designed to block automated bots, CAPTCHAs are themselves automated. [my emphasis].
Which I think supports my position that the real point of CAPTCHAs is to stop automated bots and not prove that you're really a human.
So a potential way to restate the problem is that we want to reasonably control the rate of access to a resource and that the ideal goal is for that rate to more-or-less match the profile of a human.
I bring this up because internet-friend Betrand Fan has already explored the next level of frustration which is that he works in tech and he uses a YubiKey, the exact kind of hardware key Cloudflare proposes to use in its Cryptographic Attestation of Personhood. Fan's frustration was that the button on the YuibKey was far away, so he wanted to use a computer to press it. This, clearly, defeats the point because as soon as you can do this, then you can press them pretty quickly, and also press lots of them. So Fan of course (being a human and exhibiting one of our finest qualities of fucking around and finding out) has created a way to press a YubiKey using a computer and helpfully supplied the full instructions to do so, from the source code to the hardware to the 3D printing STL files.
That wasn't what you'd actually do about it, that was more about why you think something should be done about it
Right. So we're back to names. I think we should -- for society's sake -- come up with a better name for what we want CAPTCHAs to do, which is slow down access to things to a human level and not keep up with the adjacent but not true fiction which is determining whether you are a human being.
Wait, hang on this is about something else, isn't it?
Yeah, so this is actually about human-scale latency, which is kind of exciting to me because it is one of the underlying factors that societies are trying to figure out right now with respect to the internet and software.
Let me explain what I mean by human-scale latency: speed is potentially harmful and that maybe we don't want everything to be fast.
I mean, we get this on an innate level because as young children we learn quite quickly that even though we don't know the exact formula of force = mass✖️acceleration, we do understand that fast things hit harder and hurt more.
Pure speed/velocity in nature is... not great? Viruses that reproduce too quickly burn out their hosts. Unconstrained reproduction is also not great? I know this is getting into the area of folk psychology and evolutionary psychology which I am well aware spans the gamut from unsubstantiated bullshit to difficult-to-verify theory, but, you know, waiting is good sometimes? It is certainly something I am attempting to teach to my children.
Human-scale latency is a way of describing one of the factors involved in algorithmic amplification and spread of hate.
For the last couple of decades we've had the emergency and now dominance, I guess, of high-frequency trading, which covers a multitude of activities and behaviors. One of them is, bluntly, doing things very quickly, which includes things like using microwave links and fiber optics to make sure you're the first to get information, the first to act on it, and you have the ability to do that all incredibly quickly, which can let you trade at very high volumes.
Generally, some people are worried about whether this is bad or not, and their worrying has led to putting in place things like circuit breakers (hey, let's just all take a time out) and speed bumps (please trade carefully, humans live here). Why have exchanges put these in place? Partly because in the U.S. people (for certain values of people and thus government and regulatory bodies) have decided that it's important to "ensure that our capital markets remain fair, deep, and liquid" (and also, satisfyingly, use the Oxford comma)12.
As societies, we have speed limits because (at least some of us) understand the difference between a car hitting a child at 15 miles per hour and 30 and 40 miles per hour, never mind an adult.
But computing and software have also shown us (or are in the process of showing us increasingly clearly) that being able to do something quickly also means it's easier to do that same thing in large quantities or, as we say, at scale.
As groups of people -- societies -- we have a raft of techniques to encourage and discourage behavior. These can include technical measures and, well, societally-based measures like civil and criminal law and, ultimately, use of force. This is alongside growing up -- both individually and as a species -- with environmental and physical constraints. Heat, for example, is a big physical constraint, which from my point of view is good because otherwise guns could fire really, really quickly10, and guns are pretty much used for killing people. See: atrocities, above.
Some of these technical measures are quite interesting! They include ECU speed limiters in cars11 and in perhaps one of my favorite examples, a speed bump implemented by the Investors Exchange to slow down access to their market:
IEX creates a 350-microsecond delay by running all external communications through a coil of fiber-optic cable.
SEC Staff Report on Algorithmic Trading in U.S. Capital Markets12
which takes advantage of a physical limit at least until we invent superluminal communication.
I mean, hang on: the speed of light is about 300,000 kilometers/second in a vacuum, but fiber-optic cable isn't a vacuum. M2 Optics sell fiber, and helpfully have a table of the distance light travels in different types of fiber based on its refractive index13. Anyway, long story short, I reckon a 350 microsecond delay would need a coil of about... 70 kilometers. Which is a lot! Light goes really fast! Michael Lewis's 2014 New York Times article, The Wolf Hunters of Wall Street14 (also Lewis' book Flash Boys), has a photograph of the ~61 kilometer/38 mile coil of fiber.
I digress.
Let me bring this back.
CAPTCHAs, Cryptography and Personhood
- We're primarily paying attention to websites, which people use.
- Bots are software, programmed by humans.
- Because bots are software, they can do things both quickly and in large numbers.
- In general, doing things quickly and in large numbers is something bots can do that people cannot do.
- There are some things that people don't want to happen. These things include interfering with someone's website (e.g. stopping it from working), stealing information (e.g. recipes, news, stories), fraud (e.g. tricking an advertiser into paying for an advert), sending spam, and pretending to be someone else (e.g. stealing someone's identity). Let's call this malicious behavior.
- CAPTCHAs were an early way to stop malicious behavior from bots. They did this by preventing access until a particular problem had been solved, one that was easy for a human, but hard for software.
- In this way, telling the difference between a human and a bot was a proxy for spotting the behavior.
- Importantly, people can do all these malicious behaviors. It's just anywhere between hard to impossible for humans to them as quickly or in as large numbers as bots, which are software.
- Cloudflare says the amount of time people spend dealign with CAPTCHAs is a big number. It's not great to waste people's time.
- Cloudflare says one way to tell a person from a bot is to use a security key, which more or less relies on a person pressing a button. They call this invention Cryptographic Attestation of Personhood.
- Cryptography -- mathematics -- makes some parts of pressing the button secure in that it is difficult or impossible to pretend a button has been pressed.
- Needing to press a button also makes it harder to behave maliciously because there are only so many buttons you can buy. This is a physical limit.
- People have found ways around some physical limits (like pressing a button), by using computers and engineering to program pressing the button.
- This means that the name of what Cloudflare has called its invention is false or misleading, because it does not prove (attest) there was a person directly involved.
- There is a related problem, which is that "personhood" has a specific meaning and is not related to the actual problem.
- The actual problem is that software allows malicious behavior to happen quickly and in large numbers.
- Ultimately, it is hard to tell if something is malicious. This is because malice is tied with intention.
- In western common law, the problem of intention comes up in criminal law in a concept called mens rea15.
- Intention is a problem because (right now) we can't really know what someone was thinking or intending to do. We can only look at evidence.
- Common law criminal law divides the intent (the mens rea) from the action (the actus reus). Blame the history of common law in Roman law.
- The actions of malicious bot behavior include both direct speed and scale and also the demonstrated capability for direct speed and scale.
What I am grasping toward is the direct question and challenge. We don't necessarily want to prove if you're a person. We want to be reasonably sure your actions are not malicious, or that you are not going to act maliciously, which could take the form of showing you are not likely to be a maliciously behaving bot.
In other words:
Using Hardware Keys To Help Show You're Not A Malicious Bot
I think what you call things is important.
Phew. That was... unexpected.
Anyway. How are you doing? I love getting notes, even when or if they are just saying "hi", like the equivalent of a nod or shared eye contact across a street, and I do my best to reply.
Best,
Dan
-
‘Cryptographic Attestation of Personhood’ Could End CAPTCHAs Forever, Motherboard, 17 May 2021. ↩
-
Humanity wastes about 500 years per day on CAPTCHAs. It’s time to end this madness, Cloudflare, 13 May 2021 ↩
-
With thanks to Tom Insam for this reminder, in that it "attests control of a scarce / expensive resource and therefore allows the other side to bottleneck requests to the server. Being human is a useful example of such a scarce resource but it’s not actually required.", Twitter ↩↩
-
This disturbing image of a Chinese worker with close to 100 iPhones reveals how App Store rankings can be manipulated, Business Insider, February 11 2015 ↩
-
Test of User Presence in W3C draft Web Authentication: An API for accessing Public Key Credentials, Level 3 Editor’s Draft, 5 May 2021 ↩↩
-
Regarding the paper Trust in blockchain-based systems, commented on in s09e18: Cursed Takes; Multiplayer Figma and Low Ping Bastards; Digital Isn’t Better Than Paper, 21 April 2021 ↩
-
Personhood, Wikipedia ↩
-
Linux and the GNU System, by Richard Stallman ↩
-
How CAPTCHAs work | What does CAPTCHA mean?, Cloudflare Learning Center ↩
-
Rate of fire: Technical Limitations, Wikipedia ↩
-
Speed limiter, Wikipedia ↩
-
Staff Report on Algorithmic Trading in U.S. Capital Markets As Required by Section 502 of the Economic Growth, Regulatory Relief, and Consumer Protection Act of 2018 (PDF), August 5 2020 ↩↩
-
Calculating Optical Fiber Latency, M2 Optics ↩
-
The Wolf Hunters of Wall Street, New York Times, Michael Lewis, March 31 2014 ↩