It’s Wednesday 23rd October, on a Southwest 737 heading back to Portland, Oregon from Sacramento, California. You can’t stop me from listening to Carly Rae Jepsen while I write this.
This Quanta piece reports that a team from University University of Tübingen, Germany found that deep learning vision systems perform object recognition primarily via texture over shape [open abstract, paper]. Seems like a pretty smart experiment; the team built on the observation that the slight addition of noise to an image results in an inaccurate (and, to a human, incomprehensible) mislabeled object.
As an aside, there’s something irritating about the way Quanta publish their articles on the web which is that the dateline is right at the bottom. This is not a new article, it was published on July 1, 2019. You can guess at the publish date by the slug in the URL (https://www.quantamagazine.org/where-we-see-shapes-ai-sees-textures-20190701/), but I find it somewhat user-hostile that the author and dateline aren’t up top. But, I note that Quanta’s site does have that thing where there’s an article progress bar up top, a sort of horizontal orange bar that fills from left to right as you scroll down the page. Which… OK? I would prefer to know when the article was published, thanks.
I feel like I raised this issue before when I wrote about tireless five year olds in episode 10.
I mean, I notice that when my children were babies, we’d show them picture books, the kind of high-contrast stuff. These picture books would use both illustrations of objects and concepts as well as photographs of the actual objects, I dunno, to help developing brains produce some sort of gestalt. These are the salient features of a cat, so you can recognise a cat or a kitty as a representation. It feels like deep learning systems that are trained solely on photographs of cats and not also representations of cats are thus going to potentially learn in ways that we would not intuit, mainly because it’s hard for us to imagine something utterly alien like that. Why wouldn’t a thing that we’ve designed to learn based on our knowledge of biological neural networks learn the way we think we learn, at a high level?
Is it just me, or does this not feel obvious? Or is it because I happened to be interested in child development and read a whole bunch of Steven Pinker books when I was younger? Or is it because the field of machine learning is just super young and the people who work in it, on average, tend not to have children? Or aren’t as involved in childcare?
When people talk about the need for diversity in technology, about why we need diversity in technology, this feels like one of the reasons. One, for having a higher potential for wider points of view that might make recognizing “we’re only training on photographs of real things” a risk, or as having implications for the technologies being created.
This also brings up a related thought for me: you might think that training deep learning systems on photographs of real things in the world makes a lot of sense if you’re also the kind of person who believes that general artificial intelligence requires embodiment, whether in a physical or virtual environment. (I mean, I feel like the embodiment requirement is something like being able to exist in a high-bandwidth, high-degree of feedback environment, so an environment significantly more complex than contemporary attempts at experimenting with A.I. in dynamic, feedback driven systems like video games. But, that’s a start!)
But there’s a potential hole which is that the physical environment humans exist in is not only one where there are Real Cats, but also a physical environment that contains human artifacts, where Symbolic Cats and Representational Cats exist alongside Real Cats. Do we need those to be recognized, too? If we want A.I. to exist alongside us, and if A.I. is marketed as a general intelligence and treated as such by its users, then how do we make sure that we either fulfil those expectations or, better, signal that this is a Different Thing? There’s a lack of precision when we say “this system can recognize and label x” when we actually mean “this system can recognize and label x” when x is of the type y.
This started out by me seeing mention in the PDX edition of Eater that a Portland, Oregon area Outback Steakhouse was going to use AI to “manage its front of house service”. The Eater story was picked up from a Wired article and also covered in the local paper the Willamette Week on October 21. By the next day, the WW received an email statement that it had “halted the testing of Presto Vision at the chain’s Portland location.”
I mean, I was all set to have an excuse to do something horrible to my body all in the interests of finding out more about this Presto Vision product [press release]. The press release is worth a read, I think, because it shows how industries are talking about (and treating) technologies like machine learning and computer vision.
The press release covers all the usual stuff in terms of why a restaurant owner or manager might want to use such a system. You might want to, uh, get more information about “host availability” and “individual wait times” and “customer bounce rates”. It’s telling that the press release compares this new analytical data to e-commerce websites in a quote from the founder/CEO:
E-commerce websites have always had detailed analytics on how customers navigate their sites, but restaurants never have had access to this information in their physical stores. With this product, restaurants can now have access to critical insights on how their stores actually work. This helps them provide better service, operate more efficiently, and reduce overhead.
Of course, one of the other reasons why websites have detailed analytics is to enable digital advertising and… well, that’s not going so great right now. I don’t know the restaurant running business, so I don’t know if “bounce rate” is a term that makes sense there, but I do notice that one term from one area of business is encroaching on another area.
The Wired article also prompted a bunch of thoughts. In my opinion, Suri, the Presto CEO said the Presto Vision is “not that different from a Fitbit or something like that. It’s basically the same, we would just present the metrics to the managers after the shift.”
So, I suppose, it is a bit like a Fitbit in that it’s something that continuously and persistently monitors your activity but it is, in ways we should be clear about, unlike a Fitbit because it is being used by a party without the consent of others (the employees and customers) and, well, let’s just emphasize the whole present the metrics to the managers after the shift. I mean, I can choose to share my activity data with my doctor or my insurance company (I mean, I have the choice. I don’t know how much longer we may have that choice in America).
I suppose it is OK, in a sort of bare-minimum lip-service way, that Presto Vision “doesn’t identify individual diners” but again the telling point in Wired’s writeup is that the system “doesn’t currently employ technology like facial recognition”. I wrote about this in s07e03, about Linksys upgrading their mesh routers through an additional paid service that provides motion detection through your house. Part of the issue here is that there’s nothing stopping Presto from arbitrarily upgrading their system to support post-hoc individual recognition. How do you opt out? I mean, you could just not go to Outback Steakhouse and I have to concede that not going to Outback Steakhouse has been pretty much my default position until this very moment, and it will probably continue to be my default position. Again, I wonder whether we’ll end up with a well-intentioned but badly-legislated and badly implemented outcome like giant:
THIS ESTABLISHMENT USES TECHNOLOGY KNOWN TO THE STATE OF CALIFORNIA TO ADVERSELY AFFECT YOUR PRIVACY.
signs on the front of every single chain establishment. Which you know, might not be a bad thing all things considered.
The thing is, the allure is going to be there. One day you’re happily installing a Presto Vision system because you don’t trust your workers and you want to save money by having an impersonal manager just use potentially inaccurate metrics to manage staff (sorry did I say staff, I meant zero-hours contract workers with no benefits) at arms length. The next thing Presto are telling you that you can participate in revenue share by allowing third parties to use location and presence data of identified individuals because such data is useful to ad tracking companies! And you know, why not? It’s not like a restaurant is a nice, easily profitable business, so why not take the extra money?
And then before you know it, the whole thing doesn’t collapse at all after there’s a security breach and random people get to find out that you’re the kind of person who regularly goes to Outback Steakhouse because someone has been just like Equifax and used the username/password admin/admin on a database with PII.
FOR STARTERS, your writer shouts, the story opens with someone trying to use the park police robot to alert the police by pushing the robot’s emergency alert button because a fight is happening. Now, this is a reasonable and understandable course of action because the robot has POLICE written on it and has its own Twitter account, @hprobocop. Now, some of you may be able to guess that pressing the emergency alert button did absolutely nothing at all because the button on the robot (which, again, has POLICE written on it and its own official Twitter account that links to the City and its Police Department) is not in fact connected, in any way, to the Police Department. You may be able to guess this because honestly what else do we expect the tech industry, as an amorphous body, to be capable of in terms of head-slappingly horrific lapses of judgment and understanding.
Now, the situation I’m describing is one where a robot is being used to patrol a park (and has been used since June this year), and, let me point out again, has the word POLICE written on it and the aforementioned Twitter account. It also still has an accessible emergency alert button on it but, “we’re not advertising those features” says the Chief of Police.
I don’t understand the disconnect here. I mean, I get that you’re not advertising the feature and that, I don’t know, you’re not using Twitter to tell people that they can use the emergency alert button on the five-foot robot patrolling the park with the word POLICE on it and an official Twitter account to use the button. I get that. But… the button is still there? It is still possible to press it? I do not know what people think when they see the word POLICE and a button whose label has according to NBC News consists of two words; “EMERGENCY” and “ALERT”. Perhaps, alongside the word that says POLICE, there might be some space to say “THIS IS A TEST” and “DOES NOT ACTUALLY CONTACT THE POLICE”.
Because it turns out the robot is a deterrent. It is not a “police” robot. It is the equivalent of that social science study where if there are eyes watching you, you will not do bad things which speaking as the father of two young children is demonstrably untrue. I can be staring them in the goddamn eyes telling them not to do bad things and they will do them, and they still do them if there are googly eyes in the room after I have left it.
The NBC team do a good job of pointing out that HP RoboCop is “little more than a glorified security camera on wheels” that costs $60-70,000 to lease a year, about as much as “a Huntington Park police officer with a basic assignment.”
Again, there’s “no signage describing what it does or why it is there”. The Chief of Police again says that this is “because the department does not want to falsely advertise the robot” and I need to point out again that this is a robot that says POLICE on it and also has its own Twitter account and what is Twitter these days, really, other than a way to advertise things and perhaps to forment civil war and usher in nuclear armageddon? I mean… aren’t you already falsely advertising the robot?
It gets better (by which I mean: it gets worse) because in the NBC article, Knightscope’s executive vice president and chief client officer supports the idea that “the fact that the public is unaware of all the robot’s capabilities is integral to Knightscope robots’ mission to be a physical deterrent to crime”.
So… it’s not a Police robot, then? It’s… an alarm? It’s a sticker on wheels that says THIS AREA IS PATROLLED AND UNDER SURVEILLANCE?
I will just put the thought out there that deterring crime is maybe not the only job of a police force (gosh, the things we could talk about in terms of the job of a police force in America these days), and it is a bit weird, one might think, that the general public would think that a robot with the word POLICE on it would be able, in some way, to render assistance to a person in need.
Hey do you remember that time the US Marine Corp commissioned a study about whether they could have low-earth orbit insertion dropships for rapid deployment? [pdf]
Frank Lantz (Drop7, Universal Paperclips) continues to be a goddamn genius because his new game Hey Robot (via many people), a sort of Taboo but with Alexa/Google, is, well, genius because of how it will get people to interact with and understand voice assistants as well as being fun.
Remember that time people used AI to search possibility space and design chairs? This was an entirely valid use of AI because although people love to design chairs and will probably never stop doing so, all those chairs are really, like, human-y. So TU Delft researchers did a similar thing and designed a “new material by using Artificial Intelligence only” and if you are a space-cadet, the thing they designed the material for is super fun because it’s about packing giant solar sails in a tiny package. As far as I can tell, there is no mention of designing a chair in the article, so if you’re allergic to that kind of thing you should be cool.
Via @foone, a GAO report [pdf[ from February 1992 after a failure of the Patriot Missile Defense System led to the death of 28 Americans in Dharhan, Saudi Arabia, during Operation Desert Storm. The Patriot systems were only supposed to be used for short periods of time by mobile units, so when they were used at static locations like barracks, assumptions about their runtime turned lethal: conversion of floating point numbers started deviating from clock time, resulting in the shifting of the expected window of an incoming projectile. The fix, “reboot the missile system” is one familiar to people who know about planes (“reboot the plane”) but again, wasn’t because of something like using an unsigned 8 bit integer and overflowing. No, that…
… describes why trains in Switzerland may not have 256 axles. That particular tweet caught my attention not because of Yet Another Overflow breaking out into the Real World (of course they break out into the real world, all software has an effect in the real world) but because of the comment that this might be an example of a problem with code effecting a policy change and, well, I am sure there are many, many, many examples of such a thing happening. (It’s worth reading the thread too, because it turns out “this wasn’t fixed in code [because] the affected signal boxes are based on relays”.
All of which is to say software his hard and do we monkeys really know what we’re doing, and that’s even before we put the word POLICE on something and then get surprised or defensive when people act like a thing is supposed to do what the label says.
Anyway, I am sure there was more but I am very tired and I also have many other things to do!
I hope you’re all well, and please, if you feel like it, send a note, even if it is just to say hi. It helps us remember that we’re actually human beings and not sophisticated simulations. (Or does it?)