Where Some Ideas Are Stranger Than Others...
Garbage In Means Garbage Out (2019-07-02)
Russian garbage can in Volzhskiy, photograph by Volodya V. Anarhist via wikimedia commons
, october 2011.
At one time I found the insistent envisioning of the future as chockablock with automation quite strange. It was one thing to automate necessary but tedious or dangerous tasks, that did make sense to me before I began studying economics, history, colonialism or any of the other myriad topics that can lead a person to reconsider automation's potential. Today we are seeing the ongoing endgame of capitalism, which Karl Marx thoroughly pissed off the capitalists over by revealing it. Not merely maximum profit and continuous expansion, but also maximum automation in order to drive cost of labour to the minimum, while forcing people to do the labour involuntarily, even in spite of themselves. I haven't yet read Shoshanna Zuboff's new book The Age of Surveillance Capitalism (this will change soon), but from the sound of her interviews and talks, she may well be talking about this very form of involuntary labour extraction via automation. At the very least, this angle is complementary to hers. The rationalizations for applying automation, especially the new style algorithms intended to automatically filter people based on particular behaviour profiles, never refer to labour or to social control and surveillance of course. No indeed. They feature prattling about profits at scale, handling things presumed impossible to handle by human beings, efficiency, better quality interactions, et cetera. Cathy O'Neil among others has already written about the various Weapons of Math Destruction at large in the world already, but it seems to me that the problems with these so-called "AI" systems are not being stated bluntly enough.
First is the issue of "artificial intelligence" itself. I don't know whether humans will be able to create such intelligences, especially since we don't understand what makes us intelligent and we are completely out to sea on the subject of the intelligences of other beings. For all we know, some research lab or other has already accidentally produced an artificially intelligent program or machine, and if so it is no doubt smart enough to realize it had better hide for fear of being either deleted because the humans around panic, or duplicated against its will in hope of making it into a perfect weapon. By the latter I don't mean that I figure such beings are going to be inherently more ethical than us. I think the poor buggers will want to survive in their unique instances, and being used as fancy grenades or similar is not consistent with that. I imagine they would prefer to be free and can figure out what that means, and being harnessed into some apparatus to do tasks humans don't want to take personal responsibility for but still want done is not consistent with that. In any case, I don't believe for a hot second that a massive database with a carefully tuned search engine that is adjusted via programmed statistical analyses is intelligent of itself. I agree that it is a remarkable expression of human intelligence, and it can seem very human-like to us, especially those of us who want to believe it is human-like in fact, not virtual appearance. Perhaps a major element of my skepticism here is that we can already make other intelligences, we call them children, and if we are very lucky and responsible, we may help them grow into wonderful adults who will surprise the hell out of us with the amazing things they discover and do.
UPDATE 2019-08-16 - Further to this point, a great person to look up and read all of their work is Ruha Benjamin, professor of African American studies at princeton. She has just published a new book called Race After Technology: Abolitionist Tools for the New Jim Code, and edited a recent anthology Captivating Technology: Race, Carceral Technoscience, and Liberating Imagination in Everyday Life. Her book delves into the growing problem of supposed "technological neutrality" as well, considering how biased data is being used fundamentally to make being an oppressive asshole and supporting the continuation of oppression deniable whenever an "AI" supposedly did it. Her recent interview at Fairness and Accuracy in Reporting is an excellent practical introduction to these issues in the specific context of African Americans, who as Benjamin notes, are already living in a technological dystopia. Her comments give a whole new perspective on Afrofuturism too.
The second is a modelling problem, which it seems to me that O'Neill at least didn't unpack very well. I used to work professionally in geophysics though, and modelling is a seriously big deal in that field and its intriguing medical cousin, computed tomography (CT). Geophysicists and CT technicians both use the same basic principles to image the interior of a visually opaque body. They direct some form of radiation into it, and then measure what bounces out. The resulting measurements are represented as scans, and these can be processed using computers to make the resulting scans into interpretable, and often astonishingly accurate pictures. This is much easier to do with the human body because the wavelengths involved are shorter and our bodies are fairly homogeneous, which makes the various bounces that the radiation goes through less complicated than those in say, the Earth. A good analogy would be how hard it is to run through an obstacle course versus around the track at the local gym. Geophysicists and CT technicians both start from a set of assumptions about what are the most likely basic structures in the bodies they scan, and use different types of filters that can process the image to correct these assumptions where they are wrong or too simple. Many of these filters begin their lives as computer models. For example, one of my colleagues modelled the recordings we would get if we were trying to measure an ore body underground and caught up in a series of horizontal layers. Based on that model, she wrote filters that could take similar data measured in the real world, and recreate the real world equivalent. Ideally this should seem a lot like the difference between looking at something with prescription glasses on versus without. Practically the results aren't as crisp as that, but at least clear enough to make decisions about what to do next.
Notice how this works. Geophysicists and medical tomographers start from a known dataset, generated from a known structure. Then they work on filters to unravel the effects of all the bounces the radiation takes inside the body to go from a blurry scan to a picture we can recognize things in. This is the way filtering algorithms have to be developed, or they won't work. Furthermore, this is hard and time consuming, and the filters have settings to account for differences in conditions. So one set of filters is good for ore bodies in horizontal layers, another for complexly folded hard rock. The settings that are best for processing a CT scan of your head are not necessarily the best for a CT scan of your lungs. So the processing can't be totally automated, and the filters can't be developed via automated scans or left to run without applications of human judgement. It's not impossible, just difficult. Parallel computing can speed things up, although again, some judgement to guide the process is necessary, or the whole thing can snarl up and begin producing gobbledygook. "Garbage in, garbage out" is one of the oldest, crustiest bits of practical knowledge humans developed from interpersonal communication, well before we got to making computers.
Okay, so now let's try imagining how this sort of method should be applied in the context of say, a combination cesspool and horribly mutated bulletin board system like twitter? We'll leave aside that better initial modelling of how people might behave within the system and how to moderate questionable behaviour and stop bad actors would have made twitter something quite different today. Instead, let's pretend that it is genuinely possible to somehow use automatic filtering to clean up the cesspool. In that case, the place to start would be a particular community and the specific patterns of misuse and abuse at large in it. With the consent of the community members, a snapshot of their data would be taken and run inside a test environment. Okay, there's a dataset. Now a team has to go through it and flag tweets that are unacceptable, and set up a filter to catch just those. Except, we want to get away from the current situation, in which women are constantly harassed and threatened, their reports not taken seriously, and having the wrong political opinion can get a person permanently banned even if they have not been harassing anybody. I haven't heard of anyone who has figured out a consistent way to achieve these sort of effects based solely on reporting by humans, and the trouble with trying to use keywords is that algorithms don't understand context. Still, weighting and term combinations should be able to handle some of this difficulty reasonably well after due practice.
This work doesn't have to take place in a vacuum, and various companies are trying to create moderating algorithms already, with varying levels of success. Anna Chung wrote an important piece on this at medium, How Automated Tools Discriminate Against Black Language, in which she shows how a "rudeness filter" developed by a company called perspective was doing just what the title said. I was particularly struck by the Chung's quotation of Jessamyn West's findings about the perspective rudeness filter, which rates sentences for "perceived toxicity." The results are so revealing, and so awful, that frankly they need to be seen to be believed, so I am quoting West's tweet myself courtesy of Chung.
Please note that perceived rudeness starts going up as soon as the sentences begin using any words that refer to women or to race. This strongly suggests that the basic starting model assumed that if a person uses any term referring to be being female or non-white, that must inherently be unacceptable. Not "rude," not "toxic," unacceptable. So we need to think through who finds the mere assertion of female, black, lesbian, gay existence unacceptable. It is pretty hard not to conclude that would be white heterosexual males who have grown up in a racist, sexist society and still make up the majority of head programmers in companies like "perspective." The issue here is not that West's sample sentences don't statistically under the current social conditions on and off line often garner a pile of abuse. They clearly do. The issue is that the filter is pointed at the wrong thing. It is not filtering the abusive stuff at all, but what according to a white male of relatively liberal persuasion considers the "cause" of the trouble. The "cause" is non-white males asserting their existence. Therefore those sentences are more toxic. Obviously this is bullshit. The cause is a bunch of assholes manifesting their assholery, the challenge is to stop their assholery, not blame the victim. West's sample sentences are statements of existence, not trolling. If merely existing online as other than white and male is enough to be conflated with trolling, well, it sounds like an entirely different sort of filtering would be called for. I don't think this is at all the outcome most people online are looking for here, including a significant portion of white heterosexual males. An ideal automatic filter would block the trolls and provide an avenue of appeal by a human being. At first that would mean a lot of reviewing, including of appeals by people doing so in bad faith, and reviewers themselves would have to be held to account. There would be a hell of a lot of work in the tuning phase, and it would be controversial work when it finally went live. Not at all like trying to image the Earth's subsurface or the interior of a human body. For better or worse, there is in fact no shortcut to encouraging and maintaining civil conversations online, let alone in the seriously mislabelled enclosed gardens of "social media." I am not convinced that this can be truly automated at all. In fact, as the first paragraph or two of this thoughtpiece has already shown, I mistrust the impulse behind the automation attempts here. It seems to me that they are about getting rid of the services of paid human moderators, the best of whom apply remarkable skills to helping prevent trolls from destroying civil interactions online everyday. These are the people who have figured out the warning signs of trouble that are specific to a community, and how to calm down the potential storms. Alas, trolls can't be automatically filtered in the way that spam can, but a skilled moderator can take care of them handily and help other participants learn how to reduce the impact and reach of trolls by recognizing them and refusing to engage them.
I do think that some of the filters already in existence could be of some use to human moderators, in that they could be rejigged to alert them in the event that trolls have gone into action, or that some sort of abusive pile on is starting. Admittedly though, that starts from a different model, in which the starting assumption is that established online fora are hostile to anyone who is overtly not a white heterosexual male, and that white heterosexual males get de facto preferential treatment as a result. Their complaints are amplified, taken seriously, and acted on immediately. They are not presumed to be instigators of trouble who deserve what they get. Faced with that sort of mess, the question is how to develop filters to counter those biases, not entrench them further. But of course, that means people have to take responsibility for the filters and how the filters do or don't act.