DRAFT: How social media algorithms embed and magnify racism, sexism, and misogynoir

DRAFT! Work in progress!

Please do not forward broadly yet.

Feedback welcome! There's a comment section at the bottom, experimenting with Commento, or you can get in touch with me via email or social networks.

Facebook, Twitter, YouTube, Reddit and other social networks use algorithms to decide what news to show you; which of your friends' posts or tweets you see; what gets automatically blocked for hate speech and which accounts get suspended; what ads you see and how much advertisers pay for them; what groups and events they recommend; "trending topic"; what to show if you do a search; and much much more.

Virtually all these algorithms are racist, sexist, and misogynoir.

And when you combine those algorithms with processes, employee demographics and power relations, it all works to reinforce and magnify the kinds of oppression we see throughout society. And not just for social networks, either; books like Safiya Noble's Algorithms of Oppression, Ruha Benjamin's The New Jim Code, Frank Pasquale's Black Box Society, and Cathy O'Neill's Weapons of Math Destruction paint a picture of how pervasive these problems are.

Of course, algorithms are far from the only source of racism, sexism, and misogynoir in systems like Facebook, Twitter, and reddit. Becca Lewis' All of YouTube, Not Just the Algorithm, is a Far-Right Propaganda Machine, Siva Vaidhyanathan's Making Sense of the Facebook Menace do a great job at setting the algorithmic issues in a broader system context. Still, they're an important factor.

Looking at all the different ways algorithms can be racist, sexist, and misogynoir, and all the different ways algorithms are used in these systems ... there's a lot to cover. So this post certainly isn't an exhaustive catalog.

Instead, the bulk of this post will be a quick (partial) tour of different ways algorithms can embed these (and other) forms of systemic oppression, a complement to Jessie Daniels 2018 Algorithmic Rise of the Alt-Right. At the end, I'll wrap up with some thoughts on what we can do in response.

Intentional racist, sexist, and misogynoir algorithms

People often talk about algorithmic oppression as an "unintended consequence". Sometimes, though, it's an explicit choice. Automated moderation on Facebook is a great example of this. Facebook's policy is that "white people are racist" is considered hate speech ... but "Black women are <whatever>" is not. So that's what the algorithms implement.

Facebook's support of illegally discriminatory housing and employment ad targeting is another example where the racism was intentional. After ProPublica first reported this in 2016, Facebook claimed that they'd stop ... but they lied. Even after Facebook's 2019 settlement, they're still doing it.

A variation on this is when companies have techniques that could improve the situation but choose not to deploy them. Why Won’t Twitter Treat White Supremacy Like ISIS? Because It Would Mean Banning Some Republican Politicians Too (from 2019), Facebook Struggles to Balance Civility and Growth, and Facebook Executives Shut Down Efforts to Make the Site Less Divisive describe various examples.

And as Kara Swisher says about unconscious bias: after the tenth time it's been pointed out to you, just how unconscious – or unintended – is it? Algorithms of Oppression is a really good exploration of search biases, and there has been a ton of research since then .... and yet, when I did a Twitter search for "impeachment" when I was writing this post, nine of the top ten tweets were from white people (with CNN's Ana Cabrera as the only exception).

"Rich get richer" effects

Twitter's search algorithm factors in how much engagement (likes, retweets, replies) a tweet gets. Unsurprisingly, the more followers you have, the more engagement your tweet's likely to get. White people tend to have more followers than people of color, so it's mostly white people's posts who get shown – which makes it more likely that they'll get liked, retweeted, followed. The rich get richer!

There are a lot of reasons for the disparity in followers. Twitter's more likely to feature white people as recommended users. White people are more likely to have money to promote posts or buy followers. White people are more likely to be able to get their companies to pay to promote their posts. White people are more likely to be verified. Etc etc etc.

Think of follower count as a form of power on social networks, and disparities tend to mirror or magnify other societal power disparities.

Similarly guys tend to have more followers than women and non-binary people, and the disparities are greatest at the intersections.The net result: the algorithm loooooves white guys and especially penalizes Black women.

"Rich get richer" effects also come up in Twitter's algorithmic timeline, Facebook's timeline and most-shared posts, which redditors posts and which subreddits make it to the front page, many recommendation systems ... and lots of other places.

Homophily

Social networks, online and off, tend to be homophilic: people are more likely to connect to, engage with, and amplify people who are similar to them.

On Twitter, white people are generally more likely to follow, like, and retweet other white people. Guys are generally more likely to follow, like, and retweet other guys. The net result: another way the search algorithm loooooves white guys and especially penalizes Black women.

For most people, homophily also accentuates racial and gender biases in their Facbook feed. If you're friends with a lot of white people, and they're mostly liking and sharing stuff from white people ... what kind of racial balance are you seeing? [That's not just a rhetorical question; it can be quite interesting to track this and figure out how to improve your balance.]

And again, this pattern comes up all over the place. Susan C. Herring 's Women and children last: the discursive construction of Weblogs (from 2004), Shelley Powers' Guys don’t link (2005), my own Guys talking to guys who talk about guys (2009), Ana-Andreea Stoica et. al.’s Algorithmic glass ceiling in social networks: the effects of social recommendations on network diversity (2018), and Nikki Usher et al’s Twitter Makes It Worse: Political Journalists, Gendered Echo Chambers, and the Amplification of Gender Bias (2018) look at some of the many different examples.

Biases embedded in the data

Machine learning algorithms are typically "trained" on a dataset, and then applied to real life situations. If the training dataset is biased, then the algorithm will also be biased.

Shirin Gaffrey's The algorithms that detect hate speech online are biased against black people discusses one good example of this in a social network context: biases in the data biases lead to Google's "Perspective" API misclassifying some African-American English constructs as toxic.

Especially Joy Buolamwini and Timnit Gebru's 2018 Gender Shades project, here's also been a lot of attention to how the biases in image datasets leading to facial recognition systems being much less accurate on Black people than white people. As well as leding to people getting mistakenly arrested, these image technologies are also used by social networks for lower-stakes decisions – for example what kind of text to automatically generate for images and vidoes to support people using assistive technologies.

There have been a couple of very high-profile Twitter episodes this year featuring well-known AI experts arguing that this shouldn't be considered algorithmic bias, and instead blaming the data. To me that sounds like rationalization and an attempt to avoid taking responsibility. In any case, though, whether you blame the data for reflecting society's bias, or the algorithm for failing to take into account the obvious fact that the data reflects society's bias, the end result's the same.

A first step towards improving the situation is to be explicit about the biases in datasets. Timnit Gebru et. al.'s Datasheets for Datasets discusses this in some detail. Why yes, this is the same Timnit Gebru who was recently fired by Google for her work on ethical AI. It's almost like large tech companies don't really care about making their algorithms less biased!

Algorithmic manipulation

"Threat modeling" is a structured approach to looking at security threats. Most social network companies are not very good at it, and so it's often easy for attackers to manipulate thei algorithms to their own advantage.

White supremacists exploit this, for example by hijacking YouTube's video recommendation algorithm for recommending videos and using it as a recruitment tool. Disinformation networks also exploit this, for example the fake news sites who manipulated Facebook's "trending topics" algorithm in 2016. And looking at Facebook's top 10 lists, where sites like Breitbart sites are consistently near the top of the list, there's still a lot of that going on.

Still, it's long past time for social networks to be looking at these kinds of issues.

What can we do in response?

At an individual level, once you understand these kinds of biases, you can change your behavior to help reduce their impact some what. True, these are systemic problems, and individual responses can only do so much.

Still, straightforward approaches like following more Black activists, media, experts (especially Black women and non-binary people), liking and reshare their posts and comments, and cutting down on how much you like and reshare posts from white people (especially white guys) can make a big difference in the news you personally consume – and have some influene over what your friends see.

Getting your friends and family to do the same can multiply the impact. If you're part of a non-profit or activism group, get your colleagues and membership involved as well.

For broader change, it's vital to press tech companies to do better – for example by supporting tech worker organizers like the new Alphabet Worker's Union as well as groups like the MediaJustice, Color of Change, Ultra Violet, and the Real Facebook Oversight Board who have taken the lead here. Tech companies have a lot of leverage here, but need encouragement to use it.

For example, algorithms often can be adjusted or overridden to get better results. In the ramp-up to the November election, Facebook introduced more friction into their algorithms to cut down how quickly certain kinds of post could spread. Temporary changes Facebook made to their news prioritization algorithms temporarily let mainstream news sites dominate the top 10 ... for a day or two, anyhow, before white supremacist sites took back over. Facebook to start policing anti-Black hate speech more aggressively than anti-White comments, from early December, and Facebook Executives Shut Down Efforts to Make the Site Less Divisive discusses several other algorithmic changes. Sometimes the tools are blunter: disabling functionality, banning accounts.

Of course all of these interfere with tech companies' business models, at least to some extent, so they typically prefer not to take these kinds of steps – in fact as Sydette Harry points out they'll even claim it's impossible to make straightforward changes. And if they really want to take this seriously, there's a lot more they need to do as well: auditing and fixing existing algorithms, dropping harmful functionalty, threat modeling opportunities for abuse, and so on.

Call me a cynic, but they're not likely to do it unless pushed. So, keep pushing.

And from an activism and organizing perspective, it's important to acknowledge these issues and incorporate them into planning. Despite all the biases, Black Lives Matter protestors have done a great job at leveraging social media for organizing. Sarah J. Jackson, Moya Bailey and Brooke Foucault Welles' #Hashtag Activism and Sarah Fiorini's Beyond Hashtags are two good starting points here.