I’ve used apps that thought I was a monkey but it recognized white people just fine, I’ve seen racist chat bots go on about how much they love hitler and hate black people, I’ve saw ai generated photos that produced racist depictions. What I’m wondering is it the developers making the ai racist or is it the code itself that is racist? I don’t believe that any machine has reached sentience obviously, but I have no doubt in my mind that all ai I have experienced is racist, all I ask is why?
Easy one, the code itself is racist bc it was developed by white supremacists who never checked their own racism. The algorithm is just a big dataset that they programmed with certain tagging, but the internet itself is bad and the people who are working on this are usually highly paid white people, so you’re working off a racist/abelist etc dataset and tagging system. So tiktok tags “ugly” people as something they don’t want to go viral in case of “bulying” but that of course, includes visable disabled people. And the twitter issue where it still favors cropping to anything but a black person. Youtube has also made it so queer issues are silenced, videos even with the word “lesbian” get demonitized even if it’s from a lesbian identified person talking about their positive experiences and there’s no sexual mentions.
These also leads to insularity --> twitter can see people who talk in a certain way only follow other people who talk like that. That means white people literally won’t even see black people recommended to them and so on.
People in here have done a good job of covering racism in the formal training set. But AI also tends to be fine-tuned on examples the devs have immediately handy, pointing it at themselves and their coworkers and double checking whatever they get out of it. That’s why the Google traffic algorithm is like 70% accurate world wide but 99% accurate on two highways in the bay area. And it ends up reflecting hiring biases in these companies, where they’re mostly white, mostly male, and especially not black. So you end up with a black product manager I met who worked on voice recognition for the XBOX, who used his “white guy voice” at work, and who the machine couldn’t understand when he spoke in his natural dialect. Or an Indian friend who worked on the tracking software in the Amazon Go store, and it would glitch out when looking at him because the camera wasn’t calibrated properly for skin as dark as him (and he knows this, but the execs they have to make demos for are white, so it doesn’t get prioritized).
Something that blew my mind when I learned it is that machine-learning algorithms produce programs that nobody really understands beyond a conceptual level. Like if a regular computer program is doing something unexpected, the creators can scrub through the code, find the cause, and fix it - but if your chat bot starts spamming antisemetic and racist phrases, often the only thing you can do is roll it back to a version that didn’t say those things (which of course does nothing to prevent it from re-learning them).
machine-learning algorithms produce programs that nobody really understands beyond a conceptual level
wait what
The current big thing in machine-learning are neural networks, which are vaguely based on how neurons interact (but each node in a neural net is much more rudimentary than an actual neuron) and basically these get trained with data and adjustment algorithms that try to make their outputs look more like the known correct answer for the known data sets, and often they’re further trained by having people look at the outputs and say whether it was right or not.
Like, imagine a dog: the dog can be taught to do certain things based on certain stimuli by rewarding it with food when it does what you want, and that training can be anything from teaching it tricks, to teaching it useful behavior like herding or guiding someone, to turning it into an erratic weapon that will maul someone at the slightest incitement. You control the teaching process, but the actual internal mechanisms and what’s been learned are entirely inscrutable because they’re all encoded deep into a bafflingly complex web of nodes that we barely understand beyond “they work because they work, and there’s little electrical bits and some chemicals, it’s all real fiddly but it mostly does what it should.”
That’s what modern AI research is, just teaching really stupid fake dogs that live in computers to do useful tricks, which can nonetheless be very impressive since 100% of the fake brain is dedicated to a specific task instead of having to worry about stuff like thermal regulation, breathing, balance, what smells are, etc.
A machine learning model consists of taking all your inputs, applying several pages worth adding them together and multiplying by arbitrary constants, and getting a number out. Machine learning itself consists of methods for rejecting piles of arbitrary constants until you get one that outputs results similar to your training data. Nobody knows what’s going on inside, because it’s a pile of very arbitrary math, chosen automatically because it mostly does the right thing.
(There are other branches of machine learning that are more insightful, understandable, and explicable. But they’re also more limited, and not the stuff that the last several years of ML hype has been about.)