We need to have a real conversation about engaging with AI tools in the long term. If you just avoid using these tools then your voice and your perspective don't get represented in the training data. If all we do is leave these systems talking to those we don't agree with then all we're going to get is a trained model we don't agree with. Is that what we really want? I mean I hate to throw out the term "garbage in, garbage out" but you need to understand that is exactly the scenario we're living through right now.
Specifically I want to address the subject of representation. To help you understand by example, let's step back for a moment in time and consider another case in history where a limited representation situation set in motion the structure of an entire entertainment medium. The medium of videogames.
Let's set the groundwork. Historically video game stores were staffed by young male video game players and review magazines were staffed by young male game players and marketing advertised video games to boys because very early games were rudimentary affairs where you might drive tanks, race cars, shoot guns… often extensions of aspects of play for young boys (not saying girls don't play in this space, but the Sears catalog shows a boy with a joystick and a girl with a doll). As games improved they were sci-fi or fantasy focused (D&D influenced for sure). As the SNES/Genesis era of game consoles took off and hit the mainstream with the "Play it Loud" and "Blast Processing" back and forth edgy marketing campaigns, they became more focused on the young male audience. From that audience the foundations of gaming media were built and when we hit the next market explosion of console gaming in homes with the PlayStation era the magazines staffed mostly with effectively fans aspiring to be writers became the norm for the media and for the most part did not mature with the growth.
At the peak of this new market we had an incidental echo chamber of 20-something males writing and staffing retail operations catering to other males from teenage and up. These people became the voice of the industry and they became the voice the retailers heard. There were no women in this chamber and the rare efforts to try and produce content for the female audience typically were shunned or ridiculed by the magazines and retail staff almost assuring failure.
We ask why women aren't represented in videogames. That's because they weren't at the table when we decided as a society what videogames were. Ultimately then we left half the gaming population feeling ignored and objectified. Do we have to do this again?
Does this sound familiar? While we laugh at what happens when one of these systems trained predominantly on one corner of the internet behaves wildly, you need to understand that this is what we're going to end up with if that's all the training data contains. Look at historical examples of bad results from bad data we've already encountered:
The Amazon recruiting AI event from 2018, where a bias was learned from historical hiring data that reflected decades of underrepresentation of women in tech. The model didn't invent the discrimination. It faithfully reproduced the discrimination already present in the data it learned from and picked the statistical male representation from its dataset for recruiting decisions.
The COMPAS system (reported on by ProPublica in 2016) was predicting recidivism risk in ways that systematically flagged Black defendants as higher risk than white defendants with similar criminal histories. It was being used by judges in sentencing and parole decisions in multiple states (and still is). The insidious nature of this was the ongoing feedback loop it builds flagging the very same community it was punishing for being flagged.
Or something nearly everyone of working age has dealt with, the evolving issue with ATS platforms (aka Applicant Tracking Systems, the resume screening software now standing between most job seekers and a human set of eyes). While these historically haven't been LLM-based, companies are now using models to screen resumes and in some cases perform an agent-based interview. The ATS systems illustrate the same problem at industrial scale. Much like the Amazon event many of these are industry wide platforms built using biased historical data, they don't invent discrimination, they systematize it. In one case I read an algorithm had quietly learned that the strongest predictor of job success was having the first name Jared and having played high school lacrosse. Obviously these kinds of filters would result in a recruiter being fired or a company getting sued, but the algorithm just called it a pattern and the company treats it like an unaccountable black box. If they get called on it, it becomes "The computer did it! Not our fault!" way to avoid accountability while keeping the status quo (and yes there are opinions I've read that believe some companies do this consciously in order to filter without accountability).
This problem is real. It's happening all around you. If you have strong convictions about not using these tools, I do respect that decision, I really do. But, I need you to also understand the stakes by not participating in the conversation or adding your voice to the model. Choose to be invisible now and you'll remain invisible.