First They Came For The Data

1.2

Welcome back or welcome to arachne 🕸️! This week we’ve got some ramblings on data privacy. Always something to talk about there. Plus some shopping suggestions and why you needed to create a password for this newsletter. Thanks for reading.


🌟 Feature

I cannot quantify for you how much information there is. I mean this in both literal and metaphorical senses.

I cannot tell you how much data there is on this planet. Like the actual amount of gigabytes of encoded information, there is no useful way to put a number on this. New data is born faster than people are. One source I found suggests it could be in the 120 zettabyte range…or 120 trillion gigabytes. Your phone probably has, say, 128 GB in it. That’s 937.5 billion phones.

I cannot even come up with a meaningful analogy. The amount of gigabytes of information on the planet is somewhere in the realm of the number of red blood cells in six adult human beings. But somehow that doesn’t even seem like a lot. Until you remember that there are around 5 million red blood cells in one milliliter blood.

A plurality of that information is unlikely to be available to you at this moment. It sits behind paywalls, it alights the pixels in military machines, it amasses in troves of corporate emails. Another plurality of that information is data you will likely never come in contact with. The depths of YouTube, an awful Netflix show, years and years of audio books that don’t interest you. And then of course there’s the information you will come in contact with, like a Netflix show you like. You’ll stream it and that data will travel along miles of wires to your Roku and turn into colors on your TV.

But then there’s this other pile. What’s in it? Well, here are some examples:

email addresses, phone numbers, names, social security numbers, routing numbers, home addresses, health insurance ID numbers, mothers’ maiden names, credit card numbers, dates of birth, zip codes, shoe sizes

Mountains and mountains of data. All of it willingly offered at checkout, in Google forms, on login pages.

For around 15 years, the internet has been funded by these sorts of identity markers and the online behaviors of individual users that coincide with them. I mean this literally; the internet as we know it today would not exist without the collection of your data, the synthesis of that data, and the recommendation of advertising that comes with it. Entire sites like Facebook, Pinterest, YouTube, Google, Buzzfeed, and on and on and on are advertising companies fronted by other operations.

For a moment, let’s follow the journey of an individual mobile internet user. For the sake of this hypothetical, we are going to assume that they have never used the internet before. They are creating an account for Boogle, a service that provides an internet browser, the world’s most powerful search engine, document collaboration software, email, cloud storage, video entertainment, calendar and task management, video conferencing, a map app, a weather app, and photo library management. Damn, that feels like a lot, maybe someone should break up Boogle or its competitor Tapple. The user opens up the Boogle internet browser and make a Boogle account:

Name: Testy McTesterson
Email: testy@bmail.com

So they’ve got an account. They see an offer for new users! They can get 1 TB of Boogle Drive storage for a year for free! They click on the offer. The click gets stored and tracked. They set up cloud storage, adding a credit card, and link their photo library to Boogle Photos. Boogle Photos automatically begins looking at Testy’s photos identifying pets, people, locations, and text. A screenshot in their photos reminds Testy to lookup lawn chairs in a new tab. Their search gets stored and tracked.

Their browser asks for Testy’s location. This will allow automatic news and weather generation. Well, that’s convenient, Testy thinks. They hit allow.

So let’s audit some information Testy has given Boogle so far:

Name, Email, computer IP address, interest in cloud storage, purchase of cloud storage, credit card information, 1057 images, search for lawn chairs, exact location.

Testy offers up all of this information, freely and without reservation, and Boogle provides a host of useful technologies that will help Testy be more ✨productive✨.

In the coming weeks, Testy will leave a trail of cookies. Every move they make on the Boogle browser, every email that they send, receive, or open, every video they watch on BooTube, will be collected, stored, analyzed, and sold. Sold to local businesses advertising in Testy’s town, sold to lawn chair companies, sold to Proctor and Gamble, Unilever, Pizza Hut, etc.

In the coming months and years, billions of people every single day will feed Boogle more and more information about interests, political affiliations, gender expression, exact locations, credit card charges…

There are two primary antagonists that make this situation bad.

  1. Corporate power. There are many hypothetical reasons why one single corporate entity knowing an enormous amount about you seems bad, but there are real world ones as well. The data sourced by individual companies has lead to the disruption of democratic elections around the globe (Facebook), the de-competition of industries from paintball guns to paperbacks (Amazon), and class action lawsuits against the unlawful collection of biometric data (Google).

  2. Independent bad actors. “Hackers,” scammers, fraudsters, and bottom feeders of that ilk. Imagine, for example, you spread out your most valuable stuff between three different homes hundreds of miles apart. If someone wanted to steal it all they’d have to take on a massive coordinated attack, one that addresses the unique locks and security systems on each home. Now imagine all of your stuff is under one roof in one house. We’ll call that house Target ↗️. Bad actors only have to crack that one place for all of your credit card info, your home address, your name, email, phone number, etc.

I want to note here how hackers typically work. There is not, in all likelihood, some random person in a hoodie specifically seeking you out, tracking your keystrokes, following you around the internet, and trying to make purchases on your credit card you won’t notice. Your data sits in massive troves with other peoples’ information. Despite it being safe guarded with high budgets and encryption, it is not invulnerable. Some attacks in the past have been the result of data breaches that give bad actors millions of personal data points. For example, hackers have used enormous data sets on user passwords not to necessarily find your password, but to find the most commonly used passwords. They can run programs that try the highest likelihood passwords on millions of emails and only gain access in a small percentage of them, but that small percentage can yield a lot in ransomware attacks and other bullshit.

But why should you care? Isn’t this just the cost of entry to doing anything on the internet? It has become the cost of entry, but it certainly does not have to be.

One reason I am sounding an alarm here about offering up so much of your information to these companies is the reactionary, fascistic behavior of the Republican Party since 2020. Despite insisting they love small government, they have used incredibly powerful tools of the state to create open hostility and violence against queer people, people of color, and women.

Since the repeal of Roe v. Wade, multiple states have signed into law policies essentially eliminating access to abortion. These laws incriminate people seeking out an abortion and doctors who perform them. In some cases, they provide benefits to people who snitch on those people. In a world in which the state and citizens are emboldened to report the private behavior of patients and doctors, the internet becomes a fertile ground for criminalization. Corporate institutions, despite how much they insist that they will not bend the knee to the state, could be forced to give up data. And since our country does not have robust data privacy laws on the books, the nature of the search and seizure of that information could get murky. Independent bad actors could go hunting for data markers of people seeking abortions and their doctors: gender, age, income, and so on. Some apps help users track menstrual cycles, wellbeing, or diet without guarantees of end-to-end encryption. In a world that fast tracked the Patriot Act, it doesn’t actually seem that far fetched to future-scope the invasive possibilities of a theo-fascist American government.

As drag and non-conforming gender expression continues to be criminalized and antagonized, online behavior will come under scrutiny. As you read this, Texas is attempting to put into effect a bill for “child safety” online. It specifically refers to “grooming” as a prohibited behavior that companies like Facebook or Google would need to eradicate from their platforms. As you might know, Republicans have long used the language of grooming as a dog whistle for homophobia and transphobia. Under laws like this, states can not only decree the terms of what is “allowed” on these platforms, but also open the doors for targeted harassment, threats, and violence against queer and gender non-conforming people. In environments like Twitter where private or even deleted data can seem to find its way in front of an audience, this is incredibly dangerous.

There is always money in hate. The consolidation of your information under these pliable corporations can, and has, been used to stand up tyrants, increase the wealth gap, and enact state, corporate, and individual violence against the oppressed.

I worry that I am not making my point adequately. I’ll just say that. I feel like I am both under and over exaggerated the problem. Dabbling in real world examples and extending them to hypotheticals can often be seen as hyperbolic catastrophizing. But I am reminded of the long history of tools fascists will use to accumulate power and single out groups. First they will come for trans and gender non conforming people’s data, and you might not speak out because you don’t have that data. But then they will come for your data, and there will be no one left to speak for you.

These institutions have a financial incentive to own your existence. If your existence becomes criminalized, they will have a financial incentive to criminalize you too.

What can you do about it? A couple things.

  1. Keep your data incredibly close to your chest. For me, that means going into my Google settings to ensure that my browser, email, and maps data is wiped automatically every three months. This way, I feel more comfortable clicking the “Sign in with Google” button on various sites. I can be more assured that third parties have limited access to information about me. Consider using more private browsers like Safari or DuckDuckGo.

  2. Avoid clicking “Sign in with Google,” or any of the other sign in integrations. I am not very good at this. The convenience is just too much. But this gets at the importance of siloing your information across different servers. Of the available “sign in with” options, Apple likely has the best, most private one.

  3. Get a robust password keeper. I keep multiple password libraries, which can often have discrepancies when I update a password on one and not another. If you heed #2, you’ll need this.

  4. Tread cautiously. It is better to assume your actions and information is being tracked than not. If you are a Chrome user, every move you make is being being collected in some way, even if you are only navigating incognito or on a VPN. Safari is slightly less invasive, but different individual sites may track you. Modern browsers and encryption are incredibly good at rooting out sketchy sites, but they often fail to illustrate the sketchiness of the companies that manage them.

  5. Question everything. I often ask myself “Is the value I am providing this company by giving them my information returned to me in a useful way?” If the answer is no, I provide them very little or nothing at all.

  6. Enable two-factor authentication on your accounts whenever possible.

  7. Regularly review and update your privacy settings on social media and other online accounts to limit the amount of personal information that is shared.

  8. Be mindful of the apps you download and the permissions you grant them on your device. Here’s some permissions to be aware of:

    1. Location access

    2. Camera access

    3. Microphone access

    4. Contacts access

    5. Calendar access

    6. Email access

    7. Local device access (Bluetooth, etc.)

    8. Health data access

    9. Facial/fingerprint identification access

What can we do about it? We must advocate for good faith data privacy legislation and corporate policy. Illinois is currently the nation’s gold standard in digital privacy legislation ↗️, creating restrictions on the ways companies can retrieve and use your data. More states should adopt policy like this.

We should work towards reconstructing the economic model of the internet, away from advertising and toward direct payment for goods and services. As is said, if you are getting something for free, you are the product. Services that reward this sort of peer-to-peer or direct payment include Patreon, Squarespace, Substack, and OnlyFans.

We should decentralize the internet and break up the large tech firms. Currently, there is a significant movement toward decentralization. We have Elon Musk to thank for that. Social sites like Mastadon and Blue Sky work on a different protocol than is typical, spreading out data management over many server nodes. But that’s a story for another 🌟 Feature.

I am certain that I will often return to this subject. I will likely make corrections and updates to this argument in the future, but for now I just wanted to get this conversation started. I haven’t even brought up the ways this data is used for recommendation algorithms, and that’s a whole debacle.


📚 Reading list

This week I want to highlight some recent talk about data.

This week on the Vergecast, the challenges of protecting kids on the internet. You can skip the Xbox part, lol:

Listen on Apple Podcasts.

An article about why digital ads seem so bad:

Why Are You Seeing So Many Bad Digital Ads Now? by Tiffany Hsu at the New York Times


⚡️ Lightning

  • Remember that AI is very good at sounding confident. Engage with chat bots how you might a new intern. They require clear prompting, a little hand holding, and verification.

  • If you’re looking for new toys, I typically check a couple places.

  • There’s a big fight going on in Microsofts bid to purchase Activision ↗️. It’s messy and silly, but I am opposed to the merger. Tech consolidation is typically bad for creative laborers.

  • Based on continued reports of negative experiences in the animation ↗️ and visual effects ↗️ industries, I would not be surprised if a large scale unionization effort is formed for these kinds of workers. It’s difficult ground; much of this labor is already outsourced to Canadian, New Zealand, and South Korean firms, but something has gotta give.

  • My favorite app this week: NYT Games.

  • I really want to play with a 15 inch MacBook Air.


📕 Glossary

  • PII

    • is any information that can be used to identify a specific individual. Examples of PII include a person's name, Social Security number, date of birth, home address, phone number, email address, and biometric data, such as fingerprints or facial recognition data.

  • cookies

    • Cookies are bits of code stored in your browser that allow the browser/sites to recognize you across the internet. They are used to customize experiences, remember you are logged in to certain sites, and target advertising.

  • keystrokes

    • Keystrokes refer to the physical act of pressing a key on a keyboard or other input device, particularly in the context of typing or entering data into a computer. In the context of data privacy, keystrokes can refer to the information that is captured and stored by software or devices as a user types, including passwords and other sensitive data.

  • ransomware

    • Ransomware is a type of malicious software that threatens to publish the victim's data or block access to it unless a ransom is paid. While some types of ransomware simply lock the victim's computer screen, others encrypt the files on the system, making them unusable until the ransom is paid and the files are decrypted. It is a form of cyber extortion that can be used to target both individuals and organizations.

  • end-to-end encryption

    • A process by which information is encoded so that only the sender or recipient of the data can read or edit it. Applications that use E2EE include Signal, FaceTime, and Facebook Messenger.

  • biometric data

    • Fingerprints, retinal scans, facial recognition. This is all considered biometric data.


☎️ Answers

Got a question relevant to today’s topic:

Why do I need to create an account for your newsletter?

This is a totally fair question. Fwiw, I should point out that I don’t see any passwords or information beyond your email.

This newsletter is currently behind a free subscriber wall in my Squarespace site. As part of the security protocol Squarespace imposes on member areas, they require validation beyond just an email address. I imagine that they do this for creators with a more robust subscriber area in order to build trust with a creator’s users. In those more robust member areas, money is being exchanged, credit card information is being added, and other PII comes into play. The basic protocol in that situation is to at least password protect the account.

This brings up a good point. Never share credit card, bank, or insurance information in a non-logged in state. Companies are quick to remind you of this, but they will never solicit this information in informal ways.

One quick way to tell if a site meets certain data security requirements is to look at the URL bar. Most browsers will show a lock icon like this 🔒 but you can go one step further and take a look at the random stuff at the beginning of the URL. If it says “https://“ the S is for secure. No S, no secure.


That’s it for this week! Thanks for reading and see you in your inbox next Sunday. Much love, Alex.

Previous
Previous

The Paramount Decree +

Next
Next

The Vision Pro Kinda Sucks