loading...
Cover image for Whelp, they got all our data, now what? - A guide, well a lecture first, then a guide.

Whelp, they got all our data, now what? - A guide, well a lecture first, then a guide.

ronsoak profile image ronsoak Updated on ・18 min read

Reminder: All views expressed here are my own and do not represent my employer, Xero.


Remember how we laughed at previous generations for thinking smoking was healthy, that sun burn was harmless and knowing more than one language made you dumb?

Well chuckle a bit quieter as it won't be long before people are laughing at us. "You did WHAT with your data!!!!!!!!?"
"You DIDN'T read the terms of service???"
"You LET them track you???"

For we are, and I won't mince words here, the data stupid generation.

We have given away so much of our data to companies who made billions off of it with absolutely nothing in return, we have sacrificed privacy, safety and in some circumstances our own genetic blueprints all because we were so oblivious as to what was happening.

And while I will concede that we didn't know what was happening until it was too late, much like Edna with her 1963 Lincoln Continental and 6 pack a day habit, it's something we should be trying to rectify as soon as possible.
Edna

So how bad is it?

To quote HIBP:

Our data is leaked, sold, redistributed and abused to our detriment and beyond our control

Let's first talk about the data we knowingly gave away.

So think about every social media account, every newsletter, every loyalty program, and every other account you've set up in the past ten years.

Every tweet, Facebook post, email, picture, video, GIF you put online you willingly made it the property of the company hosting it to do whatever they wanted to do with it. You agreed to it in the really really really long Terms of Service you skipped to the bottom and agreed to.

Now we can discuss how intentionally problematic these sort of legal documents are another day but as a surprise to no one, a clear majority of people do not read them, and of those who do, less actually understand them.

But the devil is in the details.

It's in Google's Terms and Conditions that they read all of your emails not just to use to cater adverts back to you, but to aide internal development (ML anyone) and other stuff (profit from information sharing).

It was Apples Privacy Policy that tells you that a human may listen to what you say to Siri to "better improve her recognition of pronunciation."

Heck, if you read all the way to the bottom of the iTunes terms and conditions, you'll find that you've agreed to NOT use iTunes in the creation of Nuclear Weapons. Something I think most of us will be fine with.
Nuclear

And to be fair we all knew this, to some extent, that the stuff we put up on Facebook was used by them for their own purposes. What I think many people didn't realize was how long we had been doing it for, we were all so accustomed to websites coming and going it didn't need close scrutiny but next thing you knew we had willingly fed twitter five years worth of our likes, dislikes, political opinions, passing fancy's, and humor.

We also didn't realize that while you may only put certain information online, the decade you spent on Facebook allowed them piece together the missing info like a Jigsaw.

What pains me now is that we are filling these websites with information about our kids who don't get a chance to opt out, by the time they understand the problem you may have thrown away any chance of them ever having privacy.

The things we willingly do, have massive ripple effects. Signing up to a loyalty scheme at a shoe store to receive $10 off your next order makes that shoe company a hell of a lot more than $10, people don't give money away for free.

Your photos are scanned, your videos deconstructed, your emails read, and if you really need a wake up call, these companies are looking at your nudes Barry. Which actually happened at Snapchat!!!!!!!.
Small

Let's talk about the stuff we unknowingly gave away

I talked earlier about Facebook filling in the puzzle of the stuff you didn't post online.

They all do a lot of that, all the time. You see, we leave a lot of footprints all over the world wide web.
Footprints

So you log in at home right, you do that every day for a month, through your ISP they know what country, city, and region you are in. Good things ISP's don't tell them your exact location right? Well your ISP won't but your GPS enabled cellular phone has just told them your exact longitude and latitude, your phone has also told them the SSID of your wifi so the next person who logs in on that wifi network (flatmate), Facebook now knows what house they are in. One person can betray the information of everyone in the house by being the final piece of the puzzle.

Same with at work, Facebook notices that there's this second location you regularly log into five days a week and your Benedict Arnold of a phone has also told them the exact coordinates of your workplace as well.

And it's not just your home location, they are tracking, or more accurately your phone is giving them a map of where you go. What super market you go to, what shops you visit, what shops you don't visit, your bar, your gym.

They then use this, not just in advertising, but to enhance their profile of you, and to learn more about how the people in your city operate on mass.

They know when certain streets are busiest, what a stores opening hours are and when it's busiest, and what sort of people are in a city based purely on mass data collection. And while some of that stuff is useful, they never asked, they tricked us, and now we are suffering the cost of that(more on exactly how we are suffering later).

You don't even have to be logged in, Google can sometimes tell by your search habits that it's you, just give them enough time. In fact going in-incognito doesn't hide you from Google, at first it will just look like a new person in your house is looking for Busty Mature Women on the ole Porn Hub, but give them enough time (especially seeming the activity is happening on the same device every night at 10:55pm....BARRY) and google knows what ticks your box, even though you used 'Private Browsing.'

Let's talk about the stuff we didn't know they did

Facial Recognition Training

Before I purged my Facebook I downloaded a copy of everything on there. I had uploaded over 2,000 photos to the blue box in the sky.

While a good chunk would have been memes, enough of them contained pictures of people, lets say 300 have humans in them. Then consider that Facebook has over 2.5 billion active users and we all now know how Facebook trained it's Facial Recognition algorithm.

And while advancing the field of data science is a cool thing, Facebook has proven themselves to willingly sell their services to whomever can pay, regardless of the ethical implications.

Enter stage left, the Chinese Government who use facial recognition to track every citizens movements and actions as part of some personality score, and we know exactly who is most likely to purchase that data. It's all fun and games until they use your data against for unethical reasons. We fucked up.
oops

Cookie Stealing

Did you know that Facebook is tracking the other stuff you do on other websites?
Without Facebook even being open they can track you, though all the embedded 'like' or 'share to Facebook' buttons found on other websites. If you can see a link to Facebook on a website, old Zuckerberg is watching you.
Zuck

Stealing your data via the companies they own

These big companies also steal your data by the simply fact that they own other companies.

Those loyalty schemes are often owned by bigger companies, which is why 5 minutes after you sign up for your $5 voucher, your inbox is being flooded.

Back onto hating on the big wigs.

Did you know Facebook owns Instagram, WhatsApp, Occulus Rift and 82 other companies?

Google on the other hand is actually owned by a company called Alphabet, who owns everything Google, Nest, Sidewalk, is about to purchase Fitbit (yes all your health data will go to them), and nearly 300 other companies. At one stage Google bought more than one company a week.

Your data is bought and sold over and over again

Unbeknownst to most of us, the selling and buying of our data has been what has made all of the companies filthy rich.

Their are Data Enrichment companies who will sit as the middle man and buy your data off of Google, smash it together with what they have from Twitter and sell it to Facebook. One of those companies last year left one of their servers unprotected and the personal information of 1.2 billion people were leaked, which means that company was buying and selling data on over a billion people, you know how unlikely it is that your data wasn't in that leak?
odds

Live Manipulation of pricing / products

We are all aware of surge pricing right? Ubers are more expensive based on demand.

But what if the Uber was more expensive because they knew you could afford it?

That's exactly what online retailers like Amazon do. Because they have all of our demographics, interests, habits and buying history Amazon can show someone who they think can afford a higher price, a higher price, while in return only offer a sales price to someone they need to work hard to covert.

Same with Banks, the information they have harvested on you, as well as your financial history will be used to make judgements on what to lend you. A practice some insurance companies have started to deploy.

Just imagine if your health insurance went up because they saw that you had eaten at McDonald's???? Is that a world you want to live in?

Using your data to influence your actions

And of course the one that has been all over the news. If the Facebook can use what they know about you to dynamically change what sort of content you can and can't see, how do you know that they aren't hiding stuff from you in order to influence your decision making?

We already suspect (know) that Russia influenced the previous American Election via Social media. What if your local government wants to roll out legislation to curb the amount of data Facebook can steal off of you? Can we trust them to not tweak the algorithm in their favor? Showing you more content that puts up arguments against the legislation? All signs point to no.

Let's talk about the breaches

If it's not bad enough that all these websites are harvesting your data, they keep fucking losing it.

According to Have I Been Pwned, the Internets leading website on compromised user data, over 400 websites have lost user data due to breaches to a whopping NINE POINT FIVE BILLION USER ACCOUNTS.

I will repeat again what I said earlier, the odds of your personal data not having been breached are infinitesimal right now. Your data is out there in an uncontrolled environment, what data exactly is still unknown, it could just be your NeoPets account (yes they got hacked too), but hackers don't need a lot of information to get more from you. Hackers steal and buy your data to enable them to do something called Social Engineering.

Given enough data they can either engineer what your password might be, or use an old pass word to steal some accounts off you, thus learning more about you, or even ring up technical support call centers and pretend to be you to get access to your account.

leak

Why do they do it?

Primarily it was for Marketing and Feature Usage tracking. Over time it evolved into what it is now.

Marketing

With over a billion users on the internet it can be hard for some one wanting to advertise a product to reach the right audience. A maker of boutique headphones aimed at audiophiles could show an add to a million people and not get a single sale. To the untrained its hard to find out who should see your advert and who shouldn't. This is where Facebook / Twitter / Google excel.

Because they have information on you, they break you down into what is known as segments. Segments become groupings that they offer to these people wanting to advertise.

As a 30 year old white male, living in an English speaking western developed world who has an interest in technology, my segments probably look something like this:

  • Male
  • Disposable Income
  • Likely to buy tech
  • English speaking

In fact my Facebook download had me as 'Starting Adult Life' which goes to show what years I actively used Facebook.

The platform holder then charges the Headphone maker to show their adverts only to the segments that apply to the Headphone makers most likely customer base.

No one claims that this is a 100% hit rate, which makes this the perfect scam. Facebook might only increase the success rate of the headphone makers adverts by 5%, especially when the advert might just be a bad advert or the product poorly priced, but to the Headphone maker, it was better than blind chance and so they pay for it.

Feature Tracking

A core part of development in tech is knowing what to develop next. The collection of this data has been, for the longest time, the backbone of development. If a development team want to know what bug they should fix first, they would turn to feature usage data. If they want to know what kind of person is using X product to figure out how to entice Y person to use it, feature usage data, they want to know if a new feature is being used, again feature usage data. Its how these companies decide on what to develop. Now this isn't a get out of jail free card. They really only need anonymous data, and a lot more could be done in this space to make sure devs are only seeing the data needed to do their job.

What can we do?

We need to be both preemptive to save our future data as well as clean up the past X amount of years of accounts we have created, it's no good setting up good passwords from here on out, if your data is going to get leaked by an account you didn't use any more. This is both an exercise in security as it is best data protection practice.

Check how badly you've been hacked.

Go to have i been pwned? and enter in every email you can remember ever using (even work emails). This will give you an indication of how badly compromised you are.

Close old accounts

As a piggy back off of the above, close as many old accounts that you can remember and don't use any more.

Pretend your covered by GDPR

The GDPR is a regulation passed by the EU to give Europeans the right to be forgotten. For example before GDPR if you closed your account with Facebook, Facebook didn't delete your data, they kept it and kept using it. Now with the GDPR in place, if requested by a member of the EU Facebook legally has to scramble the users data so they can't be identified any more or risk a 4% fine to their Revenue.

Fun fact, most tech companies haven't even implemented GDPR properly as their products were never built to do this sort of thing, not to mention GDPR covers EU citizens wherever they live. Because no one has any way of proving whether someone who lives in Australia is or isn't an EU citizen, they won't challenge you. So once you've closed your account, ask them to remove your data under the GDPR.

Set up a spam email address

For any account you want to keep open, or anything in the future you want to sign up to, have a personal email address for you and a spam email account for everything else (I have whats left of my Facebook tied to a spam email). Give it a generic name like 732643_8324732_824623@gmail.com and give it a good password (nothing similar to anything else you use please).

Mask your email

Some email services will now offer you a masked email. Apple now does this, so you can be debbie_taylor@icloud.com, but apple can offer you 876545678765_dfdsfu@icloud.com to use with everyone else on the internet. You still get the emails but it means people can't extract personal information from your email and prevents hackers from matching your email addresses across the internet.

Opt out

Get into the habit of scrolling to the bottom of every spam email and hit 'unsubscribe', get yourself removed from those email lists, not only are they sending you advertising, they also include trackers in the rich content of advert to see how you interact with the advert.

Reset your advertising ID

Did you know that every software platform you use, tracks you with an advertising ID, which helps the marketing people find you faster? I bet most people don't know that.

It can be found on your iPhone, your Android, your Mac and your Windows, and you can reset it as often as you want, and you can even change some settings to make it harder for them to track you. I have BOTH change my settings to its harder to advertisers to track me and I have a reminder in my phone to reset my advertising id every month.

Reset Ad ID

Read what apps can access

Many apps on the iOS and Android app store access more than you would think, and while both Apple and Android are getting better at showing you what those apps access and giving you the ability to dictate what they can and can't do, it's still worth checking from time to time. The key things I look for are, what apps want to know my location, what apps have access to my contacts, and which apps can use my camera.

A good starter guide for is here.

Prevent Apps from tracking you and chose what data they can send back to the mother ship

In the same vein as the above, start by making sure only apps that need to use your location are using it and turn it off for everything else. For example my voice memo app has location turned on.

You can also allow or deny 'usage analytics' from being sent back to the device owner. This is standard across many devices and software and it is pitched as being 'for your benefit' as they can detect bugs and improve the software to your liking however, in this day in age my preference is to not believe any 'it's for your benefit rhetoric' when I know for a fact that data will either be sold or leaked.

A handy guide for ios is here.

This advice also applies to the software you use, a lot of apps, web browsers, computer programs, and other internet connected devices in your home will also most likely have a 'opt out of sharing usage analytics' option.

Use privacy focused search engines.

So everything you search in google can be used against you, not just by Google, but also by the law.

Enter Duck Duck Go. The privacy focused search engine, it doesn't track you and encrypts your search activity so even they can't see what you are doing. They have a door mat outside their office that says COME BACK WITH A WARRANT

warrant

I've been using Duck Duck Go for over a year now and it's been a great decision as my ads get less and less targeted, it also helps me with my attempt to divorce google.

Though a word of warning, your searching will get a tad harder, because google is always tracking you, it allows them to make searching the internet easy, while they can't predict what you will search they can make finding the right result easier as they know what to exclude, especially with vague search terms.

When I started using Duck Duck Go I did notice that stuff I used to google using vague terms didn't get the results I wanted, but over time, both I and Duck Duck Go have improved. For example I live in Wellington New Zealand, when I google a restaurant google will know to show me the one in my city, because Duck Duck Go doesn't know where I am, I will often get restaurants of the same name in other countries.

Trust me it's worth it.

Use privacy focused Web Browsers / stop using Chrome.

Stop using Google Chrome is the number one answer here. They have monopolized their way into the number one position and with that monopoly they are scraping monumental amounts of data. What makes it worse is that Chrome is also the base for many other web browsers like Microsoft Edge and Opera Browser, and even is the underlying code base for non web browsers like Slack and VSCode. What this means is the Chrome based tracking stuff is tracking you even if you aren't using Chrome, its up to the developers to turn that off.

Well there are two options here.

Brave: Brave browser is a chrome based browser, however it actively blocks everything and has been built with that purpose in mind. Its turned off all of Googles back doors and even provides you with a granular, per website, tool called Shield that allows you to block what you want. What shocks me is that YouTube stops working when you block all their trackers. Bit naughty.

Firefox: Firefox is the only non-chrome base browser left (other than Safari). Not only is it stopping Chrome from being the true monopoly it also has a privacy focus similar to Brave. It actively stops trackers and gives you control over what you allow certain websites to do. Firefox where the first people to be able to provide a tool that stopped Facebook from looking at the other tabs you've got open.

Use end-end encryption services where possible

When a company tells you your data is encrypted they often mean, it's encrypted from everyone else. This is alright at protecting your data from being intercepted, but should a hacker get access to the back end, like the NINE POINT FIVE BILLION TIMES ITS HAPPENED BEFORE the encryption means nothing. It also means a legal agency like the government can get access if they have a warrant. Cough America Cough.

The phrase you want to be on the look out for is 'end to end encryption' messaging provider Telegram offers this on their secret chats function meaning, that if you used it, not even Telegram, or the government with a warrant can have a look.

However do your research, WhatsApp is encrypted end-to-end, however its parent company Facebook does still have other ways of seeing that data, primarily by seeing it before it gets encrypted. So be warned.

Change your password / use password managers / don't be dumb

Come one, you know the drill.

  • Don't use the same password for everything
  • Don't cycle passwords
  • Don't make them easy, or write them down.
  • Turn on two factor authentication
  • Use a password manager like LastPass
  • Change them often
  • Keep an eye on Have I been Pwned, every time that your get pwned change all your passwords.

Don't use genetic testing companies

In one of the most chilling revelations of a breach, MyHeritage revealed that they had been hacked. It's not just passwords that are out there in the wild for the customers but their genetic information. A future where we could use a genetic imprint in lieu of a password has potentially already been ruined by this sort of thing. I would urge you to resist using any of these services until security gets a bit better.

Read the ts & cs

Easier said than done right. You can use this website, Terms of Service: Didn't Read, to get terms and conditions boiled down for easier consumption.

Download your data.

In recent years, most websites give you the ability to download all of your data that they hold on you. I recommend you do this for any major platform you use. When I downloaded all my Facebook data I could even see what guesses they had made about who I was for marketing purposes.

Actively delete old data

In the past year I deleted everything off of Facebook (after downloading it) and delete all my old tweets. This doesn't remove them from the platform holders data but it prevents other sites, like Data Enrichment companies, from scraping them and selling them on.

For twitter there are some third parties who will bulk delete tweets for you.
For Facebook I used a Chrome Extension that deleted posts for you by simulating the mouse clicks over and over again. That took a few days.


Reminder: All views expressed here are my own and do not represent my employer, Xero.


Who am I?

You should read....

Posted on Mar 9 by:

ronsoak profile

ronsoak

@ronsoak

Data Analysis Team Lead at Xero in Wellington NZ. All views expressed here are my own.

Discussion

markdown guide