Cover Photo by Franki Chamaki on Unsplash
Differential privacy is really simple. Let’s say, I have a data set of info about people that I want to publish but still keep their data private. That’s where differential privacy comes in. It takes your data and alters it in a way that will keep overall facts about your data in the same area (with more complex algorithms you can alter how close you want it to be) while keeping individuals' data private.
Okay, let’s get to the details of how we are going to alter this data. We want to start simple, so we’ll code the Random Response mechanism. This is the most simple mechanism for differential privacy. It flips a coin, if it is heads, then it keeps the same value. If it is tails, it flips it again, and if it is heads, it returns true, and if tails, then it returns false. Now, again, as I said before, this is a very simple algorithm so it only works with zeros and ones or true and false.
Our first step is to get a data set for our algorithm to run on. If you are going to use real data, you’ll need to convert it to an array. You can also use this sample data to test if your algorithm is working:
You could also run a for loop which generates random values and pushes them to an array.
Now that we have our data, let's start by creating a function called rand_response and use it as what we call to privatize our data. We create it with one parameter,
data, for the data we created earlier.
Now, we call a for loop to go through the array:
for a in data:
Then we set a variable, b, as a random number and use if statements to tell if it is a 0 or 1. We’ll need NumPy for this so install it via pip if you haven’t already:
pip install numpy
and now import it into our code:
from numpy import random
We imported a specific file called random for the random number generation.
Now let’s generate the number:
b = random.randint(2);
This generates a random integer, either 0 or 1.
Next, we put the if statements to find the number and change the value accordingly:
if b == 0: a[i] = 0 if b == 1: b1 = random.randint(2) if b1 == 0: a[i] = 0 if b1 == 1: a[i] = 1
And now we’ve finished! You should have something like this:
When you run rand_response(data) and then print data, you should see the data changed a little bit. Now, this isn’t very private, but it is the first step. Now that you know how differential privacy, try other mechanisms like the Exponential Mechanism and Laplace Mechanism. If you want to use these algorithms, you can use my python package DiffPriv.