DEV Community

Cover image for Secure Proxy for HIPAA-Compliant API Analytics
K for moesif

Posted on • Originally published at moesif.com

Secure Proxy for HIPAA-Compliant API Analytics

In HeathTech apps, it’s often the case that you’re dealing with private or health-related data. This requires compliance with regulations, such as HIPAA in the United States. These regulations force you to handle sensitive data in a well-defined manner, so only specific people can read it, and if they do, it should be logged for later auditing.

To be compliant with HIPAA, technical and administrative safeguards must be implemented both within your company and in your app. The technical safeguards often lead to more complicated software architectures. So, it’s a good idea to make sure the extra engineering development work for HIPAA compliance is necessary before embarking. Alternatively, you could fall under one of the four cases where it’s safe not to comply with HIPAA.

In a companion article, we explained how to build HIPAA-compliant APIs. In this article, we’ll dig deeper into how you can use a secure proxy to keep your health data secure and your app HIPAA complaint. Full instructions on how to use Moesif's patented secure proxy is given in our docs section.

Storing Health Data Without HIPAA?

If your customer encrypts data before sending it to you, by definition, you don’t know the contents of that data. This leads to plausible deniability. The data you’re storing and processing on your side is worthless to a potential attacker without the encryption key.

This can be done with a secure proxy.

A secure proxy is an API server between you and your customer — but in your customer’s data center. Data leaving your customer’s data center is encrypted and then sent to your API in your own data center.

For encryption, the proxy uses a key that only your customer knows. If someone steals the encrypted data from you, they can't read it without the encryption key from your customer.

How to Build a Secure Proxy?

The first step is to build a server that acts as your own API but can be deployed by your customer on-premise. The clients for your API will then communicate with that proxy instead of sending the requests directly to your API.

Next, you encrypt all client data to the right level before sending it to your actual API. So your proxy needs to work with an encryption key supplied by your customer, one that you never have access to.

The right level of encryption depends on your API:

  • For a straightforward key-value-store API. that just stores everything in the request body, it’s good enough to encrypt the whole body before sending
  • For anything more complicated, your proxy needs to handle encryption in a more involved manner.

Take Moesif API Analytics. Moesif’s API receives events and counts them. If this API can receive multiple events per request, you would have to encrypt every event name in the request independently and not the whole request body at once — otherwise, the API wouldn't understand the request. But with the event names encrypted, Moesif doesn't know what the event was. Was it health-related? Was it from an online game?

Finally, you need to decrypt the data when your customer wants to read it. So the proxy needs a way to show the plaintext data to your customer.

Secure Proxy Example

I’ve created a GitHub repository to illustrate a secure proxy implementation. It contains three parts: an analytics API server, a proxy API server, and a client.

Disclaimer: This is just a fundamental example in which I used a crypto library I didn’t actually audit. If you build your own system, you’ll have to do that part of the research independently!

With that out of the way, let’s look at the three parts of the system.

The Analytics API

I extracted the essential parts of the analytics API from the repository. The API is built with Node.js and the Express framework.

const eventStore = [];

api.post("/event", ({ body }, response) => {
  eventStore.push(body.event);
  response.status(201).json("201 - Created");
});

api.get("/results", (request, response) => {
  const results = eventStore.reduce(
    (results, event) => {
      if (!results[event]) results[event] = 0;
      results[event]++;
      return results;
    },
    {}
  );
  response.status(200).json({ results });
});

api.listen(9000);
Enter fullscreen mode Exit fullscreen mode

There are two endpoints, one that will receive an event and store it in an array, and a second endpoint will calculate how often every event was received and send those sums back to the client. Just elementary stuff here, data is received, some processing is done, and results are sent to a client.

The Analytics API Client

Next, let’s look at how the client actually uses the analytics API. Again, I extracted the important parts from the repository.

async function logEvent(event) {
  return axios.post(
    "http://localhost:9000/event",
    { event }
  );
}

async function getResults() {
  const response = await axios.get(
    "http://localhost:9000/results"
  );
  return response.data.results;
}

await logEvent("plaintext-event-a");
await logEvent("plaintext-event-a");
await logEvent("plaintext-event-a");
await logEvent("plaintext-event-b");
await logEvent("plaintext-event-b");

const results = await getResults();
Enter fullscreen mode Exit fullscreen mode

The client sends JSON objects with event strings to the analytics API, which listens on port 9000. It also fetches the results in the end.

Running the Example

If we run this example with logging added, we see the following.

[Client   ] Sending event: plaintext-event-a
[Analytics] Received: plaintext-event-a
[Client   ] Sending event: plaintext-event-a
[Analytics] Received: plaintext-event-a
[Client   ] Sending event: plaintext-event-a
[Analytics] Received: plaintext-event-a
[Client   ] Sending event: plaintext-event-b
[Analytics] Received: plaintext-event-b
[Client   ] Sending event: plaintext-event-b
[Analytics] Received: plaintext-event-b
[Analytics] Sending results:  {
  'plaintext-event-a': 3,
  'plaintext-event-b': 2
}
[Client   ] Results:  {
  'plaintext-event-a': 3,
  'plaintext-event-b': 2
}
Enter fullscreen mode Exit fullscreen mode

The client sends its events to the analytics API, which saves them in its event store. In the end, the client fetches the results, which is a JSON that contains the sum of each event.

We see that the analytics API gets the plaintext names of our events. The analytics API would have to be HIPAA compliant if the events contained health-related data. The events could track how often a person took a specific medication, for example.

The Secure Proxy API

Let’s see how the secure proxy for this system would look.

Note: In the example project, all systems run on localhost, but for the secure proxy to be valid, it must run on-premise at your customer’s data center, and while your own company doesn’t have to be HIPAA compliant, your customer has to. But if they’re handling health-related data to start with, they probably already are.

api.post("/event", async ({ body }, response) => {
  const event = cryptoStore.encrypt(body.event);
  const res = await axios.post(
    "http://localhost:9000/event",
    { event }
  );
  response.status(res.status).end(res.data);
});

api.get("/results", async (request, response) => {
  const {
    data: { results },
    status,
  } = await axios.get(
    "http://localhost:9000/results"
  );

  const results = Object.keys(results).reduce(
    (results, encryptedEvent) => {
      const event = cryptoStore.decrypt(encryptedEvent);
      results[event] = results[encryptedEvent];
      return results;
    },
    {}
  );

  response
    .status(status)
    .json({ results });
});

api.listen(9999);
Enter fullscreen mode Exit fullscreen mode

Like in the analytics API, there are the same two endpoints in the secure proxy. And for the client, they behave just like the analytics API’s endpoints. The only difference here is, the secure proxy API endpoints will encrypt and decrypt the event names before relaying them to the analytics API.

This way, the secure proxy API behaves as a drop-in replacement for the analytics API. The client isn’t aware of its implementation and can simply send events to the proxy and fetch the results as they did before.

The level of detail for encryption I chose here was the names of the events. The secure proxy passes along HTTP status codes, sums and keeps the JSON structures as they receive them from either the analytics API or the client.

Running the Example

If we run the example with logging added and the proxy in between the client and the analytics API, we see the following output:

[Client   ] Sending event: secret-event-a
[Proxy    ] Received event: secret-event-a
[Analytics] Received event: 23c06d46723cd...
[Client   ] Sending event: secret-event-a
[Proxy    ] Received event: secret-event-a
[Analytics] Received event: 23c06d46723cd...
[Client   ] Sending event: secret-event-a
[Proxy    ] Received event: secret-event-a
[Analytics] Received event: 23c06d46723cd...
[Client   ] Sending event: secret-event-b
[Proxy    ] Received event: secret-event-b
[Analytics] Received event: 604c04290177e...
[Client   ] Sending event: secret-event-b
[Proxy    ] Received event: secret-event-b
[Analytics] Received event: 604c04290177e...
[Analytics] Sending results:  {
  '23c06d46723cd...': 3,
  '604c04290177e...': 2
}
[Client   ] Results:  {
  'secret-event-a': 3,
  'secret-event-b': 2
}
Enter fullscreen mode Exit fullscreen mode

In the logs, we see that the secure proxy acts as a middleman. It received the events with the client’s names, but the analytics API receives an encrypted string. It doesn’t know if the event name contains a game score or medication intake info.

In the example of API analytics, as with Moesif, the API doesn’t know what kind of APIs are tracked. The only requirement is that all of the events that are of the same type have the same name. That's enough to correlate them and calculate the result.

Conclusion

If your API doesn’t have to identify the type of data it works with, a secure proxy is a handy solution to circumvent HIPAA compliance.

Your customers bring their own encryption keys and deploy the secure proxy in their own data centers. HIPAA-related requirements are none of your concern anymore since you now have plausible deniability; you never knew what kind of data you were processing since everything was encrypted before it left the customer’s networks. And you never had access to the encryption key.


This article was originally written for the Moesif Blog.

Top comments (0)