DEV Community

Paweł Kowalski for platformOS

Posted on • Edited on

How platformOS uses mixpanel while keeping users privacy?

We at platformOS value our visitors privacy, time and money. And because we live by our values, we always dive deeper into technologies to manifest them. Today we will dive look into how we use mixpanel without breaking our website's performance or sacrificing our visitors privacy.

We use mixpanel to learn where our visitors are confused, leaving our pages

Why mixpanel

Mixpanel is an industry standard when it comes to getting to know what visitors are doing on your site.
It does not track everything like Google Analytics, but only what you tell it to. At least we hoped that would be the case, but looking at the documentation and training materials led us to believe that it is actually gathering more data than we were comfortable with giving or needed.

Having said that, mixpanel has a very comprehensive set of tools to dive deeper into where your visitors journey on your website. This is very important for onboarding process because you can focus on exactly the part that is a bottleneck, so you don't waste time optimizing the wrong part of it.

How mixpanel is usually implemented

Just like most third-party services mixpanel usually is implemented on the frontend, using javascript. This has a lot of consequences, no matter if you load the script from your CDN or from mixpanel CDN.

  • It has to load - your page becomes bigger (around 80KB bigger)
  • It has to parse and execute - your page becomes slower, especially on slower devices (mobile)
  • You don't really know what it does (or track), unless you are inspecting mixpanel javascript client codebase before installation and every update
  • According to documentation turning on GDPR compliance is an opt-in, so this might create additional friction and work to be done
  • AdBlockers have mixpanel CDN blocked by default, so there will be gaps in data. How big of a gaps? Well, some sources report 50% gaps, which is big enough to not rely on client-side measurements.

Tweet

All those disadvantages are directly against our values. But mixpanel is good enough product to do the research and our great engineers found a way to use mixpanel without any javascript.

How we use mixpanel

Sending HTTP requests in platformOS is very simple, and we decided to leverage that in our implementation.

First we defined API call that will do the network call:

---
name: mixpanel_create_event
to: https://api.mixpanel.com/track
format: https
request_type: POST
callback: >
  {%- assign response_data = response.body | to_hash -%}
  {% if response_data.error %}
    {%- log response_data.error, type: 'modules/monitoring/mixpanel_create_event' -%}
  {% endif %}
request_headers: >
  {% if data %}
    {
      "Content-Type": "application/x-www-form-urlencoded"
    }
  {% endif %}
---
data={{ data }}
Enter fullscreen mode Exit fullscreen mode

Everything between --- is definition of the api call, everything below, is the body of the request that will be sent. Additionally, callback is executed after the API call has been sent. In our case it is logging error if mixpanel server returned one.

Having defined API call, we need to execute whenever we need to, with proper data. Here we created another abstraction to be able to pass different events from different sources:

{% liquid
  graphql instance = 'modules/monitoring/instance' | dig: 'instance'
  unless data
    assign data = '{}' | parse_json
  endunless
  hash_assign data['distinct_id'] = instance.id
  hash_assign data['instance_id'] = instance.id
  hash_assign data['token'] = "377404cb3e579051250ca9a2b129ea7b"
%}
{% parse_json mixpanel_data %}
{
  "event": "{{ event_name }}",
  "properties": {{ data }}
}
{% endparse_json %}

{% liquid
  graphql r = 'modules/monitoring/api_call', data: mixpanel_data, template: 'modules/monitoring/mixpanel_create_event'
  log r, type: 'monitoring/migration/track_first_deploy'
  return r
%}
Enter fullscreen mode Exit fullscreen mode

To every event we attach some basic data, like:

  • public token (required by mixpanel),
  • instance id - to know which website sent the event

And some variables that are passed down from different parts of the system, one is event_name which is just a name to filter them easier in the mixpanel dashboard, and data which is a JSON object that can contain anything we want, including user data, like id, email, browser data (extracted from user agent), etc.

Now we can call this piece of code (partial) anywhere in our application, passing event_name and some data. Simplest example would be our marketplace_install event which is called every time someone installs our Marketplace Template for the first time.

Event name is called marketplace_install to group all those events in one bucket. Our template has different variants (community site, ecommerce), so our data object identifies that so we know what type of marketplace our partners and clients are installing.

{% parse_json data %}
  {
    "marketplace": "products"
  }
{% endparse_json %}

{% liquid
  if context.location.host != 'getmarketplace-qa.staging.gapps.platformos.com'
    function res = 'modules/monitoring/commands/track_event', event_name: 'marketplace_install', data: data
  endif
%} 
Enter fullscreen mode Exit fullscreen mode

Profits

Using mixpanel server-side has some consequences:

  • There is no performance penalty at all on the frontend
  • There is no performance penalty on the backend, because code can be run asynchronously, so visitor does not have to wait until data has been sent to mixpanel when browsing website
  • Server only knows as much as browser tells it plus whatever logged in person provided, if anything
  • Mixpanel does not run any code on neither browser side or server side, so hypothetical hack, our clients are safe. You might think it is far-fetched and a non factor, but history teaches us that hack/leak is not a matter of "if" but "when", no matter how good of a security company implement. Thats one of the reasons we try to minimize data we keep, and give.

Tracking across multiple sites

When you have multiple sites like we do (Documentation, Marketing page, Partner Portal) you might want to track visitor's journey across all of them. We had this thought too, but this would mean one thing: identifying single users.

Long story short, it all comes down to fingerprinting which is a standard practice for most big outlets, ad networks, etc. but is unacceptable by us. Not only it is sending a lot of javascript down the wire that visitors does not want, they would probably answer: "NO" if you asked them if they want to be fingerprinted, no matter what.

Instead of tracking across multiple pages, we decided to restructure our onboarding flow a little bit to minimize jumping across different domains/websites. It is more consistent visually and more performant for the visitor as well.

Resources

[Video] What is mixpanel by Danny Lambert

Mixpanel Documentation

platformOS Documentation

platformOS Marketing page

platformOS product template github repository

Read more

If you are interested in more performance oriented content, follow me and I promise to deliver original, or at least effective methods of improving your website.

Top comments (0)