Whilst Segment makes it easy to quickly start sending events from your app, there are plenty of non-obvious details. In this post, I’m going to share some of the tips and tricks we learnt on our journey towards rock-solid metrics in Mixpanel and Google Analytics.
If you’ve been following along with the series so far, you’ll know I’m using Segment to send data from my mobile app and back end to both Mixpanel and Google Analytics. The first post was an introduction to the topic in general terms. The second introduced the technologies.
In this third instalment, it’s time to get really stuck into some of the nuts and bolts. It’s true that it’s very easy to get started with Segment. Add the SDK, start firing some events, maybe add screen load tracking to a navigation component and, um, done?
Not quite. There are a few things which are either non-obvious or downright confusing. Our app (check it out!) launched in October and, in the time since, we’ve had our share of behavioural data head-scratchers. I’ve raided my notepad and hope that sharing these practical pointers will be helpful to someone.
Mixpanel is really fussy about how users identities are communicated to it. Fussy to the extent that Segment has a special section in their documentation to try and help developers get it right.
It’s very useful – but quite tricky – to keep track of users as they transition from anonymous prospect to registered user. Mixpanel (and, by extension, Segment) have a method called
Alias() which needs to be called exactly once, immediately after a user registers and immediately prior to calling
Identify() with the newly registered user’s information.
It doesn’t sound too tricky if you read the documentation carefully, but it’s still possible to come unstuck. You see, events sent to Segment come with no guarantee of arriving in order. This means that the quick succession of
Identify() can sometimes (about 5% of the time in our app at one point, according to my observations) arrive reversed. Mixpanel doesn’t handle this well – it creates duplicate records. This is obviously not ideal.
One symptom of this issue is the presence of duplicate users – e.g. email addresses associated with more than one “distinct ID”. I’ve adapted one of Mixpanel’s JQL scripts to show which email addresses are affected. You can grab it here if you like. (And if you’ve not encountered JQL yet, stay tuned for the next post in this series!) If you see duplicate users, it’s possible they were created as a result of your
Identify() events arriving in the wrong order.
Thankfully, this can be mostly mitigated by called
Flush() in between the two calls. By telling the Segment SDK to clear its buffer, we force it to send the contents immediately.
If we do this immediately after calling
Alias(), but before calling
Identify() we maximise the chances of the events arriving in the correct order.
NB: Since Google expressly forbid that you send them identifiable user information, Segment automatically blocks these events from reaching GA.
Another consequence of Segment events arriving out-of-order is that your analytics will sometimes make it look like a small number of your users are flowing through unusual (or even impossible) paths in your app.
In Google Analytics the effects of this are very apparent when you look at the “Behaviour Flows” report. If you’re affected, you’ll notice that some users seem to deviate from the routes taken by the majority.
In our app, there are a couple of one-way page sequences and we were seeing a small proportion of our users going against the tide. Some of them we rationalised away as users hitting the back button, but others just shouldn’t have been possible.
As with the identity management problem outlined above, the solution seems to be to sprinkle
Flush() calls strategically through your code.
The reason I wanted to highlight this symptom, however, is that it was hard to detect in Mixpanel. We aren’t on the enterprise plan (yet), so can’t use the “Flows” feature. Instead, we were modelling key interactions as “Funnels”. These are reports on the traffic through a sequence of events that you can specify. They are strictly unidirectional, so if events fire out of sequence, those users will appear to drop out of your funnel.
To my mind, issues like this are a great reason to have Segment pointing at more than one destination. We probably wouldn’t have caught this without looking at Google Analytics. Since the result is under-reporting the completion rates of crucial event sequences, we’re pleased to have understood and resolved this.
I’m not sure if this has always been a thing, but Google Analytics are heavily pushing people towards using Firebase as an intermediary. When you try to create a new property and select “mobile app” as the type, you are directed to Firebase instead of being provided with the usual
UA-XXXXXXXX-1 tracking code. This confused the heck out of me!
Anyway, it seemed, for a time, as though I was going to have to install the Firebase SDK in our app in order to connect GA. I’ve not got any beef with Firebase, but it seemed like an unnecessary step. Firebase comes with a lot of features that we have no plans to use, so why would I want to plop their big fat SDK in my app? And why would I want to pay them $25 a month for the privilege? I’m sure it’s great, but it’s not for us.
For a while, after that, I was pursuing the option of configuring Firebase as a destination in Segment. Unfortunately, it became clear that this was only going to be possible for the iOS version of our app. Not cool.
Eventually, I found a workaround. If you create a normal (i.e. website) “property”, it’s possible to add a mobile app “view”. At this point, you have a tracking code, which means you can now connect Segment to it. Once you’ve got your new view, you can delete the default website view. Voila!
Despite Segment’s core promise, you will find yourself writing or changing code to support new destinations. This is inevitable because each destination has its own set of quirks and will be fussy in different ways about different things. Mitigating this somewhat is the fact that (if you’re anything like us) you’ll probably be adding your own changes anyway as you continually refine what events you track.
I’ve already mentioned that Mixpanel is fussy when it comes to identity management. If you think you’ll ever want to send data to Mixpanel, I recommend trying to meet their spec from the outset.
Google Analytics also has a funny requirement: if you want page view events to be accepted (which you probably do), you’ll need to ensure that you send the apps fully qualified name. I think that this is automatic for native apps, but must be done manually for React Native.
As a general rule of thumb, it’s worth giving the docs a pretty thorough read before you start coding.
If you’re dipping your toes into these services, you’ll almost certainly start on a free tier. These have limits and (in my experience) don’t always make it clear when you exceed them. I can kind of see the reasoning behind this – they don’t want to spam users who aren’t yet committed.
But your data will stop coming through at some point. And when it does, you’ll kick yourself. It’ll likely be at that moment of inflexion; when your app finally gains traction and you’re excitedly watching the numbers climb. It’ll happen at this time not because of Finagle’s Law, but simply because your most interesting time is also going to be when you’ll chew through your allowance the most rapidly.
I’m not going to recommend a tech fix for this one. You just have to be a bit organised. Here are some tips:
- Make a note of what the limits are
- Decide as early as possible whether you’ll eventually want to move to a paid tier
- Regularly (weekly or even daily) re-estimate when you’re projected to hit the quota
- Get authorisation to spend early and know who has the credit card
Don’t slip up and miss out on patches of lovely data. Don’t let them hold your metrics hostage. Make a plan, be prepared, and keep on top of it.
The topics in this post were really the impetus for this whole series. It was whilst I was reviewing my notes that I realised they might be of interest to others trying to do what we’ve been doing. Then I got thinking about the other stuff I’ve learnt, which maybe didn’t make its way into a notebook and, well, here we are.
I really hope that these tips are useful to someone out there. The tools are all excellent, but getting the most out of them can require some fiddly nuance. I wish I’d been better prepared when I started out.
In the next post, I’ll be exploring some of the more advanced techniques you can apply to your data within Google Analytics and Mixpanel. Until then, have fun playing with your behavioural data!
Do you have any tips for working with Segment, Mixpanel, or Google Analytics? Please share them in the comments!