How (not) to track product usage

#privacy #browser #vivaldi #ethics

When it comes to tracking how users engage with products, there are two extremes.

At one end is the idea that your activities should be ultimately private, so companies won’t actively monitor how you use a product.

At the other extreme is the approach of monitoring each and every action that you take within a product.

In between? It’s a grey area.

As someone who works for a software company, I can see the appeal of knowing what product features users are actually using, or which version of a feature is more easily understood. But aside from the chance that data can be misinterpreted, there’s also the problem of how to gather usage information while respecting the privacy of users.

A basic check – mostly harmless?

Most companies will want to know how many users their system or product has – for statistical purposes or for financial purposes when making agreements with partners, for example.

For a server-side product such as a webmail service, counting accounts (perhaps checking if they’re active) is usually sufficient.

However, for a product that users download and install, counting users is more difficult. You could simply count downloads and hope it matches the number of installs. But this paints a false picture, since users may install software via third-party channels or via corporate software distribution, or – indeed – they may never run it again after installation.

For truer results, it may be necessary to count users by having the installations notify the vendor when they are installed or running. This requires some kind of identification to prevent reinstallations from being counted as new users and to make sure that repeatedly running the application does not increase the user count. To identify a single installation, an identification token must be kept in the user’s profile, to differentiate them from other users. This identification token is then sent as part of the notification to the vendor.

All this makes it possible to tie an installation to a user. The vendor can see how often you run the product, whether you change your IP address, whether you travel to other countries, whether you run the software daily or weekly.

This is all highly useful for a vendor who wants to know how their software is used and where their users are from, but it immediately invades your privacy and that of any user who just wants to use the software, not share their personal business.

When you agree to send back installation statistics, do you fully understand the privacy implications? Do you understand that just by allowing yourself to be counted, you also make it possible for a company to see other details? Because this privacy intrusion didn’t sit well with us, we developed a system for counting users that maintains your privacy.

How feature tracking starts

It is typically in a company’s interest to know where to spend their development resources. New features take time to produce and maintain. Is a recent feature being used? Should it be relocated to help more users find it? Does adding one feature cause other features to be used less? Is one language version performing better than others?

Tracking whether a feature is being used simply sends a ping – a minimal message saying “the feature was used”. This could be anonymised, or it could be tied to the user identifier. Either way, the server that receives the message gets to see that a user from that IP address was using that feature.

Progress is not always progress

Feature tracking can quickly become a go-to approach for driving development. Now that Developers can see whether you use a feature, they may want to see exactly how you’re using it.

This can be done with laboratory-style testing, using focus groups. However, that can be expensive and does not always represent how it will be used in the real world. So, some companies turn to feature tracking.

It can get highly detailed, timing how quickly you get through a certain section, checking which buttons you press, checking how you move your mouse, or whether you use a touch screen or keyboard to navigate.

As product development increases its data appetite, they might update the privacy policy and the end-user license agreement to allow more feature tracking to take place. And many users help them by blindly clicking “Accept” without realising what they are agreeing to.

As the huge amounts of tracking and profiling information roll in, it is collected in a user-profile database. This is normally anonymised to a degree; it is not stored with the user account tied to it, but each profile is the digital representation of a real person. It may be possible to tie that profile to that person, depending on whether the system links it to a specific user account. If the data were to be exposed, someone with access to other behavioural data may be able to tie that profile to the actual person.

Being a trustworthy company

This collected data is a valuable asset that can be sold to other companies, or advertising agencies, as “big data”. To some companies, this becomes a major source of revenue, while to others, user privacy remains paramount.

However, the reality is that corporate cultures can change over time. One day, even innocently collected feature-usage data can suddenly be seen as a financial gold mine. Frameworks built to improve products now become a privacy-invading money spinner, violating the trust of the users who agreed to tracking to help improve the product they’re using.

As a company grows, it can become more difficult to maintain the line between acceptable feature tracking and unacceptable user-behaviour monitoring. Perhaps the staff that adhered to the original spirit are no longer working on the product. Newer staff may not realise the bounds that they are overstepping. They may not feel that it is wrong to see how quickly a user moves their mouse towards a button or whether that correlates with whether the user has selected the high contrast mode first – essentially leaking information that the user is likely to have a physical disability.

More companies should just say “no”

This is one of the reasons that Vivaldi outright refuses to collect such statistics. It is easy to prevent data collection from escalating to the point of privacy invasion, and ensure that the data can never be leaked or compromised, if it is never collected in the first place. In our experience, it is also much easier to gain and retain your trust.

Even in cases where server-side services collect minimal information for debugging purposes, such as HTTP access logs, companies can and should remove this data as soon as it is no longer needed. This prevents it from becoming a statistical data store ripe for data mining, should the corporate culture shift. Companies should also clearly document the purpose of this collection in their privacy policies, so you, the user, are informed and reassured that nothing will be retained for future use.

Data privacy regulation and ethics

Perhaps feature-tracking may not sound all that threatening, but in many cases, the resulting behavioural profiles can reveal personality traits and, potentially even medical conditions.

Unless you have specifically signed up for a behavioural-profiling study, you may not realise just how much information is being gathered about the way you use a product.

Regulations may not go far enough to protect you from anonymised data collection. With legal systems slow to respond to the ever-evolving privacy-risk landscape, most nations don’t have sufficient protection for user data in place. GDPR has only recently become established within the EU, but other countries are still working on their equivalents.

Even if we assume that the vendor will always be trustworthy, storing user data must be done in such a way that in the event of a compromised server, it will not fall into untrusted hands.

Listening to users

With all this feature tracking going on, it can become all too easy for companies to rely on statistics to drive development, instead of doing the most important thing: listening to users. Users are the lifeblood of the industry. They are people, real human beings, with desires for the product that won’t show up in a statistic. Perhaps you may have wanted to use a feature, but couldn’t find it. Making it easier to find would be the right way forward, but instead – looking at the low usage statistics – the unused feature gets pulled, potentially damaging good will – and good word of mouth.

Even though direct feedback from users can often be negative – people are much quicker to complain about a problem than to offer praise for a positive experience – we believe it is important to engage, to listen to you and all our users. Apart from a channel for potentially product improvements, we see this as a vital part of community building.

In a privacy policy, which would you find more reassuring: “We do not collect usage statistics,” or “We collect statistics for the following 10 purposes, and we promise not to misuse the data, but we reserve the right to update this privacy policy in the future.”?

Vivaldi views you as a person, rather than a statistic. We prefer to interact with you in a welcoming community, rather than fixate on numbers. If more companies would follow Vivaldi’s path of connecting with and listening to users rather than tracking them, privacy would be better respected and protected – and products and services would improve their user experiences overall.

To submit or vote for a feature you’d like to see in the desktop browser, Android browser or our services, head over to the Feature Request Forum.

What’s your take on feature tracking? Something you take for granted? Something you hate? Something you haven’t really thought much about? Have your say in the comments! 👇