How Does Open Source Track Adopters?

twitter logo github logo ・1 min read

I'm a big believer in open source but I'm curious about something:

how do projects track adoption?

For example, how does ElasticSearch know who is downloading or cloning their software and what they're using it for?

Is it something passive like an ADOPTERS.md file in a GitHub repo or something active like a license requiring acknowledgement of use or does it not exist at all?

twitter logo DISCUSS (13)
markdown guide
 

It's kind of a million-dollar question, or would be if we weren't doing all this for free in the first place! There are things you can do that give you a better picture of where and how your project is being used, such as:

  • checking referrers to your project page, documentation, etc (GitHub repositories have a "Traffic" item under "Insights", for example)
  • for library code, searching for projects that depend on yours
  • seeing who participates in the community, where issues and contributions are coming from (many people use employee email addresses or talk about where they work)
  • offering consulting services and seeing who takes you up on them

but they are of varying utility depending on the size of the project, the environment, and other factors. For example, some languages lack package managers, and some projects are end-user focused; dependency graphs are unavailable in the former case and useless in the latter.

 

I like the idea of trying to stitch those sources together somehow because they all definitely provide some sort of insight into project activity/usage.

 

Here are some sources of data that I use:

The main problem is that you will only be able to see the public ones. You will have no idea about who is using it in private projects.

 

Not bad and yeah, that is the problem. Getting insight into how it's being used privately would be pretty valuable.

 

I mean...there's kind of a reason people have private repos.

Trying to sniff out information about how private repositories use your project is a breach of trust and over-stepping your boundaries.
If you really want information, be upfront about it, ask your users, and if they want to give you the information they will

That's true, I'm just curious as to how we could even identify users if all they do is download and use the software.

One thing that you could do is implant a sort of "fingerprint" in your software and then actively go looking for it.

For example if you're distributing a JS library you could put a UUID as a string constant somewhere in your code or a series of functions that are being called but have no real purpose other than form a recognizable pattern in your bundled code.

Then you could write a crawler that looks for your fingerprint in the resources that webapps pull. If you have a match you also have the domain, i.e. identity of the user.
Change the fingerprint for every new version you release, and you'll also have metrics on adoption rate and which user uses what version.

This would be super shady behavior though

I agree; if there were a way to have the user opt-in to the tracking that would be the open way of doing it.

 

This is something I've been wondering myself. As the developer of Orchid I'm very interested to know how many people are using, and currently estimate usage from several sources. Each one can be useful, but also has its own drawbacks and may be skewed in its own way (based purely on speculation, I really don't have any hard evidence for any of this). My main metrics are:

  • Google Analytics pageviews on docs site. Specifically, I'm interested in which pages people are viewing, because that shows me people are actually looking into how to use it rather than just viewing the landing page and leaving. However, these numbers may be skewed from web crawlers.
  • Bintray has statistics on which versions of Orchid's JARs are being downloaded over time. I'm sure these numbers get inflated because of CI, which may not be caching the downloaded jars between runs, and would pull them fresh each build.
  • Github stars and forks are nice but are far from representative of project usage. I'm realizing all the time that I've been using certain projects for years and are an indispensable tool in my toolbelt without ever having given the project a star.
  • Probably the most useful metric to me right now is activity on Github. People opening issues for bug reports, suggestions, general usage questions, etc. all show me that they are actively using the project in some manner. I also like to do searches for projects which include the Orchid dependency, although this only shows me the public projects.

That said, I've been wondering if it would be right/ethical to add analytics tracking to the Orchid codebase, so I could identify the specific features people are using. Obviously, I would need to ask users to enable tracking, but I just don't know if this is a "normal" thing in OSS projects and how well it would be received just the mere fact that it is there. I would love if anyone has any input with this!

 

Well put and a great example! And I agree, it would be interesting to know if, and to what extent, the tracking is accepted within the open source community.

 

code.google.com used to tell you how many times your packages had been downloaded. Before being automatically migrated to github, my barebonesmvc/php had been downloaded over 7K times at my last recollection, but now on github hasn't been starred even once as github.com/dexygen/jackrabbitmvc And stars are what a lot of people are "measuring" by these days. Github has a lot of catching up to do with what was code.google.com in this regard. Maybe Microsoft can pull its head out of the previous maintainers' ass(es) on this one.

 

My team and I just open sourced a project today and we faced the same question. It's not sufficient to track downloads or even how many people clone; what you really want to know is actual usage (i.e. did they actually use the software, followed a guide/example in the docs). If you write a guide and reward users for completing a step, you'll at least get some data and that's what we intend to do.

 

Very cool! So basically providing a way to track how users are learning to use the project?

Classic DEV Post from Apr 16

A Guide to Handling Browser Events

In this article, we will discuss ways browser events can be handled, default browser actions and event propagation.

John Forstmeier profile image
Build well