Learning in public: Lessons from open source

#opensource #android #gradle #java

A little more than two years ago, in October 2019, I began work on my first significant open source project, the Dependency Analysis Gradle Plugin. I had just left a job where I had done relatively little coding, was taking a month off, and wanted to get back into a building mode and learn some new things. I decided to explore the domain of unused-dependency detection. The nearest competitor I was aware of was the Gradle Lint Plugin from the Netflix Nebula collection. However, as that plugin has never supported Android projects, that meant I had an exploitable niche—if only I could exploit it.

In truth, I had very little idea of what I was doing. While I had significant experience with Gradle scripting, I knew hardly anything about what was involved in writing complex plugins. Of the tools I would ultimately need to learn to build something useful (581 ⭐s and counting!)—bytecode analysis with asm, source code parsing with antlr, graph theory and analysis (thanks, Algorithms, 4th Edition!), just for starters—I can say that I knew these things were possible, but I had never attempted to use any of those tools or theories in practice, let alone integrate them into a coherent whole.

I did have a few things going for me, however. For one, I knew, or at least believed, that what I wanted to do was possible. So, when frustrations inevitably hit—and how could they not, I was dealing with Gradle's Configuration API 😭—I was able to hold off despair and keep pushing. I also already knew the Android ecosystem fairly well by then, from the perspective of a developer: I knew that any analysis would have to be variant-aware and I knew, roughly, how to go about doing that. And I had also built up a community of people who knew me and were willing to help when I had questions—experts on both the Gradle and Android sides. I cannot overstate the benefits of being part of learning communities. Even though I'm the sole maintainer on my project, I truly stand on the shoulders of giants (see also the changelog for a partial list of contributors).

I'm writing this post now because, more than two years after starting that project, and over 80 releases,¹ I'm actually inching² close to that coveted 1.0.

It turns out that it's quite possible to build a useful tool without being an expert in that tool's domain—and thereby gain expertise. When I started work on this project, I had in mind that I would create a proof of concept that would give developers advice: it would tell them which of their declared dependencies were unused and could be removed; which transitive dependencies they were using without having declared them; and which dependencies were declared incorrectly (api vs implementation, for example). And then I pushed ruthlessly towards that goal for, well, two years. The current state of the plugin is essentially that collection of ad hoc proof-of-concept algorithms necessary to achieve that very specific goal. And if I may, it does that quite well, given the inherent complexities. But it has proven challenging to add new features that seem like they should be slam dunks, easy extensions of the core concept. Consider, for example, issue 16: "Infer if an Android project could be a plain Java/Kotlin project." The information to do that is available and in principle quite straightforward, but with the current implementation, the algorithms that would be used are smeared across a number of distinct Gradle tasks, and coalescing them into a single boolean (Android/Not-Android) is not trivial. In the new algorithm I've been working on, whose code is already committed to trunk (but hidden behind a feature flag), this information is readily available and I am confident that feature will be implemented sooner than later.

The difference between the current and new implementations is precisely that I now understand the domain and have used that understanding to build a rich model.

The model is a producer-consumer model, with a sprinkling of graph theory. On the producer side are all of a project's dependencies. Each dependency is modeled as a set of capabilities (or features), uniquely identified by its Maven coordinates (e.g., com.company:lib:1.0). The most common such "capability," provided by nearly every dependency, is ClassCapability; this is the set of classes bundled in the jar, available to be imported and used by the consumer. Less common is NativeLibCapability, which is the set of .so files that might be bundled in an Android library artifact (.aar file); I consider this is a runtime capability. There are many others.

On the consumer side, the project-under-analysis is modeled as a set of variant-specific views.³ The ProjectVariant class contains a reference to a representation of all the project's source (both bytecode and Android res for Android projects), along with the classpath and the set of annotations processors that might be present. With this information, we can compute what I call the usages of each dependency in the project. The usage might be "none" (unused), "api" (exposed as part of the ABI), "implementation," "compileOnly," "runtimeOnly," or "annotationProcessor." (I am also contemplating support for "compileOnlyApi," but as only java-library projects support that at time of writing, it's not a high priority for now.) There is also foundational support for java-platforms and test fixtures, but I don't expect those to be available in the initial release of the new model.

Given the actual usages of each dependency, associated with a specific project view, and knowing how each dependency is declared in a build script, the plugin can then calculate the transformations (i.e., the concrete advice) that users can follow in order to have correct dependency declarations.

What I like is that each step is discrete and well-modeled, with a single clear responsibility.

It almost happened that none of this happened

Fairly early in the life of the project, an engineer at a tech unicorn contacted me and expressed interest in collaborating. We had a fruitful conversation, and worked together for a bit before I started to sour on the experience. He asked me why, and I explained that I didn't want to work on features that would primarily benefit a billion dollar company, for free. Working on an open source project for no pay is one thing, but doing it in order to help out a large company with practically infinite resources was something else entirely. He replied that he completely agreed, and that's the day we became friends. We found a path forward that kept the project, and my interest in it, alive, and the rest is… well, see above.

Other lessons

There are two other important things I've learned about OSS in the past two years: first, always thank people for filing issues, because this means they're actually using your project and that's a gift. Second, never apologize or even acknowledge a delay in responding to their issue, because they're probably not paying you and voluntary maintenance of open-source software is not a life obligation. I do it because I enjoy learning and building and being a resource in my various communities, not to yoke myself to the whims of strangers.

What's next?

I want to complete the transition from the original to the new and improved implementation, and this means finishing off the few remaining missing features, deprecating old ones I no longer want to support, and polishing. I’ll be doing all of this in public, as I have been, and will continue to welcome feedback (but don't expect a quick response). If you want to try the new model, just pass -Dv=2, and be sure to let me know how it goes via the issue tracker.

Endnotes

¹ 80 releases in two years: the benefits of making releasing very easy. up
² What do metrics people say, "centimetering"? up
³ Android variant (debug, release) or JVM source set (main, test). up