Welcome back into these series everyone, past few days have been incredible as skott just reached 130 stars on GitHub, 100k of total downloads and lately around +12k weekly downloads since I started open-sourcing it. It's very far for being mainstream but it's a good start, isn't it? Anyway, let me put my personal satisfaction aside, and let's talk about what you came for!
For the newcomers of the series, let me do a quick skott
introduction to put things in context.
skott
is a all-in-one devtool to automatically analyze, search and visualize dependencies from JavaScript, TypeScript (JSX/TSX) and Node.js (ES6, CommonJS). Basically what skott
does is deeply traversing a file directory, parsing all supported files (only JS(X)/TS(X) for now), resolving all module imports and build the whole graph from these and then expose a graph API from it.
Having that graph generated, you can then use it in many ways, such as visualizing it using a web application. Here is an overview of the next version of the skott
web application I'm currently revamping:
If you're familiar with static analysis tools (if not feel free to check my previous blog post on that subject), you might already be aware of some of the challenges we can face. In the context of skott
, here is a list of some of the challenges:
- extracting workspace information such as manifest files (package.json) or lockfiles (package-lock.json, yarn.lock, pnpm-lock.yml), configuration files (tsconfig.json, .eslintrc.json)
- traversing the file system, using ignore patterns, discarding ".gitignore-d" files
- parsing files using multiple parsers for each supported language (JS/TS) and managing to cover all specificities that can be enabled or not within that language (e.g: JSX/TSX for JavaScript/TypeScript)
- walking the previously emitted AST to collect all the module declarations from each file AST, for both CommonJS and ESM
- resolving all the module declarations to their real file system location, in order for us to incrementally resolve the entire graph
- building the graph by using all the resolved modules
That's quite a lot to do even though it's a simplified version.
The question now is, how do you manage that complexity in a way that you're fast, confident and safe enough to add/remove/update features, fix bugs and process huge refactorings? Basically, how do you enable true software Agility?
Testing
The first thing that could come in mind are tests. Overall writing good tests brings safety and helps to gain confidence. Nevertheless, there is a double-edged side with tests, because depending on how you manage to write them and essentially what is being tested i.e: the system under test (SUT) and how it is tested, it could lead to wrong confidence leading to unexpected bugs, sneakily introduce wrong design solutions such as coupling code to implementation details leading to painful and unsafe refactorings. As software engineers, this is something we should avoid at all cost.
In my previous blog post Don't target 100% coverage, I quickly demonstrate how having tests written in a certain way can produce unexpected behaviors and false sense of security.
Before continuing the post, I highly recommend to read ☝️ this blog post if you're not familiar with the different ways of bringing tests to the table I'm referring to, namely Test-First, Test-Last, Test-Driven Development, Mutation Testing, Code Coverage.
Test-First, Test-Last and Test-Driven Development
Let's see the differences between these three.
Test-First
The process of writing the test before writing any line of code of the targeted feature. This aims to put a precise plan on what should be implemented but is indeed missing the most important part, the how you manage to achieve the implementation, as Test-First is not providing an incremental feedback loop from which you can benefit a lot.
Doing Test-First can be useful and can generally help increasing the covering of specifications with tests but kind of omits the fact that new constraints, hints and ways of doing things will be discovered along the way. Unfortunately, Test-First does not allow tests to drive the implementation, and favors the writing of big chunks of code at once, because you start from 0 lines of code to X lines of code where X is the number of lines of code that were needed for the test to pass. Amongst these chunks, some might be useless or take too much time to be introduced given the overall complexity that you have to deal with at once.
Test-Last
Probably the most popular way of testing, consists of writing all the tests after all the lines of the feature code have been written. This aims to bring safety around the already produced code but does it really help with that? Not really. One of the drawbacks of Test-Last is that it's often used as a way of converting a desire to gain confidence -by adding tests and increase the code coverage percentage- into a false sneaky feeling of safety.
Remember the Goodhart's law: When a measure becomes a target, it ceases to be a good measure
.
Consequently, a lot of unnecessary and noisy tests will be added along the way as they will try to cover parts of the code that don't essentially need to, either because they could have been covered by higher-level tests oriented by system state expectations or just that they test useless or repeated things already covered by other tests. How could you know if some test is useless or covers an behavior already covered by another test? You add a test, it passes, but is it due to the code you just added or was it the case before? 🧐
Because Test-Last introduces tests in the very last step of the feature life cycle, it comes with a lot of pain points, such as revealing design smells or that implementation details were introduced or that the code is not easily testable. This is a waste of time, because you could have figured it out before. This will also most likely favor the introduction of mocks (⚠️ I'm not referring to the test doubles in general, but especially a Mock which is one type of test double) that should be considered as a smell in most cases as it introduces a structural coupling between tests and the code (asserting that X will called Y with abc parameters, this highly reduces flexibility and ability to refactor).
In that case, tests become very fragile as they depend on implementation details and at the next function change they break, making you feel the anger towards the tests themselves.
"Tests prevent me from refactoring as they break all the time when changing the code, I'm losing flexibility and it's not a convenient way of working" 😡
Does it exist a better way of achieving efficiency and safety while keeping the code highly flexible, easily refactorable and abstracted from implementation details that we don't want to depend on feature-wise?
Test-Driven Development (TDD)
Disclaimer: My desire is not to advocate TDD at all cost, Test-Driven Development is not a silver bullet. The whole purpose is just to offer an overview about an highly misunderstood and under estimated discipline. It's then your choice to try it, blame it, blame me. But believe me, when you start being decent with TDD, you never look back.
Test-Driven Development is a more evolved version of Test-First, it's essentially a software development discipline that drives you to find the quickest and best path of writing each line of code targeting a domain specification through fine-grained decomposed steps (so called baby steps), dealing with complexity incrementally.
It appears that Test-Driven Development relies on automated tests which its best companion to-date.
By using tests, TDD will drive the writing of code required to make a failing test ❌ pass to turn to GREEN ✅ in very short feedback loop cycles.
If there is no prerequisite, that is no failing test, why would I add some lines of code? Nothing in my system justifies that need for adding lines of code yet. Maybe that the behavior is already implemented? The first step is to be sure that we have a failing test which is our first checkpoint to reach.
Once we have that, we can start writing the code required to make the test pass. We are now sure that we did something useful, that is adding lines of code justified by the specification that turned from red (failing) to green (succeeding).
Just after that, we have room for the third step of the TDD process which is the refactoring part during which you're free to refactor and produce the best code while keeping the test to green. That's why Test-Driven Development fundamentally helps designing software as it allows to infinitely and safely refactor (at any point in time) the code produced while ensuring that the expected behavior is still intact, which is one of the main benefits.
Test-Driven Development requires good Software Engineering skills to be effective
TDD helps designing software because it creates a perpetual confrontation between the design choices to be made at a time depending on the current system requirements. Nevertheless, TDD requires a good set of design and refactoring skills upfront in order for the discipline to be effective and increase productivity, otherwise you might end up being blocked at some point in the process or taking the wrong decisions (or even worse, no decisions at all).
Test-Driven Development is all about feedback loop, so is the Software Engineering in general!
Let's put the TDD aside for a little bit and let's talk about some of the most common feedback loops we're all working with on a daily basis:
Integrated Development Environment (IDE): all about feedback loop! How fast you want to know that the code is correctly written in the host language? The red underlines, warnings, auto-format, popups and whatever alerts can go through the whole IDE are part of the feedback loop, they help you know whether you are heading to the good path or not.
Compilers: all about feedback loop! Same here, the compiler provides you a way to know if the written code is semantically correct to some extent. Statically typed languages such as TypeScript are a great way of improving the feedback loops by adding type-safety + great developer experience with real-time integrated Language Server Protocol (LSP).
CI/CD Pipelines: all about the feedback loop, where you continuously want to know whether the product could be successfully deployed in production, passing all the verification steps. If that's not the case, you want to know it quickly to fix the problem right away.
All of these examples of feedback loops tend to leverage the fail fast
principle whose goal is to quickly identify failures, rather than letting them persist or be discovered later in a development process or worse letting the customers discover the bugs. Feedback loop is the foundation of continuous improvement.
Test-Driven Development is all about having the shortest feedback loop towards the written code, asserting whether the system is producing the expected outcome, when adding, removing or updating code. It appears that thanks to automated tests, the loop is quick enough in most cases as long as you respect the Fast
nature from the F.I.R.S.T principles!
One downside of TDD is that it can become counter-productive if not correctly applied. IMHO it is better not to practice TDD than to practice it the wrong way. Also, the learning curve is steep as it requires strong technical skills and mental shifts and a different way of approaching software development.
Putting that into small practical examples
Ok so now that we talked about Test-First, Test-Last and Test-Driven Development, let's come back with our initial question:
How do you manage that complexity in a way that you're fast, confident and safe enough to add/remove/update features, fix bugs, continuously improve the code using refactoring?
For now, the only way I have been putting in practice that works for all these points, is Test-Driven Development. Of course you might be able to do the same without it, but at what cost, with what level of confidence and how fast? Without introducing any regression? I have been there too, now I could never do it again without it.
Let's take an highly simplified version of a skott feature.
Given two JavaScript modules with one being a dependency of the other one, both should be analyzed and a graph should contain two vertices with one edge representing that relationship:
index.js
import { add } from "./feature.js";
// Rest of the code consuming the function, we don't care about that
feature.js
export function add(a, b) {
return a+b;
}
We want to:
- read both files,
- find the module declarations and see that
index.js
importsfeature.js
module - then build a graph of two vertices,
index.js
andfeature.js
, with a directed edge fromindex.js
tofeature.js
representing the import dependency, we say that index.js is "adjacent to" feature.js.
Consequently the expected outcome of our system is a graph shaped that way:
{
"index.js": {
"adjacentTo": ["feature.js"]
},
"feature.js": {
"adjacentTo": []
}
}
This is our expected final shape, this is the expected outcome that we want skott
to produce.
Test-First
Practicing Test-First here would suggest us to start by writing the test first, with the whole expectation that we have from our system.
describe("Graph construction", () => {
describe("When having two JavaScript modules", () => {
describe("When the first module imports the second", () => {
test("Should produce a graph with two nodes and one edge from the index module to the imported module", () => {
createFile("feature.js").withContent(
`
export function add(a, b) {
return a + b;
}
`
);
createFile("index.js").withContent(
`
import { add } from './feature.js';
`
);
const graph = new GraphResolver();
expect(graph.resolve()).toEqual({
"index.js": {
adjacentTo: ["feature.js"],
},
"feature.js": {
adjacentTo: [],
},
});
});
});
});
});
The test itself includes the three main components, Arrange/Act/Assert. Running that test will make it fail for sure, as we don't have any code written yet. But now, how do you go from that test failing from passing?
Little reminder of all the steps that we must go through to make the test pass:
- file traversal
- file parsing
- module extraction
- module resolution
- graph construction
This represents such a long way and so many steps and components involved, hence we might spend a little a bit of time doing that. Unfortunately the test won't be useful during that whole time, it will only be there to assert the final result once we already produced all the required code.
Test-Last
Nothing happens there, Test-Last does not want us to write any test yet! Even though sometimes I hear it whispering something in my head:
Test-Driven Development
Ah finally, something that will help us achieving the desired behavior. As already said, Test-Driven Development wants us to take an incremental approach instead and let the code flow and grow in complexity in various steps. By design, it wants us to adopt a baby step
approach where we just add the minimum of code to make the test pass.
In the context of TDD, here is the first test that crosses my mind:
describe("Graph construction", () => {
describe("When not having any modules", () => {
test("Should produce an empty graph", () => {
const graph = new GraphResolver();
expect(graph.resolve()).toEqual({});
});
});
});
First things first, we are focused on the shape of the contract we want to expose, it constraints the thinking around that, one problem at a time.
One benefit of TDD is that the test become the first client of the code itself and you can let the design emerge progressively.
Here is the quickest way of making the test pass:
class GraphResolver {
resolve() {
return {};
}
}
Really easily, we just have to create a raw class with a method returning an hardcoded empty object.
Then comes the wonder of what should be the next test? Don't forget that the end goal is to be able to traverse files, and build module dependencies between each of them.
TDD efficiency is enabled by the choice of the tests, there is no magic. The developer is responsible for thinking and finding the right path for the test order and each missed step or step that is too big will be a missed opportunity to entirely benefit from TDD. But don't worry if you end up in that situation and think of an intermediate step that can be valuable, you can still downshift gear.
So what test comes next? After validating the case where we don't have any module, what about introducing the case where we finally have one module to analyze?
describe("When having one JavaScript module", () => {
test("Should produce a graph with the module", () => {
/**
* ARRANGE
*
* How can we create a file system context including that file specifically
* for this test?
*/
const graph = new GraphResolver();
expect(graph.resolve()).toEqual({
"index.js": {},
});
});
});
Even if this appears to be simple, it grows in complexity as we introduce the notion of file in the Arrange part of the test. Our tool will be based on the file system, meaning that it will need to read from it at some point. Does it mean that the test itself should read from the file system? Not at all.
But why not using the real filesystem right away, using the Node.js API for instance?
Introducing the use of the real file-system is something we want to avoid with unit tests, we want them to be fast, isolated and repeatable across executions. This is not the case as the filesystem layer would not check any of these criteria. We want to have a fully manageable and specialized version of the Node.js filesystem module. Do we need to fake everything within it? Not at all, we just need to cover the subset of it we need for now.
As you can see, that small test brings a whole new level of thinking around the new feature. You could be thinking that it's something we could avoid if we were not writing any test and that it complicates our task for not much. But indeed, this forces us to think about building a testable code on which we have full control over and that already brings us to designing solutions.
What we want first is to create a minimalistic and in-memory file system that can fake a real one.
Let's go back at our second test:
test("Should produce a graph with the module", () => {
const fileSystem = new FakeFileSystem();
fileSystem.createFile("index.js").empty();
});
Now that we have that in-memory context created, we want it to be able to be used within that Graph Resolver context, in other words we want it to be injected in the Graph Resolver context.
This naturally leads us to Dependency Injection (DI).
Dependency Injection
Dependency injection is a pattern for decoupling the usage of dependencies from their actual creation process. In other words, it is a process of injecting dependencies of a service from the outside world. The service itself doesn't know how to create its dependencies.
The dependency there (the FileSystem module) is created outside of the GraphResolver
context and can be injected right away:
class FakeFileSystem {
// ...
}
const fileSystem = new FakeFileSystem();
fileSystem.createFile("index.js").empty();
class GraphResolver {
constructor(private readonly fileSystem: FakeFileSystem) {}
}
// Dependency Injection
const graphResolver = new GraphResolver(fileSystem);
Now that we have the dependency injection in place, we can fake the required behavior from the file system module
class FakeFileSystem {
fs = {};
createFile(name) {
return {
empty: () => {
this.fs[name] = "";
},
};
}
readFiles() {
return Object.keys(this.fs);
}
}
This is a simple fake filesystem that emulates the operations we need for now, that is createFile
and readFiles
. Nothing much! We don't need to cover everything, only what's needed in the context of the test.
To make the test pass, we can now consume that newly available implementation in the GraphResolver
service.
class GraphResolver {
graph = {};
constructor(private readonly fileSystem: FakeFileSystem) {}
resolve() {
const [filename this.fileSystem.readFiles()) {
this.graph[filename] = {};
}
return this.graph;
}
}
And now that passes, just matching our expectations. Note how much we're only focused on the current desired behavior, we just produced the only minimum of code we needed. For instance the readFiles
method only yields the filename as the test only requires us to have a record including that name, even though we know for sure that later one we'll also need the file content.
Also, we're fully in-memory and isolated from the real filesystem, the dependency only produce the outcome we need, we don't need to fake the whole filesystem API of Node.js or whatever runtime we are using!
Note that in this case I'm voluntary skipping some of the steps in between as I know for sure that we want to read all files from the directory at some point. Instead of just introducing
readFile
method (as we only have one file), I go straight into what would have been the next step (readFiles
). As a rule of thumb, it's generally a good practice to follow a sequence of Transformations that you should go through to produce the minimal code in order for the test to pass in short loops. If you're interesting in knowing more about that, you can read the Transformation Priority Premise from Robert C. Martin
After the green step, TDD comes in with a third phase which is the refactoring.
Refactoring is about changing the system design without altering that is changing/adding/removing parts of the system behavior.
Luckily, we can use that phase to improve a tiny thing that you might have noticed in the Dependency Injection system we currently have.
Dependency Injection can be used in a way that decouples from concrete implementations of the dependencies. In the case above, GraphResolver
is directly coupled to a FakeFileSystem
instance letting no room for flexibility and offers no way of injecting anything else that would match the same contract. More importantly, it leaks implementation details and things that should not be part of the contract. In our case, FakeFileSystem
exposes a createFile
method that exists just for the purpose of the test but has no value in the context of the GraphResolver
, so GraphResolver
should not know about that method but only what it cares about, that is the readFiles
method.
Consequently, we can improve that by introducing a generic interface in a way that GraphResolver
now depends on abstractions and not on concrete implementation. This brings us to the Dependency Inversion Principle (the D from SOLID principles). That way, skott
could have an runtime-agnostic way of traversing file systems.
interface FileSystem {
readFiles(): string[];
}
class FakeFileSystem implements FileSystem {
// unchanged
}
class GraphResolver {
constructor(private readonly fileSystem: FileSystem) {}
// ^ this changes
}
All tests are still passing because we just played a bit more with interfaces and with the static compiler to narrow the correct types into the GraphResolver
.
No more refactoring? Let's get closer to the objective.
Now we want to start introducing the concept of dependencies between the JavaScript modules. As already said, dependencies between JS modules can be modeled using a Directed Graph, in which files would be represented as vertices and relationships between files would be represented as directed edges between vertices.
Using the Graph terminology, a vertex A that depends on another vertex B is said to be "adjacent to B".
Once again, we introduce one concept at a time. This time, we don't really need to add a new test, we can just update the last one:
expect(graph.resolve()).toEqual({
- "index.js": {},
+ "index.js": {
+ adjacentTo: [],
+ },
});
This fails, now we make it pass:
resolve() {
for (const filename of this.fileSystem.readFiles()) {
- this.graph[filename] = {};
+ this.graph[filename] = {
+ adjacentTo: [],
+ };
}
return this.graph;
}
// What is the next step? Adding two modules would not allow us to go forward. We need to find a test that will drive us to the cases where we introduce the concept of import that will create an edge between two modules.
Let's add a test that will force us to start dealing with dependencies between modules:
describe("When having two modules with one dependency", () => {
test("Should produce a graph with both modules and a dependency between the two", () => {
const fileSystem = new FileSystem();
const fileWithModuleImport = `import "./feature.js";`
fileSystem.createFile("index.js").content(fileWithModuleImport);
fileSystem.createFile("feature.js").empty();
const graph = new GraphResolver(fileSystem);
expect(graph.resolve()).toEqual({
"index.js": {
adjacentTo: ["feature.js"],
},
"feature.js": {
adjacentTo: [],
},
});
});
});
Still with the idea of writing the easiest code possible to make the test pass, we can add a little if statement
in the resolve
method of the GraphResolver
. From the Transformation Priority Premise
, this is ordered as the middle of the transformation we should process (unconditional->if) splitting the execution path
, splitting the execution path to match expectations:
resolve() {
for (const [fileName, fileContent] of this.fileSystem.readFiles()) {
+ if (fileContent.includes("import")) {
+ const moduleName = fileContent.split("./")[1].split("'")[0];
+
+ this.graph[fileName] = {
+ adjacentTo: [moduleName],
+ };
+
+ continue;
+ }
this.graph[fileName] = {
adjacentTo: [],
};
}
return this.graph;
}
Test is passing ✅
Note how ugly that "split" is, actually it might be the ugliest statement I have even written but we don't care, it fits our needs: it turns the test to green.
Remember after this step, you're free the refactor as much as you want, so no stress, we'll be able to do better just after.
Let's try to slightly improve the code during the refactoring step:
resolve() {
for (const [fileName, fileContent] of this.fileSystem.readFiles()) {
if (fileContent.includes("import")) {
+ const moduleImportParser = /import '(.*)';/g;
+ const [moduleImport] = +moduleImportParser.exec(fileContent);
// path comes from Node.js "path" module
+ const moduleName = path.basename(moduleImport);
this.graph[fileName] = {
adjacentTo: [moduleName],
};
continue;
}
this.graph[fileName] = {
adjacentTo: [],
};
}
return this.graph;
}
The way parsing is done in the two previous steps is an implementation detail there and is part of our own business logic, it should remain hidden allowing heavy refactoring without breaking the API surface. As long as the resolve
method returns the expected graph we are fine because it's what matters the most.
While progressively adding cases where you want to reliably parse all types of module imports (and there is a bunch, including fun edge cases), you will realize that using a RegExp does not scale well and becomes to complex to manage. But it's fine, nothing prevents you there from introducing an ECMAScript-compliant parser, such as meriyah
, acorn
, swc
, that does the job for you! The best part is that this would still remain hidden inside the internals of your use case, allowing infinite refactoring and improvements.
I'm not going to into the implementation of the parser itself (you can check skott
source code to see a complete example), but hopefully you start being able to imagine what are the next steps of this use case. I also wish that you were able to see the advantages that Test-Driven Development brings in that little context.
Wrap up
By dealing with few use cases, we were able to see how Test-Driven Development incrementally drives us to the development of a feature.
Not only this helps us facing constraints very early in the process thanks to the feedback loop, it also naturally forces us to introduce a way to make the code easily testable, fast, isolated, repeatable, via Dependency Injection. This also reduces the level of complexity we have to deal with at each new step and test added, while going way faster and safer.
Moreover, we can also abuse from heavy refactoring phases to make the code cleaner (even though clean code and great design are prerequisites to process efficient refactoring) while being sure that the system behavior still matches our expectations.
Personal take:
One thing I know for sure is that I would not feel confident with skott
features if I hadn't brought 130+ unit tests with Test-Driven Development. Features can be added without any fear, refactoring can be done with ease and confidence.
Nevertheless this does not mean that skott
can not produce bugs, it simply means that for the use cases where skott
is expected to work, it will most likely work as expected. In other words, a bug
will most of the time be due to a missing system behavior description or edge case. And it's totally fine, because to fix that, you only have to add that test case reproducing the missing behavior, and let the TDD flow!
Link to skott's GitHub repository: https://github.com/antoine-coulon/skott
See you at the next episode 👋🏻
Top comments (0)