The problem
Have you ever faced the issue of waiting what seems to be eternity while you are downloading dependencies after hitting that yarn
(or npm install
for you masochists out there) in a typical node project? Of course you have!! Whom are we kidding? :)
But why does it happen? ๐ค Well, one of the reasons is obviously that package managers like npm has several inefficiencies in downloading node modules. That could be still solved by using a better package manager like pnpm. But reason is that JS dependencies are usually larger in size than their corresponding dependencies.
Why is it that JS dependencies are usually larger in size than their corresponding dependencies in other languages such as Go, Rust and Java, etc.?
The reason
The reason for this is simple - Dependencies in languages such as Golang are source code + binaries, while dependencies in JS are JS files + CSS Assets + HTML assets and stuffs. And the size of a JS file is usually larger than a binary executable, and thus a typical JS dependency is usually larger than its correspondents. But the question arises, why are JS dependencies not binaries? To answer that, first we need to understand how does a code gets compiled in JS and some other language like Go.
So in a language like Go, the output of your entire project is compiled into a binary executable with the help of Go compiler. This binary executable can then be shared to anyone else who can then execute it in their machine. When you install a golang dependency, it does also install the source code too, but due to the rich nature of Golang due to which it has support for a lot of stuffs, a golang package usually has less external dependencies, and even those dependencies are statically linked to the golang package's binary. Due to this reason, even after installing the source code, Golang packages have small size.
But what is the output of a JS project? Its usually a single JS file called bundle.js or index.js, which contains the entire JS source code of the project in a highly condensed, minified manner, alongside probably bundled with some css and html assets. This is usually achieved using tools like Webpack.
Now, to execute the JS in this minimized bundle file, we use a JS Engine like the V8 Engine which is found in Chromium and Node.js, and it executes the JS in a compilation technique called Just-In-Time (JIT) Compilation.
JIT Compilation technique in which the compiler compiles the source code at the time of execution only, i.e. run-time, instead of ahead of time as we see in other languages. It means that during the time of execution, the compiler interprets a line, and compiles it and executes it in that moment only. Most modern JS engines optimise this process further by using techniques such as Ahead-Of-Time (AOT) Compilation, etc.
Now, JIT is not a technique which is just used in JS. Even Java uses it to execute the generated Byte Code in the machine, but the problem is JS does not have any intermediate form like Java's byte code. Hence, JS Engines directly needs the source JS code to do the JIT compilation.
Now since the JS Engines compiles JS source code right at the time of execution, and the only way it can compile is if the source code is written in JS, thus the dependencies also needs to be written in JS instead of being binary executables. Because, if they were executables, they would not be able to be handled by the JS Engines, because as we just read, JS Engines directly needs the JS source code only, and most modern engines are made to handle only JS files and nothing else.
Thus, it is due to this reason that the modules in node_modules needs to be outputted as JS files, leading to increase in their size and making them bulky.
Extra Buzz
It is still possible to execute binaries made in other languages such as C/C++ by linking it dynamically as Native Addon Modules in Node.js using tools like node-gyp. But the execution in those case is done by the operating system, not the JS Engine.
The JS ecosystem is developing fastโก๏ธ, so there is a possibility that one day we might be able to solve this bulky modules issue, but till then we can do our parts as developers to contribute to the ecosystem as much as we can!!
Happy Engineering!!๐๐๐ป
Top comments (30)
Although it is true other languages are compiled and JS is not, that's not the major reason behind the heaviness of node_modules.
Python, for example, is not a compiled programming language, but it comes with a mature standard library that solves most of your daily programming needs, so you need fewer third-party dependencies. That said, if you choose to install a Python library, you're getting the library's source code, not a binary.
Unlike JS and Python, Golang is a compiled language, but when you install a dependency, you also get the source code of that dependency
If you run
ls $GO_PATH/pkg/mod/github.com
, you can see the source file of all go dependencies you're using in your projects, Not the binaries but the source code.I think the main reason for the heaviness of
node_modules
comes in two folds.Javascript's lack of a standard library has forced the community to depend solely on third-party solutions. Nodejs has a standard library, but it's not as mature as what you get from, say, a language like Python.
I think the Js community has generally adopted a culture of preferring someone else's code to theirs (I'm sometimes guilty). For example, [npmjs.com/package/is-string] package has 20+m weekly downloads. A package less than 30 lines of code (counting white spaces), and I'm almost certain that most people downloading this package could have just done
return typeof value === 'string'
and then we haveis-string-and-not-blank
with 90+k weekly download. If you ask me I'd say that's ridiculousYou are correct, and if you read the blog, I did mention it that the reason why Golang projects are small even after downloading the source code, its because it has a rich standard, due to which most go dependencies will have less external dependencies, and your own project will also have less external dependencies.
Another thing you might have missed is that in languages like go and java, the dependency manager has very nice ability to not re-download a package which is already there. This is one of the faults of npm as a monorepo with multiple submodules might have the same dependency installed separately in each of the submodules, instead of doing it only once. In a language like Go, the dependency is managed centrally, which saves the storage of dependency.
Also, I slightly disagree on the part of python. I have definitely seen many python projects which uses multiple dependencies, and python dependencies are usually equal in size with node dependencies
@faizbshah I must have scrolled past the paragraph where you talked about go dependencies.
Great post
And I'm excited about what new runtimes like bun and deno are bringing to the game.
Same here!! Im very excited about the future of Bun. Its probably the most significant thing which happened in the ecosystem in the past 3-4 years
totally. i bet there's tons of redundant dependencies in even the simplest tree
Exactly. The redundancy is a big part why people have started to dislike npm as a package manager in recent times
If you want to avoid
node_modules
, useesbuild
or some other similar tool to only grab what you need.Additionally, abandon TypeScript, as type definition modules and the TypeScript facade itself are pretty chonky. TypeScript may add value for the developer, but it adds zero value to the end-user of the program.
There are a number of compilers for JavaScript out there, and they all have their trade-offs.
This is one of those problems where it feels like developer complain about the unintended consequences of their own convenience, but it is possible to write great JavaScript apps without massive bundles and dependency payloads.
Reduced bugs is a benefit to the user. Developer productivity and speed of delivery for features are benefits to the user. TS has multiple benefits for the end user. Things that make developers more efficient and more accurate are pretty much defacto good for end users unless the tools have tradeoffs that negatively impact users to a greater degree. Arguments about the impact on bundle size and time to interactive can certainly be had, but the assertion that TS has no benefit to users is just false.
Naw, nothing you said requires typescript to solve it. There are a number of approaches and languages which produce the same outcomes youโre describing, and when you combine that with the reality that typescript is a tedious layer on top of JavaScript, and that the majority of its ecosystem is still JavaScript, I stand my my correct assertion.
I do wish that the typescript zealotry would stop, itโs not constructive.
I think you are wrong here. Typescript is the best thing which was created for JS world from last 10 years. TypeScript will allow developers to improve their efficiency, improve confidence in the codebase and solution on which they are working. The codebase will become โunderstandableโ, even through time, because of rich type system which JS misses. You are right in only one moment here - everything what was done in TS can be done in JS. But in large codebase pure JS become a pain in the butt and in most cases will take a lot more time and mental health to maintain. Itโs always a point into TS money-box, benefits are outweigh a lot more, than possible drowback in performance (in correct setup and product architecture, with some development conventions the performance footprint can be can be very small)
I'm really excited for the enthusiasm you're bringing to the table, but please be careful not to confuse your enthusiasm and preferences for objective fact. I don't think JavaScript is lacking a type system because I've been programming long enough to know that a type systemโespecially one that isn't present at runtimeโisn't necessary.
Some people prefer a type system, and some don't. I have my own preferences, and as long as we all remember that our preferences are only thatโpreferencesโwe can all be friends.
pnpm give a link instead copy same node_module library, so if you use same dependency multiple then it will be save your disk space
Keep your package.json simple as you can. In my experience with vite + pnpm is the best for speed a storage too.
Also the npm module creator responsibility is keep the module dependencies keep at minimal level.
For example, my really minimalist npm module: react-state-factory still using peer-dependency for avoide React version collision:
Even If someone just use
useStateFactory
hook, then do not need to install redux-saga and use-saga-reducer.So I think node_modules is also heavy if someone is check how many different React version is included ( after few React module used)
Yes, I mentioned pnpm in my blog post as well as in one of my comments. pnpm is far amazing than what npm has to offer!!
The next question then would be: Why, if any real project requires a JS compiler of some sort, why not use a real compiler instead? If we have to use a compiler either way, why not output a binary?
For any real project you need to use babel/TS/swc and a module system, then you can just use a real compiler instead.
And that's where WebAssembly comes in. The size of the node_modules folder is just one of the many problems that wasm will solve. Stop using JavaScript
To answer your question, I think the reason is simple: No one in the ecosystem is too interested in taking up that job. It took years before somebody actually become frustrated enough to create a runtime env better than Node (talking about Bun). A reason why the ecosystem don't want to deal with binaries is because that will introduce the dreaded problem of platform-level machine code dependency. And, unless there's a dedicated team behind that goal, no one wants to deal with that issue even from a distance. Using JS as a source code intentionally or unintentionally helps keeping JS platform independent.
And as a fan of WASM, I agree with you xDD
There is a lot of great stuff writen in JS you can't write in TS or it feels wrong. Personaly I using WASM as ascelerator, JS for libraries that are used a cross multiple projects, some up to 7 years old and TS for new apps. All that to meet you nead sort of glue.
Node Modules is the Universe
It was originally created by Zaphod Beeblebrox in 1988, but since there was no WWW yet, he programmed it to eventually truly answer The Question.
So Node Modules has to contain every F&@^ Byte all programmers in the world can come up with
Some say Node Modules is the ultimate practical joke.
There are too many dependencies for the dependencies that you are downloading, if these dependencies were updated to use modern JavaScript, you might find that JavaScript has a built-in feature that removes a lot of extra code that is being downloaded as dependent packages. It is the issue of libraries not using the full capabilities of a language as that language evolves -- this is true of Java, C#, etc... It would be a lot of work to rewrite to use those features and that is why old libraries are continually used. Python and Go have large standard libraries, that is why it is less of an issue in these languages. This is technical debt and it is everywhere in software development.
You can take a look at pnpm.io
Iโve got really interesting results in installation times and storage
Yes, I mentioned pnpm in my blog too. Its amazing!!
Recommending to try bun.sh as package manager. It is not solving problem of amount of space but speed up download process.
Yes, Bun has been a breath of fresh air in the JS ecosystem!!
Great! learned something new today. ๐ต
That's great to hear!! :D
Dependencies in go and rust are source files, compiled with your application, and python or ruby dependencies are source files just like nodejs. Java dependencies are compiled and it is very heavy.
Thanks for the feedback!! I added some extra stuffs to explain it too. About Python you are correct, so removed it. Actually I didn't even noticed that I put it in the first place, but yeah the thing with large dependencies is not just limited to JS. About Java, the total dependencies size is usually smaller than the Node because maven and gradle is better with managing duplicate dependencies in the dependency tree, and they don't even need to download the source code. The duplication management is now finally starting to get handled by nice package managers such as pnpm