Hello,
In this third article of the Architecture series, we will discuss why we chose Node.js (JavaScript) to build an hybrid monitoring solution.
When I was at the idea stage and I was thinking about what they would be the main features, I wanted to be part of the product the following things:
- A highly modulable agent (with only one true abstraction).
- Built-in tools and support for developers and integrators.
- A reactive solution with hot reload capacities.
- Produce Flexible and maintainable codes.
- Modularity have to be first-class (read the first article of the series).
- "Work as design" will not be part of our answers.
Check the following governance chapter to get a bigger picture of the different forces.
I was not that much confident about doing it in C or C++ and I'm still not even today. Most developers are unaware of their limitations and are driven by the ideology that they can carry the weight of the required rigor...
At the end you will end up with a piece of software that's going to be way ahead of you (even at a high level this will happen even to us.). For monitoring everything take decades to take place and evolve so you have to deal with this day to day.
Most of the low-level solutions I know are in catastrophic situations where developers can't even maintain the software anymore... With the turn over and the lack of skills, performance regression will appear (Often combined with a sloppy and all-too-fragile foundation).
But that doesn't mean that these languages shouldn't be an integral part of the stack (I'll come back to this later in the article).
Node.js and JavaScript
Many of my ideas have surely been influenced by the Node.js and JavaScript ecosystem. I've always been a big fan of the level of accessibility and simplicity to build or use a module.
On the other hand V8 Engine is a really powerful virtual machine with the ability to optimize branches of code in real time. The event-loop pattern (provided by libuv) is also very good for everything that touch the monitoring world because there is not that many "CPU Intensive" tasks.
Along a lot of others things:
- A rich ecosystem with millions of packages for those who want to build addon.
- JavaScript is by no meaning the most popular and accessible language.
- Setting up complete tests suite is not that much complicated.
- Having the same language across the board.
- ABI Stable Native API.
- We can start in the cloud at any time with a design cost near to zero.
And modern JavaScript is far from being "slow" as many people think. Obviously we don't have a memory management as refined as C, GO or Rust.
Today our Agent run with 20MB of memory footprint (which is mainly used by V8 stack). And we are far under 1% of CPU usage.
We had already expressed ourselves on the subject but our objective is to remain on performances very close to competitors such as Netdata.
I often hear a lot of complaints about the quality of the ecosystem...and many seem to take this as an argument that it's impossible to build something without a black hole of dependencies.
We have carefully thought out and architected our solution and to date we have no indirect dependencies in our Agent (Which doesn't mean we're having fun reinventing the wheel.).
It's just that there are many very high quality packages out there that many people don't mind (nobody takes the time to do serious research and analysis... And they dare talking about quality and security ð).
On the other hand many people simply hate JavaScript and are not open-minded that it can produce any quality ðĪ·.
Bindings
As I previously indicated... Choosing JavaScript does not mean at all that you don't have to deal with languages like C/C++.
SlimIO is not one of its solutions that runs bash script on your system ð. All our metrics are retrieved through very low level interfaces (as low as possible) and exposed through a binding package.
This ensures optimal execution times as well as the lowest possible resource consumption for the target system.
From our current experience most interfaces return under 10<->30Ξs ð (time including conversion to JavaScript)
I think that in the long term we will work more and more with bindings written in Rust. However, there is still a lot of work to be done to make this possible (and we clearly don't have the necessary traction at the moment.).
I appreciate very much the ecosystem of Rust which is for me one of the only ones that correspond to the mentality and the idea that we are trying to push/build.
Future implementations ?
The core of the product (entity responsible to load and manage addons and communication between them) is written in JavaScript. It would be in my opinion very interesting to explore the possibility of rewriting it in C++ or Rust one day.
There are a lot of quite sensitive topics like Isolation where having access to some low V8 API would be more advantageous (and same for libuv).
This even allows us to imagine that it would be possible to develop addons in C++ and Rust ðĩ.
However, it would change a lot of things, especially on the implementation of communications. Having an overly idealistic vision seems dangerous to me... it is moreover quite possible that such a choice could lead to a regression in the overall performance.
We need contributors to create a prototype ð.
Everything is not pink
Choosing Node.js for an On-premise product is good... but we still had to verify through several proofs of concept that the idea was viable ðŦ.
I personally built two prototypes and did several months of research to make sure we wouldn't have any critical problems in the future. However, this does not mean that we do not have constraints, weaknesses or problems ð.
I like to be honest about the weaknesses of our solution because it is for me the first step to move forward and seriously explore solutions to solve them ðŠ (and maybe even push further JavaScript).
So i guess we can go with that list:
- JavaScript is not statically compiled, so we have to embed (bundle) the Node.js executable with the core.
- JavaScript lack of native way of cancelling Asynchronous tasks properly.
- There is some isolation issues when addons are running in the same process (These are not critical if the developer is not making big mistakes).
We are working to fix these issues in the long-term with Realm and Node.js Workers to be able to achieve a fault-tolerant Agent (or as writted higher with a new Core implementation).
- V8 require high amount of memory to optimize slow interpreted code into low level machine code (CodeStubAssembly).
- V8 and sqlite binding cost a lot in the product size (99% of the size ð ).
We could simply summarize that we pay the price for a software running with a Just-in-time compiler. Technically speaking, this is the key detail that sets us apart from our competitors (right or wrong depending on how you look at it).
â ïļ Note, however, that the agent is delivered as an executable and that only the addons codes are readable.
Ironically, some of our weaknesses are faded away by some of the ecosystem's strengths like all the tools that allow us to tree-shake and eliminate dead codes for addons (which counterbalances the weight cost a bit).
Conclusion
Here is the path that leads us to JavaScript and Node.js (even if I bet that C++ and Rust will surely be a big part of our product history).
The solution does not aspire to be the most robust or the fastest. It aspires to be a much higher unified foundation for framing and guiding overall IT monitoring needs (Infrastructure, APM, Logs ...) even in sensible contexts.
It must be clearly understood that in the long term nothing stop us from answering critical needs through extensions written in Rust, Haskell or whatever.
This is part of an ideology that is obviously our own.
I will come back to detail some points in the next articles of the series (Like exploring the subject of reactivity or what I mean by one true abstraction for the agent).
Thank you for taking the time to read.
Best Regards,
Thomas
Top comments (0)