DEV Community

nepeckman
nepeckman

Posted on

How do we improve security in the npm ecosystem?

For those who haven't seen this trending elsewhere, a popular npm library executed malicious code on victims' computers. To summarize the thread (though it is worth a read) the maintainer of the library gave control to an unknown individual who claimed they wanted to maintain it. This individual added a dependency designed to execute some sort of malicious code, and people are still trying to figure out what the payload does. While a lot of people are playing the blame game, I'm interested in discussing what practical steps can be taken to limit this vector of attack. Should we establish a more rigorous process for giving up control of an npm module? Is our only hope better audit tools? I'm interested in any idea that addresses this security concern.

Top comments (19)

Collapse
 
dmfay profile image
Dian Fay

Somebody in the thread suggested treating ownership changes as a major version bump. That seems worth exploring, although my gut feeling is that the only way it'd really work well is to make it part of the semver standard and have the registry automatically bump and republish on any addition to the collaborator/publisher list.

Collapse
 
rhymes profile image
rhymes

Yeah but then a smart attacker would just release an innocuous major version and then slip in the malware in the next minor one.

Collapse
 
xowap profile image
Rémy 🤖

Also it's obvious that people need a way to get paid for working on open-source. Free lucrative startup idea below:

Let's create a service à-la Spotify:

  • You pay a fixed amount every month (you choose how much)
  • An IDE/editor/git/... plugin analyzes your code and finds out which packages you use the most
  • Each of those package can subscribe to the service to receive donations
  • At the end of the month, you can review the money split and when you validate it all the maintainers get paid
Collapse
 
herrfugbaum profile image
Pascal

Sounds a bit like tidelift.com/ 🤔
I've seen it beeing used by chalk, vue and babel.

First, someone purchases the Tidelift Subscription. Then, we scan the subscriber’s open source stack for packages and dependencies. We split up the subscription fee and use it to pay the exact packages they use.

Source

Collapse
 
xowap profile image
Rémy 🤖

Well let's do more of that :)

Thread Thread
 
herrfugbaum profile image
Pascal

Yeah this could lead to a better maintained kind-of stdlib for js.
But it might also lead to an even more fragmented ecosystem, where it is most lucrative to publish a lot of one-liners and hope that a big project will use it somewhere in it's dependency graph.

It would also leave other kinds of packages more or less unpaid. Take for example a cli app. No one will depend on it, while it could have millions of downloads at the same time. On the other hand, that's a different kind of problem as it wouldn't have such an impact on the generell ecosystem and could be targeted by donations, one time payments or something like that.

Thread Thread
 
xowap profile image
Rémy 🤖

The revenue split is certainly a tricky question, however at this point it seems obvious that:

  1. Open-source maintainers
  2. The only thing that seriously dented piracy is Netflix/Spotify/Steam

When it's easier to buy it people tend to do so. I definitely think it's worth working around that idea.

Thread Thread
 
aghost7 profile image
Jonathan Boudreau

The problem I see is we're not dealing with people, we're dealing with organizations. Its a bit odd but I don't think a company would decide to pay for such a service.

Collapse
 
geoff profile image
Geoff Davis

Also Back Your Stack does the same but offers more piecemeal contributions.

Collapse
 
jep profile image
Jim • Edited

Not entirely related, but I wanted to mention that Python struggles from the same security issues. Most people don't realize that despite PyPI, the Python Package Index, being the primary source for most popular modules, the fact that it's available on PyPI offers no security assurances at all, and any module hosted there is also susceptible to hosting malicious code.

I wonder if methods used to filter malicious JavaScript XSS attacks against web-forms could be similarly employed by a plugin, or other means, to look for specific JS functions or strings, like eval(), before a package could be imported or used. It might be a good starter project for someone who has been considering learning how to write plugins for VSCode, Atom, etc.

Collapse
 
drbearhands profile image
DrBearhands

The possibility of malicious npm packages was pointed out almost a year ago.

The root of the problem cannot be fixed in imperative paradigms. By definition, if you can tell the effects of any piece of code through static analysis, it's declarative (no side-effects), not imperative. And if you can't identify the effects automatically, you're going to have to read and understand the code all by yourself, which greatly reduces the time saved by using someone else's code.

There's a reason functional programmers keep raving about (safe) composition.

Admittedly, you could have an automated system that analyzes a subset of the code/syntax automatically, rather like the safe language extension in Haskell, which might work for a lot of cases.

Collapse
 
tiguchi profile image
Thomas Werner

Not really a solution to this problem, but more of a damage control thought... Is there a way to run dependencies of dependencies in some kind of a restricted sandbox environment, where HTTP requests and access to DOM are intercepted, and only whitelisted dependencies get that kind of access? Is it possible to create a scope with fake window and document objects for those dependencies, from which they cannot break out?

If it's possible I guess webpack or whatever is bundling the JS would have to take care of that sandboxing?

Collapse
 
herrfugbaum profile image
Pascal
  • Author verification (npm)
    • This could be the solution already, if the verification process can't be tempered with.
  • Automatic major version bumps on maintainer change (npm)
  • Transparent minification (npm/github)
    • This could also be selfmade, something like a .travis.yml, that makes clear how the package needs to be built.
    • After building the package there should be a file hash that can be compared to the one npm pack produces.
  • Code signing (npm)
  • Mandatory two-factor-authentication (npm/github)
  • Locking package names after deletion / Allow only scoped packages (npm)
    • To prevent stuff like this.
  • Code should be sandboxed by default, no access to network and file system without asking for permission (node.js)
    • Examples: Android apps or Chrome extensions
    • This is afaik a major design feature in deno (See point 4 in the readme), the new TypeScript runtime by Ryan Dahl, the inventor of node.js
Collapse
 
hoelzro profile image
Rob Hoelz

I think there are a lot of facets to this, but I think it would be interesting if there were a crowdsourced auditing/vetting system associated with package ecosystems. Auditing is yet another way people can contribute to open source, and it's something a lot of us are doing anyway. Here's how I imagine it would work:

  • New package X is published to npm/pypi/etc - version 1.0.0
  • Prospective auditors would see X/1.0.0 show up as having 0 (or few) audits, and they check out the code, inspect it for flaws, etc
  • If the code passes inspection, each auditor would sign off on the version - not just the package + version, but the code checksum or signature that npm/pip/etc could verify upon install.
  • npm/pip/etc would be configured to only install modules that have passed N audits, or have been audited by certain trusted individuals, web-of-trust style
  • A few months later, package X is updated - version 1.0.1. X/1.0.1 re-enters the queue, and is open for audits - the cycle begins anew

Individuals or organizations could stand up their own installations of this auditing management tool, so that a user could make sure they sign off on code entering their own system, or a company could make sure new or updated dependencies are audited and use this tool to manage that. Who knows - maybe you could attach a bounty to audit a module (who knows how the auditor would prove they did the work, though)!

Obviously the value of the audit is only as good as much as you trust your auditors, - maybe it would just be pushing things back until things like voting ring attacks pop up around this auditing infrastructure - but it would be a start.

Collapse
 
ericherman profile image
Eric Herman

I can imagine an automated audit which might be useful. To ensure that the minified version of a package is the same as the normal version. Setting an easy way for anyone to verify this change sounds straight-forward, although perhaps not easy.

I see a direct parallel between trusting binaries and trusting minified source. For those who are not familiar with Ken Thompson's paper "Reflections on Trusting Trust", I highly recommend reading it:

Reflections on Trusting Trust
Ken Thompson
Communication of the ACM, Vol. 27, No. 8, August 1984
dl.acm.org/citation.cfm?id=358210

Collapse
 
xowap profile image
Rémy 🤖

I don't have any numbers to back this up but I believe that an issue in the JavaScript ecosystem is the lack of a decent standard library.

Because of that you need an external library for pretty much everything you do (left-pad anyone?) and that's adding a lot of potentially compromised dependencies in your tree.

If we could unify this JS ecosystem with a large standard library that can also be progressively polyfilled into older browsers this would be a great way to reduce the attack surface.

Collapse
 
webmutation profile image
DDd • Edited

I think there is already a company doing that... a node company forgot the name and i think they were bougth by someone.

nodesource.com/ these guys.

Collapse
 
ben profile image
Ben Halpern
Collapse
 
jep profile image
Jim

I definitely think digitally signed packages would be a good way to go. I think having a central repository for packages may eventually cause issues because considerations like who actually owns the hosted data, who can rightfully access the data, does charging to access the packages violate the licenses, etc? Maybe just a service that doesn't digital signing and authentication, so each module can be checked before being loaded, or create some sort of warning message should the check of the digital signature fails.