The title says it all. In general, is it better to have multiple small files linked together, or one large file?
At least in the web, I know that you should have one or very few CSS stylesheets, to better control the style across different pages and all. As for HTML, it's pretty logically one per page. But what about JS?
And most importantly, the main point of my question is, what about other environments and other languages? Does this depend on the language or the goal? If so, what are some examples or cases where one method is better than the other?
Regardless of which is better, is it defined just by convention, or is it because of performance?
BTW, this is first post, so I'm sorry if I missed something, and please let me know!
Top comments (19)
The idea of "files" is, for the most part, an abstraction developed to help people make sense of the contents of a disk. It's a lot easier to envision chunks of storage as documents with names and extensions and so forth than it is to try to work with byte offsets and lengths. Think of trying to find your way: street signs and buildings and landmarks help you orient yourself by dividing and delimiting space, but if you're lost in the desert, one sand dune looks much like the next.
It is perfectly possible to write programs in a single file in many languages, but tools that allow multiple source files to be "linked" into the finished product date back half a century for a reason: it makes an enormous difference to our ability to comprehend how individual parts of the system work and interact with each other. The ways in which programs may be divided into source files vary wildly depending on language, purpose, convention, and taste. But those are all essentially human factors. It doesn't matter to the computer.
Extending on this, I'd say that my reasoning is that you should split files by logical group, and ensure that they stay on topic. It's quite easy when you look at the difference between a CSS or JS file. However, each of those can be split by topic / purpose. You'll then find that there's opportunity for reuse rather than duplication :-)
I do like that you mention that the notion of "many files" is mostly a human construct, and I agree fully :-D
AgentDenton is right in the sense that files in a program should be logically grouped.
To extend on Dian's awesome answer, the amount of files (or the splitting of files in a program) depends on what programming language you use and what the purpose of your program is. Generally speaking, in a production-ready application (or a service, etc), the program should be split into multiple files that are grouped together based on the behaviour they provide.
For example, if you are building an application that reads files into memory and then puts them in a database, it makes sense to split the multiple steps into files that are grouped accordingly (e.g you can have the file load step, the decoding step and the persistence step).
It is quite important to get a feel for splitting up your program into multiple files so that (and quoting Dian here) it can "help you orient by dividing and delimiting". This also applies to functions, packages (or modules) etc. In object-oriented programming (think Java, C++) one good rule is the "single responsibility principle" which states that "every function, class (in our case file) or module should have responsibility over a single part of the functionality provided by the software". This is basically saying that in most cases (certainly not all or not in all programming languages) your files should do one thing only and do it well. Try using this rule the next time you code a small application and see if it helps your code become more readable and clean.
Enlightening! I guess it does depend greatly on the people writing, using, and maintaining the code.
Thank you for your answer!
I'd like to share this amazing talk by Evan Czaplicki, the creator of Elm: The life of a file.
In Haskell and Elm, I like to think of files (i.e. modules) as small libraries that offer good APIs. Then, other modules build upon them to offer higher-level APIs, and so on. I try to apply this idea in other languages when possible.
1/ Do not confuse the files that are used to write your code with the files that you deploy. You may perfectly work with many small files that are packaged in one single file when deploying.
2/ As far as version control and collaboration are concerned, many small files is certainly the way to go. Merges will be far easier.
I prefer small to mid-sized files.
I think that files should act like bookmarks for your code that help you and other developers understand what code does what, and easily find a specific group of code when something needs to be changed.
My advice is to use one file per module or class, or when it comes to languages that don't use modules and classes (like CSS), to try to pick a specific 'theme' for what the code in that file does.
Large files in and of themselves aren't a necessarily a problem, but I find they often indicate a class or module that's trying to do to much, and thus breaking the single responsibility principle.
Actually, Version control encourages us to have plenty of small files vs one big file. Working on different files across a team helps avoiding merge conflicts !
That only true for the final build, not for the development environmental.
It is much better to have small files with good names that target only small sections or modules. It's easier to browse, easier to understand and to manage even. Then, of course, they should be "compiled" to one file only because that's HTTP1 optimization.
In fact, modern JS frameworks also encourage this approach, like Components in React.
I don't know about other languages, though.
It is a lot easier to find what you need in a nested directory structure of many files, rather than a single file or even a flat directory of files with no structure. Use the filesystem to your advantage!
I have a general guideline of trying to keep files, of any kind, under 200-300 lines each. Regardless of whether it is CSS, JS, Java, Kotlin, raw HTML, or anything else, you will do well to find a framework/toolkit/workflow that will compile multiple files together so you can break them apart logically. If a file is getting larger than just a couple hundred lines, it is probably doing too much and should be broken up, so that each file can focus on doing a single thing well.
Wow, so many answers! Thank you everybody for your valuable input.
It's great to see different opinions from different backgrounds, but I see that generally the agreement is that having more small files tends to be better for the human side of building and maintaining programs. As the computer doesn't really "care" about how many files there are, it's a matter of keeping all the developers, present and future, in mind, when deciding how to organize code. Thanks!
Spiting your JS into multiple files is more readable for me. And I don't think it's a bad thing because even if you call several scripts in an html file for example, the navigator will simply put all your scripts in one giant script.
For front end assets. it's straightforward these days to break it down logically into smaller files to work with them more easily, and then during the build process to concatenate them. So you can have many CSS/SCSS and JS files, but only serve one of each in production. That way you can have the best of both worlds.
It's nearly always easier to understand lots of small files for server-side code too, and it's not likely to affect performance.
Basically, if you need (almost) all data from a file, keep it in one file. If you need just a small subset at a time and you can explicitly request a specific subset, keep subsets in separate files.
Let's say you have to create an address book for a small company. It contains a few offices and you will want to show all of them at once. It makes sense to keep the addresses in one file. But what if you were to create an address book of all companies in your country?
You would probably need to find a way to group addresses and keep them in separate files. Otherwise, it would be difficult to send so much data to a client or even open it on a server.
One solution would be to group addresses alphabetically. One file would contain companies starting with 'a', another starting with 'b' and so on. Then, your address book could allow a user to pick the first letter and reduce the amount of data loaded unnecessarily. You could further reduce file size by grouping addresses by the first two letters: 'aa', 'ab', 'ac'... Then, you could easily create a search on your site that would work when a user enters the first two letters.
This is pretty much how databases work with indexes and partitioning.
Sometimes, even if you need to send all the data to a user, you might want to keep it in separate files. Back in the days of floppy disks, a game would consist of 19 archive files (rar/zip), each under 1.44. This is still a relevant approach in the days of the internet as you can deliver "just enough" data to a user more quickly and load subsequent packages while the user is enjoying the data.
When developing I'd say the best practice is to have a good balance between the number of files and their size. If you have too many very small files or too few huge files your productivity will suffer anyway.
When delivering to clients instead you do the best thing for the client platform/interpreter (browser, java vm, os, etc).
For browsers specifically you usually want to reduce to the minimum the number of requests needed to get all the resources so that the page can be rendered quickly
For me. 1 file = 2-3 Components/Widgets. I think it relates to Separation of Concerns
This can really depend on the sizes and what you are going.
In situations like web development where you sometimes need css files for all pages and other times just for 1 page. It is good to have them split up as it reduces the file size most of the time. The downside to this is the additiona call to the server for the extra file when you need it. However typically the main css file will be cached and in this case you will end up only getting 1 file.
However there are times where too many files have the opposite effect. A good example would be icons. Lets say you load 40 images for icons, these images may only be 2kb each, but your header data will end up being 100 bytes. Header data will end up being 5% of your request. You could reduce this to using 1 large image, and using css to split the images up. It would be similar to that of a spritesheet. (Yes there are problems with scaling blah blah blah).
You have to focus on what slows you down. For web development it is the internet and then processing. Depending on the browser maybe it has to move memory around or what ever it may be, but the internet speed is the biggest factor. Compared to a local program, it is reading from a local drive. While this still takes time (from the computers point of view), it is almost nothing compared to sending packets thousands of miles. In most cases a slow hard drive spinning up is still faster.
From a local programs perspective it gets far more complicated. You could have files on different drives or in different locations. However lets just run through an easy example.
Lets say you have a program running, but you have a settings file that loads, some settings and maybe history.
Typically you read, parse, update. You are only doing this action 1 time.
However you could split this action up into 3... and it would create a similar issue as the web dev version above. Instead of requests, your moving data in and out of memory. While it is faster than the web, it still takes more time that 1 large file. More processing more time.
But things really changes when you start talking about multithreading. If you can do these 3 things at the same time... It would be faster than just 1 large file, while it is still more processing, you are splitting it up amoung multiple cores.
There can be many bottlenecks with this, some things like graphics or devices can't take multiple calls at the same time, so that could also slow you down.
However with multiple, just like the web, you can update, reload, change them without having to change all the files. This makes it great when you have specific files that change like history, but settings might not every change... or if it does it could be very rare.
There is so much to this... I tried to summerize it the best I could, but it can depend on so much. This is just assuming your fetching data files. It becomes a whole other thing when your talking about programs, that use other programs or scripts.
Many large files 😬