loading...

Execute commands for multiple files using fish

danrot90 profile image Daniel Rotter Originally published at danielrotter.at ・4 min read

Recently I have downloaded some submissions from a course I am currently giving. Luckily the tool being used allows to download an archive with all submissions at once, but unfortunately the structure of the archive is not really handy. It contained a folder for every student, and in this folder there was another zip archive, which contained the actual submission of the given student.

In addition to that every single archive contained a JavaScript project, in each of which I had to execute npm install in order to test the submission afterwards. So I could either go into each folder, unzip the archive and install the NPM packages for each and every one of them. Or I could take some time, and figure out how to unzip all archives and install the required packages with two small command line scripts.

Of course I chose the latter!

First of all I had to find a glob that matches all the files I want to execute the command (unzip in my case) for. The ls command is great for testing this:

$ ls */*.zip

ls is usually used to list the content of entire directories, but also allows to list files being passed to it. This is really useful in combination with the wildcards the fish shell provides. fish tries to match the /.zip expression to as many files as possible, whereby the * can be a string of any length not containing /, which is the folder delimiter in linux. That means it will find any arbitrarily named file ending in .zip being located directly in a sub folder of the current directory. So the above command returned file names like erika_mustermann/submission.zip and john_doe/assignment.zip for me, which indicated that I was on the right track. I always start tasks like that this way, because I want to make sure that I am not accidentally executing the commands for the wrong files.

Next step: Figure out how to use the for loop in fish. I still prefered to play it safe, so I decided to only output the names of the files using the echo command within the for loop.

$ for file in */*.zip;
    echo $file;
end

Basically the for loop in fish is a simple for .. in construct, whereby the name after the for is being used as the variable name (mind that it is defined without a prefix, but when accessing it a $ is required), and the part between in and ; is the expression that is being looped over, which is every file that matches the glob in our case. Other than some differences in whitespaces the output should be pretty much the same as in the previous ls command.

So I knew that the for loop was receiving the correct values and I could start writing the actual command I want to execute for all of these files. I’ve ended up with the following:

$ for file in */*.zip;
    unzip $file -d (dirname $file);
end

The unzip command takes the name of the archive file you want to extract. The only problem is that it will extract the archive’s content in the current directory instead of right next to itself. Fortunately the unzip command comes with a -d option, that allows to specify the directory you want the archive to be extracted to. However, this option expects a directory, not a file. All I’ve got until now is the file name of the archive, but thankfully dirname exists as well. That’s a command that will remove everything after the last / in the given string, including the / itself. So it is a great utility to get the parent directory of another file or directory. I use that command in combination with fish’s command substitution, which allows me to execute another command and use it as parameter of an outer command. That is done by putting the inner command in paranthesis.

The above command unarchived all zip files right where there are located. But some of the students added another root folder in their archive, and others didn’t. So the structure was not always the same, but I still wanted to run npm install for all of them at once. Therefore I decided to use another for loop with a different glob. The ** wildcard is similiar to the * wildcard, but with one important difference: It also includes /, meaning that it matches an arbitrary number of sub folders, not just a single one. This way it does not matter if another folder has been added as root in the archives.

In addition to that I also make use of the fact that multiple commands can be used within a for loop if they are delimited by a ;.

$ for file in **/package.json;
    cd (dirname $file);
    npm install;
    cd -;
end

This loop finds all package.json files in any of the current sub directories, changes to the containing directory of the file by again using the dirname command with the $file variable. Afterwards npm install is executed to install all the dependencies, and once that is finished it goes back to the previous working directory (that’s what the - stands for).

For some reason I was always a bit hesitant when using loops in the command line. Somehow the command line always felt to me like something to which I enter a single command and get a result in return, which is a pretty simple protocol. Using stuff like pipes still felt ok to me, but a loop might need to be broken into multiple lines quite quickly, which does not feel natural to me. But it is probably a much bigger time saver than other features of shells, so I checking them out is highly recommended!

Discussion

markdown guide