DEV Community

Diego Carrasco Gubernatis
Diego Carrasco Gubernatis

Posted on • Originally published at diegocarrasco.com on

How to find files using Regex (Regular Expressions) in GNU/Linux and MacOs command line and do something.

I needed to find all the files that matched a regular expression. In this case all the files that had either 300x200 or 400x220 and where either png or jpg files.

You can do that pretty easily in BASH using find but the more filetypes and pconditions the longer the command.

For example the following would be used to find all png and jpg files in /tmp folder:

find /tmp -name '*.png' -or -name '*.jpg'

Enter regular expressions :)

Important: the find command in MacOS and in GNU/Linux (Ubuntu, Debian, etc..) are slightly different and not all syntax can be used between systems.

It’s a really useful magic but really hard to learn.

What is a regular expression?

A regular expression is a special text string for describing a search pattern. You can think of regular expressions as wildcards on steroids. You are probably familiar with wildcard notations such as *.txt to find all text files in a file manager. The regex equivalent is .*\.txt. -[regexbuddy]

On Mac
find -E . -regex '.*\.(jpg|png)'

On Linux
find ./ -regextype posix-extended -regex '.*(jpg|png)$'

Find all the jpg and png files which have 300x200 or 400x220 in their filenames.

On Mac
find -E . -regex '.*(300x200|400x220)\.(jpg|png)'

On Linux
find ./ -regextype posix-extended -regex '.*(300x200|400x220)\.(jpg|png)$'

Those commands will find all the files of the type something-300x200.jpg or somethingelse400x220.png

So now you see a pattern. You can use the same regular expression on MacOS and Linux (the part between ' ) if you change the sintax and add an $ at the end in the case of Linux.

Here are some useful regular expression you may want to know.

Find all the png and jpg files with SAM somewhere in the filename

'.*(SAM).*\.(jpg|png)'
This would find, among others, the following files

Files found by previous expression

On Mac
find -E . -regex '.*(SAM).*\.(jpg|png)'
On Linux
find ./ -regextype posix-extended -regex '.*(SAM).*\.(jpg|png)$'

Now let’s learn xargs

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command. -[Wikipedia]

It’s great to be able to find all those files, but you should be able to do something with them once you have found them, right?
That’s where xargs comes in. Let’s say you would like to remove all the jpg and png files that have SAM in the filename.
On Mac
find -E . -regex '.*(SAM).*\.(jpg|png)' | rm
On Linux
find ./ -regextype posix-extended -regex '.*(SAM).*\.(jpg|png)$' | rm

That was easy, right? And it’s really handy to be able to delete some files from all folders. For example all the automatically-made thumbnails in Wordpress and other systems. Let’s say you have to clean a folder with several sub-folders and only leave the original files. On a Mac you can do this (add or remove other resolutions) and remove all the files with a resolution in their name (for example something-AAAxBBB.jpg

find -E . -regex '.*(1024x576|278x380|506x380|1024x576|508x380|1024x681|550x366|550x309|350x220|380x380|768x768|285x380|768x1024|1024x1024|213x380|576x1024|550x367|253x380|681x1024|1024x768)\.(jpg|png)' | xargs rm

Related links

Source

Top comments (0)