sed Expression
echo $var | sed 's/\(.*\)\/\(.*\)\.\(.*\)$/\2\n\3\n\1/'
The regex used is \(.*\)\/\(.*\)\.\(.*\)$
. Lets analyze it step by step.
Let's say the path is /User/talha/content/images/README.example.md
.
Extract Directory From Path
The first part of the regex is \(.*\)\/
.
-
\(
escapes(
-
.
means any character -
*
means any number of times - Combined,
.*
means match all the characters in a string -
\)
escapes)
-
\/
, escapes/
()
is used for capturing the match. sed matches the string and captures it.
You reference the first captured match using \1
; second captured match using
\2
; third using \3
and so on. In this example, .*
is in-between ()
,
therefore sed captures it.
.*
means all the characters from the start.
\(.*\)
is followed by \/
. sed matches all the characters from the start until
it finds the /
character.
sed regex matcher is greedy. It means it selects the longest possible match.
In our example path, sed does not stop matching at /User
. Instead, it keeps
matching until it runs out of the /
. Hence it matches:
/User/talha/content/images/README.example.md
Because .*
is enclosed inside brackets \(
and \)
. sed captures the match, which is the first capture in the expression. It can be referenced using \1
.
Extract Filename From Path
The second part of the regex is \(.*\)
.
-
\(
escapes(
-
.
means any character -
*
means any number of times - Combined,
.*
means match all the characters in a string -
\)
escapes)
So .*
means all characters, and because of brackets, capture it, which is the second capture; hence, it is referenced using \2
.
However, where sed starts the match from? It starts match right where the first part ended.
/User/talha/content/images/README.example.md
Till where will sed end the match? Good question. It depends on the third part of the regex.
Extract File Extension From Path
The third part of the regex is \.\(.*\)$
.
-
\.
escapes.
. It means literal.
-
\(
escapes(
-
.*
any string -
\)
escapes)
-
$
end of the string
It means, start from the end of the string, and move towards left, till a .
is found. Match any character between last .
and the end of the string and
capture it.
This part of the regex, matches:
/User/talha/content/images/README.example.md
What Is Matched
When all these parts are combined, we get the following matches
- /User/talha/content/images/README.example.md
- /User/talha/content/images/README.example.md
- /User/talha/content/images/README.example.md
What Is Captured
Notice, in the first part, \/
is outside the capturing \)
. In the third part, \.
is placed before \(
. Because they are not inside the ()
, they are not captured.
To understand, compare the captured result with the matched result.
- /User/talha/content/images/README.example.md
- /User/talha/content/images/README.example.md
- /User/talha/content/images/README.example.md
Replace Pattern
Let's focus on the replace pattern of the sed expression. \2\n\3\n\1
-
\2
prints the second captured group, which is filename -
\n
prints new line -
\3
prints the third captured group, which is file extension -
\n
prints new line -
\1
prints the first captured group, which is directory
Example Output
Lets run our example through the expression,
$ echo "/User/talha/content/images/README.example.md" | sed 's/\(.*\)\/\(.*\)\.\(.*\)$/\1\n\2\n\3/'
README.example
md
/User/talha/content/images
sed for macOS user
sed version that comes with macOS does not support \n
. You need to install gnu-sed
brew install gnu-sed
Then replace sed
with gsed
in the command.
Further Readings
Cover Image Attribution: Casey Horner
Top comments (0)