Do you need a piece of code to extract title, description or frontmatter from markdwon dynamically or you are just curious to know how it is done?
This tutorial shows you how to do it efficiently and step by step.
Just give me the code:
const extractMetadataFromMarkdown = (markdown) => {
const charactersBetweenGroupedHyphens = /^---([\s\S]*?)---/;
const metadataMatched = markdown.match(charactersBetweenGroupedHyphens);
const metadata = metadataMatched[1];
if (!metadata) {
return {};
}
const metadataLines = metadata.split("\n");
const metadataObject = metadataLines.reduce((accumulator, line) => {
const [key, ...value] = line.split(":").map((part) => part.trim());
if (key)
accumulator[key] = value[1] ? value.join(":") : value.join("");
return accumulator;
}, {});
return metadataObject;
};
Now, let's explain everything step by step.
Step 1: Declare a function named "extractMarkdownMetadata"
const extractMarkdownMetadata = markdown => {
// The rest of the code will go here
};
extractMarkdownMetadata
takes a markdwon
as an argument. Let's assume the markdown we want to use is:
`---
title: how to get things done
description: this is greate
tags: money, income, coding
conver_image: ayobami.jpg
---
This is the main body of the article.`
Step 2: Write a regex that matches anything within ---
and ---
const charactersBetweenGroupedHyphens = /^---([\s\S]*?)---/;
Clearly, you get the purpose of the regex above but do you understand what each of its units does? Let me explain:
/: indicates we start writing a regex
^: means the matching only matches the beginning of a string
---: matches three hyphens
\s : matches whitespace characters (enter, tab and more)
\S : matches non-whitespace characters (texts, numbers and symbols)
[\s\S]: it matches a white or non-white space character
*: matches the preceding element zero or more times, in this case, it operates on [\s\S],
?: matches the preceding element zero or one time. So "*?" makes the matching lazy.
([\s\S]*?): () is a group capturing that remembers/keeps the string in the bracket as a group.
---: matches three ending hyphens
/: indicates the end of the regex
Step 3: Extract frontmeta or metadata from a string
const metadataMatched = markdown.match(charactersBetweenGroupedHyphens);
const metadata = metadataMatched[1];
Don't forget, markdwon
is a string of markdown passed to the function as an argument and now, we extract the metadata from it. If we console.log metadata
we should have a string that looks like below:
"title: how to get things done
description: this is greate
tags: money, income, coding
conver_image: ayobami.jpg"
You might want to ask, why do we assess metadataMatched[1] with 1
and why not 0 or any other number?
It is because the regex matched the string including --- and ---
and it is the first element of the array but group capturing helped pick the text between ( and )
as the second element of metadataMatched
. So, we used 1
to access it.
Step 4: Split the string of metadata into lines of an array
if (!metadata) {
return {};
}
const metadataLines = metadata.split("\n");
We return an empty object if metadata
is falsy and split the metadata
string into an array of lines of strings.
Step 5: Convert the lines into an object
After we split the metadata into an array of lines of strings, the metadataLines
should look like below:
[
"title: how to get things done",
"description: this is greate",
"tags: money, income, coding",
"conver_image: ayobami.jpg"
]
Now, let's convert everything into an object.
// Use reduce to accumulate the metadata into an object
const metadata = metadataLines.reduce((accumulator, line) => {
// Split the line into key-value pairs
const [key, ...value] = line.split(":").map(part => part.trim());
if(key) {
accumulator[key] = value[1] ? value.join(":") : value.join("");
}
return accumulator;
}, {});
Yeah, that is what the reduce function does.
const [key, ...value] = line.split(":").map(part => part.trim());
This part split each line by semi-colon (:). You should have realized that value is an array because of the rest operator (...). We do it that way in case semi-colon is also used as a part of value
in the key-value
pairs like " title: How to get things done: the best ways".
In this case, the only string before the first semi-colon is consider to be the key while the remainder is considered to be a value.
if(key) {
accumulator[key] = value[1] ? value.join(":") : value.join("");
}
return accumulator;
Then, we convert the key and value to key-value
pairs and put them into accumulator
. Remember, accumulator
is an argument from the reduce
callback function.
value[1] ? value.join(":") : value.join("")
We checked if value is an array with more than one elment, if that is true, the array elements are join with ":" and if it has an element; we turn it to a string.
If you look at the complete function, you should see something that looks like below:
return accumulator;
}, {});
We pass an empty object as the accumulator to the reduce method
because accumulator is an array by default.
Now, the final result should look like:
{
"title": "how to get things done",
description: "this is greate",
tags: "money, income, coding",
conver_image: "ayobami.jpg"
}
Finally, we can now extract frontmatter or metadata from any markdown with JavaScript. What are you waiting for? Go and use whatever you have learnt now.
See you soon!
One more thing
Do you want to solve any business problems with content or programming? Let's talk. Feel free to reach me on Twitter at Ayobami Ogundiran
Top comments (0)