All the responses here are right, and I definitely would stress @mortoray's point
There's a genral rule for for asset transformation: always keep the original
Since this is clearly something we put a lot of thought into, I'll talk about our approach.
A user submits a field we call body_markdown, and that is saved to their DB, and before it's written, we take that text and turn it into a field called processed_html. I really feel like this should be done at write time. For every write of a document like this there are going to likely be several orders of magnitude more reads.
We also do more than running a markdown engine. We take steps to convert any images to our hosted service, for performance and security, and we have other security-related restrictions. The list of things that happens between markdown and html is always growing.
I like to think of the HTML as a function of the markdown. Every time the markdown changes, run the function to output HTML. Within the main function are a series of other functions to handle each subtask and return the latest state of the HTML. I think the functional approach makes the process easy to modify and reason with.
All the responses here are right, and I definitely would stress @mortoray's point
Since this is clearly something we put a lot of thought into, I'll talk about our approach.
A user submits a field we call
body_markdown
, and that is saved to their DB, and before it's written, we take that text and turn it into a field calledprocessed_html
. I really feel like this should be done at write time. For every write of a document like this there are going to likely be several orders of magnitude more reads.We also do more than running a markdown engine. We take steps to convert any images to our hosted service, for performance and security, and we have other security-related restrictions. The list of things that happens between markdown and html is always growing.
I like to think of the HTML as a function of the markdown. Every time the markdown changes, run the function to output HTML. Within the main function are a series of other functions to handle each subtask and return the latest state of the HTML. I think the functional approach makes the process easy to modify and reason with.
Nice, thanks for sharing how ya'll do it @ben ! I thought @mortoray 's point to 'always keep the original' was on point. Ended up saving the raw.