A thousand PDF generation solutions, but which one to choose?
Recently, I was faced with a problem that many developers fear and avoid: generating a PDF document.
There are lots of libraries handling this, the problem is knowing which one best fits your project. For my specific use case, I had to generate printable documents from HTML in a Symfony application.
Let's see why WeasyPrint fits this job perfectly, and what you need to know before using it on your own.
What about Snappy and wkhtmltopdf, the recommendation of Symfony?
When I first looked for solutions to generate a PDF document in a Symfony application, I was amazed to find that a SymfonyCast was made about this. It features the PHP library Snappy, which is essentially a wrapper around wkhtmltopdf.
There are unfortunately many problems with this library, some of which are presented on the website:
- Its CSS interpreter does not allow flexbox or grid, but only webkit extensions which are very limited.
- The last update of its engine (QtWebKit4) was in 2015, and it was dropped from most community repositories, including Alpine Linux. It has several critical security issues that will never be patched, mostly about remote code execution when using untrusted HTML.
- When choosing a library for a project, it is important to consider its maintenance status. In this case, wkhtmltopdf has stopped being maintained in 2020 and there are no plans to continue development. Therefore, most issues about Snappy will never get fixed.
As a consequence, I highly discourage you from using it as it is no longer a viable solution for generating PDF documents.
I was disappointed to learn that it was unusable, but fortunately, many of its users found a drop-in replacement and fell in love with it: WeasyPrint.
What is WeasyPrint?
WeasyPrint is a solution for creating PDF documents from HTML. It has a lot of features regarding pagination which are easy to use and predictable. You can find the list of all supported features on their website, as well as examples of complex PDFs such as reports, invoices, books, and more.
It utilizes its own visual rendering engine for HTML and CSS aiming to support web standards for printing that is implemented in Python. However, it does not support JavaScript execution.
The advantage of this approach is that it can be used as a command-line tool with other languages, just like the bundle for Symfony that I will present in this article!
You can easily do pagination
WeasyPrint uses its CSS layout engine designed for pagination, supporting parts of the CSS specifications written by the W3C. Most of the flex layout is implemented, and there is also a ton of supported features for paginated documents:
- Support for the @page at-rule to specify the document size, orientation, margin
- Support for page breaks with the CSS property
page-break-before
- Support for footnotes
- Support for page numbers and page selectors (you can even code a table of contents!)
- Support for fetching fonts with the usual syntax
<link href="https://www.example.com/font" rel="stylesheet" />
But there are unfortunately cons to this approach, as it is not a browser engine some CSS features are missing:
- The grid layout (sob)
- The gap property
- CSS filters, including shadows
- CSS for SVG, therefore using the
<img />
tag is better suited than including your SVG with the inline<svg />
tag to avoid the sanitizing step
A test suite is available for verifying the correct implementation of CSS guidelines.
You can generate high-quality documents
WeasyPrint supports many PDF features that make it a great tool for generating high-quality documents:
- generation of PDF/A documents, which is the ISO-standardized version allowing original formatting across different devices
- bookmarks generated by heading elements (
<h1>
to<h6>
) - hyperlinks, internal to the document such as a table of contents, or external such as a website link
- vector images and text, meaning that you can zoom in without compression artifacts; this is especially useful for generating and including bar codes and QR codes on your PDF
- basic support for PDF forms (at the moment only text inputs, text areas, and checkboxes)
Furthermore, WeasyPrint can find any font installed on your system with the help of fontconfig
, which is useful to avoid fetching fonts from external stylesheets whenever you generate a PDF. You can check what fonts are installed on your system with fc-list
.
Generate your first PDF with WeasyPrint
Rendering HTML with Twig
The first step before generating the PDF is writing the HTML. To generate the HTML string, we will use the Twig template engine, which is the default one in Symfony. It comes with tons of features such as inheritance, blocks, filters, functions, and more.
You can install Twig with the following bundle provided by Symfony.
composer require symfony/twig-bundle
The bundle comes with a configuration file located in config/packages/twig.yaml
, providing a default path for your templates. You can read more about the template engine here.
Create a template in the templates
directory of your Symfony application:
<!-- base.html.twig -->
<html>
<head>
<meta charset="UTF-8" />
<title>Hello world</title>
<style>
.title {
background-color: #dcb03f;
padding: 10px;
text-align: center;
}
</style>
</head>
<body>
<h1 class="title">Hello world!</h1>
</body>
</html>
The HTML can now be generated from this template using the \Twig\Environment->render
method. To do this, write a simple controller that returns the HTML:
// MyPdfController.php
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Response;
use Symfony\Component\Routing\Annotation\Route;
use Twig\Environment;
class PdfController extends AbstractController
{
public function __construct(
private readonly Environment $twig,
) {}
#[Route('/my-pdf-controller', name: 'my-pdf-controller')]
public function pdfController(): Response
{
$html = $this->twig->render('base.html.twig');
return new Response($html);
}
}
You can now test the endpoint by visiting /my-pdf-controller
in your browser. You should see your HTML rendered!
Never trust user HTML!
As with any other tool, you should avoid using untrusted HTML, CSS, and images, because WeasyPrint has access to local files and can make network requests. Always sanitize user inputs! This can be done with the filter escape in Twig. You can find a list of other possible attacks with advice to avoid them on the WeasyPrint documentation.
Performance is not the strength of WeasyPrint, meaning that heavy HTML files will increase generation time. You should always compress images before attaching them, as they are not compressed by default. Generating a 50-page-long PDF may take up to a minute in extreme cases, although multi-page documents generated on my project take fewer than 2 seconds to generate.
Generating PDF from your HTML
You will need to install the weasyprint
binary separately from the Symfony bundle, as WeasyPrint does not have a native PHP implementation.
This step may depend on your distribution and/or environment, see the installation page for reference. If you use Docker with an Alpine distribution, you can install it by adding the following lines:
# Dockerfile (Alpine distribution)
RUN apk add --no-cache \
weasyprint \
# used to find and configure fonts
fontconfig \
# used to render TrueType fonts
freetype \
# used as a default font
ttf-dejavu \
;
Do not forget to install a default font on your system! This is useful to have as a fallback, otherwise the binary may crash when attempting to render text. On Alpine Linux, you can install the ttf-dejavu
package to avoid this issue.
You can then install the bundle that allows running WeasyPrint from our Symfony application:
composer require pontedilana/weasyprint-bundle
Inject the newly installed service in your controller, and modify the response to return a PDF instead of HTML:
// MyPdfController.php
use Pontedilana\PhpWeasyPrint\Pdf;
use Pontedilana\WeasyprintBundle\WeasyPrint\Response\PdfResponse;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Response;
use Symfony\Component\HttpFoundation\ResponseHeaderBag;
use Symfony\Component\Routing\Annotation\Route;
use Twig\Environment;
class PdfController extends AbstractController
{
public function __construct(
private readonly Environment $twig,
private readonly Pdf $weasyPrint,
) {
}
#[Route('/pdf', name: 'pdf')]
public function pdf(): Response
{
$html = $this->twig->render('base.html.twig');
$pdfContent = $this->weasyPrint->getOutputFromHtml($html);
return new PdfResponse(
content: $pdfContent,
fileName: 'file.pdf',
contentType: 'application/pdf',
contentDisposition: ResponseHeaderBag::DISPOSITION_INLINE,
// or download the file instead of displaying it in the browser with
// contentDisposition: ResponseHeaderBag::DISPOSITION_ATTACHMENT,
status: 200,
headers: []
);
}
}
You are now ready to generate a basic PDF in your application!
Going further: what about the other PDF generation libraries?
Working on my project with WeasyPrint was a breeze: it is easy to set up, its repository is well maintained, it satisfies all the needs I had for pagination and it allowed me to generate high-quality PDF documents from re-usable Twig templates.
However, WeasyPrint is not the only solution available, and there might be other solutions that suit your use case better:
- If WeasyPrint does not fit your needs, you can find a comparison of other HTML to PDF libraries here
- If you also want to convert Markdown or LibreOffice formats, the self-hosted API Gotenberg is worth checking out
- If you want to convert an existing page, need to use JavaScript, or want your PDF rendered by a browser, take a look at this article about Puppeteer
Top comments (1)
Hi.
I remember coming across WeasyPrint when looking for an alternative to wkhtmltopdf but had to ditch it because of the lack of JavaScript support.
wkhtmltopdf is a bit of a hassle to setup/debug but once your eyes have bled enough then it's a viable solution. :)
I might give Puppeteer a go at some point.