Evgeny Ivanov

Posted on Aug 28, 2023 • Originally published at e10v.me

rico: rich content to HTML as easy as Doc(df, plot)

#datavisualization #datascience #python

What is rico?

rico is a Python package for creating HTML documents from rich content: dataframes, plots, images, markdown etc. It provides a high-level, easy-to-use API with reasonable defaults, as well as low-level access for better control.

Why rico?

One might wonder, why do I need another package if I can create HTML documents with Jupyter and nbconvert? If you are working with data in Jupyter notebooks, you probably don't need rico. But if you are working with data in Python scripts, rico can be helpful.

Here is my use case. I'm working with data in Python scripts and organizing them into data pipelines using Prefect. Sometimes I want to create a report as the last step of a pipeline. With Jupyter I would have the following routine:

Create a task to save all the dataframes I need for my report in Parquet files or in a database.
Create a Jupyter notebook to load the dataframes and visualize all the tables and plots I need.
Create a task to convert the notebook to HTML using nbconvert.

Creating an extra task and an extra notebook is not what bothers me the most. I have to think about how to pass data to a notebook. Sure it's not a very hard job, but it's not a job I want to spend my time on.

With rico, I simply create a task to add all objects I want to visualize to a document and save it to an HTML file.

How to use rico?

Installation

pip install rico

rico has no dependencies other than standard Python packages.

For Markdown support install markdown-it-py or Python Markdown or set your own Markdown renderer using rico.set_config.

Basic usage

rico provides both declarative and imperative style interfaces.

Declarative style:

import pandas as pd
import rico

df = pd.DataFrame(
    {
        "x": [2, 7, 4, 1, 2, 6, 8, 4, 7],
        "y": [1, 9, 2, 8, 3, 7, 4, 6, 5],
    },
    index=pd.Index(list("AAABBBCCC")),
)
plot = df.plot.scatter(x="x", y="y")

doc = rico.Doc("Hello, World!", df, plot, title="My doc")

The result:

Imperative style:

doc = rico.Doc(title="My doc")
doc.append("Hello, World!")
doc.append(df, plot)

The result will look the same.

Serialize the document to HTML using str(doc):

with open("doc.html", "w") as f:
    f.write(str(doc))

Also, you can call doc.serialize(indent=True) to indent the HTML element tree. For brevity, I will omit the serialization and saving to file in the following examples.

Content types

rico automatically recognizes the following content types:

rico content classes (subclasses of rico.ContentBase).
Matplotlib Pyplot Plots.
Dataframes, Seaborn plots, Altair charts and other types with IPython rich representation methods. Basically, rico support the same types as Jupyter.
Text.

Use specific classes for plots and text to change the default behavior:

doc = rico.Doc(
    rico.Text("Hello, World!", mono=True),  # The default value is False.
    df,
    rico.Plot(plot, format="png", bbox_inches="tight"),  # The default value is "svg".
    title="My doc",
)

The following code gives the same result as the code above:

doc = rico.Doc(title="My doc")
doc.append_text("Hello, World!", mono=True)
doc.append(df)
doc.append_plot(plot, format="png", bbox_inches="tight")

Use specific classes and methods for other content types:

Images: Image or Doc.append_image.
Code: Code or Doc.append_code.
Markdown: Markdown or Doc.append_markdown.
HTML tag: Tag or Doc.append_tag.
Raw HTML: HTML or Doc.append_html.

For example:

doc = rico.Doc(
    rico.Markdown("## Dataframe"),
    df,
    rico.Tag("h2", "Plot"),  # An alternative way to add a header.
    plot,
    rico.HTML("<h2>Code</h2>"),  # Another way to add a header.
    rico.Code("print('Hello, World!')"),
    title="My doc",
)

The result:

The following code gives the same result as the code above:

doc = rico.Doc(title="My doc")
doc.append_markdown("## Dataframe")
doc.append(df)
doc.append_tag("h2", "Plot")
doc.append(plot)
doc.append_html("<h2>Code</h2>")
doc.append_code("print('Hello, World!')")

Layout

rico relies on Bootstrap styles. The resulting documents are responsive and mobile-friendly. Use Bootstrap classes to control document layout. For example:

import altair as alt

doc = rico.Doc(
    rico.Tag("h2", "Dataframes"),
    rico.Div(
        rico.Obj(rico.Tag("h3", "A"), df.loc["A", :], class_="col"),
        rico.Obj(rico.Tag("h3", "B"), df.loc["B", :], class_="col"),
        rico.Obj(rico.Tag("h3", "C"), df.loc["C", :], class_="col"),
        class_="row row-cols-auto",
    ),
    rico.Tag("h2", "Plots"),
    rico.Div(
        rico.Obj(
            rico.Tag("h3", "A"),
            alt.Chart(df.loc["A", :]).mark_point().encode(x="x", y="y"),
            class_="col",
        ),
        rico.Obj(
            rico.Tag("h3", "B"),
            alt.Chart(df.loc["B", :]).mark_point().encode(x="x", y="y"),
            class_="col",
        ),
        rico.Obj(
            rico.Tag("h3", "C"),
            alt.Chart(df.loc["C", :]).mark_point().encode(x="x", y="y"),
            class_="col",
        ),
        class_="row row-cols-auto",
    ),
    title="Grid system",
)

The result:

How it works:

rico.Div represents an HTML container (<div>). You can add objects to it, just like with rico.Doc.
Each content element is also wrapped in a <div> container.
rico.Obj is a magic class that automatically determines the content type, just like rico.Doc does.
You can specify an HTML class attribute of a <div> container for rico.Div and for any content element such as rico.Obj.
The "row", "row-cols-auto", "col" classes are part of the Bootstrap Grid System. Read more in the documentation.

The following code gives the same result as the code above:

doc = rico.Doc(title="Grid system")

doc.append_tag("h2", "Dataframes")
div1 = rico.Div(class_="row row-cols-auto")
doc.append(div1)
for name, data in df.groupby(df.index):
    div1.append(rico.Tag("h3", name), data, class_="col")

doc.append_tag("h2", "Plots")
div2 = rico.Div(class_="row row-cols-auto")
doc.append(div2)
for name, data in df.groupby(df.index):
    div2.append(
        rico.Tag("h3", name),
        alt.Chart(data).mark_point().encode(x="x", y="y"),
        class_="col",
    )

More information

Read the more detailed user guide.
Take a look at the self-explanatory examples with resulting HTML documents.
Install rico, create your own documents.
Check the docstrings for details.

Feel free to create an issue to report bugs or submit a suggestions.

DEV Community

rico: rich content to HTML as easy as Doc(df, plot)

What is rico?

Why rico?

How to use rico?

Installation

Basic usage

Content types

Layout

More information

Top comments (0)