Saurav Jain

for Crawlee

Posted on Feb 27 • Originally published at crawlee.dev

Launching Crawlee Blog: Your Node.js resource hub for web scraping and automation.

#webscraping #automation #node #crawling

Hey, crawling masters!

I’m Saurav, Developer Community Manager at Apify, and I’m thrilled to announce that we’re launching the Crawlee blog today 🎉

We launched Crawlee, the successor to our Apify SDK, in August 2022 to make the best web scraping and automation library for Node.js developers who like to write code in JavaScript or TypeScript.

Since then, our dev community has grown exponentially. I’m proud to tell you that we have over 11,500 Stars on GitHub, over 6,000 community members on our Discord, and over 125,000 downloads monthly on npm. We’re now the most popular web scraping and automation library for Node.js developers 👏

Changes in Crawlee since the launch

Crawlee has progressively evolved with the introduction of key features to enhance web scraping and automation:

v3.1 added an error tracker for analyzing and summarizing failed requests.
The v3.3 update brought an exclude option to the enqueueLinks helper and integrated status messages. This improved usability on the Apify platform with automatic summary updates in the console UI.
v3.4 introduced the linkedom crawler, offering a new parsing option.
The v3.5 update optimized link enqueuing for efficiency.
v3.6 launched experimental support for a new request queue API, enabling parallel execution and improved scalability for multiple scrapers working concurrently.

All of this marked significant strides in making web scraping more efficient and robust.

Future of Crawlee!

The Crawlee team is actively developing an adaptive crawling feature to revolutionize how Crawlee interacts with and navigates through websites.

We just launched v3.8 with experimental support for the new adaptive crawler type.

Support us on GitHub.

Before I tell you about our upcoming plans for Crawlee Blog, I recommend you check out Crawlee if you haven’t already.

We are open-source. You can see our source code here. If you like Crawlee, then please don’t forget to give us a ⭐ on GitHub.

Crawlee Blog and upcoming plans!

The first step to achieving this goal is to reach out to the broader developer community through our content.

The Crawlee blog aims to be the best informational hub for Node.js developers interested in web scraping and automation.

What to expect:

How-to-tutorials on making web crawlers, scrapers, and automation applications using Crawlee.
Thought leadership content on web crawling.
Crawlee feature updates and changes.
Community content collaboration.

We’ll be posting content monthly for our dev community, so stay tuned!

If you have ideas on specific content topics and want to give us input, please join our Discord community and tag me with your ideas.

Also, we encourage collaboration with the community, so if you have some interesting pieces of content related to Crawlee, let us know in Discord, and we’ll feature them on our blog. 😀

In the meantime, you might want to check out this article on Crawlee data storage types on the Apify Blog.

DEV Community

Launching Crawlee Blog: Your Node.js resource hub for web scraping and automation.

Changes in Crawlee since the launch

Future of Crawlee!

Support us on GitHub.

Crawlee Blog and upcoming plans!

Top comments (0)

Read next

NestJS tip: how to run operations when the HTTP server is ready

How to Deploy a Next.js App on Windows Server 2022 with IIS: A Step-by-Step Guide

Building Type-Safe Event-Driven Applications in TypeScript using Pub/Sub, Cron Jobs, and PostgreSQL

Interface vs Type in TypeScript