DEV Community

Caleb Hearth
Caleb Hearth

Posted on • Originally published at calebhearth.com on

Send a From Header When You Crawl

Sending a From header is part of building a polite crawler, along with respecting Robots.txt and sending a unique User-Agent. The From header simply contains an email address that can be used by the site’s owner to reach out if your bot is creating any issues for them.

RFC 9110 which describes HTTP Semantics says that a From header SHOULD be sent for robotic user-agents. MDN says that the From header “must” be sent in that circumstance, but doesn’t have a citation for a spec that defines that.

I’ve been sending From headers for a while now, and I’m even including it as a requirement in a Swift type-driven API client I’ve been using across a few projects. I’m probably sending it in more places than I need to or that it would be expected in, but as I use scraping and API requests to automate my /now page, it seems prudent in case that causes issues for the services I’m using.

Setting a From header is easy. Its syntax is From: <email>. If you’re building any kind of scraper.

Read More

Top comments (0)