DEV Community

Pierre
Pierre

Posted on • Originally published at daolf.com

New season, new project: I need you πŸ‘‰ πŸ™

A quick update

I haven't written a lot lately, I was very busy with my first own, and co-founded, company: PricingBot a price monitoring service for e-commerce.

Things well went, we brought in some cash, and currently, the whole thing works well on autopilot.

But lately, my partner and I decided to build something new, something for developers.

One API to rule them all

During our short career, we did a lot of scraping, and we all had the same problems:

  • proxy handling
  • managing a large fleet of chrome headless
  • captcha
  • JS rendering
  • other

We know all of this can takes a lot of times. We wanted to make a tool that allows people to easily scrape web pages, on the go, without having to take care of all this.

This is why we've decided to build ScrapingNinja πŸ•Ά πŸ•· πŸ•Έ, the easiest scrapping API on the web.

I'd love to have your input on the idea, so if you could answer those 3 questions on the comment I'd be forever grateful.

Getting feedback when you have an idea can be tough and we really care about the comments we receive from the community.

Here is what I'd like to know:

  1. What are you most pain point when you scrape (proxy / js rendering / latency / capcha / ...)
  2. Would you use such an API?
  3. How many people do you know that could use it?

If you answer those, thank you very very very much πŸ™.

Continuing my blog

If you liked my last post about the best time to post on dev.to last month,
I plan to post one about tags analysis early next week.

I'll also try to share as much as possible about this new project with the community if there is interest in it.

Thank you for reading

Usually, I blog about more tech stuff:

PS: We built the landing page with Landen and I must say it is awesome for someone like me who is very bad with design/font/color. I can't recommend it enough for people who need to quickly put their ideas in front of someone else's eyes.

Top comments (7)

Collapse
 
svedova profile image
Savas Vedova

I like the tool (pricingbot) I will actually propose this to my wife who works in the e-commerce field.

Regarding the API, I used once puppeteer to scrape some content and it was pretty straightforward. I would definitely use the data however. For instance if I am scraping several sites to find the median price of a car, i would rather use the api hoping it already has some data i could rely on. Companies would even pay a lot of money for this IMO.

Keep up the good work!

Collapse
 
daolf profile image
Pierre

Thank you very much!

May I ask what do you mean by β€œ if they already have some data” ?

Collapse
 
svedova profile image
Savas Vedova

Sure! So what I mean is that the api could be written in a way that it learns the scraped data, and after a while people can use it only to query data. Imagine I scrape the average car price in the market for a model, and I specify several websites to look for. Next time another user might use just the data and they won't have to rewrite the same scraper.

I am saying this because I was using puppeteer and it was is very easy to use, I haven't seen the need for another tool when I used it. The hard part (in terms of effort) was to write the scraper and save the data into the database. It just takes time.

Thread Thread
 
daolf profile image
Pierre

Oh, I see, it makes sense now.

However, ScrapingNinja only returns you raw HTML, no formatted data, maybe it is not that clear on the landing page.

What you are talking about is some kind of product that gets a URL in input and that output formatted data so you don't have to configure XPath / CSS selector whatever.

Thank you very much for your time and feedback.

Collapse
 
vuild profile image
Vuild

Looks good, signed up.

Good luck on the launch Pierre.

Collapse
 
daolf profile image
Pierre

Thank you very much!

Collapse
 
vuild profile image
Vuild

What are you most pain point when you scrape (proxy / js rendering / latency / capcha / ...)

General reliability.

Would you use such an API?
With a smile. No specific plans yet. :)

How many people do you know that could use it?
SEO community, archiving, independent web is coming back, I think you can find a market.