DEV Community

Scrapfly for Scrapfly

Posted on • Originally published at scrapfly.io on

How to Use Yelp API to Extract Business and Review Data

How to Use Yelp API to Extract Business and Review Data

Considered one of the most popular domains for business directories in the US, Yelp contains valuable company details, including addresses, emails, phone numbers, and reviews. But what's the most efficient way to extract these data?

In this guide, we'll take an extensive look into Yelp API, its key features, pricing, and limitations. Furthermore, we'll discuss potential alternatives. Let's dig in!

Legal Disclaimer and Precautions

This tutorial covers popular web scraping techniques for education. Interacting with public servers requires diligence and respect and here's a good summary of what not to do:

  • Do not scrape at rates that could damage the website.
  • Do not scrape data that's not available publicly.
  • Do not store PII of EU citizens who are protected by GDPR.
  • Do not repurpose the entire public datasets which can be illegal in some countries.

Scrapfly does not offer legal advice but these are good general rules to follow in web scraping

and for more you should consult a lawyer.

Yelp API Overview

The Yelp API is a product provided by Yelp for developers to automate certain actions or extract specific business data. Yelp APIs come with different endpoints, covering a wide range of workflows and resources.

In terms of data extraction, the Yelp API offers the following popular features:

  • Search for businesses based on terms, location, categories, or phone numbers.
  • Search for business service offerings.
  • Search for businesses with food delivery services.
  • Retrieve businesses' engagement metrics and review data.
  • Retrieve a business's details based on its ID or alias.

Note that the Yelp API capabilities aren't limited to the above features. Refer to the official Yelp API documentation for more.

Yelp Fusion API

Yelp includes thousands of business details and millions of equivalent reviews across different sectors and indentures. This makes the queries required to find specific businesses quite complex. Therefore, Yelp has introduced the Fusion API.

The Fusion API provides an easier business search experience to find the best matching results powered by an AI chat interface. It answers prompted queries and categorizes them into business questions or search results. Then, it retrieves the relevant results with related information, reviews, photos, and more.

For further details on Yelp Fusion API, refer to the official introductory tutorial.

Why Use Yelp API?

Yelp has information and review data for millions of businesses across various sectors and industries. Extracting this data using Yelp empowers different use cases, including:

  • Market Research

    Navigating business services on Yelp allows business owners to evaluate their offerings based on the current market trend, which supports decision-making and helps businesses remain competitive.

  • Lead Generation

    Considered one of the largest business directories in the US, Yelp API enables easy retrieval of contact information. Details like names, addresses, phone numbers, and emails make building outreach and marketing campaigns easier.

  • Sentiment Analysis

    Using third-party tools and software has made it much easier to utilize LLMs for RAG applications and sentiment analysis models. Hence, extracting data from Yelp reviews is an excellent way to train these language models for context-aware applications.

For further details, have a look at our introduction to web scraping use cases.

How to Use Yelp API?

In the following sections, we'll explore using the Yelp API for data extraction. We'll cover each related resource endpoint and the core parameters.

Search for Businesses

One of the most popular Yelp API endpoints is used to search for businesses. The generic search endpoint retrieves basic business data based on the provided search query. Below is the search API endpoint schema:

curl --request GET \
     --url 'https://api.yelp.com/v3/businesses/search?sort_by=best_match&limit=20' \
     --header 'Authorization:Yelp API key' \
     --header 'accept: application/json'
Enter fullscreen mode Exit fullscreen mode

The business search endpoint accepts different URl parameters to refine and narrow down the retrieved results. Below are the most common query parameters:

Parameter Type Description
term string Search term to use
sort_by string Sorting algoirthm to use
categories []string Categories to filter search results by
location string Geographic area to filter search results by
latitude number Latitude of the location to search from
longitude number Longitude of the location to search from
price []integer Pricing levels to filter the search result with
attributes []string Bussiness attributes to filter by
limit integer Number of results to retrieve
offset integer Pagination cursor to start from

Above, are the commonly used query parameters when searching for businesses. Below is an example of the business details retrieved.

{
  "businesses": [
    {
      "alias": "golden-boy-pizza-hamburg",
      "categories": [
        {
          "alias": "pizza",
          "title": "Pizza"
        },
        {
          "alias": "food",
          "title": "Food"
        }
      ],
      "coordinates": {
        "latitude": 41.7873382568359,
        "longitude": -123.051551818848
      },
      "display_phone": "(415) 982-9738",
      "distance": 4992.437696561071,
      "id": "QPOI0dYeAl3U8iPM_IYWnA",
      "image_url": "https://yelp-photos.yelpcorp.com/bphoto/b0mx7p6x9Z1ivb8yzaU3dg/o.jpg",
      "is_closed": true,
      "location": {
        "address1": "James",
        "address2": "Street",
        "address3": "68M",
        "city": "Los Angeles",
        "country": "US",
        "display_address": ["James", "Street", "68M", "Los Angeles, CA 22399"],
        "state": "CA",
        "zip_code": "22399"
      },
      "name": "Golden Boy Pizza",
      "phone": "+14159829738",
      "price": "$",
      "rating": 4,
      "review_count": 903,
      "transactions": ["restaurant_reservation"],
      "url": "https://www.yelp.com/biz/golden-boy-pizza-hamburg?adjust_creative=XsIsNkqpLmHqfJ51zfRn3A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=XsIsNkqpLmHqfJ51zfRn3A",
      "business_hours": {
        "open": [
          {
            "is_overnight": false,
            "start": 15,
            "end": 130,
            "day": 0
          },
          {
            "is_overnight": false,
            "start": 630,
            "end": 1730,
            "day": 1
          },
          {
            "is_overnight": false,
            "start": 45,
            "end": 500,
            "day": 2
          }
        ],
        "hours_type": "REGULAR",
        "is_open_now": false
      }
    }
  ],
  "region": {
    "center": {
      "latitude": 37.76089938976322,
      "longitude": -122.43644714355469
    }
  },
  "total": 6800
}
Enter fullscreen mode Exit fullscreen mode

Each request to the Yelp API for business search retrieves the matching businesses, with a maximum of 240 entities in each call. However, the results don't include business reviews.

Retrieve Business Data

The business details endpoint retrieves the detailed business content. It follows the below schema:

curl --request GET \
     --url https://api.yelp.com/v3/businesses/business_id_or_alias \
     --header 'Authorization: Your Yelp API key' \
     --header 'accept: application/json'
Enter fullscreen mode Exit fullscreen mode

To retrieve a specific business details, this API endpoint accepts either the business ID or its alias as the business_id_or_alias path parameter.

As for the endpoint query parameters, they are limited to the below properties:

Parameter Type Description
locale string Locale code in the langauge and country code foramt
device_platform string The platform to use for the mobile_link property

Here's a sample output of details retrieved by the business API:

{
  "alias": "golden-boy-pizza-hamburg",
  "categories": [
    {
      "alias": "pizza",
      "title": "Pizza"
    },
    {
      "alias": "food",
      "title": "Food"
    }
  ],
  "coordinates": {
    "latitude": 41.7873382568359,
    "longitude": -123.051551818848
  },
  "display_phone": "(415) 982-9738",
  "distance": 4992.437696561071,
  "id": "QPOI0dYeAl3U8iPM_IYWnA",
  "image_url": "https://yelp-photos.yelpcorp.com/bphoto/b0mx7p6x9Z1ivb8yzaU3dg/o.jpg",
  "is_claimed": false,
  "is_closed": true,
  "date_opened": "",
  "date_closed": "",
  "location": {
    "address1": "James",
    "address2": "Street",
    "address3": "68M",
    "city": "Los Angeles",
    "country": "US",
    "display_address": [
      "James",
      "Street",
      "68M",
      "Los Angeles, CA 22399"
    ],
    "state": "CA",
    "zip_code": "22399"
  },
  "name": "Golden Boy Pizza",
  "phone": "+14159829738",
  "photos": [
    "https://s3-media2.fl.yelpcdn.com/bphoto/CPc91bGzKBe95aM5edjhhQ/o.jpg",
    "https://s3-media4.fl.yelpcdn.com/bphoto/FmXn6cYO1Mm03UNO5cbOqw/o.jpg",
    "https://s3-media4.fl.yelpcdn.com/bphoto/HZVDyYaghwPl2kVbvHuHjA/o.jpg"
  ],
  "photo_details": [
    {
      "photo_id": "CPc91bGzKBe95aM5edjhhQ",
      "url": "https://s3-media2.fl.yelpcdn.com/bphoto/CPc91bGzKBe95aM5edjhhQ/o.jpg",
      "caption": "Meat",
      "width": "710,",
      "height": "47,",
      "is_user_submitted": "false,",
      "user_id": "null,",
      "label": "food"
    },
    {
      "photo_id": "FmXn6cYO1Mm03UNO5cbOqw",
      "url": "https://s3-media4.fl.yelpcdn.com/bphoto/FmXn6cYO1Mm03UNO5cbOqw/o.jpg",
      "caption": "Dessert",
      "width": 585,
      "height": 78,
      "is_user_submitted": false,
      "user_id": "null,",
      "label": "food"
    },
    {
      "photo_id": "HZVDyYaghwPl2kVbvHuHjA",
      "url": "https://s3-media4.fl.yelpcdn.com/bphoto/HZVDyYaghwPl2kVbvHuHjA/o.jpg",
      "caption": "Dessert_2",
      "width": 710,
      "height": 53,
      "is_user_submitted": false,
      "user_id": null,
      "label": "food"
    }
  ],
  "photo_count": 50,
  "price": "$",
  "rating": 4,
  "review_count": 903,
  "hours": {
    "open": [
      {
        "is_overnight": false,
        "start": 15,
        "end": 130,
        "day": 0
      },
      {
        "is_overnight": false,
        "start": 630,
        "end": 1730,
        "day": 1
      },
      {
        "is_overnight": false,
        "start": 45,
        "end": 500,
        "day": 2
      }
    ],
    "hours_type": "REGULAR",
    "is_open_now": false
  },
  "special_hours": [
    {
      "date": "2019-02-07",
      "end": "2000",
      "is_closed": null,
      "is_overnight": false,
      "start": "1600"
    }
  ],
  "transactions": [
    "restaurant_reservation"
  ],
  "url": "https://www.yelp.com/biz/golden-boy-pizza-hamburg?adjust_creative=XsIsNkqpLmHqfJ51zfRn3A&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=XsIsNkqpLmHqfJ51zfRn3A",
  "attributes": {
    "business_temp_closed": 1657868400,
    "outdoor_seating": false,
    "liked_by_vegans": false,
    "liked_by_vegetarians": true,
    "hot_and_new": "2022-12-10"
  },
  "messaging": {
    "url": "https://www.yelp.com/raq/AA5cAADa-F9f5DPqZ-PADA?adjust_creative=5374ujususZtKiSNEg7uhg&utm_campaign=yelp_api_v3&utm_medium=api_v3_graphql&utm_source=5374upadasZtCvMLBg7uhg#popup%3Araq",
    "use_case_text": "Request a Quote",
    "response_rate": 1,
    "response_time": 791,
    "is_enabled": true
  },
  "yelp_menu_url": "https://www.yelp.com/menu/golden-boy-pizza-hamburg",
  "rapc": {
    "is_enabled": true,
    "is_eligible": true
  }
}
Enter fullscreen mode Exit fullscreen mode

Despite having the full business details retrieved by this Yelp API endpoint, the review data aren't included.

Retrieve Service Offerings

We have explored the endpoints responsible for retrieving business details. However, the services provided are retrieved through a dedicated endpoint for service offerings.

Below is the service offerings API endpoint schema:

curl --request GET \
     --url https://api.yelp.com/v3/businesses/business_id_or_alias/service_offerings \
     --header 'Authorization: Your Yelp API key' \
     --header 'accept: application/json'
Enter fullscreen mode Exit fullscreen mode

The above Yelp API endpoint requires a business_id_or_alias path parameter to identify the related business.

The related query parameters are used to define localization settings:

Parameter Type Description
locale string Locale code in the langauge and country code foramt

Here's an example of what the service offering results look like:

{
  "active": [
    "bathtub_shower_installation",
    "drain_repair",
    "emergency_services",
    "garbage_disposal_repair",
    "gas_line_services",
    "offers_electric_water_heater_installation"
  ],
  "eligible": [
    "backflow_services",
    "bathtub_shower_installation",
    "bathtub_shower_repair",
    "drain_installation",
    "drain_repair",
    "emergency_services"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Retrieve Business Reviews

The business review data is among the most frequently requested information on Yelp. For this, Yelp provides a dedicated review API endpoint with the below schema:

curl --request GET \
     --url 'https://api.yelp.com/v3/businesses/business_id_or_alias/reviews?limit=20&sort_by=yelp_sort' \
     --header 'Authorization: Your Yelp API key' \
     --header 'accept: application/json'
Enter fullscreen mode Exit fullscreen mode

Similar to the previous business-related API endpoints, passing the business ID or its alias as a business_id_or_alias path parameter is required to identify the business entity for retrieving reviews.

The Yelp review API provides a few query parameters for locality settings and pagination:

Parameter Type Description
locale string Locale code in the langauge and country code foramt
offset integer Pagination cursor to start from
limit integer Number of results to retrieve
sort_by string Sorting algoirthm to use

The review snippet results include details about the review text, rating, and user details:

{
  "possible_languages": [
    "en"
  ],
  "reviews": [
    {
      "id": "xAG4O7l-t1ubbwVAlPnDKg",
      "url": "https://www.yelp.com/biz/la-palma-mexicatessen-san-francisco?hrid=hp8hAJ-AnlpqxCCu7kyCWA&adjust_creative=0sidDfoTIHle5vvHEBvF0w&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=0sidDfoTIHle5vvHEBvF0w",
      "text": "Went back again to this place since the last time i visited the bay area 5 months ago, and nothing has changed. Still the sketchy Mission, Still the cashier...",
      "rating": 5,
      "time_created": "2016-08-29 00:41:13",
      "user": {
        "id": "W8UK02IDdRS2GL_66fuq6w",
        "profile_url": "https://www.yelp.com/user_details?userid=W8UK02IDdRS2GL_66fuq6w",
        "image_url": "https://s3-media3.fl.yelpcdn.com/photo/iwoAD12zkONZxJ94ChAaMg/o.jpg",
        "name": "Ella A."
      }
    },
    ....
  ],
  "total": 3
}
Enter fullscreen mode Exit fullscreen mode

The above example response includes complete review data for each snippet. However, each review API call limits the number of reviews retrieved to only three.

Yelp API Evaluation

So far, we've explored the technical aspects of the Yelp API. However, a common question arises: is using the Yelp API suitable for extracting business and review data? For this, we must explore two crucial factors: pricing and limitations.

Pricing

Yelp offers different subscription tiers, each varying in pricing and the data fields that can be retrieved. Let's consider the minimum plans required to retrieve both basic business and review data:

  • Business data: Starter plan, with pricing of $7.99 /1,000 API calls
  • Review data: Plus plan, with pricing of $9.99 /1,000 API calls

To better explore this, let's evaluate the cost of extracting 1000 business and review data using Yelp API.

Data Plan Pricing per 1,000 calls Max results per API call Cost per 10,000 results
Business Starter $7.99 1 $79.9
Reviews Plus $9.99 3 $33.33

Above is a rough estimation of the Yelp API costs for extracting 10,000 business and review entities. However, the subscription plans considered for this cost estimation only cover very basic attributes. The full data attributes are only available under the enterprise plan, which comes at a much higher cost of $14.99 per 1,000 API calls.

For further details on Yelp API pricing, refer to the official documentation.

Limitations

So far, we have explored the available Yelp APIs for business and reviewed data extraction, including their specifications and result schema. However, we have identified the below limitations:

  • Subscription plans are expensive and allow for a limited number of API calls.
  • Full data attributes are only accessible through the enterprise plan, which is even more expensive.
  • Gathering complete business information requires requesting data from multiple API endpoints.

Yelp API Alternatives: Web Scraping

An alternative to using Yelp API for data extraction is using web scraping. This approach enables extracting data from Yelp's public web pages. Instead of requesting the API endpoints for direct JSON data retrieval, we can parse the HTML or replicate background API calls to extract what we are looking for!

For example, let's replicate and parse Yelp's public search pages as an alternative to the paid search API endpoint:

import json
import asyncio

from parsel import Selector
from typing import List, Dict
from urllib.parse import urlencode
from httpx import AsyncClient, Response

# initialize an async httpx client
client = AsyncClient(
    # enable http2
    http2=True,
    # add basic browser like headers to prevent getting blocked
    headers={
        "Accept-Language": "en-US,en;q=0.9",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
        "Accept-Encoding": "gzip, deflate, br",
        "Cookie": "intl_splash=false"
    },
    follow_redirects=True
)

def parse_search(response: Response) -> List[Dict]:
    """parse listing data from the search XHR data"""
    assert response.status_code == 200, "Request is blocked, use ScrapFly to bypass Yelp's blocking"
    search_data = []
    selector = Selector(text=response.text)
    script = selector.xpath("//script[@data-id='react-root-props']/text()").get()
    data = json.loads(script.split("react_root_props = ")[-1].rsplit(";", 1)[0])
    for item in data["legacyProps"]["searchAppProps"]["searchPageProps"]["mainContentComponentsListProps"]:
        # filter search data cards
        if "bizId" in item.keys():
            search_data.append(item)
        # filter the max results count
        elif "totalResults" in item["props"]:
            total_results = item["props"]["totalResults"]
    return {"search_data": search_data, "total_results": total_results}

async def scrape_search(keyword: str, location: str):
    """scrape single page of yelp search"""

    def make_search_url(offset):
        base_url = "https://www.yelp.com/search?"
        params = {"find_desc": keyword, "find_loc": location, "start": offset}
        return base_url + urlencode(params)
        # final url example:
        # https://www.yelp.com/search?find_desc=plumbers&find_loc=Seattle%2C+WA&start=1

    first_page = await client.get(make_search_url(1))
    data = parse_search(first_page)
    return data


async def run():
    search_data = await scrape_search(
        keyword="plumbers", location="Seattle, WA"
    )
    with open ("data.json", "w", encoding="utf-8") as f:
        f.write(json.dumps(search_data, indent=2, ensure_ascii=False))

if __name__ == " __main__":
    asyncio.run(run())
Enter fullscreen mode Exit fullscreen mode

Here, we request Yelp search pages given a search term and location. Then, we extract all business results from the page hidden web data. Here are what the results look like:

{
  "total_results": 240,
  "search_data": [
    {
      "bizId": "_Wv9uLrzQ1dZ6fgMYjgygg",
      "searchResultBusiness": {
        "ranking": null,
        "isAd": true,
        "renderAdInfo": false,
        "name": "Rooter-Man",
        "alternateNames": [],
        "businessUrl": "/adredir?ad_business_id=_Wv9uLrzQ1dZ6fgMYjgygg&campaign_id=jpQhVJGCG8ILi71Et0XDdQ&click_origin=search_results&placement=vertical_0&placement_slot=1&redirect_url=https%3A%2F%2Fwww.yelp.com%2Fbiz%2Frooter-man-orting%3Foverride_cta%3DGet%2Bpricing%2B%2526%2Bavailability&request_id=9d07f67caab0b9cc&signature=919646537897431750a9ec41b1240262d0e8596eed098a3d2776e9c9afe1898f&slot=0",
        "categories": [
          {
            "title": "Plumbing",
            "url": "/search?find_desc=Plumbing&find_loc=Seattle%2C+WA"
          }
        ],
        "priceRange": "",
        "rating": 0.0,
        "isClickableReview": false,
        "reviewCount": 0,
        "formattedAddress": "",
        "neighborhoods": [],
        "phone": "",
        "serviceArea": null,
        "parentBusiness": null,
        "servicePricing": null,
        "bizSiteUrl": "https://biz.yelp.com",
        "serviceOfferings": [],
        "businessAttributes": {
          "licenses": [
            {
              "license_number": "ROOTE**792MT",
              "license_expiration_date": "2025-08-12",
              "license_verification_url": "https://secure.lni.wa.gov/verify/Detail.aspx?UBI=602584774&LIC=ROOTE**792MT&SAW=",
              "license_verification_status": "verified",
              "license_verification_date": "2023-11-17",
              "license_issuing_authority": "WA DLI ",
              "license_type": "Journey Level",
              "license_source": "biz_owner",
              "licensee": null
            }
          ]
        },
        "alias": "rooter-man-orting",
        "website": {
          "href": "/adredir?ad_business_id=_Wv9uLrzQ1dZ6fgMYjgygg&campaign_id=jpQhVJGCG8ILi71Et0XDdQ&click_origin=search_results_visit_website&placement=vertical_0&placement_slot=1&redirect_url=https%3A%2F%2Fwww.yelp.com%2Fbiz_redir%3Fcachebuster%3D1701335261%26s%3D853c7f42baedaddb12d3a47cbf0c7c30e7bb3cf5d0408740f1e1ee56ec69c2a7%26src_bizid%3D_Wv9uLrzQ1dZ6fgMYjgygg%26url%3Dhttp%253A%252F%252Fwww.rooterman.com%26website_link_type%3Dwebsite&request_id=9d07f67caab0b9cc&signature=8842df091a5b79be57b4bf6122644039b1f6c07fac0c2fdb235c8ff076ce2520&slot=0",
          "rel": "noopener nofollow"
        },
        "city": "Orting"
      },
      "scrollablePhotos": {
        "isScrollable": false,
        "photoList": [
          {
            "src": "https://s3-media0.fl.yelpcdn.com/bphoto/-31eN7ypNCIJCHRO0Xjf3g/ls.jpg",
            "srcset": "https://s3-media0.fl.yelpcdn.com/bphoto/-31eN7ypNCIJCHRO0Xjf3g/258s.jpg 1.03x,https://s3-media0.fl.yelpcdn.com/bphoto/-31eN7ypNCIJCHRO0Xjf3g/300s.jpg 1.20x,https://s3-media0.fl.yelpcdn.com/bphoto/-31eN7ypNCIJCHRO0Xjf3g/348s.jpg 1.39x"
          }
        ],
        "photoHref": "/adredir?ad_business_id=_Wv9uLrzQ1dZ6fgMYjgygg&campaign_id=jpQhVJGCG8ILi71Et0XDdQ&click_origin=search_results&placement=vertical_0&placement_slot=1&redirect_url=https%3A%2F%2Fwww.yelp.com%2Fbiz%2Frooter-man-orting%3Foverride_cta%3DGet%2Bpricing%2B%2526%2Bavailability&request_id=9d07f67caab0b9cc&signature=919646537897431750a9ec41b1240262d0e8596eed098a3d2776e9c9afe1898f&slot=0",
        "allPhotosHref": "/biz_photos/_Wv9uLrzQ1dZ6fgMYjgygg",
        "isResponsive": false
      },
      "childrenBusinessInfo": null,
      "searchResultBusinessPortfolioProjects": null,
      "searchResultBusinessHighlights": {
        "bizSiteUrl": "https://biz.yelp.com/business_highlights?utm_source=disclaimer_www_searchresults",
        "businessHighlights": [
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_locally_owned_v2",
            "iconName": "18x18_locally_owned",
            "id": "LOCALLY_OWNED_OPERATED",
            "title": "Locally owned & operated"
          },
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_family_owned_v2",
            "iconName": "18x18_family_owned",
            "id": "FAMILY_OWNED_OPERATED",
            "title": "Family-owned & operated"
          },
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_workmanship_guaranteed_v2",
            "iconName": "18x18_workmanship_guaranteed",
            "id": "WORKMANSHIP_GUARANTEED",
            "title": "Workmanship guaranteed"
          },
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_years_in_business_v2",
            "iconName": "18x18_years_in_business",
            "id": "YEARS_IN_BUSINESS",
            "title": "20 years in business"
          },
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_veteran_owned_v2",
            "iconName": "18x18_veteran_owned",
            "id": "VETERAN_OWNED_OPERATED",
            "title": "Veteran-owned & operated"
          },
          {
            "bizPageIconName": "",
            "group": {},
            "bizPageIconV2Name": "40x40_free_estimates_v2",
            "iconName": "18x18_free_estimates",
            "id": "FREE_ESTIMATES",
            "title": "Free estimates"
          }
        ],
        "numGemsAllowed": 2
      },
      "tags": [],
      "serviceOfferings": [],
      "snippet": {
        "readMoreText": "more",
        "readMoreUrl": "/adredir?ad_business_id=_Wv9uLrzQ1dZ6fgMYjgygg&campaign_id=jpQhVJGCG8ILi71Et0XDdQ&click_origin=read_more&placement=vertical_0&placement_slot=1&redirect_url=https%3A%2F%2Fwww.yelp.com%2Fbiz%2Frooter-man-orting%3Foverride_cta%3DGet%2Bpricing%2B%2526%2Bavailability&request_id=9d07f67caab0b9cc&signature=919646537897431750a9ec41b1240262d0e8596eed098a3d2776e9c9afe1898f&slot=0",
        "text": "Give us a call for a free consultatiom.",
        "thumbnail": {
          "src": "https://s3-media0.fl.yelpcdn.com/bphoto/A0e8SoYZthSqITMDTjB0sA/30s.jpg",
          "srcset": "https://s3-media0.fl.yelpcdn.com/bphoto/A0e8SoYZthSqITMDTjB0sA/ss.jpg 1.33x,https://s3-media0.fl.yelpcdn.com/bphoto/A0e8SoYZthSqITMDTjB0sA/60s.jpg 2.00x,https://s3-media0.fl.yelpcdn.com/bphoto/A0e8SoYZthSqITMDTjB0sA/90s.jpg 3.00x"
        },
        "id": "",
        "type": "specialty"
      },
      "searchActions": [],
      "markerKey": "ad_business:below_organic:U5LNtOZST6_9gpNAbqw8Lg",
      "searchResultLayoutType": "scrollablePhotos",
      "verifiedLicenseInfo": {
        "licenses": [
          {
            "licensee": null,
            "licenseNumber": "ROOTE**792MT",
            "issuedBy": "WA DLI ",
            "trade": "Journey Level",
            "verifiedDate": "2023-11-17",
            "expiryDate": "2025-08-12"
          }
        ],
        "bizSiteUrl": "https://biz.yelp.com/verified_license?utm_source=legal_disclaimer_www"
      },
      "verifiedLicenseLayout": "BadgeAndTextBelowBizName",
      "yelpGuaranteedInfo": {
        "yelp_guaranteed_status": false,
        "yg_info_modal_url": "https://www.yelp.com/yelp-guaranteed"
      },
      "adLoggingInfo": {
        "placement": "vertical_0",
        "slot": 0,
        "placementSlot": 1,
        "opportunityId": "9d07f67caab0b9cc",
        "adCampaignId": "jpQhVJGCG8ILi71Et0XDdQ",
        "flow": "search",
        "isShowcaseAd": false
      },
      "offerCampaignDetails": null
    },
    ....
  ]
}
Enter fullscreen mode Exit fullscreen mode

From the sample output above, we can see that we have retrieved full business data from search pages directly in JSON. Some attributes returned are even only available for the Yelp API through the enterprise plan!

For further tips and tricks on Yelp web scraping, refer to our dedicated guide.

Scrape Yelp at Scale With ScrapFly

We have seen that web scraping is a much better alternative to using Yelp API. However, there's a catch: web scraping blocking.

Yelp is able to identify our web scraping requests as being automated and hence requiring us to solve CAPTCHA challenges or even blocking us entirely:

How to Use Yelp API to Extract Business and Review Data
Yelp scraping blocking

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

How to Use Yelp API to Extract Business and Review Data

Using ScrapFly to bypass Yelp scraping blocking is fairly straightforward. All we have to do is replace our HTTP client with ScrapFly client:

# standard web scraping code
import httpx
from parsel import Selector

response = httpx.get("some yelp.com URL")
selector = Selector(response.text)

# in ScrapFly becomes this 👇
from scrapfly import ScrapeConfig, ScrapflyClient

# replaces your HTTP client (httpx in this case)
scrapfly = ScrapflyClient(key="Your ScrapFly API key")

response = scrapfly.scrape(ScrapeConfig(
    url="website URL",
    asp=True, # enable the anti scraping protection to bypass blocking
    country="US", # set the proxy location to a specfic country
    render_js=True # enable rendering JavaScript (like headless browsers) to scrape dynamic content if needed
))

# use the built in Parsel selector
selector = response.selector
# access the HTML content
html = response.scrape_result['content']
Enter fullscreen mode Exit fullscreen mode

Refer to our official official Yelp scraper on GitHub for ready-to-use data extraction scripts for various datasets.

FAQ

To wrap up this guide, let's look at some of the frequently asked questions about using Yelp API for data extraction.

Is Yelp API free?

No, Yelp API is provided under paid subscription tiers. Each tier supports specific data resources.

Are there are alternatives for Yelp API?

Yes, web scraping Yelp is a competitive alternative to Yelp API. This approach extracts Yelp data using HTLML parsing or by replicating background requests. Refer to our Yelp scraping guide for more.

How to get Yelp API key?

Yelp requires API tokens to authorize its endpoints. To get your Yelp API key, subscribe to any of the available plans.

Summary

In this guide, we went through an in-depth guide on Yelp API. We started by exploring the available API endpoints, their schemas, parameters, and outputs.

We have seen that using Yelp API for data extraction has limitations, including expensive subscription plans and limited data attributes supported.

As an alternative, we have explored ScrapFly web scraping API. It provides antibot bypass capabilities, enabling Yelp data extraction at scale.

Top comments (0)