API Scraping - E-Commerce Data - Excel & PostgreSQL Using Python

PAckages FOR This Process is

pip install jupyter
pip install requests
pip install pandas
pip install sqlalchemy

After Installation given Packages
We USe THis E-Commerce Site for data

https://www.currys.co.uk/

Now load this site and after That
USe this cmd On Chrome ANd inspect Given page
ctrl+shift+s
And Click on Network
And after it Look For File start with name navigation
and copy as a CURL By Right clicking on it

now go to google and search for

https://www.onlinedevtools.in/curl

and paste the given file and convert It in Python
and AFter that Open Jupyter import Given Libraries We installed earlier

import requests
import pandas as pd
import sqlalchemy

after that Paste the Given Code that we get from devtools site
Now, Create a json object and output data using this code

result_json  =  response.json()
result_json.keys()
listing =  result_json['resultsets']['default']['results']

After all We Put everything togather That the KEys WE HAve By runnung Upper Code
`product_title = []
brand= []
price= []
review_score= []
review_count= []
description= []

for result in listing:
product_title.append(result['title'])
brand.append(result['brand'])
price.append(result['price'])
review_score.append(result['reevoo_score'])
review_count.append(result['reevoo_count'])
description.append(result['short_description'])
speakers_df = pd.DataFrame({'product_title':product_title,'brand':brand,'price':price,'review_score':review_score,
'review_count':review_count,'description':description})
speakers_df
`
after running all these cell you get Data from only One Page

So What we have t do if we need data from multiple pages

for that you have to make for loop
suppose i want data from 20 pages so we can us eloop that look like this

for i in range(1,21):
print (i)

Here the Complete Code for Multiple Pages
`

headers = {
'authority': 'api.cdn.dcg-search.com',
'sec-ch-ua': '^\^',
'sec-ch-ua-mobile': '?0',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
'sec-ch-ua-platform': '^\^Windows^\^',
'accept': '/',
'origin': 'https://www.currys.co.uk',
'sec-fetch-site': 'cross-site',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://www.currys.co.uk/',
'accept-language': 'en-US,en;q=0.9',
}
product_title = []
brand= []
price= []
review_score= []
review_count= []
description= []

for i in range(1,21):
params = (
('do', 'json'),
('m_results_per_page', '20'),
('page', str(i)),
('q', 'speaker^%%5E20docks'),
('sort', 'relevance'),
('sp_cs', 'UTF-8'),
)

response = requests.get('https://api.cdn.dcg-search.com/currys/navigation', headers=headers, params=params)

result_json  =  response.json()
listing =  result_json['resultsets']['default']['results']

for result in listing:
    #Product title
    try:
     product_title.append(result['title'])
    except:
        product_title.append('')

    #brand
    try:
        brand.append(result['brand'])
    except:
        brand.append('')
    try:

        price.append(float (result['price']))
    except:
        price.append('')
    try:
        review_score.append(float (result['reevoo_score']))
    except:
        review_score.append('')

    try:

        review_count.append(int (result['reevoo_count']))
    except:
        review_count.append('')
    try:

        description.append(result['short_description'])
    except:
        description.append('')

speakers_multiple_df = pd.DataFrame({'product_title':product_title,'brand':brand,'price':price,'review_score':review_score,
'review_count':review_count,'description':description})
speakers_multiple_df

after all we have to save data in excel file so for that we use this logic

speakers_multiple_df.to_excel('speakers_multiple.xlsx',index = False)

ANd also if we want to Save data and use in PostgreSQL
for that we use this logic For Multiple Pages

engine = sqlalchemy.create_engine('postgresql://postgres:"yourPassword"@localhost:5432')
#hAnii :(
speakers_multiple_df.to_sql('e-commerce',engine, index=False)

for getting this Data In PostgreSQL You HAve To Install this In Your Machine

That's It For Now...............

DEV Community

API Scraping - E-Commerce Data - Excel & PostgreSQL Using Python

Top comments (0)

Read next

Frontend Challenge - December Edition

Winter Solstice

Enhancing User Experience with React-Joyride: How I made User onboarding seamless

Context Dump: Simplifying AI File Preparation