I was debugging the Bing Image Search to help implementing our new Bing Reverse Image Search API. Initially, I've used
Ctrl+Shift+F in the browser dev tools haven't found the request. Then I've figured out how to filter network requests in the browser dev tools, examined the response, and made a draft data adapter.
Algorithms to reverse engineer a JSON API on the SPA
Two ways I've used to reverse engineer a JSON API used on the Bing Image Search:
mitmproxy and browser developer tools. I explain the devtools process because it's used more often.
Ctrl+Fin the Network tab of browser dev tools.
- Go to the
Previewtab of the JSON response.
- Expand JS object recursively (my Brave Browser doesn't search in the collapsed JSON 😕)
Ctrl+Fthe target string
- Copy property path
- Navigate up and down in JS object (with arrow keys) to learn its structure and create an adapter.
- Copy as cURL and transform response with
jqto check my assumption.
Ctrl+Shift+F in the browser dev tools no longer searches across all responses.
I've proxied the browser network connections via
mitmproxy. Then filtered response bodies with
mitmproxywith view filter
$ mitmproxy --view-filter '~bs "Freshsales"'
- Start chromium-based browser with the target URL and the following flags and parameters
- Proxy requests via
- Use incognito mode (1) with a temporary user profile (2) ignoring insecure connections (3) and certificate errors (4):
--temp-profile -incognito --user-data-dir="mktemp -d
" --no-first-run --ignore-certificate-errors --allow-insecure-localhost. (I ignore certificate errors in a temporary browser profile to not install
mitmproxy's certificates system-wide.)
$ brave-browser 'https://www.bing.com/images/search?view=detailV2&insightstoken=bcid_RLKVsIV2BwkFXg*ccid_spWwhXYH&form=SBIHMP&iss=SBIUPLOADGET&sbisrc=ImgPicker&idpbck=1&sbifsz=927+x+524+%c2%b7+25.15+kB+%c2%b7+png&sbifnm=serpapi-serpbear.png&thw=927&thh=524&ptime=223&dlen=34344&expw=798&exph=451&selectedindex=0&id=-1051855017&ccid=spWwhXYH&vt=2&sim=11' --proxy-server='http://127.0.0.1:8080' --temp-profile -incognito --user-data-dir="`mktemp -d`" --no-first-run --ignore-certificate-errors --allow-insecure-localhost
mitmproxywill display the matched requests
mitmproxy can be used to find the HTTP request with the needed data in addition browser dev tools. At some point, I'll explore
wireshark to reverse engineer websites for web scraping and share the learnings with you.
If you have anything to share, any questions, suggestions, or something that isn't working correctly, feel free to reach out via Twitter at @ilyazub_, or @serp_api, or Mastodon at @iz.
Top comments (0)