DEV Community

Midhun
Midhun

Posted on • Updated on

Simple web scraper that reads all the links to JSON files in JS

I had to get a list of all links on a webpage for a task I was working on. here I am sharing the snippet of code that I used. Let's discuss how to improve it

var tag = document.querySelectorAll("a");
var myarray = []
for (var i = 0; i < tag.length; i++) {
    var nametext = tag[i].textContent;
    var cleantext = nametext.replace(/\s+/g, ' ').trim();
    var cleanlink = tag[i].href;
    myarray.push([cleantext, cleanlink]);
};
function generateJson() {
    var hrefArray = [];
    for (var i = 0; i < myarray.length; i++) {
        let t = {}
        t.n = myarray[i][0]; t.m = myarray[i][1];
        hrefArray.push(t);
    };
    var win = window.open("Json");
    win.document.write(JSON.stringify(hrefArray));
}
generateJson()

Enter fullscreen mode Exit fullscreen mode

Steps

  1. You will need to open the website in your browser to get all links
  2. Go to the console tab in Inspect element
  3. Please paste the above code and press enter. A json file will open in a new window

Screenshots

  1. How to Run

Image description

  1. Result

Image description

Please let me know your thoughts after reading

Discussion (1)

Collapse
midhunz profile image
Midhun Author

how to download json