How to build a Instagram photo downloader using PHP

#instagram #php #webscraping #webdev

Instagram has become of the most popular social media platform around the world and currently, it has more than 1 billion active users. Millions of photos and videos are uploaded daily on Instagram. There is a major problem with Instagram. It does not allow its users to download photos from their platform.

I believe if a user wants to use or save any Instagram photo, then they must download it instead of taking a screenshot of that image. As photos that are saved through screenshots have poor pixels can you cannot use them in any of your professional projects and do remember if you are using someone's Instagram photo, then you must take prior consent from them. As they have a copyright of their photos.

When a photo is getting uploaded on Instagram's CDN then the depletion of its pixels is happening in the background. As they cannot afford to fill their servers with heavy multimedia. So, taking a screenshot of that photo does not make any sense.

In this post, I am going to explain how you can scrape a photo or video from a public Instagram account. PHP will be used for this purpose. Although, you can even use python for this purpose I prefer PHP. Instagram photo downloader like Instaneek is also built with this code. And it is being hosted on dedicated server by Amazon web services. You can even host this piece of software on a shared hosting.

First, we are going to search for the meta tag with property "og:image" and after that, we will get the content of the tag. Understanding the syntax of the code is easy. One just needs to have a basic understanding of basic PHP code.

Here, we are going to define a function to get the name of root domain of given URL.

function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : $pieces['path'];
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', 
$domain, $regs)) {
return $regs['domain'];
}
return false;
}

Now, here comes the cURL which is the main function in PHP web scraping.

function file_get_contents_curl($url)
{
$ch = curl_init();

curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

$data = curl_exec($ch);
curl_close($ch);

return $data;
}

Here, is the full function to get the media type and then get its link.

function checkinstaurl($urlhere,$redirectpath) {

//remove white space
$urlhere = trim($urlhere);
$urlhere = htmlspecialchars($urlhere);
///remove white space

if (get_domain($urlhere) == "instagram.com") {
    //Its is a instagram url
        if (checkurlpath($urlhere))
        {

        //getting the meta tag data

        $html = file_get_contents_curl($urlhere);

        //parsing begins here:
        $doc = new DOMDocument();
        @$doc->loadHTML($html);
        $nodes = $doc->getElementsByTagName('title');

        //get and display what you need:
        $title = $nodes->item(0)->nodeValue;

        $metas = $doc->getElementsByTagName('meta');
        $mediatype = null;
        $description = null;

        for ($i = 0; $i < $metas->length; $i++)
        {
        $meta = $metas->item($i);

        if($meta->getAttribute('property') == 'og:type')
        $mediatype = $meta -> getAttribute('content');

        if($mediatype == 'video') {
        if($meta->getAttribute('property') == 'og:video')
        $description = $meta -> getAttribute('content');
        } else {
        if($meta->getAttribute('property') == 'og:image')
        $description = $meta -> getAttribute('content');
        $mediatype = 'photo';
        }

        } // for loop statement
        $out['mediatype'] = $mediatype;
        $out['descriptionc'] = $description;

        return $out;

        ///getting the meta tag data

        } // if the url path is right
       else {
   redirecterror($redirectpath);
       }
     }
else {
    redirecterror($redirectpath);
}

}

This method will also work on the video post. Firstly, you will need to get a type of post type and after that server can look for .mp4 file in the source code. But one must remember this is only going to work best for the post which has single multimedia like a single photo or video. Single multimedia will be saved at a single time, cumulative downloads cannot happen with this method.

You must know that web scraping is legal in some countries and somewhere is illegal. But it last it all depends on the purpose of web scraping.

I decry with web developers, who believe that python is more effective than PHP when it comes to web scraping. So, those who are a novice to web scraping they must go with PHP.

Top comments (2)

Insta Zoom • Jan 27 '22 • Edited

It is very similar to instazoom.mobi/

Soyel Ghosh • Aug 22 '23

You mentioned functions like checkurlpath, redirecterror, etc. These functions and pages need to be defined and implemented based on your specific requirements.

please give me the code...