DEV Community

Cover image for Hunting Broken Social Media Links with Go
Ayoub Ali
Ayoub Ali

Posted on

Hunting Broken Social Media Links with Go

Let's Write a simple Golang program that crawls a website and finds broken social media links that can be hijacked. Broken social links may allow an attacker to conduct phishing attacks, which can lead to data breaches and other security incidents.
Finder currently supports Twitter, Facebook, Instagram, and TikTok, and much more in the future.

Step - 1

First we create a folder name:

finder
Enter fullscreen mode Exit fullscreen mode

then we will run:

go mod init github.com/ayoubzulfiqar/finder
Enter fullscreen mode Exit fullscreen mode

for me I am using my github for you it will be based on your prefrence

Dependencies

github.com/fatih/color v1.15.0
github.com/gammazero/workerpool v1.1.3
github.com/gocolly/colly v1.2.0
Enter fullscreen mode Exit fullscreen mode

Step - 2 User Agent

This user agent string can be used to mimic requests made by these browsers when crawling websites.

const (
    chromeMajorVersion  = "109"
    firefoxMajorVersion = "109"
)

// var userAgent string = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:" + firefoxMajorVersion + ") Gecko/20100101 Firefox/" + firefoxMajorVersion

// Or for Chrome:
var UserAgent string = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/" + chromeMajorVersion + ".0 Safari/537.36"
Enter fullscreen mode Exit fullscreen mode

Crawler

This function uses the colly package to crawl a website and find social media links that match specific domains, while also implementing some restrictions and filtering based on a deny list and other criteria.
It focuses on finding social media links within a specified domain and path while applying various filters and constraints.


import (
    "net/url"
    "strings"
    "time"

    "github.com/gocolly/colly"
)

func Visitor(visitURL string, maxDepth int) []string {
    socialDomains := []string{"twitter.com", "instagram.com", "facebook.com", "twitch.tv", "tiktok.com"}
    var socialLinks []string
    var visitedLinks []string
    denyList := []string{".js", ".jpg", ".jpeg", ".png", ".gif", ".bmp", ".svg", ".mp4", ".webm", ".mp3", ".csv", ".ogg", ".wav", ".flac", ".aac", ".wma", ".wmv", ".avi", ".mpg", ".mpeg", ".mov", ".mkv", ".zip", ".rar", ".7z", ".tar", ".iso", ".doc", ".docx", ".xls", ".xlsx", ".ppt", ".pptx", ".pdf", ".txt", ".rtf", ".odt", ".ods", ".odp", ".odg", ".odf", ".odb", ".odc", ".odm", ".avi", ".mpg", ".mpeg", ".mov", ".mkv", ".zip", ".rar", ".7z", ".tar", ".iso", ".doc", ".docx", ".xls", ".xlsx", ".ppt", ".pptx", ".pdf", ".txt", ".rtf", ".odt", ".ods", ".odp", ".odg", ".odf", ".odb", ".odc", ".odm", ".mp4", ".webm", ".mp3", ".ogg", ".wav", ".flac", ".aac", ".wma", ".wmv", ".avi", ".mpg", ".mpeg", ".mov", ".mkv", ".zip", ".rar", ".7z", ".tar", ".iso", ".doc", ".docx", ".xls", ".xlsx", ".ppt", ".pptx", ".pdf", ".txt", ".rtf", ".odt", ".ods", ".odp", ".odg", ".odf", ".odb", ".odc", ".odm", ".mp4", ".webm", ".mp3", ".ogg", ".wav", ".flac", ".aac", ".wma", ".wmv", ".avi", ".mpg", ".mpeg", ".mov", ".mkv", ".zip", ".rar", ".7z", ".tar", ".iso", ".doc", ".docx", ".xls", ".xlsx", ".ppt", ".pptx", ".pdf", ".txt", ".rtf", ".odt"}

    c := colly.NewCollector()
    c.UserAgent = UserAgent
    c.SetRequestTimeout(5 * time.Second)
    c.MaxDepth = maxDepth
    c.AllowURLRevisit = false
    u, err := url.Parse(visitURL)
    if err != nil {
        panic(err)
    }
    domain := u.Host
    path := u.Path
    c.OnHTML("a[href]", func(e *colly.HTMLElement) {
        link := e.Request.AbsoluteURL(e.Attr("href"))
        u2, err := url.Parse(link)
        if err != nil {
            panic(err)
        }
        linkDomain := u2.Host
        for _, domain := range socialDomains {
            if strings.Contains(linkDomain, domain) {
                socialLinks = append(socialLinks, e.Request.URL.String()+"|"+link)
            }
        }
        if strings.Contains(linkDomain, domain) {
            visitFlag := true
            for _, extension := range denyList {
                if strings.Contains(strings.ToLower(link), extension) {
                    visitFlag = false
                }
            }
            for _, value := range visitedLinks {
                if strings.ToLower(link) == value {
                    visitFlag = false
                }
            }

            if !strings.HasPrefix(u2.Path, path) {
                visitFlag = false
            }
            // if it's True it will append
            if visitFlag {
                visitedLinks = append(visitedLinks, link)
                err := e.Request.Visit(link)
                if err != nil {
                    panic(err)
                }

            }
        }

    })

    visError := c.Visit(visitURL)
    if visError != nil {
        panic(visError)
    }
    return socialLinks
}
Enter fullscreen mode Exit fullscreen mode

Check and Possible TakeOver

This function takes a slice of social media links as input and checks for the possibility of takeover based on certain conditions.

  1. For Facebook links, it constructs a temporary link and sends an HTTP request to check if the page returns a "404 Not Found" response. If it does, it prints a message indicating a "Possible Takeover."

  2. For TikTok links, it sends an HTTP request with the provided UserAgent and checks if the response status code is 404. If it is, it also prints a "Possible Takeover" message.

  3. For Instagram links, it constructs a temporary link and sends an HTTP request to check for a 404 response.

  4. For Twitter links, it constructs a temporary link and sends an HTTP request using the nitter.net domain as a proxy. If the response status code is 404, it prints a "Possible Takeover" message.

It focused on identifying social media links that may be vulnerable to takeover based on specific conditions for each social media platform.
However, the effectiveness of such checks may vary, and it's important to handle false positives and negatives appropriately in a production environment.

import (
    "crypto/tls"
    "io"
    "net/http"
    "net/url"
    "strings"

    "github.com/fatih/color"
)

func CheckTakeOver(socialLinks []string) {
    var alreadyChecked []string
    for _, value := range socialLinks {
        foundLink := strings.Split(value, "|")[0]
        socialLink := strings.Split(value, "|")[1]
        if StringInSlice(socialLink, &alreadyChecked) {
            continue
        }
        alreadyChecked = append(alreadyChecked, socialLink)
        if len(socialLink) > 60 || strings.Contains(socialLink, "intent/tweet") || strings.Contains(socialLink, "twitter.com/share") || strings.Contains(socialLink, "twitter.com/privacy") || strings.Contains(socialLink, "facebook.com/home") || strings.Contains(socialLink, "instagram.com/p/") {
            continue
        }
        u, err := url.Parse(socialLink)
        if err != nil {
            continue
        }
        domain := u.Host
        tr := &http.Transport{
            TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
        }
        if strings.Contains(domain, "facebook.com") {
            if strings.Count(socialLink, ".") > 1 {
                socialLink = "https://" + strings.Split(socialLink, ".")[1] + "." + strings.Split(socialLink, ".")[2]
            }
            socialLink = strings.Replace(socialLink, "www.", "", -1)
            tempLink := strings.Replace(socialLink, "facebook.com", "tr-tr.facebook.com", -1)
            resp, err := http.Get(tempLink)
            if err != nil {
                continue
            }
            defer resp.Body.Close()
            body, err := io.ReadAll(resp.Body)
            if err != nil {
                continue
            }
            if strings.Contains(string(body), "404 Not Found") {
                color.Green("Possible Takeover: " + socialLink + " at " + foundLink)

            }

        }
        if strings.Contains(domain, "tiktok.com") {
            if strings.Count(strings.Replace(socialLink, "www.", "", -1), ".") > 1 {
                continue
            }
            client := &http.Client{Transport: tr}

            req, err := http.NewRequest("GET", socialLink, nil)
            if err != nil {
                continue
            }

            req.Header.Set("User-Agent", UserAgent)

            resp, err := client.Do(req)
            if err != nil {
                continue
            }
            defer resp.Body.Close()

            if resp.StatusCode == 404 {
                color.Green("Possible Takeover: " + socialLink + " at " + foundLink)
            }
        }
        if strings.Contains(domain, "instagram.com") {

            if strings.Count(strings.Replace(socialLink, "www.", "", -1), ".") > 1 {
                continue
            }
            if !strings.Contains(socialLink, "instagram.com/") {
                continue
            }
            tempLink := "https://www.picuki.com/profile/" + strings.Split(socialLink, "instagram.com/")[1]
            client := &http.Client{Transport: tr}
            req, err := http.NewRequest("GET", tempLink, nil)
            if err != nil {
                continue
            }

            req.Header.Set("User-Agent", UserAgent)

            resp, err := client.Do(req)
            if err != nil {
                continue
            }
            defer resp.Body.Close()

            if resp.StatusCode == 404 {
                color.Green("Possible Takeover: " + socialLink + " at " + foundLink)
            }
        }
        if strings.Contains(domain, "twitter.com") {
            if strings.Count(strings.Replace(socialLink, "www.", "", -1), ".") > 1 {
                continue
            }
            u, err := url.Parse(socialLink)
            if err != nil {
                panic(err)
            }
            userName := u.Path
            tempLink := "https://nitter.net" + userName
            client := &http.Client{}
            req, err := http.NewRequest("GET", tempLink, nil)
            if err != nil {
                continue
            }

            req.Header.Set("User-Agent", UserAgent)

            resp, err := client.Do(req)
            if err != nil {
                continue
            }
            defer resp.Body.Close()

            if resp.StatusCode == 404 {
                color.Green("Possible Takeover: " + socialLink + " at " + foundLink)
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode
func StringInSlice(a string, list *[]string) bool {
    for _, b := range *list {
        if b == a {
            return true
        }
    }
    return false
}
Enter fullscreen mode Exit fullscreen mode

Commandline and Run

"Action" function responsible for processing a single URL. Within this function, it initiates a web crawl using the "internals.Visitor" function, which collects social media links. Subsequently, the "internals.CheckTakeOver" function is called to assess the possibility of a takeover. The code also includes progress tracking and displays the number of remaining URLs to process.

The core functionality is in the "Run" function. It starts by parsing command-line flags, which include specifying the path of a file containing URLs to be checked and the number of worker routines. The application reads URLs from the specified file, splits them into a slice, and calculates the total count of URLs to be processed.

To execute tasks concurrently, the code utilizes a worker pool created with the "workerpool" package, with the number of workers determined by the provided flag. Tasks are submitted for each URL, and the worker pool manages their execution. After all tasks have been submitted, the program waits for their completion with "wp.StopWait()".

package cmd

import (
    "flag"
    "fmt"

    "github.com/ayoubzulfiqar/finder/internals"
    "github.com/fatih/color"
    "github.com/gammazero/workerpool"

    "os"
    "strings"
)

var (
    queue int
)

func Action(url string) {
    sl := internals.Visitor(url, 10)
    internals.CheckTakeOver(internals.RemoveDuplicateStrings(sl))
    color.Magenta("Finished Checking: " + url)
    queue--
    fmt.Println("Remaining URLs:", queue)

}

func Run() {
    internals.LOGO()
    urlFile := flag.String("f", "", "Path of the URL file")
    numWorker := flag.Int("w", 5, "Number of worker.")
    flag.Parse()
    if *urlFile == "" {
        fmt.Println("Please specify all arguments!")
        flag.PrintDefaults()
        os.Exit(1)
    }
    file, err := os.ReadFile(*urlFile)
    if err != nil {
        fmt.Println(err)
        return
    }
    urls := strings.Split(string(file), "\n")
    queue = len(urls)
    fmt.Println("Total URLs:", queue)
    wp := workerpool.New(*numWorker)

    for _, url := range urls {
        url := url
        wp.Submit(func() {
            fmt.Println("Checking:", url)
            Action(url)
        })

    }
    wp.StopWait()

    color.Cyan("Scan Completed")
}
Enter fullscreen mode Exit fullscreen mode
func RemoveDuplicateStrings(strSlice []string) []string {
    allKeys := make(map[string]bool)
    list := make([]string, 0, len(strSlice))

    for _, item := range strSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}
Enter fullscreen mode Exit fullscreen mode

Usage

Finder Required Two Parameters

  1. -f : Path of the text file that contains URLs line by line.
  2. path : D:\url.txt - You have to provide the path of the txt file which contain all the url need to be scanned
  3. -w : The number of workers to run (e.g -w 10). The default value is 5. You can increase or decrease this by testing out the capability of your system.

url.txt will contain all the possible URL that needed to scanned.
You can create a local txt file and put the path and scan it.

go run . -f url.txt
Enter fullscreen mode Exit fullscreen mode

Currently, it supports Twitter, Facebook, Instagram and Tiktok without any API keys.
Will Support other Socila Media In Future..

Full Code

Top comments (0)