Whenever you want to look for something on the Internet, you use Google. The giant search engine indexes almost everything on the web. It has made billions of web pages accessible for people to find. And so, by using it, you would have a greater chance of finding whatever you are searching for.
However, within the large sea of indexed web content and public data, pieces of sensitive information can sometimes find themselves landing on search results. And frequently, this happens without their owners realising it.
A malicious hacker, by performing a technique called Google Dorking (or Google Hacking), can get their hands on this supposedly hidden content.
If you are not familiar with Google Dorks, don’t worry. In this post, I will explain what they are and how you can use them. I will also provide you with examples of how hackers employ them to access sensitive content. And finally, I will share with you a few best practices of how to protect yourself against them.
Before we go any further, I would like to remind you that accessing any information to which we are unauthorized is considered illegal in many jurisdictions. The primary focus of this article is to help you identify and clear any leaking information you might have. It also aims to assist you through the reconnaissance phase of your pen-testing projects. I do not encourage any other malicious use.
Google Dorks are search queries specially crafted by hackers to retrieve sensitive information that is not readily available to the average user. The technique of searching using these search strings is called Google Dorking, or Google Hacking.
The Google search box can act similarly to a command-line or an interpreter when provided with the right queries. In other words, there are certain keywords, and operators, that have special meaning to Google.
Users can employ these operators to help them find relevant results to their search queries in a short amount of time.
On the other hand, hackers can also take advantage of these operators to retrieve files containing passwords, lists of emails, log files, and many more.
The following example is a google dork query that returns log files containing passwords with email addresses:
filetype:log intext:password intext:(@gmail.com | @yahoo.com | @hotmail.com)
By the end of this article, you will be able to write similar queries.
Operators are the building blocks of Google dorks. Therefore, we will address them here first before we can write full dork queries.
Here is a list of the most common operators that you need to know:
If you use the operator
OR (or |) between two keywords or more, then the search results will return pages that contain matches to at least one of the keywords.
google OR bing OR duckduckgo
Using the operator AND between two keywords or more forces the search engine to return results relevant to all provided keywords.
Samsung AND Apple
Enclosing the search terms in double-quotes (“search string”) returns only webpages that contain an exact match of the string.
For example, if you search for the following :
"Google Dorks Explained"
Only pages that contain that same string will be returned. And so, pages that contain “Explained Google Dorks", or “Google Hacking using dorks explained” will not be matched.
“site: ” limits the search to the specified website.
This query will only return web pages from Wikipedia that are relevant to the keyword Linux.
If you use the operator
‘–‘ followed by a keyword, then this keyword is excluded from the results.
If we apply this operator to the previous example, then we will have the complete opposite results.
The above query will exclude the Wikipedia site from the results.
The asterisk operator
‘*‘ is used as a wildcard and can match any word or group of words. This operator can be very useful when combined with the double quotes operator.
"username * password"
This example returns pages that contain the word username, followed by a group of words, which are then followed by the word password.
The real power of google operators arises from how you can combine them to form complex queries. In such cases, the use of brackets is necessary to determine which operator has the highest priority.
If you remember some basics from your math class, then you won’t have a problem understanding the following example:
"google (dorks OR dorking OR hacking)" AND (explained OR tutorial OR guide)
If you want Google to show only pages containing the search terms in their URL, then you can use the operator
For instance, the following query will return any page that contain the word admin in its url:
Although this query on its own might return millions of pages — most of which are irrelevant — you can still filter out the results by using additional commands. For instance, if you limit the search to your website, you can verify if you have an exposed admin folder that you should worry about.
“intext:” returns pages containing the search term in their content.
“intitle:” returns pages that contain the terms of the search in their title, not their content.
When using the command
“filetype:“, you force Google to only return pages that have a certain extension.
In the example below, Google will return only PDF files that contain the words
"Budget report" filetype:pdf
Google stores a copy of almost every page it visits. These copies can sometimes come in handy, especially if the original web page is no longer available or is too slow to respond.
If you want to search in Google’s cache for a previous version of a page, you can use the command
Now that you know how dangerous Google dorks can be, you’re probably wondering how you can protect yourself, or your website, against them.
First of all, you should put yourself in the position of an attacker and try using google dorks against yourself. If you find something in the search results that shouldn’t be there, then you can fix this problem by following these good practices:
You can create a file called
“robots.txt”in your directory, and specify to search engine robots which directories or files they should not index.
For sensitive pages, you should include meta tags in your Html code header with Noindex and Nofollow values.
You should always password-protect your directories.
Never store a password in plaintext. Instead, use salted hashes.
Sitedigger is a tool that you can use to help you find vulnerabilities and sensitive data from your site that is exposed through Google results.
Even if you do not have a webserver connected to the Internet, you still might not be as safe from Google Dorking as you might think you are.
You can still find your personal information readily accessible from Google Search.
I invite you to apply what we’ve learned in this post to identify if you have any leaked personal information. And if you find any, you should notify the proper entity so that they can take the necessary steps to remediate that.
Thanks for reading. If you want more tutorials like this you may comment bellow!
Welcome to Yuma-Tsushima's Github page!
Talents and Hobbies
I love drawing (I have been drawing all of my life). I play strategy games, I code and I do CTFs! I am also good at animation, making AMVs and image editing. My favourite game is Mindustry, followed by Flow Free and Sudoku. I love watching anime (I love Code Geass - I relate to Lelouch a lot) and I aspire to create my own anime!
- Github: https://github.com/Yuma-Tsushima07
- Medium: https://0xv37r1x3r.medium.com/
- SoundCloud: https://soundcloud.com/0c7av3h4ck5
Bounty Hunters: An amazing bug hunting community full of developers and exploiters!!!
CyberArtByte: My server full of bling and joy!!