One of our devs provided an intriguing link he discovered last week, one that exposed several purportedly “secret” resources from various websites.
Someone on our team then proposed writing a blog post about this type of data leakage concern. This type of security issue has been discussed in prior blog posts, as it is a popular way for security researchers to obtain vital private information about any website.
What is a Google Hack?
Google hacking, also known as Google dorking, is a technique used by an attacker to obtain information by utilising advanced Google search capabilities. Google hacking search queries can be used to find security vulnerabilities in web applications, gather information for arbitrary or specific targets, locate error messages containing sensitive data, and identify files holding credentials and other sensitive data.
An attacker’s advanced search string may target the vulnerable version of a web application or a specific file extension (.pwd,.sql, etc.) to further restrict the search. The search can either be limited to pages on a particular website, or it can search for certain information across all websites, returning a list of websites that have the information.
For example, the following search query will return a list of SQL files (filetype:sql) indexed by Google for websites where directory listing is enabled: (intitle:”index of”).
Logical operators and symbols in Google Search
Google search logical operators such as AND, NOT, and OR (case sensitive) as well as operators such as, –, and * can be exploited by attackers. The table below contains further information on these operators.
|AND or +||Used to include keywords. All the keywords need to be found.||
|NOT or –||Used to exclude keywords. All the keywords need to be found.||
|OR or |||Used to include keywords where either one keyword or another is matched. All the keywords need to be found.||
|Tilde (~)||Used to include synonyms and similar words.||
|Double quote (“)||Used to include exact matches.||
|Period (.)||Used to include single-character wildcards.||
|Asterisk (*)||Used to include single-word wildcards.||
|Parenthesis (())||Used to group queries||
Advanced search operators
The advanced Google operators allow the user to further refine search results. Following is the syntax for advanced operators.
The syntax comprises three components: the operator, the colon (:), and the desired search query. Using double quotations (“) to insert spaces is allowed.
Google search recognises the aforementioned pattern and filters the search using the available information. Using the previously described search query, intitle:”index of” filetype:sql, for instance, Google will search for the string index of in the title (this is the default title used by Apache HTTP Server for directory listings) of a website and restrict the search to SQL files that Google has indexed.
The table below contains a collection of sophisticated operators that can be used to locate websites with vulnerabilities. For further search operators, please see Google’s Advanced Search page.
|site:||Limit the search query to a specific domain or web site.||
|filetype:||Limit the search to text found in a specific file type||
|link:||Search for pages that link to the requested URL||
|cache:||Search and display a version of a web page as it was shown when Google crawled it.||
|intitle:||Search for a string text within the title of a page.||
|inurl:||Search for a string within a URL||
Preventing Google Hacking Attacks
Google Hacking is merely a reconnaissance technique used by attackers to identify potential vulnerabilities and misconfigurations. Testing websites and online apps for vulnerabilities and misconfigurations and then correcting them eliminates the danger of enumeration and prevents exploitation.
Obviously, manual testing of vulnerabilities that can be discovered through a Google search is tedious and time-consuming. On the other hand, a thorough automated web vulnerability scanner shines at this type of activity.
The following is an example of a Google Hacking query that would locate PHPinfo files that are exposed:
Such files should ideally be eliminated; but, if they are really necessary, you should restrict access to them using HTTP Authentication, for instance.