While writing several blog posts and documentation, I often have used
example.com to stand in for any
domain name. One of the Internet
standards established by the Internet engineers circa 1999 set aside example.com (as well as example.org and example.net) for documentation purposes. So if you were to click ona link to http://www.example.com in
my post, you wouldn’t see an actual web page. Click on this link to see for yourself.
I’d like to demonstrate a fun little
trick you can use to amaze your
The page you see is when you go to http://www.example.com is
completely indexable by the search engines. There’s not a lot of content,but you would think that the engines will have indexed the content
exactly as your browser shows it to
you. It turns out that there is a robots.txt file that blocks all spiders from all content inside www.example.com. (If you ever forget how to create a basic
robots.txt file, you can use this one
as a guide.) Alright, now for the punch line. Let’s see what the search engines really have indexed for
http://www.example.com. Go to
www.google.com and type“site:example.com” (without the
quotes). What do you see? If you see only one result, click on the link:
repeat the search with the omitted
I see 10,400 results now. There are
pages like example.com/blah/ and
Google search results page does not have links to the cached version for any of these results, unfortunately,so we can’t see what exactly Google has indexed from these pages, but
we can go to the page ourselves.
Well, I tried that, and every page I go to replies back with “Not Found.” It’s logical to conclude that those pages never existed, but also notice some of the results have been crawled by Google in the past few hours.
You can try this search on other
search engines too.
My feeling on this strange
phenomenon is that it could either be Google’s own testing or other people testing or somehow tricking Google into adding these pages to its index. It may be relegated to certain data centers as well.
Whatever is causing this, I’m sure
Google knows about it, but doesn’t
feel the need to do anything about it. This phenomenon may also get you thinking about how search engines are supposed to work.
You are Here: Home > Google’s index of example.com
Wednesday, September 19, 2012