Here’s to a Great 2008!
Posted December 31st, 2007 by admin 2 Comments »

HAPPY NEW YEAR!!
Blurbs on SEO, Technology, Business, Internet, Marketing, Trends, Tips, Tricks and more
On a recent post we discussed about Robots Exclusion Protocol (REP) /robots.txt and how it is used to instruct search engines and other bots in what NOT to include while crawling our pages. /robots.txt enables us to block any page, directory or folder we don’t want appearing on the search pages. For the /robots.txt to work, it must be placed in the top-level directory of our web server.
What happens if we’re not hosting our own site and we don’t have access to the root directory of our domain? Now comes the REP META tag which we can manually place in our HTML files to control crawling, caching and snippets of our pages. The META tag is especially useful when we can get into individual pages of our site via the HTML but not the /robots.txt. Basically, if we don’t keep sensitive files and have no problem with search engines indexing all our pages, we can do away with placing /robots.txt file because the REP META tag works the same as the /robots.txt if not better in other ways because it offers flexibility in how we want our individual pages indexed by search engines thru HTML.
In structure, it is the same as other meta tags we can find when we open our template and view the HTML code. This is our meta tag for our description:
<meta name=”decription” content=”A Blog of Blogs That Follow Directory for Do Follow Bloggers. Link Submit site.” />
For the robots meta tag, it will be something like:
<meta name=”robots” content=”noindex,nofollow” />
where you’re instructing the robots to NOT index the page and NOT follow any links on the page.
This is how you can use the meta tag to suit your need.
Here are some actual examples of the robots META tag based on the above usage:
<meta name=”robots” content=”noindex,nofollow” />
OR you can also use this to get the same instruction of no indexing and no following of links:
<meta name=”robots” content=”none”>
<meta name=”robots” content=”index,nofollow” />
<meta name=”robots” content=”noarchive,nofollow,snippets,index” />
<meta name=”robots” content=”noindex,follow” />
We place the meta tag just below the <head> which can be found in the topmost part of our HTML code or place it just before </head> part.
We reiterate as found on robotstxt.org site that:
“There are two important considerations when using the robots <META> tag:
Don’t confuse this NOFOLLOW with the rel="nofollow" link attribute.”
The “rel=nofollow” is a Google invention, and is also supported by Yahoo and MSN, that is link specific and more to do with the ranking of a page rather than simply following or not following the links for indexing. It is somewhat confusing but they both have almost the same end result except that the rel=”nofollow”, once placed on a specific link means it will not get a ‘vote’ in the popularity ranking of a page. The search engine will disregard and ignore the link and it will not go to that link to index or follow that page.
That is why, we support the removal of rel=”nofollow” tag in our comments page because we value the readers who leave a comment on our posts. We want to make sure that Google, Yahoo and MSN find their link just as important as the site visited upon.
The rel=”nofollow” and robots meta tag does work in various instances, say we decided to write a story on “Evil Sites” and therefore can not help but include some sites that we find as evil. What we would do is put a robots meta tag for the entire post:
<html>
<head>
<title>Evil Sites</title>
<META NAME=”ROBOTS” CONTENT=”INDEX, NOFOLLOW”>
</head>
which means that the search engines sure can index the post but instructs it to ignore and disregard all links found.
More particularly: we could include a rel=”nofollow” attribute on the HTML code for the specific link we don’t want to be associated with:
<a href=http://www.evilsite.com/ rel=”nofollow“>Evil Site</a>
OR
<a rel=”nofollow” href=”http://www.evilsite.com/” >Evil Site</a>
The last thing we want is to tell Google, Yahoo and MSN that we approve this site and give a vote on it; that we want to be associated with the Evil Site!! In this instance, we find the rel=”nofollow” attribute to be helpful and useful.
Nonetheless, do we really want the same thinking to give to our readers leaving us a comment by still keeping the rel=”nofollow” tag on our comments page? That we’re telling them, thanks for commenting but we don’t want to be associated with you and you have no value to us… We don’t think so.
Why not give Do Follow a chance then and become a Do Follow supporter yourself? Please find our helpful topics on Do follow:
How Does rel=”nofollow” Impact My Blog?
© 2007
Related Post: Be a Do Follow Blogger, Win A Grey’s Anatomy 2008 Calendar!
Congratulations to of Simone’s Butterfly and Eam of The Postcard Collector for winning last week’s contest! We’ve already notified them and we’re awaiting for their responses.
It is still on for two more weeks! We’re telling you guys, this is an easy win! We have more calendars to give away than bloggers joining! Again, the rules are pretty easy to accomplish.
Simply become a DO FOLLOW BLOGGER! Remove the rel=nofollow tag on your template and put a “U Comment, I Follow” badge on your site. Submit your site to Blogs That Follow and then let us know that you did by leaving a comment here. And you’re done.
The calendars are just waiting to be sent!

Wherever you may be or however you celebrate this holiday, we wish you good cheers!
Everyday hundreds of web robots and search engine crawlers set out to accomplish a huge task- that of visiting billions of pages in the internet, be it Google’s bot indexing all our pages and the rest of the web or the bad robots called spam bots hunting down every email addresses it could find to steal it.
For the most part, we love it when Google pays us a visit to index our content! Knowing what Google and others are getting however means we’re taking an extra step to direct them only to the content we want indexed. Sometimes there are areas in our directory where we don’t want others to see like our temp folder. To save bandwidth, we may want images, stylesheets or other files from being indexed too. For confidential files on our site, like a database of names and addresses of contacts, of course, it is best to just put it offline or onto another machine than risking spreading it on the net.
Comes the term Robots Exclusion Protocol (REP). Think of this as a sign to our office where it says, restricted area. That means for employees access only and meant to drive away unwanted visitors. /robots.txt works just like that. There is another REP to place in META tag that works the same way. We will discuss it in our next post. We’ll talk about the former first.
/robots.txt is a simple text file. It’s not an HTML, just a basic text file that can do wonders! It instructs robots which pages we would NOT want them to visit. It is not required of them to follow so but generally good robots and crawlers are courteous enough to comply with what is asked of them. It is important to note nonetheless that as in the above comparison of a restricted area, it’s just a sign to an unlocked door. It doesn’t mean that the unwanted visitor can’t get in when he wants to! Bad robots like spam bots and malware bots may still get through the door to look for loopholes in your security and those email addresses but the good bots will definitely abide with the sign and will not barge in uninvited.
As mentioned earlier, it is risky to place sensitive files on your directory and hope that robots.txt will protect it from being indexed and appearing in search results. /robots.txt is also public and may be accessed by anyone and it sees exactly what sections you don’t want robots to see so that you don’t want a filename like /mybankaccounts on the /robots.txt included. It just tells them you can’t view mybankaccounts folder but if you know a way to get into it, you can!
The concept of robots.txt is this: a robot wants to visit the site http://www.myownsite.com/welcome.html. Before it does anything, it first looks for http://www.myownsite.com/robots.txt, to find out which pages it can index or not. If it can’t find the filename, it will go ahead and index everything on that directory.
This is the basic structure of a robots.txt file where * (asterisk) means ALL robots and / (slash) means all pages should not be indexed. As a file it means: ALL robots are NOT allowed to index any of the pages. We don’t want that but just so you know the basic component, Disallow: /thenfilename .
User-agent: *
Disallow: /
On the other hand this below says to allow all robots to index all pages. This is usually the default for all websites unless we manually create robots.txt to include files we don’t want indexed.
User-agent: *
Disallow:
If you don’t have really much yet on your site, it is best to just do the above or simply create a robots.txt file and leave it empty or just not do anything. The /robots.txt works for those who have files in their directories that they don’t want to be indexed; files they don’t want to see appear on searches.
To save bandwidth and there’s really no point in having folders like our images or cgi-bin or other files from being indexed, we create this below which means you’re allowing all robots to index your pages except the one listed on the Disallow.
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /privatestuff
Disallow: /temp/
If you want to be specific and only allow google to search your directory, you may do so with this:
User-agent: Googlebot
Disallow: /cgi-bin/
Disallow: /privatedir/
It means you’re allowing Google to index your pages except the cgi-bin and privatedir folders. Note however that if you do this, you’re not allowing MSN, Yahoo or Alexa to index your site. So you might want to reconsider doing so.
The samples above should be used as it is. Be careful with spelling, missing colons and placements. For example, writing Disalow instead of Disallow or User Agent instead of User-agent. Also the filename is robots.txt not Robots.Txt.
Placement of the robots.txt file is very important since wrong placement means the robots and search crawlers won’t be able to find it, hence will most likely index ALL your pages. They don’t have all day to look for robots.txt file on our files. The only place to put is on the root directory of the site, not on folders, not on sub-directories. To check for your robots.txt file, just place this on the URL tab of the browser: http://myownsite.com/robots.txt
We shared only the basic information on robots.txt. To learn more about robots.txt, visit robotstxt.org. This page will bring you to the Robots Database. A list of over 300 identified robots wandering in the internet everyday.
To create and validate your robots.txt file, Clockwatchers can help! Motoricerca is also a robots.txt checker.
Congratulations to Allen of Silkenhut’s World and Jenny of The So Called Me for winning last week’s contest! We’ve already notified them and we’re awaiting for their responses.
It is still on for three more weeks! Again, the rules are pretty easy to accomplish.
Simply become a DO FOLLOW BLOGGER! Remove the rel=nofollow tag on your template and put a “U Comment, I Follow” badge on your site. Submit your site to Blogs That Follow and then let us know that you did by leaving a comment here. And you’re done.
We will contact the winner by email. Winner must be willing to give us a valid address (anywhere in the world) where we can send the actual calendar! (We value your privacy and we will never sell or submit your address to any entity. Once the contest is over, we will destroy the address records.)
Curious how the winners were picked? We wrote all the names of those who submitted their Do Follow sites to us and we did a drawing, picked two names and there you have it! Non winners from last week will still get a chance of winning as we will still include their names for next week’s drawing.
The list isn’t very long yet! We know of the coming Holiday frenzy, in fact it already has started for us; nonetheless, don’t miss out on this chance for an easy win! We look forward to hearing from you…
Related Posts: What is Do Follow? How does rel=nofollow impact my blog?
We can’t stress enough the importance of removing the rel=nofollow tag in blogs. We are very surprised that since the Do Follow movement took off earlier this year, we’re still seeing a huge number of blogs with the rel=nofollow tag. Most bloggers do not have any idea how it works and the value it adds to their blogs. That’s the very reason we came up with our own site to promote just that! We’re firm believers of the Do Follow and we have to keep it alive and going!
Simply become a DO FOLLOW BLOGGER! Remove the rel=nofollow tag on your template and put a “U Comment, I Follow” badge on your site. Submit your site to Blogs That Follow and then let us know that you did by leaving a comment here. And you’re done.
The contest will run for 4 weeks. Each week, we will pick out 2 lucky bloggers who will win a 2008 Grey’s Anatomy mini calendar! Depending on the responses we get, we may increase the number of winners when there are many entries for a particular week. The more Do Follow Bloggers, the more calendars we give out.
December 11 – 18 WEEK ONE
December 18 – 25 WEEK TWO
December 25 – January 1 WEEK THREE
January 1 -8 WEEK FOUR
We will contact the winner by email. Winner must be willing to give us a valid address (anywhere in the world) where we can send the actual calendar! (We value your privacy and we will never sell or submit your address to any entity. Once the contest is over, we will destroy the address records.)
Grey’s Anatomy Mini Calendar: Already a hit in 2005, Grey’s Anatomy took on a challenging and coveted Thursday night time slot for the 2006 and 2007 season. The result was a new king in primetime. Entertainment Weekly calls it “TV’s hottest show”. Great on your computer desk! (measure 6.8 x 6.7 x 0.1 inches)
While rel=nofollow was meant to be anti-spam, it sees every link as one! Why would we punish legitimate comments from NOT being indexed by search engines? Legit comments deserve the link love…
If you don’t remove the rel=no follow tag on your blog code, it means that every time someone leaves a comment on your site, the link stays there. It will not be crawled by Google or any other search engines. It will be ignored by them hence will not help in ranking the importance of that page. If you wrote a very helpful article and you got traffic and people commented on it yet you did not remove the rel=no follow, Google will simply ignore those comments which could have helped increased the ranking of that particular page.
Be a Do Follow Blogger. Get a badge and show it proudly on your blog site! Submit your URL to Blogs That Follow, a link directory exclusive to Do Follow Bloggers. Help us and we’ll help you! Eventually, the link directory aims to be your reference to check on the blogs that support Do Follow Bloggers. For at the end of the day, your time is precious. Reading and commenting to blogs that have the No Follow turned on will not amount to anything from the search engine’s perspective…
Spread the word. Make a stand. “U comment, I Follow…”
© 2007