<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0">
    <title>Comments for Restricting Google on my terms</title>
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms" />
    <link rel="self" type="application/atom+xml" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms" />
    <id>tag:bradchoate.com,2007://4-</id>
    <updated>2006-03-20T21:38:57Z</updated>
    <subtitle>The man, the legend.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type -en-trunk--20070910</generator>
 

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2144</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2144" />
    <title>Comment from Pete Prodoehl on 2004-07-02</title>
    <author>
        <name>Pete Prodoehl</name>
        <uri>http://rasterweb.net/raster/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://rasterweb.net/raster/">
        Are you worried about getting banned from Google? I've read of people abusing this sort of thing to get higher rankings, and that some search engines will occasionally request a page with a non-bot type UA to see if they get the same content...<br />
]]>
    </content>
    <published>2004-07-02T18:50:18Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2145</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2145" />
    <title>Comment from Andy Baio on 2004-07-02</title>
    <author>
        <name>Andy Baio</name>
        <uri>http://waxy.org/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://waxy.org/">
        You might want to read Google's FAQ on <a href="http://www.google.com/webmasters/faq.html#cloaking" rel="nofollow">cloaking</a>.  Your version is benign, but if it's determined automatically, you might be in trouble.]]>
    </content>
    <published>2004-07-02T19:26:20Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2146</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2146" />
    <title>Comment from Brad Choate on 2004-07-02</title>
    <author>
        <name>Brad Choate</name>
        <uri>http://bradchoate.com/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://bradchoate.com/">
        Good point-- no, I hadn't seen that. Although, I would argue that the spirit of that restriction is to prevent the kind of abuse I alluded to at the end of this post.  If anything, I am purifying the content that is indexed. If Google offered a way to supply "hints" within the page to indicate what should really be indexed, I wouldn't have to resort to this. Sadly, they do not.

<p>I've been using this technique for more than a year I believe-- so far, I haven't been contacted by Google about this practice. Nor have I been banned for doing it. But if this post raises their attention to the issue of indexing "cruft" within a weblog site, hopefully it will produce a Google-sanctioned solution that we can all use.</p>

<p>Who knows, maybe <a href="http://evhead.com/">Ev</a> could bring this problem to their attention...</p>]]>
    </content>
    <published>2004-07-02T19:47:21Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2147</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2147" />
    <title>Comment from Peter Winnberg on 2004-07-02</title>
    <author>
        <name>Peter Winnberg</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        First of all, Google will not index the Google ad because it is included using javascript, right?

<p>If you don't want Google to index your comments and trackbacks together with the content of a page, isn't the best solution to not put it there? Instead you could have a link at the bottom of each story to  a index of comments and trackbacks for that page.</p>]]>
    </content>
    <published>2004-07-02T20:09:14Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2148</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2148" />
    <title>Comment from Brad Choate on 2004-07-02</title>
    <author>
        <name>Brad Choate</name>
        <uri>http://bradchoate.com/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://bradchoate.com/">
        Peter-- duh, you're right. The ads won't be indexed, but they are displayed in the cache. I guess that's OK.

<p>But as to putting the comments/trackbacks on a different page? No, I'd rather not. Besides, there's more than just comments and trackback that I want to exclude. I used to get a lot of hits searching for "blah photos" matching for pages all over my site. Even though the page mentions "blah", there were no photos of it/them. It was the site navigation link for "Photos" on the right was causing these false positive search results. So-- should I put my site navigation on a separate page??? Perhaps I should start using frames or something?</p>

<p>Over my dead &lt;body/&gt;.</p>]]>
    </content>
    <published>2004-07-02T20:22:23Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2150</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2150" />
    <title>Comment from Mark J on 2004-07-05</title>
    <author>
        <name>Mark J</name>
        <uri>http://www.txfx.net/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.txfx.net/">
        Good article.  I get some funky search queries that lead people to my site.

<p>I doubt that the "cloak detection" process is completely automated, and I'm sure that if they manually reviewed your content to see what you were hiding, they'd understand.  Heck, you could even leave an HTML comment for them: </p>

<p>I don't think it's a problem as long as you are just hiding cruft, and not introducing new content or doing something malicious.</p>

<p>On a different note, those using Mozilla Firefox can change their user agent on the fly.  Sometimes I like to surf the web "as googlebot."  The results can be interesting.</p>]]>
    </content>
    <published>2004-07-05T07:47:59Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2154</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2154" />
    <title>Comment from Ryan on 2004-07-07</title>
    <author>
        <name>Ryan</name>
        <uri>http://www.laze.net/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.laze.net/">
        Awesome script, Brad.  Might I make a small suggestion?  Perhaps a clear notice at the top of the page that only shows up in the Google cache and not on your regular site (something along the lines of "This is a cached version of this page with all navigation and cruft removed.  If you want to visit the full and most recent version of this page, go to...".

<p>I'd image this would be pretty easy to do since you already have the spider-detecting function.</p>]]>
    </content>
    <published>2004-07-07T23:03:51Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2159</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2159" />
    <title>Comment from Nixon on 2004-07-10</title>
    <author>
        <name>Nixon</name>
        <uri>http://www.popdizzy.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.popdizzy.com">
        Does restricting Google's Mediapartners-Google put you in violation of the Ad-Words T&Cs?]]>
    </content>
    <published>2004-07-10T12:51:30Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2214</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2214" />
    <title>Comment from erica on 2004-08-14</title>
    <author>
        <name>erica</name>
        <uri>http://digitalrainstorm.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://digitalrainstorm.com">
        I used to get a lot of hits searching for “blah  photos” matching for pages all over my site. Even  though the page mentions “blah”, there were no photos of it/them.

<p>And unfortunately it's problems like that that cause Google to serve up a lot of results that are nowhere near what a person is searching for and thus waste their time.</p>]]>
    </content>
    <published>2004-08-15T00:55:30Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:2243</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c2243" />
    <title>Comment from Brice on 2004-09-09</title>
    <author>
        <name>Brice</name>
        <uri>http://www.cmswire.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.cmswire.com">
        I too asked Google for a way to add hints for the bot. No reply. It doesn't seem to exist for general use.

<p>You're defintely cloaking here. Though its good to see that they haven't penalized you for it. </p>

<p>Perhaps there is a little human judgement still applied. That would be welcome info.</p>

<p>Thanks for sharing the script.<br />
</p>]]>
    </content>
    <published>2004-09-09T11:58:10Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:22225</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c22225" />
    <title>Comment from Marcos on 2007-02-06</title>
    <author>
        <name>Marcos</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        Great script, and good discussion.]]>
    </content>
    <published>2007-02-06T19:15:39Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-comment:22289</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#c22289" />
    <title>Comment from chris charatain on 2007-06-06</title>
    <author>
        <name>chris charatain</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        new song >story of the hood ]]>
    </content>
    <published>2007-06-07T04:03:04Z</published>
</entry>


<entry>
    <id>tag:bradchoate.com,2004://4.1933-ping:1832</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#p1832" />
    <title>Restricting Google on our own terms</title>
    <author>
        <name>leuschke.org links</name>
        <uri>http://www.leuschke.org/quick/archives/2004_07.html#003905</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.leuschke.org/quick/archives/2004_07.html#003905">
        things like this make me want to learn more PHP
    </content>
    <published>2004-07-04T03:01:05Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-ping:1848</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#p1848" />
    <title>Google Searches</title>
    <author>
        <name>Mama Write&apos;s Sideblog</name>
        <uri>http://www.mamawrite.com/sideblog/003721.html</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.mamawrite.com/sideblog/003721.html">
        Guiding how Google searches a website...
    </content>
    <published>2004-08-13T23:19:32Z</published>
</entry>

<entry>
    <id>tag:bradchoate.com,2004://4.1933-ping:1855</id>
    <thr:in-reply-to ref="tag:bradchoate.com,2004://4.1933" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms"/>
 
    <link rel="alternate" type="text/html" href="http://bradchoate.com/weblog/2004/07/02/restricting-google-on-my-terms#p1855" />
    <title>Fair and Balanced</title>
    <author>
        <name>News Goat</name>
        <uri>http://www.newsgoat.com/2004/08/19/1411/index.html</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.newsgoat.com/2004/08/19/1411/index.html">
         I really like reading services like Google News and Topix, which generate pages by pulling news from thousands of news sites. The variety of sources is nice, and the format makes it easy to scan the headlines. Topix even...
    </content>
    <published>2004-08-19T23:11:18Z</published>
</entry>

</feed>