204 Wordpress blogs with the Top Commentators plugin installed

Anybody who’s spent any amount of time building backlinks to their blog or site will know about the Top Commentators plugin for Wordpress.

In a nutshell, the Top Commentators plugin is a sidebar widget which shows the most active participants with a link to their site.

The kicker with this plugin is that by default, the links are dofollowed (although some bloggers do add a nofollow attribute to the links), as opposed to the nofollow which is automatically attributed to comment links in the default wordpress install.

I thought you all might like a list of blogs with the Top Commentators installed, so I sent my trained web monkeys off on the mission to find me at least two hundred blogs with the plugin installed.

Monkeys, being the lazy bastards that they are, only found 204 before they staggered back to my blog, beer in one hand, ciggie in the other, wheezing incoherently about being chased off from most blogs because they didn’t have the right user agent.

So without further ado, a list of 204 wordpress blogs with the Top Commentators plugin installed, sorted by pagerank:

6 - http://www.floppingaces.net
6 - http://guntotingliberal.com
5 - http://www.jimseven.com
5 - http://www.giselejaquenod.com.ar
5 - http://www.entrepreneurs-journey.com
5 - http://www.ducttapemarketing.com
5 - http://www.didigetthingsdone.com
5 - http://www.bloggeries.com
5 - http://www.blogbloke.com
4 - http://www.wpthemesplugin.com
4 - http://www.stuffbysarah.net
4 - http://www.homewiththekids.com
4 - http://www.codenamerevolution.com
4 - http://www.catswhocode.com
4 - http://www.balkhis.com
4 - http://www.anewmorning.com
4 - http://starvoi.com
4 - http://selfmademinds.com
4 - http://optempo.com
4 - http://momgrind.com
4 - http://andrewferguson.net
3 - http://www.wpthemedesigner.com
3 - http://www.thegreatestinternetsalesman.com
3 - http://www.skamid.com
3 - http://www.marketingtechblog.com
3 - http://www.managingcommunities.com
3 - http://www.leochiang.com
3 - http://www.jonlee.ca
3 - http://www.jayceooi.com
3 - http://www.gensantos.com
3 - http://www.connectedinternet.co.uk
3 - http://www.cityteacher.net
3 - http://twentyfourcarat.net
3 - http://synchronity4change.wordpress.com
3 - http://martinwaiss.com
3 - http://linkersblog.com
3 - http://extremeezine.com
3 - http://blog.nestlepoell.net
3 - http://3gweek.net
2 - http://www.yankeeinnewworld.com
2 - http://www.vivsin.com
2 - http://www.stuckz.com
2 - http://www.myfew.net
2 - http://www.momunplugged.com
2 - http://www.mark-mcwilliams.com
2 - http://www.louisyagera.com
2 - http://www.kopionline.com
2 - http://www.gospelmusicbites.com
2 - http://www.debohobo.com
2 - http://www.chrissimpson.info
2 - http://www.brownbaron.com
2 - http://www.blogjer.com
2 - http://tutorialvine.com
2 - http://trucubed.com
2 - http://slyvisions.com
2 - http://mohdnajwan.putrabytes.com
2 - http://gajemaster.com
2 - http://entity11.com
2 - http://conceptpop.com
2 - http://buttonmashing.com
1 - http://www.work-from-home-job.com
1 - http://www.sahm-one.com
1 - http://www.malaysia-kini.com
1 - http://www.dasdurcheinander.de
1 - http://personalblogger.net
1 - http://marketsecrets.biz
0 - http://zonet.in
0 - http://yeepage.com
0 - http://www.zachishere.com
0 - http://www.xerraireart.com
0 - http://www.writingwhitepapers.com
0 - http://www.vikasrikhye.com
0 - http://www.tyleringramphotos.com
0 - http://www.travelfeeder.com
0 - http://www.toxic-web.co.uk
0 - http://www.thorschrock.com
0 - http://www.thechurchgeek.com
0 - http://www.teknobites.com
0 - http://www.techsnack.net
0 - http://www.strategyonline.co.za
0 - http://www.smartblogtips.com
0 - http://www.shankrila.com
0 - http://www.selfpreneurs.com
0 - http://www.remarkableblogging.com
0 - http://www.randombatch.com
0 - http://www.pureblogging.com
0 - http://www.programmiamo.it
0 - http://www.pjlighthouse.com
0 - http://www.peterleehc.com
0 - http://www.onlineopportunity.org
0 - http://www.niessuh.com
0 - http://www.nadlique.com
0 - http://www.montysmegamarketing.com
0 - http://www.moneyrumour.com
0 - http://www.momof3girls.net
0 - http://www.miss604.com
0 - http://www.madmouse.com
0 - http://www.luisescobarblog.com
0 - http://www.kongtechnology.com
0 - http://www.k-director.com
0 - http://www.jtpratt.com
0 - http://www.jamieharrop.com
0 - http://www.ite130.com
0 - http://www.internetmoneytechniques.com
0 - http://www.intelliot.com
0 - http://www.inspiritblog.com
0 - http://www.insightwriter.com
0 - http://www.indocontest.com
0 - http://www.hippowebsolutions.com
0 - http://www.greensahm.com
0 - http://www.gadgetguy.de
0 - http://www.freetipsandwits.com
0 - http://www.freedom-uplink.net
0 - http://www.find-freebies.org
0 - http://www.fiddyp.co.uk
0 - http://www.ezrasf.com
0 - http://www.ebooktechie.com
0 - http://www.darknet.org.uk
0 - http://www.cubiccapacity.com
0 - http://www.clipclip.org
0 - http://www.charlesheflin.com
0 - http://www.cashblog-n.com
0 - http://www.breaktheillusion.com
0 - http://www.bloggreens.com
0 - http://www.bloggingtips.com
0 - http://www.bloggingmix.com
0 - http://www.benh.org
0 - http://www.anatheist.net
0 - http://www.allinfodir.com
0 - http://www.agentstealth.com
0 - http://www.134u.com
0 - http://wordprezzie.com
0 - http://windows7news.com
0 - http://webvalley2008.fbk.eu
0 - http://weblogtoolscollection.com
0 - http://vishnu.gmurali.com
0 - http://themonetizedblogger.com
0 - http://synchelp.com
0 - http://suncoastscribe.com
0 - http://strongbodies.net
0 - http://stixblog.com
0 - http://simplycats.beetle-blog.com
0 - http://science.mikelopez.info
0 - http://rockersworld.com
0 - http://pinoymaritime.com
0 - http://pantsinacan.com
0 - http://outstandingblogger.com
0 - http://nextbigtrends.com
0 - http://netbizsimplified.com
0 - http://nerdyparenting.com
0 - http://mrjavo.com
0 - http://moneybites.com
0 - http://melarky.com
0 - http://lianaandmason.com
0 - http://laowaichinese.net
0 - http://karensopinion.com
0 - http://justintadlock.com
0 - http://indianidolshow.com
0 - http://howtoboatblog.com
0 - http://getrichtalks.com
0 - http://forums.wpthemes.info
0 - http://forexfrat.com
0 - http://florchakh.com
0 - http://divapromotions.com.au
0 - http://blogs-secrets.blogspot.com
0 - http://bloggervenue.com
0 - http://bloggeroftheweb.com
0 - http://blogdesignstudio.com
0 - http://blog.mixterr.com
0 - http://blog.automobilebestbuys.com
0 - http://blazingminds.co.uk
0 - http://bestincentiveaffiliates.com
0 - http://babymonster.net
0 - http://assessmyblog.blogspot.com
0 - http://aravindjose.com
0 - http://ajaydsouza.com
0 - http://ahkong.net
0 - http://20somethingfinance.com
0 - http://
0 - http://
0 - http://
0 - http://zhxhome.net
0 - http://yieldtopedestrian.com
0 - http://www.winningexback.com
0 - http://www.shivaranjan.com
0 - http://www.realworldreally.com
0 - http://www.psyc3d.com
0 - http://www.melovillareal.com
0 - http://www.livelearninvest.com
0 - http://www.jamesatracy.com
0 - http://www.harvestofdailylife.com
0 - http://www.dimla.net78.net
0 - http://www.bloghonour.com
0 - http://spblogger.com
0 - http://sensetosave.com
0 - http://prayersforblowouts.com
0 - http://onemansgoal.com
0 - http://mybusinessmusings.com
0 - http://mattnutts.com
0 - http://logicmakesnosense.com
0 - http://littlepeanut.info
0 - http://julianasworld.com
0 - http://julianaslair.com
0 - http://flimjo.com
0 - http://azwan.my
0 - http://azoogleadstips.com
0 - http://azlan.anilezfa.com

Tags: , ,

Wordpress comment spam footprints, and how to get rid of them.

My post the other  day about how you should spam my blog had a couple of interesting rections, so today I thought I’d take a look at how a wordpress comment spammer actually spams your blog, because if we know HOW he spams you, we can look at some ways of stopping him.

Here’s the process that a spammer follows in order to drop his crap on your blog:

  1. He finds your blog
  2. He analyses your blog
  3. He posts to your blog

Let’s look at each of these steps, and work out how we can stop him:

How Spammers Find Your Blog:

The spammer finds your blog by using what’s called a footprint. A footprint is some common text that appears on most of the pages he wants to find.

In our case, the most obvious footprint a spammer is going to find on a wordpress blog is the Powered by Wordpress link in the footer.

Our spammer is using a tool like the PHP Google Blogsearch URL Scraper, using this, and a bunch of keywords, he’s able to scrape thousands of blog URLs and hour.

Let’s take a look at the part of the URL Scraper which tells Google Blogsearch what to look for:

http://blogsearch.google.com/blogsearch_feeds?hl=en&q=%22keyword%22+%22powered+by+wordpress%22&ie=utf-8&num=100&start=0&output=rss

See that bit in bold, that’s part of his search, he knows that Wordpress blogs, by default, have a link back to wordpress.org, with the anchor text “Powered By Wordpress

So the first thing we’re going to want to do is to remove that footprint: Go to your Wordpress dashboard, and click the Design tab at the top left of the window. Then click the Theme Editor tab, then look over to the right of the window and click the link to edit the Footer (footer.php).

In this file, you’re looking for the following line:
proudly powered by <a href=“http://wordpress.org/”>WordPress</a>

You want to either change or delete this line. You can change it to anything you like, just make sure you don’t have the text Powered by Wordpress anywhere on your page.

Now our average blog spammer is no idiot, actually, he’s a pretty smart cookie, and he knows that the search used above, on its own, may actually return Wordpress blog pages that have comments turned off. Now there’s no point him using up his bandwidth trying to post comments to pages which don’t even have a comment form, so he might also, while he’s doing the above search, try to work out whether there’s actually a comment form on the page.

So now he’s looking for a footprint which tells him that there’s a comment form available to him, so let’s take a look at a standard wordpress comment form, and see if there’s any kind of footprint:   

Wordpress Comment Form Footprints

Wordpress Comment Form Footprints

As you can see, there are two main footprints the spammer will be looking for (generally he’ll be looking for one or the other), so now his Google Blosearch search string might look something like this:

http://blogsearch.google.com/blogsearch_feeds?
hl=en&q=%22keyword%22+%22powered+by+wordpress%22+%22Name (required)%22&ie=utf-8&num=100&start=0&output=rss

So if we want to deter the particularly clever spammer, we need to change these footprints too. Now we want to go back to where we edited out footer file previously, and click the Comments link (comments.php), then look for the two lines we’ve highlighted in the image above, and change them (after making a backup of this file, of course!).

First, look for this line:

<label for=”author”>Name <?php if ($req) echo “(required)“; ?></label>

You can safely change either of the bolded parts, maybe to something like this:

<label for=”author”>Name (or nickname) <?php if ($req) echo “(mandatory)“; ?></label>

Now go on to look for this line in comments.php:

<label for=”email”>Email Address<?php if ($req) echo “(required)”; ?></label>

And change as you’d like.

OK, so now we’ve addressed the issue of the spammer actually finding your blog. If you make these changes, although it won’t get you off the list of spammers who are already hitting your blog, it should stop most spammers from actually finding you in the firest place.

Now let’s look at how a spammer analyses your blog once he has actually found it. 

Once an average spammer has found your blog post using the scraper mentioned above, he is going to try to post a comment using cURL to your wp-comments-post.php file, assuming you have one. All he’s going to do is to is parse the url of the post back to the root, add /wp-comments-post.php to the end of it, and try to post, see the below example:

Link spammer finds on blogsearch: http://pimpmypagerank.com/2008/11/google-blogsearch-url-scraper/

Parsed domain using parse_url($urlPHP_URL_HOST);: http://pimpmypagerank.com

Add wp-comments-post.php to the end of it: http://pimpmypagerank.com/wp-comments-post.php

And that’s where he tries to post his comment. But there are a couple of things that can throw a spanner in the works for him. Firstly, what if your blog sits in a subdirectory, such as http://youdomain.com/blog?

Well, using the above method, he’s screwed, but this is where the clever spammer comes into his own. He’s actually going to download (automatically, of course, as part of his script) your page, and do some further analysis.

So let’s say the spammer wants to download the page he found using the scraper, he might use something like the cURL code below:

$url = “http://the-url-of-the-page-we’re-downloading.com”;

$ch = curl_init();

$userAgent = ‘I iz in ur blog, stealing ur pagerank’;

curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);

curl_setopt($ch, CURLOPT_URL, $link);

curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);

$data = curl_exec ($ch);

curl_close ($ch);

So now our clever spammer has the HTML of our page sitting in a variable called $data, and he’s free to do whatever he wants with it.

The first thing he’s probably going to want to do is find the root folder of your blog. Although the above parse_url($urlPHP_URL_HOST); works for the 90% odd of blogs which are installed in the domains root, while we’ve got the page anyway, we might as well make sure where it is.

Looking through the HTML of your page, he wants to find any links which point somewhere else on the blog, links which appear in every page (so we know they’re not editorial links in posts, but actually links which are hard-coded into the code).

Oh look, here’s one:

<link rel=“pingback” href=“http://www.domain.com/subdirectory/xmlrpc.php” />

 

It’s our pingback address. It appears in every page, is hard coded, and links relative to the root of the wordpress install.

So now with a simple bit of regex:

 

(’/link rel=\”pingback\” href=\”(.+?)xmlrpc.php\”/’);

 

Our spammer knows the root of our wordpress install, so knows what to tack his wp-comments-post.php on the end of. In this example, our post URL would be:

 

http://www.domain.com/subdirectory/wp-comments-post.php 

 

Ah! But you can just change the name of your wp-comments-post.php , I hear you say. Well, you can, but if our spammer’s got this far, that’s probably not going to stop him, he’s just going to find the name you changed it to anyway. Let’s say you changed it to “whoop-de-doo.php”, you’re still going to have the following footprint somewhere in your code:

 

<form action=“http://www.domain.com/subdirectory/whoop-de-doo.php” method=“post” id=“comment-form”>

 

…which our spammer will regex out using something like:

 

(’/<form action=”(.+?)“ method=“post” id=“comment-form”>/’);

 

to give him our comment post URL:

http://www.domain.com/subdirectory/whoop-de-doo.php

which he can post to just as easily as if he were posting to wp-comments-post.php

So what can we do here to stop him? Well, we can mix up the attributes of the form, maybe instead of

<form action=“http://www.domain.com/subdirectory/whoop-de-doo.php” method=“post” id=“comment-form”>

we might try something like

<form method=“post”  id=“comment-form” action=“http://www.domain.com/subdirectory/whoop-de-doo.php”>

While this won’t stop the cleverest of spammers, it will stop the ones which stop at one footprint (the really smart ones will have a regex for each variation on the above.)

So now we’ve covered how a spammer finds your blog, and how he analyses it for the information he needs to post to yourr blog, lastly, he needs to post to your blog.

Unfortunately, once he’s got this far, there’s not a lot you can do that a piece of software such as Askimet or Spam Karma won’t do anyway, but hopefully, if you follow the steps outlined above, the dirty filthy spammers won’t find you, or be able to work out how to post to your blog anyway.

I’m sure there are other things we can do, do you know any? Any suggestions?

I’m sure someone cleverer than me could even roll this stuff up into plugin fairly easily….

Tags: ,

How to spam my Wordpress blog PROPERLY

Here’s a message to you wordpress spammers - if you’re going to spam my blog, at least show a little ingenuity.

If you want to drop your shit here, try to be a little more creative than just searching technorati for tags, then auto-posting some mindless drivel such as

PHP is a good programming language. Most people like dynamic websites which they can interact with, which PHP allows you to do.

I mean really, people, show some bloody creativity.

How about something like

I found your post while searching for simple tutorials on how to create a do while statement in PHP. Although your post didn’t answer my question, you’ve got some interesting posts here, I’ve subscribed to your feed.

See, now you’ve actually gone a little further than some generic bullshit that’s going to hit the trash as soon as I open up my comment moderation control panel

You’ve addressed a specific issue, and you’ve complimented me. I like you (or more to the point, your bot) already.

Another thing you might do is to ask me a question - you’d be surprised how well bloggers such as myself respond to being asked questions. We’ll probably even answer them, although you’ll never see the answer.

You could phrase your question like this:

I quite like your theme, did you do the header graphic yourself, or did you pay someone to do it? If it was professionally done, do you mind me asking who did it and how much it cost?

See, you’re interacting with me without even knowing it!

So come on spammers, get off your lazy asses and put some thought into your automated comments, I think you’ll find that your “stick rate” will be much, much higher.

Tags: , , , ,

PHP Google Blogsearch URL Scraper

Sometimes you just need a crapload of URL’s from Wordpress blogs. It’s nobodies business why you need them, if you need them, you need them.

Enter DaPimp’s Google Blogsearch URL scraper.

In a nutshell, this script grabs the fist 1,000 results of Wordpress blogs for a given keyword, and spits them out in a nice list for you.

**You’ll need PHP5, and a server with cURL enabled for this script to work**

Instructions for use:

  1. You can either download the script here (change the file extension to .php), or just copy and paste the code below (wordpress buggers up the quote marks in code, so you’ll probably need to go and replace them manually -just download the script, it’s much easier)
  2. Open the script in a text editor, and change the $keyword variable at the top to the keyword you want to search for
  3. Save the script and upload it to your server
  4. Navigate to the script in your browser, and wait, you’ll get your list

============ Start PHP Script ================

<?php

//give the script a keyword to search for
$keyword = “ipod touch”;
$keyword = str_replace(” “, “+”, $keyword);

//start a counter so we can number our results
$num = 0;

//set a start for our paging of Google Blogsearch (we’re going to be getting 10 pages X 100 results)
$start = 0;

do {

//Create the feed URL we’re going to get from Google Blogsearch
$feed = ‘http://blogsearch.google.com/blogsearch_feeds?hl=en&q=%22′ .$keyword. ‘+%22powered+by+wordpress%22&ie=utf-8&num=100&start=’ .$start. ‘&output=rss’;

//We’re using cURL to actually go fetch the page from Google Blogsearch
$ch = curl_init($feed);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $feed);
$page = curl_exec($ch);
curl_close($ch);

//Loop through the feed, and suck out the URL’s
$xml = new SimpleXMLElement($page);

foreach ($xml->channel->item as $item) {

//Add 1 to our counter, so our list has numbers next to the URL’s
$num = $num + 1;

$link = $item->link;

//Print our shit to the page
echo $num. ‘ - <a href=”‘ .$link. ‘”>’ .$link. ‘</a><br>’;

}

//Have a rest so we don’t get banned for hitting Google too hard and fast
sleep(30);

//Add 100 to the start, so we can fetch the next 100 results
$start = $start + 100;

}

//Keep doing this shit until we get to page 10 of the Google results
while ($start < 1000);

?>

============ End PHP Script ================

Tags: , , ,