Blacklisting comment spam | Clagnut § Blogging

Published in Brighton, UK

Clagnut

Blacklisting comment spam

Everyone’s talking about it. Everyone’s getting it. The evil that is comment spam: blog comments and/or URLs which link to off-topic and usually questionable sites, posted with the sole purpose of improving a Google ranking.

I’ve been hit a few times recently, once by a huge HTML comment containing masses of links, similar to email spam. More common manifestations are short, innocent looking comments posted with a dodgy URL, and rogue entries on my referrers page.

Well it’s time to nip it in the bud. To start off, I’ve time-limited how often someone can post a comment – this should prevent the kind of robot attack which results in dozens of spam comments to the same post.

Following Simon Willison’s lead, I’ve also implemented a blacklist technique. Any comment or referral I judge to be spam will be deleted, and the offending domains will be blacklisted. Any future comments that contain links to those domains will be refused and the poster’s IP address logged. My blacklist is available at blacklist.txt. You are welcome to grab a copy of that file once every 24 hours and use it as part of your own comment spam prevention system. As a growing decentralised web of trust, other good folks have also been posting their blacklists:

If you start using a similar system, drop me a line and I will use your blacklist as well. Please don’t merge other people’s blacklists into your own public list. If I find non-evil URLs in someone’s blacklist, I will unsubscribe from it, so all your hard work may be undone by someone else’s carelessness or maliciousness.

20 October 2003

§ Blogging

7 comments

Related photos

Next

Previous

Related posts

Keywords

â–º Machine tags

Comments

  1. 1

    Have you seen James
    Seng’s Bayesian Filter for Comment Spam yet? Might be just the thing. http://james.seng.cc/archives/000152.html

    Jemal
    20 Oct 2003
    12:44 GMT
  2. 2

    Come to jessey.net and get a free debt enlargement!

    Only joking. It seems incredible to me that these creatures that do this kind of thing can justify their behavior. It is all a numbers game to them, of course, but it destroys the Internet and what it stands for. Bastards.

    I don’t even bother having comments on my blog. I’d rather not deal with it.

    Simon Jessey
    Simon Jessey’s Gravatar
    20 Oct 2003
    17:43 GMT
  3. 3

    And what do you do with a blacklist once you have one?

    Surely, you forgot to mention MT-Blacklist: http://www.jayallen.org/projects/mt-blacklist ...

    Jay Allen
    25 Oct 2003
    10:58 GMT
  4. 4

    here is a lesson that i will give free of charge but you may not wanna black list it so that others can view and hopefully learn from it, i am spamming this blog page, for one thing blog spammers dont want this getting back to them in any way that will get them in trouble so this is what i have done…..
    fake email addy
    fake proxy setting
    a website with no contact info

    so if you can find a way to filter those out you ve got something, but its gunna happen all the time unless you take out the “website” submission bar so peeps cant leave there mark on your site

    jazz man
    15 Jan 2004
    22:36 GMT
  5. 5

    and another thing i noticed is that when people black list them from there site they put the blacklisted domain right on there home page which is so stupid because they didnt blacklist it, all they did was move that link to the first page everyone is gunna see it b4 anything else, if your gunna blaklist just delete the whole thing, makes sense right? we dont hang people from bridges that are entry ports for our cities anymore, that was years and years ago for all the uneducated people out there

    jazz man
    15 Jan 2004
    22:46 GMT
  6. 6

    What do u whant? U`ll recieve spam %(

    Battleship Potemkin
    2 Aug 2004
    20:54 GMT
  7. 7

    I think one solution is to not allow html tags at all, and no urls !

    Mister Bär
    30 Oct 2004
    14:09 GMT

Add your comment

Comments are now closed on this post. If you have more to say please contact me directly.

Outside interest

Technorati references

You have used up your daily allotment of Technorati API queries.

Top Referers