Comment spam is the bane of bloggers everywhere. Usually, it comes canned and composed of questionable parts — an unfortunate reality, but not impossible to combat.
First of all, there is a vital difference between spam and a real comment. Spam attempts to sell you something, or to get you to click on a link without adding anything to the conversation. It’s like the guy you meet at a party, who keeps trying to sell you insurance while you try to make conversation with that lovely person you’ve just met. Online, one can have a lot of trouble separating legitimate comments from the ones trying to game the system, and some of the spammers are pretty clever.
Spam comments turn up everywhere, it’s a fact of life online. One recent high profile example comes from a security-news website, The H Security:
In a press release, Symantec announced its new web site for the football World Cup in South Africa, 2010NetThreats, and that announcement turned out to be a bad idea. Under almost every security tip published, there are comments from spammers with links for purses, T-shirts, metal parts, hotels, sport shoes, and other dubious sales offers. Distributed via comment spam, the links appear to all lead to more or less harmless online shops, but it would be easy for spammers to put in links leading to servers infected with malware.
Since then, Symantec has removed all comments and disabled the commenting function entirely. It’s a “scorched-Earth” approach, but is possibly the best way to deal with the staggering array of spam the company site had managed to accrue. According to the article linked above, the incident reportedly happened because Symantec had skipped some basic precautions.
A significant part of my job as a professional blogger is moderating comment streams on a wide variety of websites and blogs. This usually means continuing the conversations that had been generated and eliminating spam. Over the past six years, I’ve dealt with enough spam to choke a regiment. Today, I’d like to share some tips on identifying and beating it:
- Install a good spam-catcher plugin. If you are using WordPress, I advise installing the Akismet plugin. It’s not 100% perfect, but it is still the most effective spam blocker I’ve used. I deploy it on all WordPress blogs I build. (For TypePad users, there is AntiSpam, but, not having used it myself, I can’t comment firsthand on its effectiveness.)
- Moderate first-time commenters. After the first comment has been approved, the first-timer’s future comments can go live without moderation. While this has the advantage of saving time, it has its drawbacks. A spammer’s first comment is often a compliment, but, once it’s cleared, the spamming begins (see “generically effusive comments,” below).
- Use word verification/CAPTCHA. You’ve seen them: those fuzzy, distorted words you need to enter into a form field. They turn up on Facebook, blogs — all over the place. Basically, it’s a simple Turing Test to ensure that you’re human and not a spambot. The downside of using it is that many people would not leave a comment if they have to fill out another form.
Things to watch out for, because they often turn out to be spam:
- Extremely long or short comments. You may occasionally run across very long or very short comments that are legitimate, but most of the time those will turn out to be spam. Often, that very long comment is merely a string of links, of the penis-enlargement variety. Super short comments (like, “That’s great!”) are also suspicious, so give them a solid once-over when you see them.
- Gibberish. Every blogger has gotten either nonsensical comments that seem to come from someone whose native language isn’t English, and who is struggling to communicate, or comments that are totally unitelligible. Unless the comment includes some direct reference to the blog post I usually delete it. One of my favorite examples was a recent comment on my personal blog that just said, “Viagra monkey spank.” If it does not make sense it is almost certainly spam.
- Generically effusive comments. “Nice site!” and other statements that offer non-specific praise are especially insidious. Most often, this amorphous praise is meant to make you approve the comment. Since most people — who do moderate comments — usually only moderate first posts, the spammers are then all set to pitch “natural male enhancement,” and similar types of product. I usually get email notification of all comments, and periodically catch and ban these.
- Long, unusual or absent URL. Is the link extremely long? Does it look like an affiliate link? That is comment spam. No link? That usually denotes a variation of the strategy used by the generically effusive comments described above. Once the spammers get one cleared, the barrage of garbage begins.
- Same comment, different IP. If you see the same comment come into your queue multiple times, and each time it has a different IP, you’re pretty likely to have a spammer on your hands — the one who is switching IPs to evade being found. Find him and ditch him.
- Keywords in the name field. Do you ever get a comment under the name that’s something like, “Discount Prices on Used Floorboards”? You have just discovered someone trying to get a link back to his or her website by playing your comment stream. Look for those search terms in lieu of names.
- Symbol soup. Is the comment an unintelligible mess composed of odd symbols and strange fonts? If so, the “delete” key is your friend.
- Same comment on several blogs. If you think a comment looks a little too generic but aren’t sure, here is a great trick. Copy the first line or two and paste it into the search field on Google. If you get a ton of matching results it is safe to say that you’ve discovered another spammer.
So, there you go! Keep these things in mind, and it should be easy to keep your blog’s comments from looking like the menu at this well-known restaurant…
Source: “Symantec scores own goal: its World Cup web site is full of spam comments – Update,” The H Security, 07/09/10
Image by arnold | inuyaki, used under its Creative Commons license.