Battling Comment Spam — Moderating Comments

Battling Comment SpamIn our past two posts we’ve looked at what comment spam is and how we can minimize it. Now comes the hard part — individually looking at the comments that make it past our defenses and deciding what to do with them.

Detecting Spam — Moderation Queue

If you’ve got Akismet or some other spam filter set up then you will have some potential spam comments in your moderation queue to deal with.  Most of the blatant spam comments you’ll recognize and you can just skip over them.  Once we’ve moderated the tricky ones then we’ll just delete everything left in the queue.

However, recognizing Sneaky spam takes a bit of learning.  If you have only one blog then you’ll be less likely to see the work of spambots as they hit each of your blogs with the exact same comment.  But over time you’ll see patterns that help you to recognize that a comment is likely spam.

Name

First of all, check out who it is coming from.  Is the name a keyword or a real name (or at least a nickname like LoneWolf)?  If there isn’t a real name then that is a flag — not necessarily proof mind you.

URL and Email

Next, look at the URL that they entered and the email address.  Do they match?  Do they make sense?  If not, there’s another flag.  Keep in mind that many users set up throw away email addresses to reduce email spam so you may still have a legitimate comment even though the email looks strange.

Avatar

Another flag that indicates possible spam is the lack of a Gravatar.  Spammers rarely have them but keep in mind that the lack of one does not prove that the comment is spam.  It is just another clue.

Content

Ultimately, you have to look at the content of the comment itself.  Does it relate to the post?  Does it add value to the conversation?  Many spam comments are very generic and usually complimentary (although I’ve seen those that issue a generic challenge).  You’ll see things like “Great post!” or “You write very well.  Are you a professional?” While it is possible that these are legitimate comments (they usually aren’t), they don’t really add to the conversation.  They do feel good though — if they’re from a real person who wrote them sincerely.

Spammers are becoming more creative, and have taken to using quotations from blogs and/or comments to create the comments that they send.  They also have comments that are related to keywords and target blogs that mention them.  This makes it a little more tricky to catch the spam.

One tool that I use is Google.  If I have a comment that I’m not sure about, I’ll cut and paste it into Google search with double quotes around it to look for exact matches.  You’ll be surprised to see the exact same comment appear in dozens or even hundreds of search results.

But even then, the spammers are getting smarter.  Just yesterday I got the following comment on one of my blogs.

I can’t understand how to add your blog to my rss reader. some recomendations are appreciated I really want to see your articles.

It seemed like a reasonable request for help, but I checked it in Google just to be sure.  No matches!  Well, let’s help this person out.  I send them an email with a link to an RSS tutorial.  Guess what!  No such email address existed.  So I did another search, this time with only the first sentence.  Bingo!  Ding! Ding! Ding!  We have a winner!  Dozens of matches — each with a slightly different wording of the second sentence.

Ultimately, you are going to have to decide whether a comment is useful for the conversation on your blog.  You may sometimes block a legitimate user’s comment, but that is rare and they should have made a better comment in the first place.

Spammer Databases

There are many people out there who are dedicated to battling spam in many forms. One group that I’ve found helpful is Stop Forum Spam. This group has set up a database of known spammers that you can check to see if the name, email address or IP address matches a known spammer.

This group was formed to deal with people who sign up to forums in order to spam them and I discovered them when looking at the signups for my Drupal based site (Master It). I decided to test the spam comment that I mentioned above and found that the IP address was a match in their database. So, this may be a good resource for bloggers to use as well. They do have an API for checking and reporting spammers, so I can imagine a plugin at some point.

Detecting Spam — The Rest

Now that you’ve dealt with the moderation queue, you still need to look at the comments that made it through the filters and plugins (unless you’re moderating everything — you’re not do that are you?)  But this should be fun.  This is where you’re seeing actual conversations.  There should be very little spam that made it this far, if any.

So read the comments, respond and enjoy.  This is a big part of why we blog in the first place.

Reporting Spam

There is one last thing to consider before we leave this topic.  Tools like Akismet and Stop Forum Spam will only work if we all report the spam we receive.  They use existing spam to be able to detect future spam, so make sure that you report the spam and spammers using the tools that you set up for your blog.  We’ll all have less spam to deal with in the long run.

Conclusion

So that leaves us with just one thing left … your comments!  What do you use to help battle comment spam?  What is the most creative spam that you’ve ever seen on your blog?  Share your thoughts below.


Update!

While this was intended to be the last post in this series, an ironic twist presented itself and I couldn’t pass up the opportunity for another post.  See Battling Comment Spam — A Real Life Example.

[stextbox id=”custom” mleft=”20px” mtop=”20px”]This post was part of a series originally posted at my blog Ramblings. I feel that this series  is a good fit for LMA so I’ve reposted them here.[/stextbox]


Female Warrior 2 image by EdwinP at stock.xchng

Battling Comment Spam – Dealing With It

Battling Comment SpamCombating Spam

In the previous post we looked at what blog comment spam is.  We defined three different types and looked at three different sources.  But now we want to know, “How do we fight back?”  Comments are a valuable part of blogging and the social web.  They are vital for building community.  But it takes time to moderate comments.  What ways can help us handle the load?

Shutdown

Some bloggers have given up the fight.  They shut comments off completely and the blog becomes a one-way rant rather than a conversation.  I find this very frustrating.   It cuts out an important part of the blog experience and doesn’t help the community.  There are no conversations, no backlinks, no accountability.  This is appropriate for a corporate information site, but not for a blog.

Laissez Faire

Others have given up the fight by going in the opposite direction.  They just auto approve everything.  They get lots of spammy comments (and I’m sure that the spammers share this information) and let them sit in amongst the true conversation — kind of like weeds in a garden.  This works well as long as you don’t mind the type of spam coming in.  Some of it is actually quite creative and even fits into your blog.

But what happens when you start seeing the pornography comments?  Or the “600 link” comments?  What happens when your visitors see them?

Middle Ground

Most of us have to live somewhere in the middle ground between these two extremes.  But how do we handle it without going crazy?  Well there are lots of techniques that are in use right now and you need to find a combination that works well for you.

I use mostly WordPress blogs, so this will lean more towards WP but most of these techniques should be available in other blog software.

Blog Settings

WordPress Dicsussion Settings (part 1)

WordPress Comment Settings

WordPress allows you to set up some basic comment moderation.  There are several settings:

Discussion

  1. Require commenters sign up for your blog.  I know that always turns me away from commenting on someone’s blog — I don’t have time to register and remember another userid/password.  Check out Are You Chasing Your Blog Audience Away? for a well written post on this subject.  Bottom line, don’t do this unless you’re building a forum.
  2. All comments are put into the moderation queue.  This is the kind of work we’re trying to avoid, so lets see what else there is.
  3. Allow users who’ve already had approved comments on your blog to post without moderation.  This will cut down on the amount of work required if you have a lot of repeat commenters.  But keep in mind that spammers know this and will often put in 1 or 2 good comments to get past this and then start spamming.
  4. Allow all comments.  Believe it or not, this is the route that use on my blogs although I have some plugins that help identify spam.

Filters

WordPress Discussion Settings (Part 2)

WordPress Comment Filter Settings

There is also a section of comment filters that is applied to every incoming comment regardless of the settings described above.  This allow you to set up general filters that look for certain keywords or multiple links.

I’ve left these alone as the plugins that I use will do a better job of catching these types of spam comments.

Avatars

WordPress Discussion Settings (Part 3)

WordPress Avatar Settings

Finally, there is the avatar.  If you’re not familiar with this concept, I’d suggest that you check out Gravatar, the de facto standard for avatar handling on the web at this point.  It allows users to have a profile picture that follows them around the web.  Set it up once and it is there for any site that allows them to be used.  This functionality is built into WordPress and most other CMS and blog systems.

The advantage to having Gravatar enabled on your site is that spammers rarely have one.  They are based on email addresses and spambots use throwaway addresses.  This will be a big help when moderating the comments that get into the queue (or even those that get through).  Keep in mind that the absence of a Gravatar is not a spam indicator by itself — many legitimate users don’t use them yet leave thoughtful and useful comments.

Plugins

Now that we’ve done what we can do with WordPress out of the box, we can now start to tinker.  If you go to the Install Plugins page and enter the keyword spam you’ll be presented with a list of plugins that deal with spam related issues.  The current list shows 19 entries.  Some of them are older plugins that are no longer supported (or needed).  There is even one that let’s you turn off the colour coding for spam entries so they don’t clash with the admin theme colours.

But of the rest, there are 2 major classes of plugins — those that try to prevent or slow down spammers and those that try to determine which comments are spam after the fact.

Prevention

These plugins use different techniques to ensure that the comment is coming from a live person rather than a spambot.  They’re usually pretty effective and use techniques such as Captcha’s or mathematical questions that are hard (but not impossible) for a spambot to crack.

They work pretty well at keeping out most of the spam, but they may also keep out a lot of legitimate comments.  I know that I hate them and I doubt that I’m alone.  They make for an extra step to leave a comment.  And no matter how politely they are presented, the implication is that you don’t trust me.  For this reason alone, I don’t plan on using this type of plugin to combat spam.

Detection

Detection is the other route.  These plugins will scan comments that come in, looking for various characteristics that indicate spam.  The best of them use databases to compare comments against.  Over time they become more accurate.  They will detect potential spam comments and either delete them or put them into the moderation queue for you to check.

I like this route.  It allows most legitimate comments to come through without any intervention or extra steps on the commenter’s part.  The comments show up immediately.  And any questionable comments will end up in the moderator’s queue where you get to decide.

My favourite plugin for spam detection is Akismet, which comes built in to WordPress.  You’ll need to get a free API key to allow the plugin access to the database, but that’s all.  The API key works for multiple sites and there are Akismet plugins for other CMS products (for example, I have a Drupal site with Akismet enabled).

The Future

What does the future hold?  Well, if the past is any indication, spam will continue to be a problem for bloggers.  As long as it gives them a benefit (i.e. traffic and/or backlinks) that outweighs the costs they will continue to find ways to put comments in our blogs.  Hopefully platforms like WordPress will be able to introduce tools to reduce spambots.  I’m hoping to see a mod that will use nonces to bounce the bots.  I don’t know if it would work 100% and there are some other issues with it.  But it may be one way to make things harder for them.

In the mean time, we have to continue to be vigilant in our fight against spam.  We need to look at our strategy to keep spam out of our blogs while encouraging good communities.  It isn’t an easy task, but I believe that it is worth it.

The Next Phase — Moderating

We will have some comments in our moderator queue that the filters and plugins weren’t sure about.  There may be some comments that went live when they shouldn’t have.  And worst of all, there may be some false positives that were flagged as spam.  In the next article will discuss how to handle this.

[stextbox id=”custom” mleft=”20px” mtop=”20px”]This post was part of a series originally posted at my blog Ramblings. I feel that this series  is a good fit for LMA so I’ve reposted them here.[/stextbox]


Female Warrior 3 image by EdwinP at stock.xchng

Battling Comment Spam – What Is It?

Battling Comment SpamAnyone who has a blog knows that comments are magnets for spam. Many bloggers have struggled with ways to deal with spam and I can imagine it becomes harder as your blog becomes more popular, not easier. But lets take a closer look at these comments.

Comment Spam Types

There are several different types of comment spam. Some of it is easy to identify, but spammers are becoming more creative.

1. Blatant Spam — This is obvious spam. It usually has nothing to do with the topic of the post (unless there is a lucky coincidence). It will usually have a couple of links to the websites that the spammer wants to promote. There are even times where the comment is not even in the same language as your post (or even the same character set).

2. Link-o-Rama Spam — This is probably a sub-type of the Blatant spam. But you’ll find that these comments are very long and consist mostly of keyword/link combinations. What is often amazing about these comments is the variety of links.

3. Sneaky Spam — Here we get to the type of spam that is more troublesome. These comments will often be vague (things like ‘Nice post.’ or ‘You write good blogs.’) and flattering (although I’ve seen some that tell me I’m wrong about what I wrote). These are the kind of comments that when we first see them, we think “They love me. They really love me!”. However, over time we begin to realize that these comments are just backlink attempts.

The spammer can get even more sneaky. Rather than just sending generic comments to thousands of blog posts, they scan for keywords and submit comments that fit. It becomes obvious when you have comments about Volkswagen Golf on your golf blog, but these can often be hard to detect.

Some spammers are actually using quotes from the post or other comments to sneak their way in. You do have to give the spammers credit for creativity.

Comment Spam Techniques

Most spam in your blog comes from three different routes.

Good Comments

There are certain types of comments that a blogger is looking to encourage on their blog. Peter Davies at Interactive Blogger has a great article describing good blog commenting techniques. Check it out to learn more about how to create good comments.

1. Other Bloggers — These comments are usually the Sneaky Spam types. They come from a blogger who is trying to build backlinks to their site by commenting on as many blogs as possible. However, these comments don’t add anything to the conversation and they often make you wonder whether the poster has even read the article.

2. Outsourced Backlinkers: You can hire people in third world countries who, for a fee, will spend hours commenting on blogs in your name, or at least with your url. These comments are often Sneaky Spam comments but can be Blatant Spam as well.

3. Spambots: The most insidious spam comments come from bots. These bots simply call the appropriate url to submit a comment without even going to your blog page. I know that this happens because my blogs get less traffic than comments on a regular basis. And the comments are often found on posts or pages that Google Analytics shows have received 0 visits.

These are where the Link-o-Rama Spam come from (no one is going to type in that many keywords and links), but a lot of the Blatant and Sneaky types are submitted this way too. On Cookie Crumbles I have some cartoon posts that have received comments like “You write really well …”. There is no way a live person would put that comment there (I hope 8=)

Spambots are getting more clever and will often use keyword searches to determine what comment to put on your blog. You may even find that the comments contain quotes from your article or other comments.

Combating Spam

How do we fight back? Comments are a valuable part of blogging and the social web. They are vital for building community. But it takes time to moderate comments. What ways can help us handle the load?

Well, that’s what the next two articles are all about.

Battling Comment Spam — Dealing With It

In the mean time, tell us all what bothers you most about spam.

[stextbox id=”custom” mleft=”20px” mtop=”20px”]This post was part of a series originally posted at my blog Ramblings. I feel that this series  is a good fit for LMA so I’ve reposted them here.[/stextbox]


Female Warrior 1 image by EdwinP at stock.xchng