Spam Filters Gone Wild: This Is True

Waaay back in the dark ages of the Web (somewhere between 1994 and 1997) I discovered a weekly email newsletter called “This Is True.” It collected strange-but-true news stories from around the world, summarizing each in a short paragraph with a witty one-liner at the end. I subscribed to the free edition, and later to the full version, which had about twice as many stories. I even picked up a few of the books collecting past stories (at a con, I think, but I can’t remember which con).

Eventually I got too busy to read them, and the back-issues piled up unread, and I decided to let my subscription lapse. But earlier this year, I decided to re-up with the shorter, free version, and it’s still as good as ever.

This week’s issue included a disappointing story: even though they practice — in fact, probably helped originate — responsible list management, Yahoo is blocking them as spammers. Why? Because people are signing up for the list, then deciding they don’t want it anymore, and instead of unsubscribing, hitting the “Report as Spam” button. Yahoo has apparently taken those spam reports at face value, and blocked everyone’s copy of the newsletter.

Clearly, some people are unclear on what “spam” means. It’s not just “mail I don’t want.” It’s mass mail I don’t want and didn’t ask for.”

That, and I’m sure some people don’t realize that their reports are being used to train everyone’s filters. I remember a co-worker explaining a few years ago that he’d trained Gmail to send the SourceForge newsletters (or something similar) straight into his spam folder. I commented that they might be using that data to train their sitewide filters, and he said something like, “I hope not.”

Using user feedback to train sitewide or network-wide (such as Cloudmark, or Akismet) filters is a powerful technique. Some people will catch the leading edge of a spam attack, and that data can be used to protect others as the attack continues. Some will check their mail sooner, and that data can be used to re-filter messages that have been received, but not yet viewed.

Unfortunately, it also can give a lot of power to people who are either unclear on the criteria being used or have an axe to grind, unless you include measures to (a) contain the impact or (b) keep track of each reporter’s reliability. I know Cloudmark factors in the reporter’s reputation, for instance. And I suspect that AOL does, at least in some cases, limit measures such as blocking to specific recipients, but I can’t be certain.

Anyway, to summarize:

  • Use the Report Spam button responsibly.  If you actually subscribed to it, it isn’t spam unless they refuse to remove you from the list.
  • Check out This is True.  You may laugh, you may groan, you may think, or you may get pissed off at the world — or all of the above.  It’s certainly worth a look.

(I really should have finished writing this yesterday, before someone submitted the original story to Slashdot. Posting about it to get the word out seems kind of redundant now. Heck, now that I think about it, I should have submitted the original to Slashdot. Oh, well.

in View Kelson Vibber's profile on LinkedIn

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.