Splogs: Antisplog.net system
Hatem from Antisplog.net has left a comment on my post "Google, Blogger and splogs", asking for my opinion about his site. Antisplog.net is an on-line service launched a few days ago, that enables you to check whether a given URL is likely to be a splog.
As explained here, to use it, you simply send the query:
- http://www.antisplog.net/check/the_url_to_check
Antisplog.net will return :
- 1 : if the blog is detected as a SPLOG
- 0 : if not.
- 3 : if the URL don't open due to a DNS error, 404 error ... etc
Correct | ||
---|---|---|
Normal | 17 | |
Spam | 22 | |
Total correct | 39 (92%) |
Wrong | ||
---|---|---|
Normal (false positives) | 2 | |
Spam (false negatives) | 1 | |
Total wrong | 3 (8%) |
A success rate above 90% is quite impressive for a system that young, especially since, as I noted before, some of these splogs are quite difficult to tell apart from normal ones, even for the human eye. Congratulations then. I'll be following how the system develops with great interest.
If I can give one piece of advice for the future, I would try to decrease the false positive rate (i.e. normal blogs reported as spam). At the moment, this rate is 2/19, i.e. ca. 10% (although of course a precise assessment is difficult on such a small number of URLs). It seems to me quite dangerous to report legitimate blogs as spam, and I would be happier that this rate fall well below 1%, even if the price to pay is to let more splogs through the net.
Of course, spammers monitor all this (see here for instance), and I am pretty sure that they will come up soon with splog- generating software to produce human-looking texts which will be extremely difficult to tell apart from real human texts by automatic means.
Anyway, congratulations again, Hatem, and good luck with your system!
1 Commentaires:
Some spammers are already creating splogs with human created text. They just steal text from other sites (Wikipedia being an obvious choice).
But even with actual human created text there are still characteristics splogs do not share with normal blogs. They are much harder to detect by a human unless you recognize the text is stolen, but hopefully AntiSplog.net can identify most of them based on their other spammy characteristics.
Enregistrer un commentaire