* Overview [1]
* Wiki as Email [2]
* Headers [3]
* Content [4]
* Filtering [5]
* Bayesian Filtering [6]
* SpamAssassin [7]
-------------------------
OVERVIEW
Its rather simple. Rewrite each edit as an email, and use existing
spam tools to classify the edit. Bayesian filters should work
fantastically well for this application, though I can't think of any
good reasons why more traditional filters such as SpamAssassin [8]
won't work.
-------------------------
WIKI AS EMAIL
HEADERS
Most of the headers in this application are defunct, but well formed
headers will help the filters work their magic in the correct fashion.
CONTENT
The content is a little tricky. Do we simply supply the raw wiki
text, or do we render into HTML? Which content do we include -
everything or just the diff? Initially I think that the diff text in
raw form should be enough, rendering into HTML is probably a good idea
at a later date.
-------------------------
FILTERING
BAYESIAN FILTERING
The regular benefits of Bayesian filtering over other methods should
apply equally as well on a wiki as in email. As with any Bayesian
filtering, the system needs to be trained and so the training
interface will probably be the most cumbersome component of our
anti-wiki-spam coding.
SPAMASSASSIN
SpamAssassin's default rules would need to be tweaked by use of a
custom config file, as various tests (eg: MIME_HTML_ONLY) are useless
in this context.
Links:
------
[1] http://melbournewireless.org.au/#overview
[2] http://melbournewireless.org.au/#wiki_as_email
[3] http://melbournewireless.org.au/#headers
[4] http://melbournewireless.org.au/#content
[5] http://melbournewireless.org.au/#filtering
[6] http://melbournewireless.org.au/#bayesian_filtering
[7] http://melbournewireless.org.au/#_spamassassin
[8] http://www.spamassassin.org/
[EditText] [Spelling] [Current] [Raw] [Code] [Diff] [Subscribe] [VersionHistory] [Revert] [Delete] [RecentChanges]
Node Statistics | |
---|---|
building | 132 |
gathering | 191 |
interested | 519 |
operational | 232 |
testing | 212 |