Willow for Internet content filtering
Description
Currently testing this software for content filtering. The nice thing about Willow is: it uses a Bayesian filter (well known for filtering out SPAM). In short: when a page is requested it determines on the fly if the page is “good” or “bad”. To determine this it uses a set of good and bad pages (the set which is included in the package seems to work well). No need for crappy white/black lists! Personally I couldn’t find any porn website which slipped through the filter.
Notes
It depends on an multiverse package python2.4-profiles.
- The webinterface works by using the user configured in willow.conf and the password hardcoded on line 1607 of willow.py
- The documentation is a bit scarce. You’ll need your linux experience to set it up.
- The page which appears if you visit a blocked site is hardcoded in several python scripts. Not a big deal but not really convienant.
Future
Ogra was considering it for Edubuntu inclusion but obviously the multiverse dependancy is a problem. If included there will probably be some GTK interface instead of the current webinterface.
Installation
Just get it from the website. And untar it. In configuration example we untarred it into /opt/willow.
Configuration
example willow.conf
syspath = ['/opt/willow'] port = 8080 filters = ['urlfilter','domainfilter','contentfilter'] domainfilterpath = '/opt/willow/filters/domain' urlfilterpath = '/opt/willow/filters/url' contentfilterpath = '/opt/willow/filters/content'
Run it
/opt/willow/willow.py --config=/opt/willow/willow.conf
Sources
Willow homepage: http://www.digitallumber.net/software/willow/