How to Create a CIPA Compliant Content Filter Using Free Software.

In this post I will describe how to use free open source software to create a simple CIPA compliant web filter that can be used in a small to medium public library.  I am using Ubuntu server 10.04 with Squid, Dansguardian and Webmin. I am using this post as a guide.

The first step is to install Ubuntu Server 10.04.  I chose to install the 64 bit version, since the machine I was installing it on had an AMD Turion 64 chip, but the 32 bit version will also work.  If you don’t have an extra machine to put this on, or if you just feel more comfortable using a virtual machine, you can use vmware on an existing windows or linux machine.

After the install I ran:
sudo apt-get update
sudo apt-get upgrade safe
sudo apt-get upgrade

Change to static IP

After installing the base system, I installed openssh-server, so I can log into the system using ssh, and run this server as a headless box.

sudo apt-get install openssh-server

Then connect to the server from my local machine (terminal for me as my machine is running Ubuntu, or if you are using windows you can use putty) using:
ssh -l username ipaddresss
where username is the name of the user you wish to log in as and ipaddress is the IP address of your system.

Ok, now it is time to start installing some software.

sudo apt-get install squid clamav-daemon dansguardian apache2

You may see this error:

Setting up clamav-daemon (0.96+dfsg-2ubuntu1.2) …

* Clamav signatures not found in /var/lib/clamav

* Please retrieve them using freshclam or install the clamav-data package

* Then run ‘/etc/init.d/clamav-daemon start’

So I ran

sudo freshclam
sudo /etc/init.d/clamav-daemon restart

Configure the Squid proxy server

make a backup copy of the config file:

sudo cp /etc/squid/squid.conf /etc/squid/squid.conf.bak
Edit the squid.conf file
sudo vim /etc/squid/squid.conf
Search for the line TAG: visible_hostname
/TAG: visible_hostname

Move the cursor to the bottom line of that section after # none and hit the “i” key to insert the text
visible_hostname squid

Find the line # http_access allow localnet
and delete the # – this will allow for all clients on the local network to get to the Internet

Restart the squid service
sudo /etc/init.d/squid restart

Configure Dansguardian

The main configuration for Dansguardian is located in /etc/dansguardian/dansguardian.conf
sudo vim /etc/dansguardian/dansguardian.conf

Add a # in front of the line:
UNCONFIGURED – Please remove this line after configuration

Restart dansguardian:
sudo /etc/init.d/dansguardian restart

Configure your browser to use the proxy settings

Internet Explorer: Go to Tools->Options->Connection-Lan Settings and check the box for “Use a proxy server for your LAN…) then type the address of the Dansguardian Machine (my case and set the port to 8080 (this is the port that dansguardian uses). Check the box for Bypass proxy server for local addresses

Firefox – Tools->Options->Advanced->Network tab->Settings. Select Manual proxy configuration and set the HTTP proxy to the IP address of the Dansguardian Machine and the port to 8080. Check the box “use this proxy for all protocols”

The system should now be working as a proxy server and content filter, but we will need some way to configure the setting sin Dansguardian.  I find that it is fairly restrictive out of the box and that I need to add some site exceptions etc, as time goes by and patrons tell me that some sites are being blocked that should not.  So in order to have a GUI to edit these files, I will install Webmin and the Webmin Dansguardian Module.

Install and configure Webmin

Download Webmin:


Install the Webmin dependencies:

sudo apt-get install perl libnet-ssleay-perl openssl libauthen-pam-perl libpam-runtime libio-pty-perl

One other dependency is libmd5-perl, but has been deprecated and is no longer available in the Ubuntu repositories for 10.01. So to install it we need to use a workaround:

sudo dpkg -i libmd5-perl_2.03-1_all.deb

Now it is time to Unpack Webmin

sudo dpkg -i webmin_1.510-2_all.deb

Woops, one more dependency problem…

dpkg: dependency problems prevent configuration of webmin:
webmin depends on apt-show-versions; however:
Package apt-show-versions is not installed.
dpkg: error processing webmin (–install):
dependency problems – leaving unconfigured
Processing triggers for ureadahead …
Errors were encountered while processing:

So we need to

sudo apt-get install apt-show-versions

and we get:

Reading package lists… Done
Building dependency tree
Reading state information… Done
You might want to run `apt-get -f install’ to correct these:
The following packages have unmet dependencies:
apt-show-versions: Depends: libapt-pkg-perl (>= 0.1.21) but it is not going to be installed
E: Unmet dependencies. Try ‘apt-get -f install’ with no packages (or specify a solution).

so now we try

sudo apt-get -f install

Hooray, we now have Webmin installed

Webmin install complete. You can now login to https://dansguardian:10000/
as root with your root password, or as any user who can use sudo
to run commands as root.

So try it out, open a browser and go to using the ip of your ubuntu server. You will need to get through the security certificate warnings then log in as a user in your sudors file.

Now we need to install the Dansguardian module for Webmin

Download the module:

Then open webmin and go to Webmin>Webmin Configuration>Webmin Modules
Chose install from local file and navigate to the file home>user ,where user is your user name, then choose the file and click install module.

Now click on servers and you will see that Dansguardian Web Content is in the list. Unfortunately when we click on it we get the error:

Warning – DansGuardian binary file not found, maybe you need to update your module config (especially the directory paths).
(Expected location: /sbin/dansguardian)

Warning – the version of DansGuardian you have is not supported by this Webmin module version
Webmin Module Version 0.7.0beta1b supports DG version 2.10 (& 2.9)
Currently installed DG version ?

Warning – running as root(superuser) risks new files not being readable by production DansGuardian

Kinda frightening, but have no fear. Click on module config and change the full path to Dansguardian Binary to /usr/sbin/dansguardian.

We now have a functioning proxy server with content filter and a gui to help configure it. Dansguardian has zillions of configuration options, but I find that I generally only worry about the Exception site (domain) list /etc/dansguardian/lists/exceptionsitelist which can be found under View/Edit A Filter Group’s Lists. I have also changed the Banned (file)extension list /etc/dansguardian/lists/bannedextensionlist as the list of banned extensions is more restrictive than what we allow here.

Anyway, I hope this helps someone out there. Please let me know if I have made any mistakes, or if you have any questions.

About ashkev

The Fiddling Librarian 3.0 is Kevin Smith’s personal weblog. Kevin is a recent graduate with an MLS from IUPUI. Kevin is the Assistant Director of the Cass District Library which is in Cass County Michigan.
This entry was posted in Uncategorized. Bookmark the permalink.