Ad blocking using DNS and Privoxy (with Squid for caching)

  • Sam Hetherington-Hawthorne

I hate ads in my browser. I know a lot of sites rely on them for revenue but i’m of the opinion that i can pick and choose what parts of a web page i allow into my network, and i choose to disallow ads.

Over the years i’ve used various methods for blocking ads but these days i use two solutions in combination:

  • Return “NX Domain” for DNS lookups on common ad-provider domains.
  • Privoxy for pattern-matching of ads that aren’t on the DNS blacklist.

Privoxy does pattern-matching for URLs that look like they’re ads. So it doesn’t need an explicit list of ad server domains (although it can be supplied with one for direct blocking). The main drawback to Privoxy is that it can’t pattern-match pages delivered over SSL connections and a lot of ad pedlars are starting to serve their ads via SSL.

This is where DNS blacklists come in. By blocking the DNS resolution of the ad domain it doesn’t matter whether the ads are served via http or https, the browser simply can’t connect to the server and retrieve it.

To do this the same way as me you need to be using Dnsmasq for your DNS (i use my OpenWrt router) and a server with Privoxy installed (i use a jail on my FreeNAS).

Ad blocking using Dnsmasq

I use a list of common ad server hostnames from pgl.yoyo.org. This list is pre-formatted for dnsmasq and, rather than using the traditional method of responding with a fake IP, responds to ad domains with an NX Domain (non-existent domain) so that the browser never even attempts a connection to pull the ad.

On the OpenWrt router run:

wget -O /etc/dnsmasq.conf.ads-yoyo "http://pgl.yoyo.org/as/serverlist.php?hostformat=dnsmasq-server&showintro=0&startdate%5Bday%5D=&startdate%5Bmonth%5D=&startdate%5Byear%5D=&mimetype=plaintext"

echo "conf-file=/etc/dnsmasq.conf.ads-yoyo" >> /etc/dnsmasq.conf

echo "/etc/dnsmasq.conf.ads-yoyo" >> /etc/sysupgrade.conf

/etc/init.d/dnsmasq restart

This will:

  • download the list of common ad servers to /etc/dnsmasq.conf.ads-yoyo
  • add the file as an additional config file for dnsmasq
  • add /etc/dnsmasq.conf.ads-yoyo to the list of files included in config backups
  • restart Dnsmasq so your router is now blocking those ad domains

Privoxy

On the server install Privoxy from your package repository.

The following config file is for the FreeBSD port of Privoxy. It should work fine for Linux versions but you should double check at least the file paths (Linux distros usually use /etc rather than /usr/local/etc):

/usr/local/etc/privoxy/config:

confdir /usr/local/etc/privoxy
logdir /var/log/privoxy
logfile privoxy.log
filterfile default.filter
filterfile user.filter
actionsfile match-all.action # Actions that are applied to all sites and may be overruled later on.
actionsfile default.action   # Main actions file
actionsfile user.action      # User customisations
listen-address :8118
listen-address [::]:8118
toggle  1
enable-remote-toggle  0
enable-remote-http-toggle  0
enable-edit-actions 0
enforce-blocks 0
buffer-limit 4096
forwarded-connect-retries  0
allow-cgi-request-crunching 0
split-large-forms 0
handle-as-empty-doc-returns-ok 1
max-client-connections 256
debug   4096 # Startup banner and warnings
debug   8192 # Errors - *we highly recommended enabling this*
admin-address user@example.org
proxy-info-url http://wpad.lan.example.org/proxy-service.html

The admin-address and proxy-info-url are shown on error pages so users know where to get more information on the proxy service.

If you aren’t using the DNS blacklisting above you can get the same ad server list in a format for Privoxy that does the same thing.

In the directory where your Privoxy action files are stored run the following:

Linux

wget -O ads.yoyo.action "http://pgl.yoyo.org/as/serverlist.php?hostformat=junkbuster&showintro=0&startdate%5Bday%5D=&startdate%5Bmonth%5D=&startdate%5Byear%5D=&mimetype=plaintext"

sed -i '1 i\
{+block{Ad Domains from pgl.yoyo.org} +handle-as-empty-document}' ads.yoyo.action

FreeBSD

curl -o ads.yoyo.action "http://pgl.yoyo.org/as/serverlist.php?hostformat=junkbuster&showintro=0&startdate%5Bday%5D=&startdate%5Bmonth%5D=&startdate%5Byear%5D=&mimetype=plaintext"

sed -i '' '1i\\
{+block{Ad Domains from pgl.yoyo.org} +handle-as-empty-document}' ads.yoyo.action

This will

  • Download the pgl.yoyo.org list in Privoxy format
  • Add a directive so Privoxy will block those domains

Edit your Privoxy config file and add the ad domain list as an action file

actionsfile ads.yoyo.action	# Ad domains from pgl.yoyo.org

Optional web caching with Squid

Squid isn’t used for the ad-stripping but can be used as a caching proxy to speed up browsing of common sites. It’s setup to forward requests to Privoxy so that ads get stripped out before the rest of the site’s content is cached.

The following config file is for the FreeBSD port of Squid. It should work fine for Linux versions but you should double check at least the file paths (Linux distros usually use /etc rather than /usr/local/etc):

/usr/local/etc/squid/squid.conf:

#
# Recommended minimum configuration:
#

# Example rule allowing access from your local networks.
# Adapt to list your (internal) IP networks from where browsing
# should be allowed
#acl localnet src 10.0.0.0/8	# RFC1918 possible internal network
#acl localnet src 172.16.0.0/12	# RFC1918 possible internal network
#acl localnet src 192.168.0.0/16	# RFC1918 possible internal network
#acl localnet src fc00::/7       # RFC 4193 local private network range
#acl localnet src fe80::/10      # RFC 4291 link-local (directly plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80		# http
acl Safe_ports port 21		# ftp
acl Safe_ports port 443		# https
acl Safe_ports port 70		# gopher
acl Safe_ports port 210		# wais
acl Safe_ports port 1025-65535	# unregistered ports
acl Safe_ports port 280		# http-mgmt
acl Safe_ports port 488		# gss-http
acl Safe_ports port 591		# filemaker
acl Safe_ports port 777		# multiling http
acl CONNECT method CONNECT

#
# Recommended minimum Access Permission configuration:
#
# Deny requests to certain unsafe ports
http_access deny !Safe_ports

# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports

# Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager

# We strongly recommend the following be uncommented to protect innocent
# web applications running on the proxy server who think the only
# one who can access services on "localhost" is a local user
http_access deny to_localhost

#
# INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS
#

# Example rule allowing access from your local networks.
# Adapt localnet in the ACL section to list your (internal) IP networks
# from where browsing should be allowed
http_access allow localnet
http_access allow localhost

# And finally deny all other access to this proxy
http_access deny all

# Squid normally listens to port 3128
http_port 3128

# Uncomment and adjust the following to add a disk cache directory.
cache_dir aufs /var/squid/cache 5000 16 256

# Leave coredumps in the first cache dir
coredump_dir /var/squid/cache

#
# Add any of your own refresh_pattern entries above these.
#
refresh_pattern ^ftp:		1440	20%	10080
refresh_pattern ^gopher:	1440	0%	1440
refresh_pattern -i (/cgi-bin/|\?) 0	0%	0
refresh_pattern .		0	20%	4320

#
# Forward to Privoxy for ad removal
#
cache_peer localhost parent 8118 7 no-query default no-digest no-netdb-exchange

#
# Define ACL for protocol FTP
#
acl ftp proto FTP

#
# Do not forward FTP requests to Privoxy
#
always_direct allow ftp

#
# Prefer to go through Privoxy but go direct if Privoxy is down
#
prefer_direct off
nonhierarchical_direct off

#
# No stupid 30 second wait for restart
#
shutdown_lifetime 0 seconds

#
# Email address displayed to users on error pages
#
cache_mgr user@example.org

#
# Disable Squid from managing logs.  Let FreeBSD's newsyslog do it.
#
logfile_rotate 0
debug_options ALL,1 rotate=0

Browser configuration

Manual

In your browser’s proxy settings enter the IP or hostname for the proxy server with port 3128 (for Squid + Privoxy) or 8118 (just Privoxy).

Autoconfig

If you have a web server you can have it serve a wpad.dat file that supplies the browser with proxy details and rules on when to use it.

You can use your DHCP server (such as Dnsmasq) to tell client devices the URL to retrieve the wpad.dat file from.

If you don’t supply the URL in DHCP then client devices may use your network’s search domain and try to connect to a host named wpad within it (i.e. if your search domain is lan.example.org then it will attempt to download the wpad.dat file from wpad.lan.example.org. If wpad.lan.example.org doesn’t exist it will then try wpad.example.org).

Here’s the simple wpad.dat i use:

        var privoxy = "PROXY proxy.lan.example.org:8118; DIRECT";
        var squid = "PROXY proxy.lan.example.org:3128; DIRECT";
        var direct = "DIRECT";

function FindProxyForURL(url, host) {

// If the hostname matches, send direct.
    if (dnsDomainIs(host, ".lan.example.org") ||
        shExpMatch(host, "(*.lan.example.org)"))
        return direct;

// If the requested website is hosted within the internal network, send direct.
    if (isPlainHostName(host) ||
        shExpMatch(host, "*.local") ||
        isInNet(dnsResolve(host), "10.0.0.0", "255.0.0.0") ||
        isInNet(dnsResolve(host), "172.16.0.0",  "255.240.0.0") ||
        isInNet(dnsResolve(host), "192.168.0.0",  "255.255.0.0") ||
        isInNet(dnsResolve(host), "127.0.0.0", "255.255.255.0"))
        return direct;

// If the requested traffic is FTP or HTTP go via the Proxy Chain (FTP won't be forwarded
// from Squid to Privoxy), or fail-over to direct if the proxy isn't available.
    if (url.substring(0, 4) == "ftp:" ||
        url.substring(0, 5) == "http:")
        return squid;

// There's no point in sending HTTPS traffic through Squid since it can't cache the content, but we still want Privoxy
// preventing access to any black-listed domains.  Send HTTPS traffic through Privoxy.
    if (url.substring(0, 6) == "https:")
        return privoxy;

// DEFAULT RULE: All other traffic, go direct.
    return direct;
}

Categories:

Tags: