User:Tepples/hosts builder

From Pin Eight
Jump to: navigation, search

This specification proposes a free software tool to build a hosts file.

Configuration

The tool reads its configuration from a file in a format called Innie, an INI-like format also used by the Action 53 ROM builder. (We reject Python's similar configparser module for lack of explicit support for duplicate sections and keys.)

The [options] section contains settings that control the entire output. All can be overridden on the command line.

Output path
INI: output=filename; command line: -o filename, --output filename
Write the file to this path. The special file name - (a single hyphen) denotes standard output. Default is -.
Hosts per line
INI: hosts-per-line=count; command line: -n count, --hosts-per-line count
Number of hostnames to associate with each IP address in each line of the output file. The default is 1, as some operating systems' hosts file parsers support only 1.
Default IP address
INI: map-to=ip; command line: --map-to ip
Set the default IP address for blacklists. The default is 0.0.0.0, but some computer security tools reportedly need 127.0.0.1.

The [sources] section lists sources from which to build a blacklist. Each must begin with source=, which gives the source a title. Each must also include a format= and a path= in the local file system.

Update URL (url=)
A URL The URL must use HTTPS, HTTP, FTP, or another scheme that Python's urllib module supports.
Update frequency (expires=)
How often to copy the url into the path. Value is a positive whole number followed by minute, minutes, hour, hours, day, or days. Default is 7 days.

Example config file:

# Example configuration file

# command line switches can override these options
[options]
output = hosts
hosts-per-line = 5
map-to=0.0.0.0

[sources]
source=Local Test Server
path  =test_servers.txt
format=hosts
map-to=127.0.0.1

source=Staging Test Servers
path  =test_servers.txt
format=hosts
action=none

source=Popular Sites
path  =popular_sites.txt
format=hostnames
action=resolve

source=Known Trackers
path  =trackers.txt
format=hostnames

source=MVPS
path  =mvps_hosts.txt
url   =http://winhelp2002.mvps.org/hosts.txt
expires=7 days
format=hosts

File formats

The tool receives blacklist and whitelist sources in two formats:

Hosts file (format=host or format=hosts)
This associates hostnames to IP addresses. If the first word (sequence of nonblank characters) in each line of the file forms a valid IPv4 or IPv6 address, all following words on the same line are treated as hostnames.
Hostname file (format=hostname or format=hostnames)
This is a simpler format. All words that are DNS names with at least two parts are considered hostnames.

In both hosts files and hostname files, a line whose first nonblank character is the pound sign (`#`) is a comment and thus ignored.

Actions

The tool can do any of several things with the data loaded from a file:

Hosts (action=)
For hosts files, include these hostnames literally in the output. For hostname files, treat as blacklist. This is the default.
Blacklist (action=blacklist, action=map-to, or map-to=)
Map all hostnames in this source to the same host, such as 0.0.0.0. This is the default for hostname files.
Resolve (action=resolve or action=whitelist)
Look up using a recursive resolver, such as Python's socket.getaddrinfo(hostname, 80). This is good for whitelisting your most commonly visited websites
None (action=none, action=ignore)
Ignore this source.