Friday, August 19, 2005

Managing DNS zone files with dnspython

I've been using dnspython lately for transferring some DNS zone files from one name server to another. I found the package extremely useful, but poorly documented, so I decided to write this post as a mini-tutorial on using dnspython.

Running DNS queries

This is one of the things that's clearly spelled out on the Examples page. Here's how to run a DNS query to get the mail servers (MX records) for dnspython.org:

import dns.resolver

answers = dns.resolver.query('dnspython.org', 'MX')
for rdata in answers:
print 'Host', rdata.exchange, 'has preference', rdata.preference
To run other types of queries, for example for IP addresses (A records) or name servers (NS records), replace MX with the desired record type (A, NS, etc.)

Reading a DNS zone from a file

In dnspython, a DNS zone is available as a Zone object. Assume you have the following DNS zone file called db.example.com:

$TTL 36000
example.com. IN SOA ns1.example.com. hostmaster.example.com. (
2005081201 ; serial
28800 ; refresh (8 hours)
1800 ; retry (30 mins)
2592000 ; expire (30 days)
86400 ) ; minimum (1 day)

example.com. 86400 NS ns1.example.com.
example.com. 86400 NS ns2.example.com.
example.com. 86400 MX 10 mail.example.com.
example.com. 86400 MX 20 mail2.example.com.
example.com. 86400 A 192.168.10.10
ns1.example.com. 86400 A 192.168.1.10
ns2.example.com. 86400 A 192.168.1.20
mail.example.com. 86400 A 192.168.2.10
mail2.example.com. 86400 A 192.168.2.20
www2.example.com. 86400 A 192.168.10.20
www.example.com. 86400 CNAME example.com.
ftp.example.com. 86400 CNAME example.com.
webmail.example.com. 86400 CNAME example.com.

To have dnspython read this file into a Zone object, you can use this code:

import dns.zone
from dns.exception import DNSException

domain = "example.com"
print "Getting zone object for domain", domain
zone_file = "db.%s" % domain

try:
zone = dns.zone.from_file(zone_file, domain)
print "Zone origin:", zone.origin
except DNSException, e:
print e.__class__, e
A zone can be viewed as a dictionary mapping names to nodes; dnspython uses by default name representations which are relative to the 'origin' of the zone. In our zone file, 'example.com' is the origin of the zone, and it gets the special name '@'. A name such as www.example.com is exposed by default as 'www'.

A name corresponds to a node, and a node contains a collection of record dataset, or rdatasets. A record dataset contains all the records of a given type. In our example, the '@' node corresponding to the zone origin contains 4 rdatasets, one for each record type that we have: SOA, NS, MX and A. The NS rdataset contains a set of rdatas, which are the individual records of type NS. The rdata class has subclasses for all the possible record types, and each subclass contains information specific to that record type.

Enough talking, here is some code that will hopefully make the previous discussion a bit clearer:

import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *

domain = "example.com"
print "Getting zone object for domain", domain
zone_file = "db.%s" % domain

try:
zone = dns.zone.from_file(zone_file, domain)
print "Zone origin:", zone.origin
for name, node in zone.nodes.items():
rdatasets = node.rdatasets
print "\n**** BEGIN NODE ****"
print "node name:", name
for rdataset in rdatasets:
print "--- BEGIN RDATASET ---"
print "rdataset string representation:", rdataset
print "rdataset rdclass:", rdataset.rdclass
print "rdataset rdtype:", rdataset.rdtype
print "rdataset ttl:", rdataset.ttl
print "rdataset has following rdata:"
for rdata in rdataset:
print "-- BEGIN RDATA --"
print "rdata string representation:", rdata
if rdataset.rdtype == SOA:
print "** SOA-specific rdata **"
print "expire:", rdata.expire
print "minimum:", rdata.minimum
print "mname:", rdata.mname
print "refresh:", rdata.refresh
print "retry:", rdata.retry
print "rname:", rdata.rname
print "serial:", rdata.serial
if rdataset.rdtype == MX:
print "** MX-specific rdata **"
print "exchange:", rdata.exchange
print "preference:", rdata.preference
if rdataset.rdtype == NS:
print "** NS-specific rdata **"
print "target:", rdata.target
if rdataset.rdtype == CNAME:
print "** CNAME-specific rdata **"
print "target:", rdata.target
if rdataset.rdtype == A:
print "** A-specific rdata **"
print "address:", rdata.address
except DNSException, e:
print e.__class__, e

When run against db.example.com, the code above produces this output.

Modifying a DNS zone file

Let's see how to add, delete and change records in our example.com zone file. dnspython offers several different ways to get to a record if you know its name or its type.

Here's how to modify the SOA record and increase its serial number, a very common operation for anybody who maintains DNS zones. I use the iterate_rdatas method of the Zone class, which is handy in this case, since we know that the rdataset actually contains one rdata of type SOA:
   
for (name, ttl, rdata) in zone.iterate_rdatas(SOA):
serial = rdata.serial
new_serial = serial + 1
print "Changing SOA serial from %d to %d" %(serial, new_serial)
rdata.serial = new_serial


Here's how to delete a record by its name. I use the delete_node method of the Zone class:

node_delete = "www2"
print "Deleting node", node_delete
zone.delete_node(node_delete)
Here's how to change attributes of existing records. I use the find_rdataset method of the Zone class, which returns a rdataset containing the records I want to change. In the first section of the following code, I'm changing the IP address of 'mail', and in the second section I'm changing the TTL for all the NS records corresponding to the zone origin '@':

A_change = "mail"
new_IP = "192.168.2.100"
print "Changing A record for", A_change, "to", new_IP
rdataset = zone.find_rdataset(A_change, rdtype=A)
for rdata in rdataset:
rdata.address = new_IP

rdataset = zone.find_rdataset("@", rdtype=NS)
new_ttl = rdataset.ttl / 2
print "Changing TTL for NS records to", new_ttl
rdataset.ttl = new_ttl

Here's how to add records to the zone file. The find_rdataset method can be used in this case too, with the create parameter set to True, in which case it creates a new rdataset if it doesn't already exist. Individual rdata objects are then created by instantiating their corresponding classes with the correct parameters -- such as rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.10.30").

I show here how to add records of type A, CNAME, NS and MX:
  A_add = "www3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.10.30")
rdataset.add(rdata, ttl=86400)

CNAME_add = "www3_alias"
target = dns.name.Name(("www3",))
print "Adding record of type CNAME:", CNAME_add
rdataset = zone.find_rdataset(CNAME_add, rdtype=CNAME, create=True)
rdata = dns.rdtypes.ANY.CNAME.CNAME(IN, CNAME, target)
rdataset.add(rdata, ttl=86400)

A_add = "ns3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.1.30")
rdataset.add(rdata, ttl=86400)

NS_add = "@"
target = dns.name.Name(("ns3",))
print "Adding record of type NS:", NS_add
rdataset = zone.find_rdataset(NS_add, rdtype=NS, create=True)
rdata = dns.rdtypes.ANY.NS.NS(IN, NS, target)
rdataset.add(rdata, ttl=86400)

A_add = "mail3"
print "Adding record of type A:", A_add
rdataset = zone.find_rdataset(A_add, rdtype=A, create=True)
rdata = dns.rdtypes.IN.A.A(IN, A, address="192.168.2.30")
rdataset.add(rdata, ttl=86400)

MX_add = "@"
exchange = dns.name.Name(("mail3",))
preference = 30
print "Adding record of type MX:", MX_add
rdataset = zone.find_rdataset(MX_add, rdtype=MX, create=True)
rdata = dns.rdtypes.ANY.MX.MX(IN, MX, preference, exchange)
rdataset.add(rdata, ttl=86400)

Finally, after modifying the zone file via the zone object, it's time to write it back to disk. This is easily accomplished with dnspython via the to_file method. I chose to write the modified zone to a new file, so that I have my original zone available for other tests:

new_zone_file = "new.db.%s" % domain
print "Writing modified zone to file %s" % new_zone_file
zone.to_file(new_zone_file)

The new zone file looks something like this (note that all names have been relativized from the origin):

@ 36000 IN SOA ns1 hostmaster 2005081202 28800 1800 2592000 86400
@ 43200 IN NS ns1
@ 43200 IN NS ns2
@ 43200 IN NS ns3
@ 86400 IN MX 10 mail
@ 86400 IN MX 20 mail2
@ 86400 IN MX 30 mail3
@ 86400 IN A 192.168.10.10
ftp 86400 IN CNAME @
mail 86400 IN A 192.168.2.100
mail2 86400 IN A 192.168.2.20
mail3 86400 IN A 192.168.2.30
ns1 86400 IN A 192.168.1.10
ns2 86400 IN A 192.168.1.20
ns3 86400 IN A 192.168.1.30
webmail 86400 IN CNAME @
www 86400 IN CNAME @
www3 86400 IN A 192.168.10.30
www3_alias 86400 IN CNAME www3

Although it looks much different from the original db.example.com file, this file is also a valid DNS zone -- I tested it by having my DNS server load it.

Obtaining a DNS zone via a zone transfer

This is also easily done in dnspython via the from_xfr function of the zone module. Here's how to do a zone transfer for dnspython.org, trying all the name servers for that domain one by one:

import dns.resolver
import dns.query
import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *

domain = "dnspython.org"
print "Getting NS records for", domain
answers = dns.resolver.query(domain, 'NS')
ns = []
for rdata in answers:
n = str(rdata)
print "Found name server:", n
ns.append(n)

for n in ns:
print "\nTrying a zone transfer for %s from name server %s" % (domain, n)
try:
zone = dns.zone.from_xfr(dns.query.xfr(n, domain))
except DNSException, e:
print e.__class__, e


Once we obtain the zone object, we can then manipulate it in exactly the same way as when we obtained it from a file.

Various ways to iterate through DNS records

Here are some other snippets of code that show how to iterate through records of different types assuming we retrieved a zone object from a file or via a zone transfer:

print "\nALL 'IN' RECORDS EXCEPT 'SOA' and 'TXT':"
for name, node in zone.nodes.items():
rdatasets = node.rdatasets
for rdataset in rdatasets:
if rdataset.rdclass != IN or rdataset.rdtype in [SOA, TXT]:
continue
print name, rdataset

print "\nGET_RDATASET('A'):"
for name, node in zone.nodes.items():
rdataset = node.get_rdataset(rdclass=IN, rdtype=A)
if not rdataset:
continue
for rdataset in rdataset:
print name, rdataset

print "\nITERATE_RDATAS('A'):"
for (name, ttl, rdata) in zone.iterate_rdatas('A'):
print name, ttl, rdata

print "\nITERATE_RDATAS('MX'):"
for (name, ttl, rdata) in zone.iterate_rdatas('MX'):
print name, ttl, rdata

print "\nITERATE_RDATAS('CNAME'):"
for (name, ttl, rdata) in zone.iterate_rdatas('CNAME'):
print name, ttl, rdata
You can find the code referenced in this post in these 2 modules: zonemgmt.py and zone_transfer.py.

Monday, August 08, 2005

Agile documentation in the Django project

A while ago I wrote a post called "Agile documentation with doctest and epydoc". The main idea was to use unit tests as "executable documentation"; I showed in particular how combining doctest-based unit tests with a documentation system such as epydoc can result in up-to-date documentation that is synchronized with the code. This type of documentation not only shows the various modules, classes, methods, function, variables exposed by the code, but -- more importantly -- it also provides examples of how the code API gets used in "real life" via the unit tests.

I'm happy to see the Django team take a similar approach in their project. They announced on the project blog that API usage examples for Django models are available and are automatically generated from the doctest-based unit tests written for the model functionality. For example, a test module such as tests/testapp/models/basic.py gets automatically rendered into the 'Bare-bones model' API usage page. The basic.py file contains almost exclusively doctests in the form of a string called API_TESTS. The rest of the file contains some simple markers that are interpreted into HTML headers and such. Nothing fancy, but the result is striking.

I wish more projects would adopt this style of automatically generating documentation for their APIs from their unit test code. It can only help speed up their adoption. As an example, I wish the dnspython project had more examples of how to use the API it offers. That project does have epydoc-generated documentation, but if it also showed how the API actually gets used (via unit tests preferably), it would help its users avoid a lot of hair-pulling. Don't get me wrong, I think dnspython offers an incredibly useful API and I intend to post about some of my experiences using it, but it does require you to dig and sweat in order to uncover all its intricacies.

Anyway, kudos to the Django team for getting stuff right.

Monday, August 01, 2005

White-box vs. black-box testing

As I mentioned in my previous post, there's an ongoing discussion on the agile-testing mailing list on the merits of white-box vs. black-box testing. I had a lively exchange of opinions on this theme with Ron Jeffries. If you read my "Quick black-box testing example" post, you'll see the example of an application under test posted by Ron, as well as a list of back-box test activities and scenarios that I posted in reply. Ron questioned most of these black-box test scenarios, on the grounds that they provide little value to the overall testing process. In fact, I came away with the conclusion that Ron values black-box testing very little. He is of the opinion that white-box testing in the form of TDD is pretty much sufficient for the application to be rock-solid and as much bug-free as any piece of software can hope to be.

I never had the chance to work on an agile team, so I can't really tell if Ron's assertion is true or not. But my personal opinion is that there is no way developers doing TDD can catch several classes of bugs that are outside of their code-only realm. I'm thinking most of all about the various quality criteria categories, also known as 'ilities', popularized by James Bach. Here are some of them: usability, installability, compatibility, supportability, maintainability, portability, localizability. All these are qualities that are very hard to test in a white-box way. They all involve interactions with the operating system, with the hardware, with the other applications running on the machine hosting the AUT. To this list I would add performance/stress/load testing, security testing, error recoverability testing. I don't see how you can properly test all these things if you don't do black-box testing in addition to white-box type testing.

In fact, there's an important psychological distinction between developers doing TDD and 'traditional' testers doing mostly black-box testing. A developer thinks "This is my code. It works fine. In fact, I'm really proud of it.", while a tester is more likely to think "This code has some really nasty bugs. Let me discover them before our customer does." These two approaches are complementary. You can't perform just one at the expense of the other, or else your overall code quality will suffer. You need to build code with pride before you try to break it in various devious ways.

Here's one more argument from Ron as to why white-box testing is more valuable than black-box testing:

To try to simplify: the search method in question has been augmented with an integer "hint" that is used to say where in the large table we should start our search. The idea is that by giving a hint, it might speed up the search, but the search must always work even if the hint is bad.

The question I was asking was how we would test the hinting aspect.

I expect questions to arise such as those Michael Bolton would suggest, including perhaps:

What if the hint is negative?
What if the hint is after the match?
What if the hint is bigger than the size of the table?
What if integers are actually made of cheese?
What if there are more records in the table than a 32-bit int?

Then, I propose to display the code, which will include, at the front, some lines like this:

if (hint < 1) hint = 0;
if (hint > table.size) hint = 0;

Then, I propose to point out that if we know that code is there, there are a couple of tests we can save. Therefore white box testing can help make testing more efficient, QED.

My counter-argument was this: what if you mistakenly build a new release of your software out of some old revision of the source code, a revision which doesn't contain the first 2 lines of the search method? Presumably the old version of the code was TDD-ed, but since the 2 lines weren't there, we didn't have unit tests for them either. So if you didn't have black-box tests exercising those values of the hint argument, you'd let an important bug escape out in the wild. I don't think it's that expensive to create automated tests that verify the behavior of the search method with various well-chosen values of the hint argument. Having such a test harness in place goes a long way in protecting against admittedly weird situations such as the 'old code revision' I described.

In fact, as Amir Kolsky remarked on the agile-testing list, TDD itself can be seen as black-box testing, since when we unit test some functionality, we usually test the behavior of that piece of code and not its implementation, thus we're not really doing white-box testing. To this, Ron Jeffries and Ilja Preuss replied that in TDD, you write the next test with an eye on the existing code. In fact, you write the next test so that the next piece of functionality for the existing code fails. Then you make it pass, and so on. So you're really looking at both the internal implementation of the code and at its interfaces, as exposed in your unit tests. At this point, it seems to me that we're splitting hairs. Maybe we should talk about developer/code producer testing vs. non-developer/code consumer testing. In fact, I just read this morning a very nice blog post from Jonathan Kohl on a similar topic: "Testing an application in layers". Jonathan talks about manual vs. automated testing (another hotly debated topic on the agile-testing mailing list), but many of the ideas in his post can be applied to the white-box vs. black-box discussion.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...