Monday, June 13, 2016

NAMED - end of the story


So, one week full-load flight with new recursor is finished.

Some results:
 - no software related problems except once we've experienced some kind of attack when DNS traffic tripled, according to maintenance team report that was attack to the authoritative servers.

Since they reside on the same hardware with recursors that caused significant system degradation, supposedly, because of named was killing both CPU's.
Unfortunately no real debugging and analyzing is possible now.
 - CPU load reaches 25-30% in peak time, and powerdns as able to use both CPU cores without proccess blocking

Some setup details:s one cache, but with all logic put to two scripts - nxdomain and preresolve. Some auth functions related to different answers to different internal networks are put in preresolve scipt.

Local domains and RFC 1918 (grey) networks are forwarded to auth directly as root servers have no idea about their delegation (actually, they are site-specific zones), some black-listed zones are also processed in the preresolve. Scripts are in  c-perl-like LUA language, pretty simple and easy to underastnd language. According to tests even complicated lookups in LUA are much more fast end effective then doing real lookups (blacklisted zone).

Problems: they alwayas are. The only problem that most recent version of the pdns-recursor doesn't do round-robin DNS balancing correctly causing overloading of some servers. Previous version worked fine, and we left it in the production for now.

The other thing, pdn-recursor also trims UDP and we cannot answer 40-50 server pools with it, BUT because of preresolve section we don't need it anymore as the problematic pool with ~50 servers is divided to 6 networks and in LUA we can answer only those servers which are supposed to serve that network segment, in comparison, BIND allowed only to do site-specific sorting of the pools, returning some servers first in the pool, but anyway all the pool was in the answer.


Overall: pdns-recursor is really nice upgrade, very low memory requerements, simple and efficient. Highly recomended for high recursor loads (5-20Kps of DNS traffic)