| Strange
/ Frustrating Caching Problems
For the past few months, I have been trying to resolve (unsuccessfully
to thi s point) with a trio of caching only name servers that we
have in place. The general nature of the problem is as follows.
A dhcp client originally gets an IP address on subnet A but at some
point prior to lease expiration moves to subnet B, where they obtain
a new IP address successfully. The problem that I am seeing is that
after the move to subnet B, one or more of our caching only name
servers are still returning the old IP address when a lookup of
the hostname occurs. This behavior seems reasonable at first glance
since caching only servers should retain the information they have
in cache until the TTL expires and/or the cache is flushed. After
digging into this further, I'm finding that that the TTL for the
hosts whose forward lookups are returning the wrong IP are set to
604800 seconds or 168 hours. I've determined this by dumping / viewing
the cache.
In addition, I've also discovered that the TTL for the reverse record
for the same client is also set to this high value. This behavior
would seem reasonable if this high value was the TTL value configured
for the domain, which is not the case here. We have the default
TTL
in our environment set for 10800 seconds or 4 hours. Thus, I'm a
little baffled as to why the TTL for some of these DHCP clients
are being set to such a high value when other clients have their
TTL's set to the 10800 v alue configured at the domain level. I've
checked the registration at the ob ject level (in our IP management
application) and the TTL field is blank, thus implying the default
TTL is in place.
Aside from the above details, I can also note that
the problematic lookups se em to involve the same DHCP
clients. The only reason I know about these clie nts is that they
are unable to SSH to some Unix boxes in a DMZ that restrict access
to hosts that they can perform both forward and reverse lookups
for. In this scenario, the forward lookup is failing since it's
returning the old IP address of the client. When this problem occurs,
it tends to affect one o r two of the caching servers but not all
three. Furthermore, it is somewhat random as to which of the 3 servers
are affected.
The caching servers in question are all Solaris
9 running BIND 9.3.2. If anyone can provide some insight here, it
would be much appreciated. I can provide additional information
and/or elaborate on something s needed. Nameservers do what the
dhcp servers tell them to do. The TTL is set by the DHCP server.
Try lowering the dhcp lease time as that influences the DNS TTL.
---------------------------------------------------------------------------
In an environment where people can wander with their laptops from
subnet to subnet, why do you have caching only name servers?
These name servers should, at least, have the local zones defined
as forward or stub zones to minimize the amount of erroneous data
being returned in a volatile environment.
In a volatile environment, you do not want the DHCP server to set
the TTL to the lease time. I've yet to see a user release the system's
IP address before picking up his laptop and going to his next meeting.
To minimize the impact of this behaviour, define ddns-ttl for each
DHCP pool. The DHCP server will use the value of ddns-ttl for the
TTL when updating DNS. The value of ddns-ttl should be set to the
maximum number of seconds you are willing to accept erroneous DNS
answers.
For this to work correctly, you need to configure the DHCP server
to update both forward and reverse zones and not permit DHCP clients
to update any zone information.
----------------------------------------------------------------------------------------
This is intersting then. We have roughly 10,000
DHCP clients in total here with only a small handful exhibiting
this high TTL
value. The handful could certainly be more that I simply don't know
about but I would have expected to hear of similar problems from
other users. In addition, the same template (of IP settings) is
being applied to the "problematic" clients as others whose
TTL's are fine. If the behavior is a by-product of the lease time,
why would we not be seeing this behavior on a larger number of clients?
Our standard lease time here is 14 days and has been for some time.
It has only been within the last few months that I've been made
aware of the noted problem. That said, best practice seems to dictate
that RR TTL for DHCP clietns should not exceed 1/3 the lease time,
which would not be the case here (right at bout 50% in some cases).
All this aside though, is there any DHCP option available to more
tightly control the TTL value or is this something that should be
configurable at a more global level? I may also follow-up with the
vendor of my IP Management product since I'm using their DHCP server.
|
 |
Latest
articles
Odd
problems trying to make use of libbind as a replacement resolver
SEO
One-Way Web Links: 5 Strategies
Google
SiteMaps and You
Getting
The Kinks Out Of Links
Yahoo
Boasts Size of Its Search Engine Index
Promoting
Your Business Website - The never-ending work to get traffic
Problems
with Bind 9 Views
Slave
bind skips delegation record in master zone
Slave
zones not updating
SPF RRType
Trying
to get full domain info in nslookup
|
 |
 |