Some Umbrella customers using Roaming Clients and/or Virtual Appliances have noticed issues with port exhaustion in firewalls that use Port Address Translation. This is most likely in environments that have a large number of Roaming Clients and/or a high volume of traffic running through the VAs. Symptoms can include DNS queries returning slowly or timing out.
Neither Roaming Clients nor Virtual Appliances will cache answers to DNS queries. Furthermore, Roaming Clients send frequent "probe" DNS requests to analyze the networking environment and as health checks.
- Ensure that your Internal Domains are properly configured within Domain Management on your Umbrella dashboard. They should contain your Active Directory zone (and/or other internal zones) in order to reduce the volume of high-frequency queries.
- Review some of the PAT settings on the firewall:
- A long UDP session timeout can be an issue. We typically recommend UDP session timeouts of ~15 seconds. However, please note that if UDP is used heavily by other applications on your network, they may have longer timeouts which you should take into account.
- Depending on your firewall, it may be possible to increase the size of its PAT pool in order to increase the number of simultaneous connections.
- If you have an IP addresses that you can dedicate to the VAs, use 1:1 NAT instead of PAT on the firewall. Note: "1:1 NAT" is sometimes referred to as "Direct NAT", but this is a misnomer; the correct technical term is "1:1 NAT".
- Review your per-IP connection limits. Often, a policy not expected to apply to the device in question is indeed applying a limit. See the next section for how to confirm.
Check for per-IP connection limits on an ASA
Follow the steps below:
- Configure the ASA with a capture to see why packets were being dropped by the firewall:
- capture asp type asp-drop all match ip any host 18.104.22.168
- Look for packets being dropped for the IP in question. A connection limit reason will appear as "Drop-reason: (conn-limit)"
- Examine the host connection limit by using the command
- show local-host detail | begin <IP Address of VA or roaming client>
- Is this number static at a certain limit (i.e. 999) and never increasing? If so, this indicates a connection limit.
- Check for a service-policy that is applying this; if you find it, check its policy-map.
- show run service-policy, show policy-map NAME
- If you find a policy-map "NAME" that sets the per-host connection limit to 1000 (for example), this will cause any new DNS packets from the device to be dropped until more connections are available. UDP is stateless and will not retry.
- To resolve, remove that service-policy (no service-policy NAME inside). Connections should start going over the 1K limit (from our example). This occurs more quickly for a VA than a roaming client.
If the above doesn't help, then a possible workaround would be as follows:
- Use the Umbrella dashboard --> Reporting --> Top Domains report to identify one or more domains that have a large number of requests within the last 24 hours.
- In the Umbrella dashboard --> Configuration --> Domain Management, add one or more of the high-volume domains to the list, setting "Applies to" to "All Appliances and Devices".
- After that, queries for those domains will be forwarded by the VAs to the local DNS. Ideally, the local DNS should be configured to forward to the Umbrella DNS at 22.214.171.124/126.96.36.199, but they could be configured to forward to any external DNS.
- The local DNS will handle queries for any domains they are authoritative for.
- Presuming the local DNS does accept queries for non-local domains, queries for those other domains will be forwarded to the external DNS.