A common issue with blocking/allowing access to a site (www.example.newsite.com) is the underlying domains that need to be accounted for.
If you wanted to block/allow www.sfgate.com for instance, it would be great to just put sfgate.com in a list.
Unfortunately, it's not that simple. If we do a packet capture on the DNS calls involved with bringing up www.sfgate.com, we would see the following DNS records needed to completely render the site:
Some of the content provided by most websites is driven from sites in a category you might have blocked. When that happens, elements of the page won't load properly, or there will be no formatting.
An easy way to find the domains required is to use Google Chrome's DNS prefetch tool which logs your queries.
More detailed information about the feature is available here -> http://www.chromium.org/developers/design-documents/dns-prefetching
Once the feature is turned on in your browser (it's most likely on for you right now), you can visit the site you want to collect information about.
After the site has completely rendered (all elements are downloaded), you can enter the following into the URL bar in the browser:
Now use CTRL-F to find the domain you are looking for. I've used www.bostonglobe.com as an example.
These entries would be required in order to Allow this site to render completely:
This issue is very prevalent in allow-only mode, or policies that make heavy use of content filtering. It takes a bit of work to get access just right if everything is being blocked.
Other methods of identifying these domains include packet captures (collected by tools such as Wireshark), collecting HAR files, or using websites such as webpagetest.org.