URL Filter Databases: Overview

URL filter databases are used by security vendors to categorize websites and assess their reputation. These databases help in blocking access to malicious or inappropriate content and are a key component of web security solutions.

How URL Filter Databases Work

  1. Categorization: URLs are categorized based on their content. Categories can include topics like news, social media, adult content, gambling, etc.
  2. Web Reputation: URLs are also assessed for their reputation, which involves checking for malicious activity, phishing, malware distribution, etc.
  3. URL Checking Process:
    • New URL: When a new URL is encountered, it is analyzed using various parameters such as domain age, IP address reputation, content analysis, and historical data.
    • Known URL: Known URLs are periodically re-checked to ensure their categorization and reputation are up-to-date. This involves re-evaluating the same parameters used for new URLs.

False Positives and False Negatives

  • False Positives: Legitimate URLs incorrectly categorized as malicious or inappropriate.
  • False Negatives: Malicious or inappropriate URLs incorrectly categorized as safe.

Major Security Vendors and Their URL Databases

Here’s a comparison table of URL databases from major vendors:

Vendor Name of URL Database Products Using This Database
Google Safe Browsing Chrome
Zscaler TBD Zscaler Internet Access (ZIA)
Netskope TBD Netskope TBD
Microsoft SmartScreen Edge and Internet Explorer
Checkpoint ThreatCloud Firewall, Endpoint Security
PaloAlto PAN-DB Next-Gen Firewall(NGFW), Prisma
Cisco Talos Umbrella, Firepower
McAfee/Skyhigh/Trellix TrustedSource Web Gateway, Cloud Security, Endpoint Security
Bluecoat/Broadcom WebPulse ProxySG, Secure Web Gateway
Fortinet FortiGuard URL Database Fortigate, FortiProxy, Forti*
Virustotal Virustotal Virustotal

URL Checking Parameters

When building a score, many parameters (sometimes more than 50) are checked, for example:

  • Domain Age: Older domains are generally considered more trustworthy.
  • IP Address Reputation: IP addresses associated with malicious activity are flagged.
  • Content Analysis: The actual content of the website is analyzed for malicious scripts, phishing attempts, etc.
  • Historical Data: Past behavior and reputation of the URL/domain.

Process Description

  1. Initial Categorization: When a URL is first encountered, it is categorized based on its content using automated tools and sometimes human review.
  2. Reputation Assessment: The URL is checked against known malicious databases, and its reputation is assessed based on various factors like domain age, IP reputation, and content analysis.
  3. Continuous Monitoring: URLs are continuously monitored and re-evaluated to ensure their categorization and reputation remain accurate.
  4. User Feedback: Users can often report false positives or false negatives, which are then reviewed and corrected by the security vendor.