Content Filtering


Content filtering is a security measure performed on email and Internet traffic to identify and block suspicious, malicious, and/or inappropriate content in accordance with an organization’s policies.

  • aka web filtering
  • applies Application layer filters based on HTTP data
    • whereas ACL security rule applies Network or Transport layer filtering
  • can apply general business rules
    • e.g., time of day restrictions, overall time limits
  • most firewalls and proxies support some level of content filtering
  • safeguards an org’s network by blocking user from accessing malicious or inappropriate sites

Types of Filtering

Agent-Based Filtering

  • involves installing a software agent on devices
  • agents
    • enforce compliance with the organization’s web filtering policies
    • communicate with a centralized management server to retrieve filtering policies and rules and then apply them locally on the device
  • typically leverage cloud platforms
    • to ensure they can communicate with devices regardless of the network they are connected to
    • means filtering policies remain in effect even when users are off the corporate network
  • can provide detailed reporting and analytics
    • can log web access attempts and return this data to a management server for analysis:
      • monitor Internet usage patterns
      • identify attempts to access blocked content
      • fine-tune the filtering rules as required
  • filtering occurs locally on the device
    • so provides more granular control
    • e.g.,
      • filtering HTTPS traffic
      • applying different filtering rules for different applications

Centralized Filtering

  • centralized proxy server acts as an intermediary between end users and the Internet
    • can effectively control and monitor all inbound and outbound web content
  • primary role is to analyze web requests from users and determine whether to permit or deny access based on established policies
  • can also perform detailed logging and reporting of web activity
    • allows analysts to:
      • track and analyze web usage patterns
      • identify policy violations
      • and gather valuable intelligence for refining filtering policies and rules
  • restricts access based on:
    • Uniform Resource Locator (URL) filtering
      • URL contains
        • protocol, domain name, server path, and optional query parameters
      • scans the URL embedded in an HTTP request and allows or blocks it
      • can block using:
        • specific URLs
        • regular expression pattern matching to filter by keywords or path and query parameters
    • content categorization
      • classifies websites into various categories
        • e.g., social networking, gambling, adult content, webmail, etc.
      • can define rules to allow or deny access based on these categories
    • block rules
      • implement block rules based on various factors such as
        • the website’s URL
        • domain
        • IP address
        • content category
        • or even specific keywords within the web content
    • reputation-based filtering
      • leverages continually updated databases that score websites based on their observed behavior and history
      • these databases assign sites a reputation score
      • easier than allow/deny individual URLs

Issues

  • Overblocking
    • occurs when the filter is too restrictive
    • inadvertently blocking access to legitimate and useful websites
      • negatively impacting employee productivity
  • Underblocking
    • occurs when the filter allows access to potentially harmful or inappropriate websites
  • Handling of encrypted traffic (HTTPS)
    • e.g., TLS
    • proxy cannot inspect or modify application data in encrypted traffic
      • cannot decrypt traffic without breaking TLS handshake between client and site
    • to perform TLS inspection:
      • proxy has to generate an enterprise certificate for each domain
      • client trusts this certificate
        • issued by enterprise CA, but still matches the domain requested
      • establishes its own TLS tunnel with website to forward requests
    • introduces employee privacy issues and concerns