Blocking User Agents on Nginx and Apache

Why Block User Agents?

Every HTTP request includes a User-Agent header that identifies the client software making the request. Legitimate browsers send user agent strings like Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36, while automated tools, scrapers, and vulnerability scanners often identify themselves with distinctive user agent strings — or use generic defaults that are easy to detect.

Blocking known malicious user agents reduces server load, prevents content scraping, thwarts automated vulnerability scanning, and eliminates noise in your access logs. While sophisticated attackers can trivially spoof their user agent, the vast majority of automated attacks use default tool identifiers. Blocking these provides a meaningful first layer of defense before deeper inspection by a web application firewall.

Common Malicious User Agents to Block

The following user agents are associated with automated tools, vulnerability scanners, and content scrapers that should be blocked on most production servers:

Vulnerability Scanners

Nikto — popular open-source web server scanner
sqlmap — automated SQL injection tool
Nessus — network vulnerability scanner
OpenVAS — open-source vulnerability assessment
w3af — web application attack and audit framework
Acunetix — web vulnerability scanner
Netsparker — web application security scanner
ZmEu — scanner targeting phpMyAdmin vulnerabilities
masscan — high-speed port scanner

Content Scrapers and Bots

HTTrack — website copier
SiteSucker — downloads entire websites
WebCopier — website downloading tool
Scrapy — Python web scraping framework
MJ12bot — aggressive crawler (Majestic SEO)
AhrefsBot — SEO crawler (block if not wanted)
SemrushBot — SEO crawler (block if not wanted)

Default Tool User Agents

curl/ — default curl user agent (used by many automated scripts)
Wget/ — default wget user agent
Python-urllib — Python's default HTTP library user agent
python-requests — Python requests library default
Go-http-client — Go's default HTTP client
Java/ — Java's default HTTP user agent
libwww-perl — Perl's LWP library (used by many automated scripts)

Note: Blocking curl and wget default user agents may affect legitimate monitoring tools and health checks. Ensure your monitoring uses custom user agents before blocking these.

Blocking User Agents on Nginx

Method 1: Using map and if

The most efficient Nginx method uses a map block to evaluate the user agent once and store the result in a variable, then an if block to act on it:

# In the http {} block of nginx.conf
map $http_user_agent $block_user_agent {
    default         0;
    ~*nikto         1;
    ~*sqlmap        1;
    ~*nessus        1;
    ~*openvas       1;
    ~*w3af          1;
    ~*acunetix      1;
    ~*netsparker    1;
    ~*zmeu          1;
    ~*masscan       1;
    ~*httrack       1;
    ~*sitesucker    1;
    ~*scrapy        1;
    ~*python-urllib  1;
    ~*python-requests 1;
    ~*libwww-perl   1;
    ~*go-http-client 1;
    ~*java/         1;
    ""              1;  # Block empty user agents
}

# In the server {} block
server {
    ...
    if ($block_user_agent) {
        return 403;
    }
    ...
}

The ~* prefix makes the match case-insensitive. The map directive is evaluated lazily and only when the variable is used, making it efficient even with many entries.

Method 2: Simple if Block

For a smaller list, you can use a single regex in the server block:

server {
    ...
    if ($http_user_agent ~* (nikto|sqlmap|nessus|zmeu|masscan|httrack|scrapy|libwww-perl)) {
        return 403;
    }
    ...
}

This is simpler but less maintainable for long lists, and the regex is evaluated on every request rather than using the optimized hash table of a map block.

Returning Different Responses

# Return 403 Forbidden
return 403;

# Return 444 (Nginx-specific: close connection with no response)
return 444;

# Return a custom error page
return 403 "Access denied.";

# Redirect to a honeypot
return 301 https://example.com/honeypot;

Using return 444 is the most efficient option for Nginx — it drops the connection immediately without sending any response headers, wasting the scanner's time and bandwidth.

Blocking User Agents on Apache

Method 1: mod_rewrite in VirtualHost or .htaccess

# Requires mod_rewrite to be enabled
RewriteEngine On

# Block vulnerability scanners
RewriteCond %{HTTP_USER_AGENT} nikto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} sqlmap [NC,OR]
RewriteCond %{HTTP_USER_AGENT} nessus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} openvas [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zmeu [NC,OR]
RewriteCond %{HTTP_USER_AGENT} masscan [NC,OR]

# Block scrapers
RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} sitesucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} scrapy [NC,OR]

# Block default tool user agents
RewriteCond %{HTTP_USER_AGENT} python-urllib [NC,OR]
RewriteCond %{HTTP_USER_AGENT} python-requests [NC,OR]
RewriteCond %{HTTP_USER_AGENT} libwww-perl [NC,OR]
RewriteCond %{HTTP_USER_AGENT} go-http-client [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^java/ [NC,OR]

# Block empty user agents
RewriteCond %{HTTP_USER_AGENT} ^$ [NC]

RewriteRule .* - [F,L]

The [NC] flag makes the match case-insensitive. The [OR] flag chains conditions with OR logic (default is AND). The [F,L] flags return a 403 Forbidden response and stop processing further rules.

Method 2: Using a Combined Regex

A single RewriteCond with a combined regex is more compact:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (nikto|sqlmap|nessus|openvas|zmeu|masscan|httrack|sitesucker|scrapy|python-urllib|python-requests|libwww-perl|go-http-client|^java/) [NC]
RewriteRule .* - [F,L]

Method 3: Using mod_setenvif

An alternative to mod_rewrite that some administrators prefer:

# Set an environment variable for bad user agents
SetEnvIfNoCase User-Agent "nikto" bad_bot
SetEnvIfNoCase User-Agent "sqlmap" bad_bot
SetEnvIfNoCase User-Agent "nessus" bad_bot
SetEnvIfNoCase User-Agent "zmeu" bad_bot
SetEnvIfNoCase User-Agent "httrack" bad_bot
SetEnvIfNoCase User-Agent "scrapy" bad_bot
SetEnvIfNoCase User-Agent "libwww-perl" bad_bot
SetEnvIfNoCase User-Agent "^$" bad_bot

# Deny access based on the variable

    Require all granted
    Require not env bad_bot

.htaccess Method

If you do not have access to the main server configuration (shared hosting), you can place rules in .htaccess:

# .htaccess
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (nikto|sqlmap|nessus|zmeu|masscan|httrack|scrapy|libwww-perl|python-urllib|^$) [NC]
RewriteRule .* - [F,L]

Note that .htaccess rules are processed on every request and have a performance cost compared to VirtualHost-level configuration. For high-traffic sites, configure blocking in the main server config or VirtualHost block when possible.

Testing Your Rules

After implementing user agent blocking, test from the command line to verify the rules are working:

# Test with a blocked user agent (should return 403)
curl -A "nikto" -o /dev/null -s -w "%{http_code}" https://example.com
# Expected output: 403

# Test with a normal user agent (should return 200)
curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64)" -o /dev/null -s -w "%{http_code}" https://example.com
# Expected output: 200

# Test with an empty user agent
curl -A "" -o /dev/null -s -w "%{http_code}" https://example.com
# Expected output: 403

# Test case sensitivity
curl -A "NIKTO" -o /dev/null -s -w "%{http_code}" https://example.com
# Expected output: 403 (if case-insensitive matching is configured)

Monitoring Blocked Requests

Track blocked requests in your access logs to understand what you are blocking and whether legitimate traffic is affected:

# Nginx — count 403 responses by user agent (last 1000 lines)
tail -1000 /var/log/nginx/access.log | awk '$9 == 403 {print $12}' | sort | uniq -c | sort -rn

# Apache — count 403 responses
tail -1000 /var/log/apache2/access.log | awk '$9 == 403 {print $12}' | sort | uniq -c | sort -rn

Limitations of User Agent Blocking

User agent blocking is a useful first filter, but it has significant limitations:

Trivial to bypass: Any attacker can set a custom user agent string. Sophisticated bots use real browser user agent strings.
False positives: Legitimate monitoring tools, API clients, and health checkers may use default library user agents. Always whitelist your own tools.
Not a substitute for a WAF: User agent blocking catches unsophisticated automated attacks but does nothing against targeted attacks using real browser user agents.
Maintenance burden: New tools and bots appear constantly. The block list needs regular updates.

For comprehensive bot detection and attack mitigation, a web application firewall analyzes request patterns, headers, payloads, and behavioral signals — far beyond just the user agent string.

Comprehensive Web Server Protection

User agent blocking is one layer of a defense-in-depth strategy. Combine it with rate limiting, IP-based blocking for known brute force sources, and a WAF that provides intelligent bot detection, request inspection, and real-time threat intelligence. Explore NOC.org's WAF plans for automated protection that goes far beyond user agent filtering.

Blocking User Agents on Nginx and Apache | NOC.org

Why Block User Agents?

Common Malicious User Agents to Block

Vulnerability Scanners

Content Scrapers and Bots

Default Tool User Agents

Blocking User Agents on Nginx

Method 1: Using map and if

Method 2: Simple if Block

Returning Different Responses

Blocking User Agents on Apache

Method 1: mod_rewrite in VirtualHost or .htaccess

Method 2: Using a Combined Regex

Method 3: Using mod_setenvif

.htaccess Method

Testing Your Rules

Monitoring Blocked Requests

Limitations of User Agent Blocking

Comprehensive Web Server Protection

Improve Your Websites Speed and Security

PRODUCTS

PROJECTS

RESOURCES

COMPANY

CONTACT US

Why Block User Agents?

Common Malicious User Agents to Block

Vulnerability Scanners

Content Scrapers and Bots

Default Tool User Agents

Blocking User Agents on Nginx

Method 1: Using map and if

Method 2: Simple if Block

Returning Different Responses

Blocking User Agents on Apache

Method 1: mod_rewrite in VirtualHost or .htaccess

Method 2: Using a Combined Regex

Method 3: Using mod_setenvif

.htaccess Method

Testing Your Rules

Monitoring Blocked Requests

Limitations of User Agent Blocking

Comprehensive Web Server Protection

Article sidebar

Improve Your Websites Speed and Security