Web Server Potential Spike in Error Response Codes

Dec 5, 2025 · Domain: Web Use Case: Threat Detection Tactic: Reconnaissance Data Source: Nginx Data Source: Apache Data Source: Apache Tomcat Data Source: IIS Resources: Investigation Guide ·

Share on:

This rule detects unusual spikes in error response codes (500, 502, 503, 504) from web servers, which may indicate reconnaissance activities such as vulnerability scanning or fuzzing attempts by adversaries. These activities often generate a high volume of error responses as they probe for weaknesses in web applications. Error response codes may potentially indicate server-side issues that could be exploited.

Elastic rule (View on GitHub)

  1[metadata]
  2creation_date = "2025/11/19"
  3integration = ["nginx", "apache", "apache_tomcat", "iis"]
  4maturity = "production"
  5updated_date = "2025/12/05"
  6
  7[rule]
  8author = ["Elastic"]
  9description = """
 10This rule detects unusual spikes in error response codes (500, 502, 503, 504) from web servers, which may indicate
 11reconnaissance activities such as vulnerability scanning or fuzzing attempts by adversaries. These activities often
 12generate a high volume of error responses as they probe for weaknesses in web applications. Error response codes
 13may potentially indicate server-side issues that could be exploited.
 14"""
 15from = "now-11m"
 16interval = "10m"
 17language = "esql"
 18license = "Elastic License v2"
 19name = "Web Server Potential Spike in Error Response Codes"
 20note = """ ## Triage and analysis
 21
 22> **Disclaimer**:
 23> This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.
 24
 25### Investigating Web Server Potential Spike in Error Response Codes
 26
 27This rule detects bursts of 5xx errors (500–504) from GET traffic, highlighting abnormal server behavior that accompanies active scanning or fuzzing and exposes fragile code paths or misconfigured proxies. Attackers sweep common and generated endpoints while mutating query params and headers—path traversal, template syntax, large payloads—to repeatedly force backend exceptions and gateway timeouts, enumerate which routes fail, and pinpoint inputs that leak stack traces or crash components for follow-on exploitation.
 28
 29### Possible investigation steps
 30
 31- Plot error rates per minute by server and client around the alert window to confirm the spike, determine scope, and separate a single noisy client from a platform-wide issue.
 32- Aggregate the failing URL paths and query strings from the flagged client and look for enumeration sequences, traversal encoding, template injection markers, or oversized inputs indicative of fuzzing.
 33- Examine User-Agent, Referer, header mix, and TLS JA3 for generic scanner signatures or reuse across multiple clients, and enrich the originating IP with reputation and hosting-provider attribution.
 34- Correlate the timeframe with reverse proxy/WAF/IDS and application error logs or stack traces to identify which routes threw exceptions or timeouts and whether they align with the client’s input patterns.
 35- Validate backend and dependency health (upstreams, databases, caches, deployments) to rule out infrastructure regressions, then compare whether only the suspicious client experiences disproportionate failures.
 36
 37### False positive analysis
 38
 39- A scheduled deployment or upstream dependency issue can cause normal GET traffic to fail with 502/503/504, and many users egressing through a shared NAT or reverse proxy may be aggregated as one source IP that triggers the spike.
 40- An internal health-check, load test, or site crawler running from a single host can rapidly traverse endpoints and induce 500 errors on fragile routes, mimicking scanner-like behavior without malicious intent.
 41
 42### Response and remediation
 43
 44- Immediately rate-limit or block the originating client(s) at the edge (reverse proxy/WAF) using the observed source IPs, User-Agent/TLS fingerprints, and the failing URL patterns generating 5xx bursts.
 45- Drain the origin upstream(s) showing repeated 500/502/503/504 on the probed routes, roll back the latest deployment or config change for those services, and disable any unstable endpoint or plugin that is crashing under input fuzzing.
 46- Restart affected application workers and proxies, purge bad cache entries, re-enable traffic gradually with canary percentage, and confirm normal response rates via synthetic checks against the previously failing URLs.
 47- Escalate to Security Operations and Incident Response if 5xx spikes persist after blocking or if error pages expose stack traces, credentials, or admin route disclosures, or if traffic originates from multiple global hosting ASNs.
 48- Deploy targeted WAF rules for path traversal and injection markers seen in the URLs, enforce per-IP and per-route rate limits, tighten upstream timeouts/circuit breakers, and replace verbose error pages with generic responses that omit stack details.
 49- Add bot management and IP reputation blocking at the CDN/edge, lock down unauthenticated access to admin/debug routes, and instrument alerts that trigger on sustained 5xx bursts per client and per route with automatic edge throttling.
 50"""
 51risk_score = 21
 52rule_id = "6fa3abe3-9cd8-41de-951b-51ed8f710523"
 53severity = "low"
 54tags = [
 55    "Domain: Web",
 56    "Use Case: Threat Detection",
 57    "Tactic: Reconnaissance",
 58    "Data Source: Nginx",
 59    "Data Source: Apache",
 60    "Data Source: Apache Tomcat",
 61    "Data Source: IIS",
 62    "Resources: Investigation Guide",
 63]
 64timestamp_override = "event.ingested"
 65type = "esql"
 66query = '''
 67from logs-nginx.access-*, logs-apache.access-*, logs-apache_tomcat.access-*, logs-iis.access-*
 68| where
 69    http.request.method == "GET" and
 70    http.response.status_code in (
 71      500, // Internal Server Error
 72      502, // Bad Gateway
 73      503, // Service Unavailable
 74      504 // Gateway Timeout
 75    )
 76
 77| eval Esql.url_original_to_lower = to_lower(url.original)
 78
 79| keep
 80    @timestamp,
 81    event.dataset,
 82    http.request.method,
 83    http.response.status_code,
 84    source.ip,
 85    agent.id,
 86    host.name,
 87    Esql.url_original_to_lower,
 88    data_stream.namespace
 89
 90| stats
 91    Esql.event_count = count(),
 92    Esql.http_response_status_code_count = count(http.response.status_code),
 93    Esql.http_response_status_code_values = values(http.response.status_code),
 94    Esql.host_name_values = values(host.name),
 95    Esql.agent_id_values = values(agent.id),
 96    Esql.http_request_method_values = values(http.request.method),
 97    Esql.http_response_status_code_values = values(http.response.status_code),
 98    Esql.url_path_values = values(Esql.url_original_to_lower),
 99    Esql.event_dataset_values = values(event.dataset),
100    Esql.data_stream_namespace_values = values(data_stream.namespace)
101    by source.ip, agent.id
102| where
103    Esql.http_response_status_code_count > 10
104'''
105
106[[rule.threat]]
107framework = "MITRE ATT&CK"
108
109[[rule.threat.technique]]
110id = "T1595"
111name = "Active Scanning"
112reference = "https://attack.mitre.org/techniques/T1595/"
113
114[[rule.threat.technique.subtechnique]]
115id = "T1595.002"
116name = "Vulnerability Scanning"
117reference = "https://attack.mitre.org/techniques/T1595/002/"
118
119[[rule.threat.technique.subtechnique]]
120id = "T1595.003"
121name = "Wordlist Scanning"
122reference = "https://attack.mitre.org/techniques/T1595/003/"
123
124[rule.threat.tactic]
125id = "TA0043"
126name = "Reconnaissance"
127reference = "https://attack.mitre.org/tactics/TA0043/"

Triage and analysis

Disclaimer: This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.

Investigating Web Server Potential Spike in Error Response Codes

This rule detects bursts of 5xx errors (500–504) from GET traffic, highlighting abnormal server behavior that accompanies active scanning or fuzzing and exposes fragile code paths or misconfigured proxies. Attackers sweep common and generated endpoints while mutating query params and headers—path traversal, template syntax, large payloads—to repeatedly force backend exceptions and gateway timeouts, enumerate which routes fail, and pinpoint inputs that leak stack traces or crash components for follow-on exploitation.

Possible investigation steps

Plot error rates per minute by server and client around the alert window to confirm the spike, determine scope, and separate a single noisy client from a platform-wide issue.
Aggregate the failing URL paths and query strings from the flagged client and look for enumeration sequences, traversal encoding, template injection markers, or oversized inputs indicative of fuzzing.
Examine User-Agent, Referer, header mix, and TLS JA3 for generic scanner signatures or reuse across multiple clients, and enrich the originating IP with reputation and hosting-provider attribution.
Correlate the timeframe with reverse proxy/WAF/IDS and application error logs or stack traces to identify which routes threw exceptions or timeouts and whether they align with the client’s input patterns.
Validate backend and dependency health (upstreams, databases, caches, deployments) to rule out infrastructure regressions, then compare whether only the suspicious client experiences disproportionate failures.

False positive analysis

A scheduled deployment or upstream dependency issue can cause normal GET traffic to fail with 502/503/504, and many users egressing through a shared NAT or reverse proxy may be aggregated as one source IP that triggers the spike.
An internal health-check, load test, or site crawler running from a single host can rapidly traverse endpoints and induce 500 errors on fragile routes, mimicking scanner-like behavior without malicious intent.

Response and remediation

Immediately rate-limit or block the originating client(s) at the edge (reverse proxy/WAF) using the observed source IPs, User-Agent/TLS fingerprints, and the failing URL patterns generating 5xx bursts.
Drain the origin upstream(s) showing repeated 500/502/503/504 on the probed routes, roll back the latest deployment or config change for those services, and disable any unstable endpoint or plugin that is crashing under input fuzzing.
Restart affected application workers and proxies, purge bad cache entries, re-enable traffic gradually with canary percentage, and confirm normal response rates via synthetic checks against the previously failing URLs.
Escalate to Security Operations and Incident Response if 5xx spikes persist after blocking or if error pages expose stack traces, credentials, or admin route disclosures, or if traffic originates from multiple global hosting ASNs.
Deploy targeted WAF rules for path traversal and injection markers seen in the URLs, enforce per-IP and per-route rate limits, tighten upstream timeouts/circuit breakers, and replace verbose error pages with generic responses that omit stack details.
Add bot management and IP reputation blocking at the CDN/edge, lock down unauthenticated access to admin/debug routes, and instrument alerts that trigger on sustained 5xx bursts per client and per route with automatic edge throttling.