Web Server Potential Spike in Error Response Codes
This rule detects unusual spikes in error response codes (500, 502, 503, 504) from web servers, which may indicate reconnaissance activities such as vulnerability scanning or fuzzing attempts by adversaries. These activities often generate a high volume of error responses as they probe for weaknesses in web applications. Error response codes may potentially indicate server-side issues that could be exploited.
Elastic rule (View on GitHub)
1[metadata]
2creation_date = "2025/11/19"
3integration = ["nginx", "apache", "apache_tomcat", "iis"]
4maturity = "production"
5updated_date = "2025/12/01"
6
7[rule]
8author = ["Elastic"]
9description = """
10This rule detects unusual spikes in error response codes (500, 502, 503, 504) from web servers, which may indicate
11reconnaissance activities such as vulnerability scanning or fuzzing attempts by adversaries. These activities often
12generate a high volume of error responses as they probe for weaknesses in web applications. Error response codes
13may potentially indicate server-side issues that could be exploited.
14"""
15from = "now-9m"
16interval = "10m"
17language = "esql"
18license = "Elastic License v2"
19name = "Web Server Potential Spike in Error Response Codes"
20note = """ ## Triage and analysis
21
22> **Disclaimer**:
23> This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.
24
25### Investigating Web Server Potential Spike in Error Response Codes
26
27This rule detects bursts of 5xx errors (500–504) from GET traffic, highlighting abnormal server behavior that accompanies active scanning or fuzzing and exposes fragile code paths or misconfigured proxies. Attackers sweep common and generated endpoints while mutating query params and headers—path traversal, template syntax, large payloads—to repeatedly force backend exceptions and gateway timeouts, enumerate which routes fail, and pinpoint inputs that leak stack traces or crash components for follow-on exploitation.
28
29### Possible investigation steps
30
31- Plot error rates per minute by server and client around the alert window to confirm the spike, determine scope, and separate a single noisy client from a platform-wide issue.
32- Aggregate the failing URL paths and query strings from the flagged client and look for enumeration sequences, traversal encoding, template injection markers, or oversized inputs indicative of fuzzing.
33- Examine User-Agent, Referer, header mix, and TLS JA3 for generic scanner signatures or reuse across multiple clients, and enrich the originating IP with reputation and hosting-provider attribution.
34- Correlate the timeframe with reverse proxy/WAF/IDS and application error logs or stack traces to identify which routes threw exceptions or timeouts and whether they align with the client’s input patterns.
35- Validate backend and dependency health (upstreams, databases, caches, deployments) to rule out infrastructure regressions, then compare whether only the suspicious client experiences disproportionate failures.
36
37### False positive analysis
38
39- A scheduled deployment or upstream dependency issue can cause normal GET traffic to fail with 502/503/504, and many users egressing through a shared NAT or reverse proxy may be aggregated as one source IP that triggers the spike.
40- An internal health-check, load test, or site crawler running from a single host can rapidly traverse endpoints and induce 500 errors on fragile routes, mimicking scanner-like behavior without malicious intent.
41
42### Response and remediation
43
44- Immediately rate-limit or block the originating client(s) at the edge (reverse proxy/WAF) using the observed source IPs, User-Agent/TLS fingerprints, and the failing URL patterns generating 5xx bursts.
45- Drain the origin upstream(s) showing repeated 500/502/503/504 on the probed routes, roll back the latest deployment or config change for those services, and disable any unstable endpoint or plugin that is crashing under input fuzzing.
46- Restart affected application workers and proxies, purge bad cache entries, re-enable traffic gradually with canary percentage, and confirm normal response rates via synthetic checks against the previously failing URLs.
47- Escalate to Security Operations and Incident Response if 5xx spikes persist after blocking or if error pages expose stack traces, credentials, or admin route disclosures, or if traffic originates from multiple global hosting ASNs.
48- Deploy targeted WAF rules for path traversal and injection markers seen in the URLs, enforce per-IP and per-route rate limits, tighten upstream timeouts/circuit breakers, and replace verbose error pages with generic responses that omit stack details.
49- Add bot management and IP reputation blocking at the CDN/edge, lock down unauthenticated access to admin/debug routes, and instrument alerts that trigger on sustained 5xx bursts per client and per route with automatic edge throttling.
50"""
51risk_score = 21
52rule_id = "6fa3abe3-9cd8-41de-951b-51ed8f710523"
53severity = "low"
54tags = [
55 "Domain: Web",
56 "Use Case: Threat Detection",
57 "Tactic: Reconnaissance",
58 "Data Source: Nginx",
59 "Data Source: Apache",
60 "Data Source: Apache Tomcat",
61 "Data Source: IIS",
62 "Resources: Investigation Guide",
63]
64timestamp_override = "event.ingested"
65type = "esql"
66query = '''
67from logs-nginx.access-*, logs-apache.access-*, logs-apache_tomcat.access-*, logs-iis.access-*
68| where
69 http.request.method == "GET" and
70 http.response.status_code in (
71 500, // Internal Server Error
72 502, // Bad Gateway
73 503, // Service Unavailable
74 504 // Gateway Timeout
75 )
76
77| eval Esql.url_original_to_lower = to_lower(url.original)
78
79| keep
80 @timestamp,
81 event.dataset,
82 http.request.method,
83 http.response.status_code,
84 source.ip,
85 agent.id,
86 host.name,
87 Esql.url_original_to_lower
88| stats
89 Esql.event_count = count(),
90 Esql.http_response_status_code_count = count(http.response.status_code),
91 Esql.http_response_status_code_values = values(http.response.status_code),
92 Esql.host_name_values = values(host.name),
93 Esql.agent_id_values = values(agent.id),
94 Esql.http_request_method_values = values(http.request.method),
95 Esql.http_response_status_code_values = values(http.response.status_code),
96 Esql.url_path_values = values(Esql.url_original_to_lower),
97 Esql.event_dataset_values = values(event.dataset)
98 by source.ip, agent.id
99| where
100 Esql.http_response_status_code_count > 10
101'''
102
103[[rule.threat]]
104framework = "MITRE ATT&CK"
105
106[[rule.threat.technique]]
107id = "T1595"
108name = "Active Scanning"
109reference = "https://attack.mitre.org/techniques/T1595/"
110
111[[rule.threat.technique.subtechnique]]
112id = "T1595.002"
113name = "Vulnerability Scanning"
114reference = "https://attack.mitre.org/techniques/T1595/002/"
115
116[[rule.threat.technique.subtechnique]]
117id = "T1595.003"
118name = "Wordlist Scanning"
119reference = "https://attack.mitre.org/techniques/T1595/003/"
120
121[rule.threat.tactic]
122id = "TA0043"
123name = "Reconnaissance"
124reference = "https://attack.mitre.org/tactics/TA0043/"
Triage and analysis
Disclaimer: This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.
Investigating Web Server Potential Spike in Error Response Codes
This rule detects bursts of 5xx errors (500–504) from GET traffic, highlighting abnormal server behavior that accompanies active scanning or fuzzing and exposes fragile code paths or misconfigured proxies. Attackers sweep common and generated endpoints while mutating query params and headers—path traversal, template syntax, large payloads—to repeatedly force backend exceptions and gateway timeouts, enumerate which routes fail, and pinpoint inputs that leak stack traces or crash components for follow-on exploitation.
Possible investigation steps
- Plot error rates per minute by server and client around the alert window to confirm the spike, determine scope, and separate a single noisy client from a platform-wide issue.
- Aggregate the failing URL paths and query strings from the flagged client and look for enumeration sequences, traversal encoding, template injection markers, or oversized inputs indicative of fuzzing.
- Examine User-Agent, Referer, header mix, and TLS JA3 for generic scanner signatures or reuse across multiple clients, and enrich the originating IP with reputation and hosting-provider attribution.
- Correlate the timeframe with reverse proxy/WAF/IDS and application error logs or stack traces to identify which routes threw exceptions or timeouts and whether they align with the client’s input patterns.
- Validate backend and dependency health (upstreams, databases, caches, deployments) to rule out infrastructure regressions, then compare whether only the suspicious client experiences disproportionate failures.
False positive analysis
- A scheduled deployment or upstream dependency issue can cause normal GET traffic to fail with 502/503/504, and many users egressing through a shared NAT or reverse proxy may be aggregated as one source IP that triggers the spike.
- An internal health-check, load test, or site crawler running from a single host can rapidly traverse endpoints and induce 500 errors on fragile routes, mimicking scanner-like behavior without malicious intent.
Response and remediation
- Immediately rate-limit or block the originating client(s) at the edge (reverse proxy/WAF) using the observed source IPs, User-Agent/TLS fingerprints, and the failing URL patterns generating 5xx bursts.
- Drain the origin upstream(s) showing repeated 500/502/503/504 on the probed routes, roll back the latest deployment or config change for those services, and disable any unstable endpoint or plugin that is crashing under input fuzzing.
- Restart affected application workers and proxies, purge bad cache entries, re-enable traffic gradually with canary percentage, and confirm normal response rates via synthetic checks against the previously failing URLs.
- Escalate to Security Operations and Incident Response if 5xx spikes persist after blocking or if error pages expose stack traces, credentials, or admin route disclosures, or if traffic originates from multiple global hosting ASNs.
- Deploy targeted WAF rules for path traversal and injection markers seen in the URLs, enforce per-IP and per-route rate limits, tighten upstream timeouts/circuit breakers, and replace verbose error pages with generic responses that omit stack details.
- Add bot management and IP reputation blocking at the CDN/edge, lock down unauthenticated access to admin/debug routes, and instrument alerts that trigger on sustained 5xx bursts per client and per route with automatic edge throttling.
Related rules
- Potential Spike in Web Server Error Logs
- Web Server Discovery or Fuzzing Activity
- Web Server Potential Command Injection Request
- Web Server Suspicious User Agent Requests
- Potential Webshell Deployed via Apache Struts CVE-2023-50164 Exploitation