Potential Abuse of Resources by High Token Count and Large Response Sizes

Jan 22, 2025 · Domain: LLM Data Source: AWS Bedrock Data Source: Amazon Web Services Data Source: AWS S3 Use Case: Potential Overload Use Case: Resource Exhaustion Mitre Atlas: LLM04 Resources: Investigation Guide ·

Share on:

Detects potential resource exhaustion or data breach attempts by monitoring for users who consistently generate high input token counts, submit numerous requests, and receive large responses. This behavior could indicate an attempt to overload the system or extract an unusually large amount of data, possibly revealing sensitive information or causing service disruptions.

Elastic rule (View on GitHub)

 1[metadata]
 2creation_date = "2024/05/04"
 3maturity = "production"
 4updated_date = "2025/01/17"
 5min_stack_comments = "ES|QL rule type is still in technical preview as of 8.13, however this rule was tested successfully; integration in tech preview"
 6min_stack_version = "8.13.0"
 7
 8[rule]
 9author = ["Elastic"]
10description = """
11Detects potential resource exhaustion or data breach attempts by monitoring for users who consistently generate high input token counts, submit numerous requests, and receive
12large responses. This behavior could indicate an attempt to overload the system or extract an unusually large amount of data, possibly revealing sensitive information or
13causing service disruptions.
14"""
15false_positives = ["Authorized heavy usage of the system that is business justified and monitored."]
16from = "now-60m"
17interval = "10m"
18language = "esql"
19license = "Elastic License v2"
20name = "Potential Abuse of Resources by High Token Count and Large Response Sizes"
21references = [
22    "https://atlas.mitre.org/techniques/AML.T0051",
23    "https://owasp.org/www-project-top-10-for-large-language-model-applications/",
24    "https://www.elastic.co/security-labs/elastic-advances-llm-security",
25]
26risk_score = 47
27rule_id = "b1773d05-f349-45fb-9850-287b8f92f02d"
28note = """## Triage and analysis
29
30### Investigating Potential Abuse of Resources by High Token Count and Large Response Sizes
31
32Amazon Bedrock is AWS’s managed service that enables developers to build and scale generative AI applications using large foundation models (FMs) from top providers.
33
34Bedrock offers a variety of pretrained models from Amazon (such as the Titan series), as well as models from providers like Anthropic, Meta, Cohere, and AI21 Labs.
35
36#### Possible investigation steps
37
38- Identify the user account that used high prompt token counts and whether it should perform this kind of action.
39- Investigate large response sizes and the number of requests made by the user account.
40- Investigate other alerts associated with the user account during the past 48 hours.
41- Consider the time of day. If the user is a human (not a program or script), did the activity take place during a normal time of day?
42- Examine the account's prompts and responses in the last 24 hours.
43- If you suspect the account has been compromised, scope potentially compromised assets by tracking Amazon Bedrock model access, prompts generated, and responses to the prompts by the account in the last 24 hours.
44
45### False positive analysis
46
47- Verify the user account that used high prompt and large response sizes, has a business justification for the heavy usage of the system.
48
49### Response and remediation
50
51- Initiate the incident response process based on the outcome of the triage.
52- Disable or limit the account during the investigation and response.
53- Identify the possible impact of the incident and prioritize accordingly; the following actions can help you gain context:
54    - Identify the account role in the cloud environment.
55    - Identify if the attacker is moving laterally and compromising other Amazon Bedrock Services.
56    - Identify any regulatory or legal ramifications related to this activity.
57    - Identify potential resource exhaustion and impact on billing.
58- Review the permissions assigned to the implicated user group or role behind these requests to ensure they are authorized and expected to access bedrock and ensure that the least privilege principle is being followed.
59- Determine the initial vector abused by the attacker and take action to prevent reinfection via the same vector.
60- Using the incident response data, update logging and audit policies to improve the mean time to detect (MTTD) and the mean time to respond (MTTR).
61"""
62setup = """## Setup
63
64This rule requires that guardrails are configured in AWS Bedrock. For more information, see the AWS Bedrock documentation:
65
66https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-create.html
67"""
68severity = "medium"
69tags = [
70    "Domain: LLM",
71    "Data Source: AWS Bedrock",
72    "Data Source: Amazon Web Services",
73    "Data Source: AWS S3",
74    "Use Case: Potential Overload",
75    "Use Case: Resource Exhaustion",
76    "Mitre Atlas: LLM04",
77    "Resources: Investigation Guide"
78]
79timestamp_override = "event.ingested"
80type = "esql"
81
82query = '''
83from logs-aws_bedrock.invocation-*
84| keep user.id, gen_ai.usage.prompt_tokens, gen_ai.usage.completion_tokens
85| stats max_tokens = max(gen_ai.usage.prompt_tokens),
86         total_requests = count(*),
87         avg_response_size = avg(gen_ai.usage.completion_tokens)
88  by user.id
89// tokens count depends on specific LLM, as is related to how embeddings are generated.
90| where max_tokens > 5000 and total_requests > 10 and avg_response_size > 500
91| eval risk_factor = (max_tokens / 1000) * total_requests * (avg_response_size / 500)
92| where risk_factor > 10
93| sort risk_factor desc
94'''

Triage and analysis

Investigating Potential Abuse of Resources by High Token Count and Large Response Sizes

Amazon Bedrock is AWS’s managed service that enables developers to build and scale generative AI applications using large foundation models (FMs) from top providers.

Bedrock offers a variety of pretrained models from Amazon (such as the Titan series), as well as models from providers like Anthropic, Meta, Cohere, and AI21 Labs.

Possible investigation steps

Identify the user account that used high prompt token counts and whether it should perform this kind of action.
Investigate large response sizes and the number of requests made by the user account.
Investigate other alerts associated with the user account during the past 48 hours.
Consider the time of day. If the user is a human (not a program or script), did the activity take place during a normal time of day?
Examine the account's prompts and responses in the last 24 hours.
If you suspect the account has been compromised, scope potentially compromised assets by tracking Amazon Bedrock model access, prompts generated, and responses to the prompts by the account in the last 24 hours.

False positive analysis

Verify the user account that used high prompt and large response sizes, has a business justification for the heavy usage of the system.

Response and remediation

Initiate the incident response process based on the outcome of the triage.
Disable or limit the account during the investigation and response.
Identify the possible impact of the incident and prioritize accordingly; the following actions can help you gain context:
- Identify the account role in the cloud environment.
- Identify if the attacker is moving laterally and compromising other Amazon Bedrock Services.
- Identify any regulatory or legal ramifications related to this activity.
- Identify potential resource exhaustion and impact on billing.
Review the permissions assigned to the implicated user group or role behind these requests to ensure they are authorized and expected to access bedrock and ensure that the least privilege principle is being followed.
Determine the initial vector abused by the attacker and take action to prevent reinfection via the same vector.
Using the incident response data, update logging and audit policies to improve the mean time to detect (MTTD) and the mean time to respond (MTTR).

References

https://atlas.mitre.org/techniques/AML.T0051

Read More
OWASP Top 10 for Large Language Model Applications | OWASP Foundation

Aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs)

Read More
Elastic Advances LLM Security with Standardized Fields and Integrations — Elastic Security Labs

Discover Elastic’s latest advancements in LLM security, focusing on standardized field integrations and enhanced detection capabilities. Learn how adopting these standards can safeguard your systems.

Read More

Potential Abuse of Resources by High Token Count and Large Response Sizes

Elastic rule (View on GitHub)

Triage and analysis

Investigating Potential Abuse of Resources by High Token Count and Large Response Sizes

Possible investigation steps

False positive analysis

Response and remediation

References

https://atlas.mitre.org/techniques/AML.T0051

OWASP Top 10 for Large Language Model Applications | OWASP Foundation

Elastic Advances LLM Security with Standardized Fields and Integrations — Elastic Security Labs

Related rules