Rare AWS Error Code

Jun 19, 2024 · Domain: Cloud Data Source: AWS Data Source: Amazon Web Services Rule Type: ML Rule Type: Machine Learning Resources: Investigation Guide ·

Share on:

A machine learning job detected an unusual error in a CloudTrail message. These can be byproducts of attempted or successful persistence, privilege escalation, defense evasion, discovery, lateral movement, or collection.

Elastic rule (View on GitHub)

  1[metadata]
  2creation_date = "2020/07/13"
  3integration = ["aws"]
  4maturity = "production"
  5updated_date = "2024/06/18"
  6
  7[rule]
  8anomaly_threshold = 50
  9author = ["Elastic"]
 10description = """
 11A machine learning job detected an unusual error in a CloudTrail message. These can be byproducts of attempted or
 12successful persistence, privilege escalation, defense evasion, discovery, lateral movement, or collection.
 13"""
 14false_positives = [
 15    """
 16    Rare and unusual errors may indicate an impending service failure state. Rare and unusual user error activity can
 17    also be due to manual troubleshooting or reconfiguration attempts by insufficiently privileged users, bugs in cloud
 18    automation scripts or workflows, or changes to IAM privileges.
 19    """,
 20]
 21from = "now-2h"
 22interval = "15m"
 23license = "Elastic License v2"
 24machine_learning_job_id = "rare_error_code"
 25name = "Rare AWS Error Code"
 26setup = """## Setup
 27
 28This rule requires the installation of associated Machine Learning jobs, as well as data coming in from AWS.
 29
 30### Anomaly Detection Setup
 31
 32Once the rule is enabled, the associated Machine Learning job will start automatically. You can view the Machine Learning job linked under the "Definition" panel of the detection rule. If the job does not start due to an error, the issue must be resolved for the job to commence successfully. For more details on setting up anomaly detection jobs, refer to the [helper guide](https://www.elastic.co/guide/en/kibana/current/xpack-ml-anomalies.html).
 33
 34### AWS Integration Setup
 35The AWS integration allows you to collect logs and metrics from Amazon Web Services (AWS) with Elastic Agent.
 36
 37#### The following steps should be executed in order to add the Elastic Agent System integration "aws" to your system:
 38- Go to the Kibana home page and click “Add integrations”.
 39- In the query bar, search for “AWS” and select the integration to see more details about it.
 40- Click “Add AWS”.
 41- Configure the integration name and optionally add a description.
 42- Review optional and advanced settings accordingly.
 43- Add the newly installed “aws” to an existing or a new agent policy, and deploy the agent on your system from which aws log files are desirable.
 44- Click “Save and Continue”.
 45- For more details on the integration refer to the [helper guide](https://www.elastic.co/docs/current/integrations/aws).
 46"""
 47note = """## Triage and analysis
 48
 49### Investigating Rare AWS Error Code
 50
 51CloudTrail logging provides visibility on actions taken within an AWS environment. By monitoring these events and understanding what is considered normal behavior within an organization, you can spot suspicious or malicious activity when deviations occur.
 52
 53This rule uses a machine learning job to detect an unusual error in a CloudTrail message. This can be byproducts of attempted or successful persistence, privilege escalation, defense evasion, discovery, lateral movement, or collection.
 54
 55Detection alerts from this rule indicate a rare and unusual error code that was associated with the response to an AWS API command or method call.
 56
 57#### Possible investigation steps
 58
 59- Examine the history of the error. If the error only manifested recently, it might be related to recent changes in an automation module or script. You can find the error in the `aws.cloudtrail.error_code field` field.
 60- Investigate other alerts associated with the user account during the past 48 hours.
 61- Validate the activity is not related to planned patches, updates, or network administrator activity.
 62- Examine the request parameters. These may indicate the source of the program or the nature of the task being performed when the error occurred.
 63    - Check whether the error is related to unsuccessful attempts to enumerate or access objects, data, or secrets.
 64- Considering the source IP address and geolocation of the user who issued the command:
 65    - Do they look normal for the calling user?
 66    - If the source is an EC2 IP address, is it associated with an EC2 instance in one of your accounts or is the source IP from an EC2 instance that's not under your control?
 67    - If it is an authorized EC2 instance, is the activity associated with normal behavior for the instance role or roles? Are there any other alerts or signs of suspicious activity involving this instance?
 68- Consider the time of day. If the user is a human (not a program or script), did the activity take place during a normal time of day?
 69- Contact the account owner and confirm whether they are aware of this activity if suspicious.
 70- If you suspect the account has been compromised, scope potentially compromised assets by tracking servers, services, and data accessed by the account in the last 24 hours.
 71
 72### False positive analysis
 73
 74- Examine the history of the command. If the command only manifested recently, it might be part of a new automation module or script. If it has a consistent cadence (for example, it appears in small numbers on a weekly or monthly cadence), it might be part of a housekeeping or maintenance process. You can find the command in the `event.action field` field.
 75- The adoption of new services or the addition of new functionality to scripts may generate false positives.
 76
 77### Related Rules
 78
 79- Unusual City For an AWS Command - 809b70d3-e2c3-455e-af1b-2626a5a1a276
 80- Unusual Country For an AWS Command - dca28dee-c999-400f-b640-50a081cc0fd1
 81- Unusual AWS Command for a User - ac706eae-d5ec-4b14-b4fd-e8ba8086f0e1
 82- Spike in AWS Error Messages - 78d3d8d9-b476-451d-a9e0-7a5addd70670
 83
 84### Response and remediation
 85
 86- Initiate the incident response process based on the outcome of the triage.
 87- Disable or limit the account during the investigation and response.
 88- Identify the possible impact of the incident and prioritize accordingly; the following actions can help you gain context:
 89    - Identify the account role in the cloud environment.
 90    - Assess the criticality of affected services and servers.
 91    - Work with your IT team to identify and minimize the impact on users.
 92    - Identify if the attacker is moving laterally and compromising other accounts, servers, or services.
 93    - Identify any regulatory or legal ramifications related to this activity.
 94- Investigate credential exposure on systems compromised or used by the attacker to ensure all compromised accounts are identified. Reset passwords or delete API keys as needed to revoke the attacker's access to the environment. Work with your IT teams to minimize the impact on business operations during these actions.
 95- Check if unauthorized new users were created, remove unauthorized new accounts, and request password resets for other IAM users.
 96- Consider enabling multi-factor authentication for users.
 97- Review the permissions assigned to the implicated user to ensure that the least privilege principle is being followed.
 98- Implement security best practices [outlined](https://aws.amazon.com/premiumsupport/knowledge-center/security-best-practices/) by AWS.
 99- Take the actions needed to return affected systems, data, or services to their normal operational levels.
100- Identify the initial vector abused by the attacker and take action to prevent reinfection via the same vector.
101- Using the incident response data, update logging and audit policies to improve the mean time to detect (MTTD) and the mean time to respond (MTTR).
102"""
103references = ["https://www.elastic.co/guide/en/security/current/prebuilt-ml-jobs.html"]
104risk_score = 21
105rule_id = "19de8096-e2b0-4bd8-80c9-34a820813fff"
106severity = "low"
107tags = [
108    "Domain: Cloud",
109    "Data Source: AWS",
110    "Data Source: Amazon Web Services",
111    "Rule Type: ML",
112    "Rule Type: Machine Learning",
113    "Resources: Investigation Guide",
114]
115type = "machine_learning"

Triage and analysis

Investigating Rare AWS Error Code

CloudTrail logging provides visibility on actions taken within an AWS environment. By monitoring these events and understanding what is considered normal behavior within an organization, you can spot suspicious or malicious activity when deviations occur.

This rule uses a machine learning job to detect an unusual error in a CloudTrail message. This can be byproducts of attempted or successful persistence, privilege escalation, defense evasion, discovery, lateral movement, or collection.

Detection alerts from this rule indicate a rare and unusual error code that was associated with the response to an AWS API command or method call.

Possible investigation steps

Examine the history of the error. If the error only manifested recently, it might be related to recent changes in an automation module or script. You can find the error in the aws.cloudtrail.error_code field field.
Investigate other alerts associated with the user account during the past 48 hours.
Validate the activity is not related to planned patches, updates, or network administrator activity.
Examine the request parameters. These may indicate the source of the program or the nature of the task being performed when the error occurred.
- Check whether the error is related to unsuccessful attempts to enumerate or access objects, data, or secrets.
Considering the source IP address and geolocation of the user who issued the command:
- Do they look normal for the calling user?
- If the source is an EC2 IP address, is it associated with an EC2 instance in one of your accounts or is the source IP from an EC2 instance that's not under your control?
- If it is an authorized EC2 instance, is the activity associated with normal behavior for the instance role or roles? Are there any other alerts or signs of suspicious activity involving this instance?
Consider the time of day. If the user is a human (not a program or script), did the activity take place during a normal time of day?
Contact the account owner and confirm whether they are aware of this activity if suspicious.
If you suspect the account has been compromised, scope potentially compromised assets by tracking servers, services, and data accessed by the account in the last 24 hours.

False positive analysis

Examine the history of the command. If the command only manifested recently, it might be part of a new automation module or script. If it has a consistent cadence (for example, it appears in small numbers on a weekly or monthly cadence), it might be part of a housekeeping or maintenance process. You can find the command in the event.action field field.
The adoption of new services or the addition of new functionality to scripts may generate false positives.

Unusual City For an AWS Command - 809b70d3-e2c3-455e-af1b-2626a5a1a276
Unusual Country For an AWS Command - dca28dee-c999-400f-b640-50a081cc0fd1
Unusual AWS Command for a User - ac706eae-d5ec-4b14-b4fd-e8ba8086f0e1
Spike in AWS Error Messages - 78d3d8d9-b476-451d-a9e0-7a5addd70670

Response and remediation

Initiate the incident response process based on the outcome of the triage.
Disable or limit the account during the investigation and response.
Identify the possible impact of the incident and prioritize accordingly; the following actions can help you gain context:
- Identify the account role in the cloud environment.
- Assess the criticality of affected services and servers.
- Work with your IT team to identify and minimize the impact on users.
- Identify if the attacker is moving laterally and compromising other accounts, servers, or services.
- Identify any regulatory or legal ramifications related to this activity.
Investigate credential exposure on systems compromised or used by the attacker to ensure all compromised accounts are identified. Reset passwords or delete API keys as needed to revoke the attacker's access to the environment. Work with your IT teams to minimize the impact on business operations during these actions.
Check if unauthorized new users were created, remove unauthorized new accounts, and request password resets for other IAM users.
Consider enabling multi-factor authentication for users.
Review the permissions assigned to the implicated user to ensure that the least privilege principle is being followed.
Implement security best practices outlined by AWS.
Take the actions needed to return affected systems, data, or services to their normal operational levels.
Identify the initial vector abused by the attacker and take action to prevent reinfection via the same vector.
Using the incident response data, update logging and audit policies to improve the mean time to detect (MTTD) and the mean time to respond (MTTR).

References

Prebuilt job reference | Elastic Security Solution [8.14] | Elastic

Prebuilt job referenceedit These anomaly detection jobs automatically detect file system and network anomalies on your hosts. They appear in the Anomaly Detection interface of the Elastic Security ...

Read More