GitHub Exfiltration via High Number of Repository Clones by User

Detects a high number of repository cloning actions by a single user within a short time frame. Adversaries may clone multiple repositories to exfiltrate sensitive data.

Elastic rule (View on GitHub)

  1[metadata]
  2creation_date = "2025/12/16"
  3integration = ["github"]
  4maturity = "production"
  5updated_date = "2026/01/12"
  6
  7[rule]
  8author = ["Elastic"]
  9description = """
 10Detects a high number of repository cloning actions by a single user within a short time frame. Adversaries may
 11clone multiple repositories to exfiltrate sensitive data.
 12"""
 13from = "now-9m"
 14interval = "8m"
 15language = "esql"
 16license = "Elastic License v2"
 17name = "GitHub Exfiltration via High Number of Repository Clones by User"
 18note = """ ## Triage and analysis
 19
 20> **Disclaimer**:
 21> This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.
 22
 23### Investigating GitHub Exfiltration via High Number of Repository Clones by User
 24
 25This rule flags a single user rapidly cloning dozens of repositories, a strong indicator of bulk source code exfiltration. Mass cloning enables quick siphoning of proprietary code, embedded secrets, and build artifacts across teams before defenses can respond. A typical pattern is a stolen personal access token used in a script to enumerate org repositories and clone them in rapid succession from a CI runner or cloud VM, including private and internal repos, to stage data for off-platform transfer.
 26
 27### Possible investigation steps
 28
 29- Validate whether the actor is a known automation or service account with a documented need to mass-clone, and quickly confirm intent with the account owner and affected repo admins.
 30- Enumerate the cloned repositories and their visibility, deprioritizing activity dominated by public repos while fast-tracking private/internal codebases with sensitive content across orgs.
 31- Pivot on the token identifier to determine the token owner, scopes, and creation/last-use details, compare to normal usage patterns, and revoke/reset credentials if anomalous.
 32- Analyze the user agent and agent identifier to attribute the activity to a specific host or CI runner, correlating with pipeline logs and login locations/times for anomalies.
 33- Correlate with endpoint/network telemetry from the originating host for large outbound transfers, external Git remotes, or bulk archiving indicating off-platform exfiltration following the clones.
 34
 35### False positive analysis
 36
 37- A developer rebuilding a workstation or creating an approved local mirror may legitimately clone dozens of repositories in a short window, especially when activity is dominated by public or low-sensitivity repos.
 38- A shared automation/service account running scheduled builds or org-wide maintenance tasks can trigger fresh clones across many repositories due to pipeline configuration or cache resets, inflating counts without exfiltration intent.
 39
 40### Response and remediation
 41
 42- Immediately revoke the GitHub token used for the clones, force sign out, require password reset and 2FA re-verification for the user, and suspend the account if unauthorized.
 43- Block and quarantine the originating host or CI runner by revoking its runner registration, removing its SSH keys/credentials, and firewalling its IP until imaged.
 44- On the cloned private/internal repositories, remove the user from teams, rotate or disable deploy keys and GitHub App installations, and enforce SAML SSO.
 45- Rotate repository and organization secrets present in those repos (Actions secrets, PATs, SSH keys, cloud access keys) and invalidate any secrets found in commit history.
 46- Recover by restoring only minimal access after owner approval, issuing a new fine-grained PAT with least privilege and expiry, and re-enabling builds while monitoring for further clone bursts.
 47- Escalate to incident response leadership and Legal if any private or export-controlled repos were cloned or cloning continues post-revocation, and harden by enforcing org-wide SSO, disallowing classic PATs, IP allowlisting for PAT use, enabling secret scanning with push protection, and alerting on burst git clone patterns from runners and unusual user agents.
 48"""
 49references = [
 50    "https://www.wiz.io/blog/shai-hulud-2-0-ongoing-supply-chain-attack",
 51    "https://trigger.dev/blog/shai-hulud-postmortem",
 52    "https://posthog.com/blog/nov-24-shai-hulud-attack-post-mortem",
 53]
 54risk_score = 47
 55rule_id = "19f3674c-f4a1-43bb-a89c-e4c6212275e0"
 56severity = "medium"
 57tags = [
 58    "Domain: Cloud",
 59    "Use Case: Threat Detection",
 60    "Tactic: Exfiltration",
 61    "Data Source: Github",
 62    "Resources: Investigation Guide",
 63]
 64timestamp_override = "event.ingested"
 65type = "esql"
 66query = '''
 67from logs-github.audit-* metadata _id, _index, _version
 68| where
 69  data_stream.dataset == "github.audit" and event.type == "change" and event.action == "git.clone"
 70| stats
 71  Esql.event_count = COUNT(*),
 72  Esql.github_org_values = values(github.org),
 73  Esql.github_repo_values = values(github.repo),
 74  Esql.github_repository_public_values = values(github.repository_public),
 75  Esql.github_token_id_values = values(github.token_id),
 76  Esql.github_user_agent_values = values(github.user_agent),
 77  Esql.user_name_values = values(user.name),
 78  Esql.agent_id_values = values(agent.id),
 79  Esql.event_dataset_values = values(event.dataset),
 80  Esql.data_stream_namespace_values = values(data_stream.namespace)
 81
 82  by user.name
 83
 84| keep Esql.*
 85
 86| where
 87  Esql.event_count >= 25
 88'''
 89
 90[[rule.threat]]
 91framework = "MITRE ATT&CK"
 92
 93[[rule.threat.technique]]
 94id = "T1020"
 95name = "Automated Exfiltration"
 96reference = "https://attack.mitre.org/techniques/T1020/"
 97
 98[[rule.threat.technique]]
 99id = "T1567"
100name = "Exfiltration Over Web Service"
101reference = "https://attack.mitre.org/techniques/T1567/"
102
103[[rule.threat.technique.subtechnique]]
104id = "T1567.001"
105name = "Exfiltration to Code Repository"
106reference = "https://attack.mitre.org/techniques/T1567/001/"
107
108[rule.threat.tactic]
109id = "TA0010"
110name = "Exfiltration"
111reference = "https://attack.mitre.org/tactics/TA0010/"

Triage and analysis

Disclaimer: This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.

Investigating GitHub Exfiltration via High Number of Repository Clones by User

This rule flags a single user rapidly cloning dozens of repositories, a strong indicator of bulk source code exfiltration. Mass cloning enables quick siphoning of proprietary code, embedded secrets, and build artifacts across teams before defenses can respond. A typical pattern is a stolen personal access token used in a script to enumerate org repositories and clone them in rapid succession from a CI runner or cloud VM, including private and internal repos, to stage data for off-platform transfer.

Possible investigation steps

  • Validate whether the actor is a known automation or service account with a documented need to mass-clone, and quickly confirm intent with the account owner and affected repo admins.
  • Enumerate the cloned repositories and their visibility, deprioritizing activity dominated by public repos while fast-tracking private/internal codebases with sensitive content across orgs.
  • Pivot on the token identifier to determine the token owner, scopes, and creation/last-use details, compare to normal usage patterns, and revoke/reset credentials if anomalous.
  • Analyze the user agent and agent identifier to attribute the activity to a specific host or CI runner, correlating with pipeline logs and login locations/times for anomalies.
  • Correlate with endpoint/network telemetry from the originating host for large outbound transfers, external Git remotes, or bulk archiving indicating off-platform exfiltration following the clones.

False positive analysis

  • A developer rebuilding a workstation or creating an approved local mirror may legitimately clone dozens of repositories in a short window, especially when activity is dominated by public or low-sensitivity repos.
  • A shared automation/service account running scheduled builds or org-wide maintenance tasks can trigger fresh clones across many repositories due to pipeline configuration or cache resets, inflating counts without exfiltration intent.

Response and remediation

  • Immediately revoke the GitHub token used for the clones, force sign out, require password reset and 2FA re-verification for the user, and suspend the account if unauthorized.
  • Block and quarantine the originating host or CI runner by revoking its runner registration, removing its SSH keys/credentials, and firewalling its IP until imaged.
  • On the cloned private/internal repositories, remove the user from teams, rotate or disable deploy keys and GitHub App installations, and enforce SAML SSO.
  • Rotate repository and organization secrets present in those repos (Actions secrets, PATs, SSH keys, cloud access keys) and invalidate any secrets found in commit history.
  • Recover by restoring only minimal access after owner approval, issuing a new fine-grained PAT with least privilege and expiry, and re-enabling builds while monitoring for further clone bursts.
  • Escalate to incident response leadership and Legal if any private or export-controlled repos were cloned or cloning continues post-revocation, and harden by enforcing org-wide SSO, disallowing classic PATs, IP allowlisting for PAT use, enabling secret scanning with push protection, and alerting on burst git clone patterns from runners and unusual user agents.

References

Related rules

to-top