GitHub Exfiltration via High Number of Repository Clones by User

Jan 12, 2026 · Domain: Cloud Use Case: Threat Detection Tactic: Exfiltration Data Source: Github Resources: Investigation Guide ·

Share on:

Detects a high number of repository cloning actions by a single user within a short time frame. Adversaries may clone multiple repositories to exfiltrate sensitive data.

Elastic rule (View on GitHub)

  1[metadata]
  2creation_date = "2025/12/16"
  3integration = ["github"]
  4maturity = "production"
  5updated_date = "2026/01/12"
  6
  7[rule]
  8author = ["Elastic"]
  9description = """
 10Detects a high number of repository cloning actions by a single user within a short time frame. Adversaries may
 11clone multiple repositories to exfiltrate sensitive data.
 12"""
 13from = "now-9m"
 14interval = "8m"
 15language = "esql"
 16license = "Elastic License v2"
 17name = "GitHub Exfiltration via High Number of Repository Clones by User"
 18note = """ ## Triage and analysis
 19
 20> **Disclaimer**:
 21> This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.
 22
 23### Investigating GitHub Exfiltration via High Number of Repository Clones by User
 24
 25This rule flags a single user rapidly cloning dozens of repositories, a strong indicator of bulk source code exfiltration. Mass cloning enables quick siphoning of proprietary code, embedded secrets, and build artifacts across teams before defenses can respond. A typical pattern is a stolen personal access token used in a script to enumerate org repositories and clone them in rapid succession from a CI runner or cloud VM, including private and internal repos, to stage data for off-platform transfer.
 26
 27### Possible investigation steps
 28
 29- Validate whether the actor is a known automation or service account with a documented need to mass-clone, and quickly confirm intent with the account owner and affected repo admins.
 30- Enumerate the cloned repositories and their visibility, deprioritizing activity dominated by public repos while fast-tracking private/internal codebases with sensitive content across orgs.
 31- Pivot on the token identifier to determine the token owner, scopes, and creation/last-use details, compare to normal usage patterns, and revoke/reset credentials if anomalous.
 32- Analyze the user agent and agent identifier to attribute the activity to a specific host or CI runner, correlating with pipeline logs and login locations/times for anomalies.
 33- Correlate with endpoint/network telemetry from the originating host for large outbound transfers, external Git remotes, or bulk archiving indicating off-platform exfiltration following the clones.
 34
 35### False positive analysis
 36
 37- A developer rebuilding a workstation or creating an approved local mirror may legitimately clone dozens of repositories in a short window, especially when activity is dominated by public or low-sensitivity repos.
 38- A shared automation/service account running scheduled builds or org-wide maintenance tasks can trigger fresh clones across many repositories due to pipeline configuration or cache resets, inflating counts without exfiltration intent.
 39
 40### Response and remediation
 41
 42- Immediately revoke the GitHub token used for the clones, force sign out, require password reset and 2FA re-verification for the user, and suspend the account if unauthorized.
 43- Block and quarantine the originating host or CI runner by revoking its runner registration, removing its SSH keys/credentials, and firewalling its IP until imaged.
 44- On the cloned private/internal repositories, remove the user from teams, rotate or disable deploy keys and GitHub App installations, and enforce SAML SSO.
 45- Rotate repository and organization secrets present in those repos (Actions secrets, PATs, SSH keys, cloud access keys) and invalidate any secrets found in commit history.
 46- Recover by restoring only minimal access after owner approval, issuing a new fine-grained PAT with least privilege and expiry, and re-enabling builds while monitoring for further clone bursts.
 47- Escalate to incident response leadership and Legal if any private or export-controlled repos were cloned or cloning continues post-revocation, and harden by enforcing org-wide SSO, disallowing classic PATs, IP allowlisting for PAT use, enabling secret scanning with push protection, and alerting on burst git clone patterns from runners and unusual user agents.
 48"""
 49references = [
 50    "https://www.wiz.io/blog/shai-hulud-2-0-ongoing-supply-chain-attack",
 51    "https://trigger.dev/blog/shai-hulud-postmortem",
 52    "https://posthog.com/blog/nov-24-shai-hulud-attack-post-mortem",
 53]
 54risk_score = 47
 55rule_id = "19f3674c-f4a1-43bb-a89c-e4c6212275e0"
 56severity = "medium"
 57tags = [
 58    "Domain: Cloud",
 59    "Use Case: Threat Detection",
 60    "Tactic: Exfiltration",
 61    "Data Source: Github",
 62    "Resources: Investigation Guide",
 63]
 64timestamp_override = "event.ingested"
 65type = "esql"
 66query = '''
 67from logs-github.audit-* metadata _id, _index, _version
 68| where
 69  data_stream.dataset == "github.audit" and event.type == "change" and event.action == "git.clone"
 70| stats
 71  Esql.event_count = COUNT(*),
 72  Esql.github_org_values = values(github.org),
 73  Esql.github_repo_values = values(github.repo),
 74  Esql.github_repository_public_values = values(github.repository_public),
 75  Esql.github_token_id_values = values(github.token_id),
 76  Esql.github_user_agent_values = values(github.user_agent),
 77  Esql.user_name_values = values(user.name),
 78  Esql.agent_id_values = values(agent.id),
 79  Esql.event_dataset_values = values(event.dataset),
 80  Esql.data_stream_namespace_values = values(data_stream.namespace)
 81
 82  by user.name
 83
 84| keep Esql.*
 85
 86| where
 87  Esql.event_count >= 25
 88'''
 89
 90[[rule.threat]]
 91framework = "MITRE ATT&CK"
 92
 93[[rule.threat.technique]]
 94id = "T1020"
 95name = "Automated Exfiltration"
 96reference = "https://attack.mitre.org/techniques/T1020/"
 97
 98[[rule.threat.technique]]
 99id = "T1567"
100name = "Exfiltration Over Web Service"
101reference = "https://attack.mitre.org/techniques/T1567/"
102
103[[rule.threat.technique.subtechnique]]
104id = "T1567.001"
105name = "Exfiltration to Code Repository"
106reference = "https://attack.mitre.org/techniques/T1567/001/"
107
108[rule.threat.tactic]
109id = "TA0010"
110name = "Exfiltration"
111reference = "https://attack.mitre.org/tactics/TA0010/"

Triage and analysis

Disclaimer: This investigation guide was created using generative AI technology and has been reviewed to improve its accuracy and relevance. While every effort has been made to ensure its quality, we recommend validating the content and adapting it to suit your specific environment and operational needs.

Investigating GitHub Exfiltration via High Number of Repository Clones by User

This rule flags a single user rapidly cloning dozens of repositories, a strong indicator of bulk source code exfiltration. Mass cloning enables quick siphoning of proprietary code, embedded secrets, and build artifacts across teams before defenses can respond. A typical pattern is a stolen personal access token used in a script to enumerate org repositories and clone them in rapid succession from a CI runner or cloud VM, including private and internal repos, to stage data for off-platform transfer.

Possible investigation steps

Validate whether the actor is a known automation or service account with a documented need to mass-clone, and quickly confirm intent with the account owner and affected repo admins.
Enumerate the cloned repositories and their visibility, deprioritizing activity dominated by public repos while fast-tracking private/internal codebases with sensitive content across orgs.
Pivot on the token identifier to determine the token owner, scopes, and creation/last-use details, compare to normal usage patterns, and revoke/reset credentials if anomalous.
Analyze the user agent and agent identifier to attribute the activity to a specific host or CI runner, correlating with pipeline logs and login locations/times for anomalies.
Correlate with endpoint/network telemetry from the originating host for large outbound transfers, external Git remotes, or bulk archiving indicating off-platform exfiltration following the clones.

False positive analysis

A developer rebuilding a workstation or creating an approved local mirror may legitimately clone dozens of repositories in a short window, especially when activity is dominated by public or low-sensitivity repos.
A shared automation/service account running scheduled builds or org-wide maintenance tasks can trigger fresh clones across many repositories due to pipeline configuration or cache resets, inflating counts without exfiltration intent.

Response and remediation

Immediately revoke the GitHub token used for the clones, force sign out, require password reset and 2FA re-verification for the user, and suspend the account if unauthorized.
Block and quarantine the originating host or CI runner by revoking its runner registration, removing its SSH keys/credentials, and firewalling its IP until imaged.
On the cloned private/internal repositories, remove the user from teams, rotate or disable deploy keys and GitHub App installations, and enforce SAML SSO.
Rotate repository and organization secrets present in those repos (Actions secrets, PATs, SSH keys, cloud access keys) and invalidate any secrets found in commit history.
Recover by restoring only minimal access after owner approval, issuing a new fine-grained PAT with least privilege and expiry, and re-enabling builds while monitoring for further clone bursts.
Escalate to incident response leadership and Legal if any private or export-controlled repos were cloned or cloning continues post-revocation, and harden by enforcing org-wide SSO, disallowing classic PATs, IP allowlisting for PAT use, enabling secret scanning with push protection, and alerting on burst git clone patterns from runners and unusual user agents.

References

Sha1-Hulud 2.0 Supply Chain Attack: 25K+ Repos Exposed | Wiz Blog

Shai-Hulud is back, spreading an npm malware worm through thousands of GitHub repos. Learn the impact, attacker methods, and how to defend your supply chain.

Read More
How we got hit by Shai-Hulud: A complete post-mortem | Trigger.dev

On November 25th, one of our engineers was compromised by the Shai-Hulud npm supply chain worm. Here's what happened, how we responded, and what we've changed.

Read More
Post-mortem of Shai-Hulud attack on November 24th, 2025 - PostHog

At 4:11 AM UTC on November 24th, a number of our SDKs and other packages were compromised, with a malicious self-replicating worm - Shai-Hulud 2.…

Read More

GitHub Exfiltration via High Number of Repository Clones by User

Elastic rule (View on GitHub)

Triage and analysis

Investigating GitHub Exfiltration via High Number of Repository Clones by User

Possible investigation steps

False positive analysis

Response and remediation

References

Sha1-Hulud 2.0 Supply Chain Attack: 25K+ Repos Exposed | Wiz Blog

How we got hit by Shai-Hulud: A complete post-mortem | Trigger.dev

Post-mortem of Shai-Hulud attack on November 24th, 2025 - PostHog

Related rules