Attachment: Duplicated header pages in fraudulent multi-page PDF Request for Quotation

Detects inbound PDF attachments between 2-3 pages where the first three lines of the header text appear nearly identical (within a Levenshtein distance of 5) on multiple pages, combined with terminology commonly found in fraudulent Request for Quotation (RFQ) documents such as procurement language, payment conditions, and preference point systems. This pattern is consistent with fabricated or manipulated procurement documents used in BEC or fraud schemes.

Sublime rule (View on GitHub)

 1name: "Attachment: Duplicated header pages in fraudulent multi-page PDF Request for Quotation"
 2description: "Detects inbound PDF attachments between 2-3 pages where the first three lines of the header text appear nearly identical (within a Levenshtein distance of 5) on multiple pages, combined with terminology commonly found in fraudulent Request for Quotation (RFQ) documents such as procurement language, payment conditions, and preference point systems. This pattern is consistent with fabricated or manipulated procurement documents used in BEC or fraud schemes."
 3type: "rule"
 4severity: "medium"
 5source: |
 6  type.inbound
 7  and any(attachments,
 8          .file_type == "pdf"
 9          //
10          // This rule makes use of a beta feature and is subject to change without notice
11          // using the beta feature in custom rules is not suggested until it has been formally released
12          //
13          and 1 < beta.parse_exif(.).page_count <= 3
14          //
15          // This rule makes use of a beta feature and is subject to change without notice
16          // using the beta feature in custom rules is not suggested until it has been formally released
17          //
18          and beta.ocr(.).page_results[0].text != ""
19          // extract the first 3 lines from the first page
20          and any(regex.iextract(beta.ocr(.).page_results[0].text,
21                                 '^(?P<page_1_3_lines>(?:[^\r\n]+[\r\n]+){3})'
22                  ),
23                  // make sure we have something
24                  .named_groups["page_1_3_lines"] != ""
25                  // either page 2 or page 3 contain VERY close text
26                  // we have to do very close because sometimes the address format changes in trivial ways
27                  // like a comma is removed or something
28                  and (
29                    any(regex.iextract(beta.ocr(..).page_results[1].text,
30                                       '^(?P<page_2_3_lines>(?:[^\r\n]+[\r\n]+){3})'
31                        ),
32                        strings.levenshtein(.named_groups["page_2_3_lines"],
33                                            ..named_groups["page_1_3_lines"]
34                        ) <= 5
35                    )
36                    or any(regex.iextract(beta.ocr(..).page_results[2].text,
37                                          '^(?P<page_3_3_lines>(?:[^\r\n]+[\r\n]+){3})'
38                           ),
39                           strings.levenshtein(.named_groups["page_3_3_lines"],
40                                               ..named_groups["page_1_3_lines"]
41                           ) <= 5
42                    )
43                  )
44          )
45          and (
46            3 of (
47              strings.icontains(beta.ocr(.).text,
48                                "Contact Person",
49                                "Date of issue",
50                                "Date Issued",
51                                "Closing Time and Date",
52                                "Closing Date",
53                                "QUOTATIONS ARE HEREBY INVITED FOR THE SUPPLY OF",
54                                "COMPULSORY BIDDERS MUST QUOTE",
55                                "Method of RFQ Submission"
56              ),
57              strings.icontains(beta.ocr(.).text,
58                                "Must be inclusive",
59                                "Tax on Price Quotation",
60                                "100% payment made in the form",
61                                "Conditions for Release of Payment",
62                                "Period of Validity of Quotes",
63                                "Partial Bids",
64                                "PRICING QUOTATION",
65                                "Payment Terms and Conditions",
66                                "freight, insurance until acceptance"
67              ),
68              strings.icontains(beta.ocr(.).text,
69                                "80/20 preference point system",
70                                "80:20",
71                                "Selection of suppliers will be based on",
72                                "plant upgrade and maintenance",
73                                "plant maintenance",
74                                "without an Official Purchase Order",
75                                "If unable to quote"
76              ),
77              strings.icontains(beta.ocr(.).text,
78                                "remain binding upon me/us",
79                                "Authorized Signature",
80                                "Name and Capacity",
81                                "must be completed and accompanied by",
82                                "This is not a Purchase Order"
83              )
84            )
85          )
86  )  
87attack_types:
88  - "BEC/Fraud"
89tactics_and_techniques:
90  - "PDF"
91  - "Social engineering"
92detection_methods:
93  - "Optical Character Recognition"
94  - "Content analysis"
95  - "File analysis"
96id: "947071ba-ae40-5ac2-b5cb-9d939dea268f"
to-top