Skip to content

Detection Pipeline Architecture

Detection Pipelines are the technological backbones of Payment Service Providers (PSPs), Banks, and Risk Platforms. They are responsible for ingesting merchant data, analyzing website content, reconstructing payment flows, and scoring risk in real-time.

This section outlines the technical architecture used by companies like Stripe, Adyen, and PayPal to identify Payment Cloaking and Merchant Evasion.


🏗 The Request Flow

When a customer initiates a transaction or when a merchant applies for an account, the data flows through a rigorous series of checks.

mermaid
flowchart TD
    User([Customer / User]) -->|Visit| Site["Merchant Website<br/>Hidden or Visible"]
    Site -->|Payment Data| Gateway["Payment Gateway<br/>Tokenization"]
    Gateway -->|Transaction Data| RiskEngine["PSP Risk Engine<br/>The Pipeline"]
    Crawlers([Crawlers & Scanners]) -->|External Data| RiskEngine

    subgraph Analysis["Pipeline Analysis"]
        RiskEngine --> F["1. Fingerprint Analysis"]
        RiskEngine --> C["2. Content Fetching"]
        RiskEngine --> B["3. Behavioral Scoring"]
    end

    Analysis --> Decision{"Decision Engine"}
    Decision -->|Green| Allow["Allow"]
    Decision -->|Amber| Review["Review"]
    Decision -->|Red| Block["Block"]

    %% Styles
    style User fill:#0ea5e9,stroke:#0369a1,color:#fff
    style Site fill:#6366f1,stroke:#4338ca,color:#fff
    style Gateway fill:#0891b2,stroke:#0e7490,color:#fff
    style RiskEngine fill:#7c3aed,stroke:#5b21b6,color:#fff
    style Crawlers fill:#fb923c,stroke:#ea580c,color:#fff
    style Analysis fill:#1e293b,stroke:#0f172a,color:#fff
    style F fill:#15803d,stroke:#14532d,color:#fff
    style C fill:#4f46e5,stroke:#3730a3,color:#fff
    style B fill:#0f766e,stroke:#115e59,color:#fff
    style Decision fill:#475569,stroke:#334155,color:#fff
    style Allow fill:#16a34a,stroke:#166534,color:#fff
    style Review fill:#eab308,stroke:#a16207,color:#000
    style Block fill:#dc2626,stroke:#7f1d1d,color:#fff

Pipeline Stages

  1. Ingestion: Raw data collection (Transaction details, HTTP headers, Device Fingerprints).
  2. Enrichment: Adding external context (WHOIS data, IP reputation, Historical TPV).
  3. Analysis: Running algorithmic checks (Velocity models, Content mismatch, Graph clustering).
  4. Decisioning: Final logic gate (Approve, Decline, or Queue for Manual Review).

🧠 Behind-the-Scenes Engine Logic

1. Web Crawler / Content Fetcher

The "eyes" of the risk engine. It visits the URL provided by the merchant to verify legitimacy.

  • HTML Snapshot: Captures the static DOM to check for prohibited keywords (e.g., "Casino", "Pharma").
  • JavaScript Rendering: Uses headless browsers (e.g., Puppeteer, Playwright) to execute JS. This detects Client-Side Cloaking where content changes after load.
  • User-Agent Rotation: The crawler masquerades as a standard iPhone or Chrome Desktop user to bypass simple "Bot Detection" scripts used by cloakers.
  • Pre-Approval vs. Live Scanning:
    • Pre-Approval: Deep scan of every page during onboarding.
    • Live: Periodic, random re-scans to detect "Bait and Switch" tactics.

2. Payment Flow Reconstruction

Advanced engines don't just look at the homepage; they simulate a purchase.

  • Checkout Simulation: The bot adds an item to the cart and proceeds to checkout to verify the payment page is hosted on the same domain.
  • Redirect Tracing: Tracks 301/302 redirects and Meta Refresh tags.
    • Risk Signal: If Shop-A.com redirects to Payment-B.com without a clear business relationship.
  • Iframe Inspection: Checks if the payment form is embedded in an iframe from a different, unknown domain.

3. Behavioral Analytics (Backend Logic)

Analyzes the metadata of the traffic rather than the visual content.

  • Velocity Models:
    • Rule: "New merchants in the 'Books' category typically process < $500/day."
    • Anomaly: Merchant #123 processed $50,000 in hour 1. -> Flagged.
  • MCC vs. Ticket Size:
    • Rule: MCC 5192 (Books) usually has a ticket size of $15-$50.
    • Anomaly: Transactions are consistently $499. -> Flagged.
  • Geographic Mismatch:
    • Claim: "We are a local bakery in London."
    • Reality: 95% of IP addresses are from Brazil. -> Flagged.

🕵️‍♂️ Sandbox vs. Production Evasion

A common cloaking strategy is Bait and Switch.

  1. Onboarding (Sandbox): Merchant uploads a compliant "T-Shirt Shop" template. The automated underwriting bot approves it.
  2. Production (Live): 24 hours later, the merchant swaps the content to "IPTV Subscriptions."

Counter-Measure: PSPs use Continuous Monitoring (or "Eternal Scanning").

  • Snapshot Diffing: The system takes a screenshot of the homepage every 24 hours and compares it to the "Master" snapshot from onboarding.
  • Significant Change Alert: If the text similarity score drops below 80% (e.g., "Cotton" replaced with "Channel List"), the account is suspended.

🔮 Future of Detection: Machine Learning

Modern pipelines use Unsupervised Learning to detect Merchant Clusters.

  • Clustering: Finding merchants who share no obvious details (Name/Email) but share subtle technical fingerprints (Same CSS structure, same Hosting ASN, same "About Us" text).
  • Graph Databases: Mapping relationship networks. If Merchant A is banned for fraud, and Merchant B shares the same Google Analytics ID, Merchant B is auto-banned.

Next: Detailed Detection Methods

Risk Science Documentation - Payment Cloaking & Evasion