Detection Pipeline Architecture
Detection Pipelines are the technological backbones of Payment Service Providers (PSPs), Banks, and Risk Platforms. They are responsible for ingesting merchant data, analyzing website content, reconstructing payment flows, and scoring risk in real-time.
This section outlines the technical architecture used by companies like Stripe, Adyen, and PayPal to identify Payment Cloaking and Merchant Evasion.
🏗 The Request Flow
When a customer initiates a transaction or when a merchant applies for an account, the data flows through a rigorous series of checks.
flowchart TD
User([Customer / User]) -->|Visit| Site["Merchant Website<br/>Hidden or Visible"]
Site -->|Payment Data| Gateway["Payment Gateway<br/>Tokenization"]
Gateway -->|Transaction Data| RiskEngine["PSP Risk Engine<br/>The Pipeline"]
Crawlers([Crawlers & Scanners]) -->|External Data| RiskEngine
subgraph Analysis["Pipeline Analysis"]
RiskEngine --> F["1. Fingerprint Analysis"]
RiskEngine --> C["2. Content Fetching"]
RiskEngine --> B["3. Behavioral Scoring"]
end
Analysis --> Decision{"Decision Engine"}
Decision -->|Green| Allow["Allow"]
Decision -->|Amber| Review["Review"]
Decision -->|Red| Block["Block"]
%% Styles
style User fill:#0ea5e9,stroke:#0369a1,color:#fff
style Site fill:#6366f1,stroke:#4338ca,color:#fff
style Gateway fill:#0891b2,stroke:#0e7490,color:#fff
style RiskEngine fill:#7c3aed,stroke:#5b21b6,color:#fff
style Crawlers fill:#fb923c,stroke:#ea580c,color:#fff
style Analysis fill:#1e293b,stroke:#0f172a,color:#fff
style F fill:#15803d,stroke:#14532d,color:#fff
style C fill:#4f46e5,stroke:#3730a3,color:#fff
style B fill:#0f766e,stroke:#115e59,color:#fff
style Decision fill:#475569,stroke:#334155,color:#fff
style Allow fill:#16a34a,stroke:#166534,color:#fff
style Review fill:#eab308,stroke:#a16207,color:#000
style Block fill:#dc2626,stroke:#7f1d1d,color:#fffPipeline Stages
- Ingestion: Raw data collection (Transaction details, HTTP headers, Device Fingerprints).
- Enrichment: Adding external context (WHOIS data, IP reputation, Historical TPV).
- Analysis: Running algorithmic checks (Velocity models, Content mismatch, Graph clustering).
- Decisioning: Final logic gate (Approve, Decline, or Queue for Manual Review).
🧠 Behind-the-Scenes Engine Logic
1. Web Crawler / Content Fetcher
The "eyes" of the risk engine. It visits the URL provided by the merchant to verify legitimacy.
- HTML Snapshot: Captures the static DOM to check for prohibited keywords (e.g., "Casino", "Pharma").
- JavaScript Rendering: Uses headless browsers (e.g., Puppeteer, Playwright) to execute JS. This detects Client-Side Cloaking where content changes after load.
- User-Agent Rotation: The crawler masquerades as a standard iPhone or Chrome Desktop user to bypass simple "Bot Detection" scripts used by cloakers.
- Pre-Approval vs. Live Scanning:
- Pre-Approval: Deep scan of every page during onboarding.
- Live: Periodic, random re-scans to detect "Bait and Switch" tactics.
2. Payment Flow Reconstruction
Advanced engines don't just look at the homepage; they simulate a purchase.
- Checkout Simulation: The bot adds an item to the cart and proceeds to checkout to verify the payment page is hosted on the same domain.
- Redirect Tracing: Tracks
301/302redirects andMeta Refreshtags.- Risk Signal: If
Shop-A.comredirects toPayment-B.comwithout a clear business relationship.
- Risk Signal: If
- Iframe Inspection: Checks if the payment form is embedded in an iframe from a different, unknown domain.
3. Behavioral Analytics (Backend Logic)
Analyzes the metadata of the traffic rather than the visual content.
- Velocity Models:
- Rule: "New merchants in the 'Books' category typically process < $500/day."
- Anomaly: Merchant #123 processed $50,000 in hour 1. -> Flagged.
- MCC vs. Ticket Size:
- Rule: MCC 5192 (Books) usually has a ticket size of $15-$50.
- Anomaly: Transactions are consistently $499. -> Flagged.
- Geographic Mismatch:
- Claim: "We are a local bakery in London."
- Reality: 95% of IP addresses are from Brazil. -> Flagged.
🕵️♂️ Sandbox vs. Production Evasion
A common cloaking strategy is Bait and Switch.
- Onboarding (Sandbox): Merchant uploads a compliant "T-Shirt Shop" template. The automated underwriting bot approves it.
- Production (Live): 24 hours later, the merchant swaps the content to "IPTV Subscriptions."
Counter-Measure: PSPs use Continuous Monitoring (or "Eternal Scanning").
- Snapshot Diffing: The system takes a screenshot of the homepage every 24 hours and compares it to the "Master" snapshot from onboarding.
- Significant Change Alert: If the text similarity score drops below 80% (e.g., "Cotton" replaced with "Channel List"), the account is suspended.
🔮 Future of Detection: Machine Learning
Modern pipelines use Unsupervised Learning to detect Merchant Clusters.
- Clustering: Finding merchants who share no obvious details (Name/Email) but share subtle technical fingerprints (Same CSS structure, same Hosting ASN, same "About Us" text).
- Graph Databases: Mapping relationship networks. If Merchant A is banned for fraud, and Merchant B shares the same Google Analytics ID, Merchant B is auto-banned.
