Thesis: a small slice of FX customers trade on a rhythm you can predict — weekly, biweekly, monthly. Find them, score them, and email two days before their next trade instead of blasting everyone on Tuesday.
Built for a money-transfer company. Synthetic dataset (10,000 trades, 300 customers, 24 months) stands in for the proprietary real one; the method and code are production-faithful.
Objective: score each customer's trading cadence, classify into tiers, and trigger an email 2 days before their predicted next trade.
Of 300 customers in the dataset, 27 turned out to be predictable enough to act on — and they punch well above their weight on revenue. Here's the punchline before the walkthrough:
The output isn't a segment — it's a dated send list. Each predictable customer gets a predicted next-trade date (last trade + their average gap) and an email trigger date two days earlier. Drop the CSV into Salesforce Marketing Cloud, and the campaign runs itself.
Most email marketing sends on the company's calendar. Some customers, though, live on their own rhythm — payday remittance, rent abroad, a monthly treasury sweep. If we can spot that rhythm in the data, we can meet them at their moment instead of ours.
Each block below is a faithful rebuild of a cell from the notebook. Code is real; the note next to it says why the step matters in plain English.
Real customer data is proprietary, so I generated a synthetic replica that mirrors it — 300 customers, 10,000 trades, 24-month window, weighted across 20 popular currency corridors (USD→EUR, GBP→USD, etc.) using approximate mid-market rates.
8% of customers are seeded as predictable (weekly, biweekly, or monthly cadence with ±1 day jitter). The other 92% trade at random intervals. The model doesn't know which is which — it has to find the predictable ones from the gap data alone.
Seed with known ground truth — so we can check the model's work| transaction_id | customer_id | date | send_amount | sell | buy | rate | revenue |
|---|---|---|---|---|---|---|---|
| TXN_003001 | CUST_0001 | 2024-11-12 | 2,475.97 | USD | JPY | 148.77 | 37.14 |
| TXN_009478 | CUST_0002 | 2024-01-01 | 1,626.55 | GBP | JPY | 189.20 | 24.40 |
| TXN_009479 | CUST_0002 | 2024-02-10 | 3,412.80 | EUR | CHF | 0.96 | 51.19 |
| TXN_009480 | CUST_0002 | 2024-03-01 | 6,822.90 | USD | AUD | 1.53 | 102.34 |
| TXN_009481 | CUST_0002 | 2024-03-15 | 2,995.68 | USD | AUD | 1.53 | 44.94 |
Sort every customer's trades by date, then use .diff() to get the days between each one and the next. That list of gaps per person is the raw signal the whole rest of the project sits on top of.
Then per-customer aggregates: mean gap, standard deviation, min, max, total revenue, tenure. One row per customer, ready for scoring.
Quick explainer — take a customer's gaps (say, 30, 29, 31, 30 days). Divide how much they wobble (standard deviation) by their average. Small wobble = low score = predictable. Statisticians call this ratio the Coefficient of Variation, or CV.
Compute CV for every customer with at least 3 trades — fewer than that and you can't tell a rhythm from a coincidence. The eligible pool: 278 customers. The CV distribution runs from a tight 0.01 (perfect clockwork) up to 1.55 (chaotic).
Minimum 3 trades. Two points make a line, not a pattern.Apply the thresholds: CV ≤ 0.10 = Highly Predictable, 0.10–0.20 = Moderately Predictable, everything above = Irregular. That lands:
Total actionable: 27 customers, 9.0% of the base. And the seeded ground truth was 24 — the model recovered all of them and picked up 3 more irregulars who happened to fall into a tight rhythm by chance.
Two-tier split — high tier gets priority, moderate gets volumeA low CV should mean "their trades line up on a grid when you plot them." Bars are trades (height = send amount), red dashed line is the predicted next trade date. The pattern is visually clear without needing any stats background — useful when presenting to marketing and ops.
The #1 customer (CUST_0183, monthly cadence) has a CV of 0.02: 24 trades, mean gap 30.00 days. The next trade is predicted for Jan 4, 2026. Trigger the email on Jan 2.
Being predictable is only useful if they also spend. Joining the revenue data onto the classification:
Total annual revenue from predictable customers: $33,249, or 12.5% of the book. Retention on this slice is worth 3× its headcount share.
| Metric | Predictable | Irregular | All (3+ trades) |
|---|---|---|---|
| Count | 27 | 251 | 278 |
| Avg Total Revenue | $2,339 | $1,608 | $1,679 |
| Median Total Revenue | $890 | $653 | $722 |
| Avg Annual Revenue | $1,231 | $871 | $906 |
| Avg Trades | 52.4 | 34.1 | 35.9 |
| Avg Send Amount | $2,620 | $3,841 | $3,722 |
| Avg Tenure (days) | 649 | 639 | 640 |
Where do these predictable customers actually trade? Breaking their transactions down by corridor and comparing to the general population:
These are the hallmarks of expat living expenses and recurring remittance — not one-off trades. A Swiss worker paying UK bills. A retiree sending from Europe to Switzerland. That's the acquisition profile to target.
Use the cohort to guide marketing — not just retention| corridor | % pred | % all | over-index | # customers |
|---|---|---|---|---|
| AUD → GBP | 2.0% | 0.8% | 250 | 1 |
| CHF → USD | 2.4% | 1.1% | 218 | 1 |
| EUR → CHF | 11.4% | 5.4% | 211 | 4 |
| GBP → JPY | 4.0% | 2.4% | 167 | 2 |
| USD → NZD | 8.1% | 5.0% | 162 | 4 |
| EUR → JPY | 4.0% | 2.5% | 160 | 2 |
| USD → EUR | 22.6% | 16.8% | 135 | 8 |
| USD → CHF | 2.8% | 2.2% | 127 | 2 |
| USD → JPY | 8.8% | 7.1% | 124 | 4 |
| EUR → USD | 8.7% | 8.4% | 104 | 4 |
This is what the marketing tool actually ingests. One row per predictable customer. For each: their predicted next trade date, their trigger date (2 days earlier), their cadence, their primary corridor, and their expected annual revenue. Sorted by trigger date so the campaign manager can see the week ahead.
Total annual revenue tied to this list: $33,249. Projected 3-year LTV for the moderate tier alone (the weekly shoppers) is $7,182 per customer — 3.7× the highly-predictable tier, because they trade so often. Counter-intuitive but real.
Trigger 2 days early — chosen via production A/B vs same-day| customer_id | tier | cadence | CV | predicted next | trigger date | corridor | annual rev |
|---|---|---|---|---|---|---|---|
| CUST_0171 | High | Biweekly | 0.07 | 2025-04-11 | 2025-04-09 | AUD→USD | $499 |
| CUST_0290 | Moderate | Weekly | 0.12 | 2026-01-01 | 2025-12-30 | USD→EUR | $4,088 |
| CUST_0238 | Moderate | Weekly | 0.12 | 2026-01-02 | 2025-12-31 | EUR→CHF | $8,388 |
| CUST_0234 | Moderate | Weekly | 0.14 | 2026-01-02 | 2025-12-31 | CHF→USD | $834 |
| CUST_0149 | Moderate | Weekly | 0.12 | 2026-01-02 | 2025-12-31 | USD→GBP | $399 |
| CUST_0166 | Moderate | Weekly | 0.13 | 2026-01-02 | 2025-12-31 | GBP→USD | $873 |
| CUST_0183 | High | Monthly | 0.02 | 2026-01-04 | 2026-01-02 | USD→EUR | $188 |
| CUST_0058 | High | Monthly | 0.03 | 2026-01-23 | 2026-01-21 | GBP→USD | $1,067 |
| CUST_0046 | High | Monthly | 0.02 | 2026-01-28 | 2026-01-26 | EUR→USD | $97 |
| … 18 more rows … | |||||||
The shape of the win: a small, high-value slice of the book that you can now meet at exactly the moment they're about to trade. No extra product, no new channel — just better timing on the email you were already sending.