# Scenario 02 — Ship-to-Ship Rendezvous (Gulf of Finland)

> **Disclaimer:** Synthetic demo data inspired by real Baltic geography, MMSI / OUI
> conventions, and infrastructure. Not real observations. All vessel names, MMSIs,
> MAC addresses, sensor IDs and coordinates are synthetic and have been harmonized
> against the canonical catalogs under `catalogs/`.

## Story

The 158 × 24 m Finnish general-cargo vessel **MV SAIMAA AURORA** (MMSI
`230999081`, callsign `OJYY2`) and the 145 × 22 m Russian general-cargo vessel
**MV NEVA CASCADE** (MMSI `273999142`, callsign `UFZZ3`) leave their declared
routes in the central Gulf of Finland and converge **~38 NM south of Hanko**
(rendezvous point **59.4200 N, 23.0500 E**). Over a ~35 minute window they
slow to 0.4–0.8 kn within 55–80 m of each other, drifting in parallel with
`nav_status` left at "under way using engine" — no STS notice was filed and the
position sits well outside any designated anchorage.

A Finnish Border Guard patrol drone — **`RAD-DRN-PAT-01`** carrying the
airborne MAC sensor **`MAC-AIR-DRN-01`** — orbits the meet at 1.5 km radius
(1200 m AGL, descending to 700 m at the peak of the burst) and captures a
cloud of **22 transient, locally-administered MAC addresses** in the `02:*`
range with `deviceManufacturer = null`. A Dornier 228-class MPA (`RAD-PLN-01`
carrying `MAC-AIR-PLN-01`) passes ~22 NM south on an unrelated transit and
contributes peripheral context.

Nine hours later, **6 of those 22 burst MACs** re-surface at the Hanko port
cluster (`MAC-HKO-PORT-01/02/03`) within ±90 min of AURORA's berthing.
Thirteen hours later, **5 burst MACs** show up at Helsinki West Harbour
(`MAC-HEL-PORT-01/02`) within ±90 min of CASCADE's berthing — **2 of them
overlap with the Hanko set**, providing physical-layer evidence that devices
crossed between hulls during the meet.

The AIS picture alone looks benign (two cargo ships near each other for half
an hour). The fused MAC + radar + AIS picture is the anomaly.

## Timeline (UTC)

| t_rel | wall clock | event | signals |
|---|---|---|---|
| T-03:00:00 | 2025-04-12T08:00:00Z | AURORA on WB transit off Kotka; CASCADE EB from Tallinn approach | ais |
| T-02:15:00 | 2025-04-12T08:45:00Z | AURORA 12° starboard course change, off declared route | ais |
| T-01:50:00 | 2025-04-12T09:10:00Z | CASCADE drops SOG 13.2 → 9.5 kn | ais |
| T-01:20:00 | 2025-04-12T09:40:00Z | Drone `RAD-DRN-PAT-01` launches from Hanko CGS (59.8240, 22.9650) | drone_radar, mac |
| T-00:45:00 | 2025-04-12T10:15:00Z | Drone enters 1.5 km orbit at 1200 m AGL over predicted RV | drone_radar, mac |
| T-00:30:00 | 2025-04-12T10:30:00Z | Ships within 2 NM of each other; both below 8 kn | ais, drone_radar |
| T-00:15:00 | 2025-04-12T10:45:00Z | Ships within 500 m; SOG 2.1 / 1.8 kn; first airborne crew MAC hits | ais, drone_radar, mac |
| **T+00:00:00** | **2025-04-12T11:00:00Z** | **Near-stop window BEGINS — distance 55–80 m; burst of 22 `02:*` MACs appears** | **all** |
| T+00:08:00 | 2025-04-12T11:08:00Z | Drone descends to 700 m → burst RSSI jumps +10 to +14 dB | drone_radar, mac |
| T+00:20:00 | 2025-04-12T11:20:00Z | Burst MAC count peaks (≈ 22 unique `02:*` MACs across both hulls) | mac |
| T+00:30:00 | 2025-04-12T11:30:00Z | `RAD-PLN-01` closest pass ~22 NM south at 3000 m | plane_radar, mac |
| **T+00:35:00** | **2025-04-12T11:35:00Z** | **Near-stop window ENDS; ships separate to 3 kn** | **all** |
| T+00:50:00 | 2025-04-12T11:50:00Z | Distance > 1 NM; courses diverge (AURORA → NW, CASCADE → SE then NE) | ais |
| T+01:10:00 | 2025-04-12T12:10:00Z | Drone breaks orbit, RTB Hanko | drone_radar, mac |
| T+02:30:00 | 2025-04-12T13:30:00Z | Both vessels back on plausible commercial headings | ais |
| **T+09:15:00** | **2025-04-12T20:15:00Z** | **AURORA berths Hanko quay 4; `MAC-HKO-PORT-01/02/03` pick up 6 burst MACs** | **mac, ais** |
| **T+13:40:00** | **2025-04-13T00:40:00Z** | **CASCADE berths Helsinki West Harbour; `MAC-HEL-PORT-01/02` pick up 5 burst MACs (2 overlap)** | **mac, ais** |
| T+14:30:00 | 2025-04-13T01:30:00Z | Dense window closes; analyst pipeline runs fused proximity + co-occurrence join | compute |

(Full machine-readable version: `timeline.json` — same events.)

## Signals Overview

| Signal | Source files | Cadence | Notes |
|---|---|---|---|
| **AIS** | `data/realtime/ais.ndjson`, `data/realtime/ais_snapshot.geojson` | 10 s subject ships, 30 s ambient | Both protagonist ships + ~600 ambient transits across the gulf. |
| **Drone radar** | `data/realtime/drone_radar.ndjson`, `data/realtime/drone_radar_targets.ndjson` | 2 s | `RAD-DRN-PAT-01` orbit telemetry + per-target plots for `TGT-A` (AURORA) and `TGT-B` (CASCADE). Altitude descends 1200 → 700 → 1200 m around T+00:08. |
| **Plane radar** | `data/realtime/plane_radar.ndjson` | 4 s | `RAD-PLN-01` Dornier 228-class peripheral transit ~22 NM south of RV. |
| **MAC sessions** | `data/realtime/mac.ndjson`, `data/realtime/mac.csv` | 1–5 s airborne, 2 min static | Canonical 12-column CSV header. Persistent crew (`P-SAU-*`, `P-NEV-*`) + 22 transient `02:*` burst MACs + cross-port re-appearance + background noise. |
| **Static layers** | `data/static/{area_of_interest, sensors_used, infrastructure_used, rendezvous_zone, ship_routes, airborne_platforms, decoy_sts_lightering}.geojson` | once | Includes 100 m / 500 m / 2 NM concentric RV zones and the drone orbit + plane transit paths. |
| **Historical baseline** | `data/historical/{ais_baseline.ndjson, mac_baseline.{ndjson,csv}}` | 30 s AIS | 14 prior daily transits per ship + decoy STS lightering on 2025-04-05. |

## MAC Fusion Narrative

The fused detector turns the otherwise-benign AIS picture into a defensible
incident by chaining four independent signals.

1. **Cluster MACs by persistent ship-attribution.** Historical baselines
   (14 days, `data/historical/mac_baseline.*`) attach the `P-SAU-*`
   block (suffix range `..:21` … `..:29`) to AURORA and the `P-NEV-*` block
   (suffix range `..:31` … `..:39`) to CASCADE. Both use **real OUIs**
   (Apple `A4:83:E7`, Samsung `38:F9:D3`, Huawei `00:E0:FC`, Xiaomi
   `04:CF:8C`, Lenovo `8C:16:45`, Apple-BLE `B0:7D:64`).

2. **Detect the burst.** At the RV window, `MAC-AIR-DRN-01` observes a
   sudden population of `02:*` MACs (locally-administered first-octet bit
   set, `deviceManufacturer = null` — the real-world schema gap for
   randomized MACs). 22 unique values appear within a five-minute window;
   none of them exist in the prior 21-day baseline at *any* coastal or
   port sensor. `mac_first_seen_ratio → 1.0` on the airborne sensor.

3. **Attribute via airborne geometry.** Because the drone orbits and
   descends, RSSI follows path loss as range and altitude change. A 700 m
   descent at T+00:08 pulls ship-attributed MACs ~+12 dB. Whenever the
   drone is closer to AURORA than to CASCADE at the moment of capture
   the MAC is tagged "side A", otherwise "side B". With ≥ 2 cross-ship
   MACs above the strong-signal threshold simultaneously,
   `co_observation_score = 1`.

4. **Spike vs baseline.** The per-minute count of MACs on `MAC-AIR-DRN-01`
   during the RV peak is ~20× the 90-day baseline mean; `mac_count_zscore`
   = `sigmoid((z − 3) / 2) ≈ 1.0`.

5. **Cross-port port confirmation.** Of the 22 burst MACs, **6** re-appear
   at the Hanko port cluster (`MAC-HKO-PORT-01/02/03`) within ±90 min of
   AURORA's berthing, **5** at Helsinki West Harbour (`MAC-HEL-PORT-01/02`)
   within ±90 min of CASCADE's berthing, with **2** of them appearing in
   *both* port sets. This is the smoking gun — devices physically crossed
   hulls during the meet.

The four signals combine to a composite per `weights.json`:

| Signal | Weight |
|---|---|
| `vessel_pair_proximity_score` | 0.35 |
| `mac_count_zscore` | 0.25 |
| `co_observation_score` | 0.20 |
| `mac_first_seen_ratio` | 0.20 |

Alert threshold: **0.70**. The S2 event raises `incident_score ≈ 0.97`
(all four signals near 1 at the peak). The decoy STS lightering
(2025-04-05, see below) scores ≈ 0.27 and stays well below threshold.

## Decoy Explanation — Why the Legitimate STS Lightering Does Not Trip

A naive "any two ships near each other for > 10 minutes = incident" rule
would fire on the legitimate, OPA-90-compliant lightering of **MV NORDLYS
HARMONY** (MMSI `230888301`, FI tanker, type 80) and **MV BALTIC EMERALD**
(MMSI `256888422`, MT tanker, type 80) on **2025-04-05** at
`59.6500 N, 23.7500 E` — about 30 NM ENE of the S2 rendezvous point.

The correctly weighted composite rejects them because:

- `vessel_pair_proximity_score ≈ 0.78` — the static vessel does sit
  alongside for 6 h within ~50 m.
- `mac_count_zscore = 0` — `RAD-DRN-PAT-01` is not even in the air for
  this event, so the airborne sensor records nothing.
- `co_observation_score = 0` — no airborne sensor observed any `02:*`
  cloud.
- `mac_first_seen_ratio = 0` — only persistent crew MACs of the two
  tankers appear, all with known real-OUI vendors and prior baseline
  history.
- Composite: `0.35 × 0.78 + 0 + 0 + 0 = 0.273`, safely below the 0.70
  alert threshold.

The decoy is present in `data/historical/ais_baseline.ndjson` (with the
static vessel's `nav_status` correctly set to `5` (moored) during the
6-hour window) and `data/historical/mac_baseline.*` (decoy crew MACs
only, no burst). See `data/static/decoy_sts_lightering.geojson`.

## KQL Sketches

All identifiers below match the catalog. Timestamps are UTC ISO 8601.

```kusto
// 1) Proximity self-join on AIS (5-min bins) — feeds vessel_pair_proximity_score
let win = 5m;
AisMessages
| where timestamp between (datetime(2025-04-12T08:00:00Z) .. datetime(2025-04-12T14:00:00Z))
| extend tb = bin(timestamp, win)
| project mmsi_a=mmsi, lat_a=lat, lon_a=lon, sog_a=sog_kn, tb
| join kind=inner (
    AisMessages
    | extend tb = bin(timestamp, win)
    | project mmsi_b=mmsi, lat_b=lat, lon_b=lon, sog_b=sog_kn, tb
  ) on tb
| where mmsi_a < mmsi_b
| extend dist_m = geo_distance_2points(lon_a, lat_a, lon_b, lat_b)
| where dist_m < 500 and sog_a < 2 and sog_b < 2
| summarize min_dist=min(dist_m), samples=count() by mmsi_a, mmsi_b, tb
```

```kusto
// 2) Airborne MAC burst — feeds mac_first_seen_ratio and co_observation_score
let baselineMacs = toscalar(MacSessions
    | where processingTimestamp between (datetime(2025-03-22T00:00:00Z) .. datetime(2025-04-12T08:00:00Z))
    | summarize make_set(macAddress));
let burst = MacSessions
    | where deviceId == "MAC-AIR-DRN-01"
    | where processingTimestamp between (datetime(2025-04-12T11:00:00Z) .. datetime(2025-04-12T11:35:00Z))
    | where macAddress startswith "02:"
    | where macAddress !in (baselineMacs)
    | distinct macAddress;
MacSessions
| where macAddress in (burst)
| summarize first_seen=min(processingTimestamp), sensors=make_set(deviceId) by macAddress
```

```kusto
// 3) Cross-port re-appearance join — surfaces the smoking-gun MACs
let burst = MacSessions
    | where deviceId == "MAC-AIR-DRN-01"
    | where processingTimestamp between (datetime(2025-04-12T11:00:00Z) .. datetime(2025-04-12T11:35:00Z))
    | where macAddress startswith "02:"
    | distinct macAddress;
let hko = MacSessions
    | where deviceId in ("MAC-HKO-PORT-01","MAC-HKO-PORT-02","MAC-HKO-PORT-03")
    | where processingTimestamp between (datetime(2025-04-12T19:00:00Z) .. datetime(2025-04-12T22:00:00Z))
    | where macAddress in (burst)
    | distinct macAddress;
let hel = MacSessions
    | where deviceId in ("MAC-HEL-PORT-01","MAC-HEL-PORT-02")
    | where processingTimestamp between (datetime(2025-04-12T23:30:00Z) .. datetime(2025-04-13T02:30:00Z))
    | where macAddress in (burst)
    | distinct macAddress;
union (hko | extend port="Hanko"), (hel | extend port="HelsinkiWH")
| summarize ports=make_set(port) by macAddress
| extend smoking_gun = (array_length(ports) >= 2)
```

```kusto
// 4) Airborne MAC count z-score vs 90-day baseline — feeds mac_count_zscore
let baseline = MacSessions
    | where deviceId == "MAC-AIR-DRN-01"
    | where processingTimestamp between (datetime(2025-01-12T00:00:00Z) .. datetime(2025-04-12T08:00:00Z))
    | summarize per_min = count() by bin(processingTimestamp, 1m)
    | summarize mu = avg(per_min), sigma = stdev(per_min);
MacSessions
| where deviceId == "MAC-AIR-DRN-01"
| where processingTimestamp between (datetime(2025-04-12T10:55:00Z) .. datetime(2025-04-12T11:40:00Z))
| summarize cnt = count() by bin_min = bin(processingTimestamp, 1m)
| extend z = (toreal(cnt) - toscalar(baseline | project mu)) / toscalar(baseline | project sigma)
| extend mac_count_zscore = 1.0 / (1.0 + exp(-(z - 3) / 2))
```

```kusto
// 5) Fused incident score (weights from weights.json, Σ = 1.0)
ProximityScore
| join kind=inner ZScore on EventId
| join kind=inner CoObsScore on EventId
| join kind=inner FirstSeenRatio on EventId
| extend incident_score =
      0.35 * vessel_pair_proximity_score
    + 0.25 * mac_count_zscore
    + 0.20 * co_observation_score
    + 0.20 * mac_first_seen_ratio
| where incident_score >= 0.70           // alert threshold
| project EventId, incident_score, vessel_pair_proximity_score, mac_count_zscore,
          co_observation_score, mac_first_seen_ratio
```

## Ingestion Notes

- **NDJSON** files start with a `__meta__: "synthetic"` disclaimer record. Skip
  the first line on ingest or filter `where __meta__ != "synthetic"`.
- **CSV** files start with `# {"__meta__":"synthetic", …}` as a leading comment
  row. Configure the CSV reader to treat `#` as a comment prefix.
- **GeoJSON** carries the disclaimer in a top-level `_meta` property on the
  FeatureCollection.
- All timestamps are UTC ISO 8601 with trailing `Z`. MAC `ingestion_ts` is
  epoch-ms `int`.
- `mac.csv` uses the canonical 12-column header exactly as it appears in
  `generators/mac_generator.py::MAC_CSV_HEADER`. Burst MACs surface as
  rows with `deviceManufacturer = None` — preserve the null on ingest (do
  not coerce to a string).
- For the cross-port query, materialize burst MACs once from
  `MAC-AIR-DRN-01` then `lookup` against the port-cluster streams to avoid
  full self-joins.

## Reproducing

```pwsh
python scenarios/02-ship-to-ship-rendezvous/generate.py
```

The script imports the shared generators from the `generators/` package
(no copy-paste of shared code) and writes all files under `data/`. The
PRNG seeds are fixed so the 22 burst MACs and their cross-port split are
byte-identical across runs (see `data/_generation_summary.json` after
running).
