# Scenario 01 — AIS Dark Near Cable

> **Disclaimer:** Synthetic demo data inspired by real Baltic geography, MMSI / OUI
> conventions, and infrastructure. Not real observations. All vessel names, MMSIs,
> MAC addresses, sensor IDs and cable alignments are synthetic and have been
> harmonized against the canonical catalogs under `catalogs/`.

## Story

The 189 × 32 m Finnish bulk carrier **MV AALLOTAR** (MMSI `230999401`,
callsign `OJZZ1`) is on a routine Sillamäe → Hanko run across the Gulf of Finland
when, mid-transit, her AIS transponder goes silent for **33 m 20 s** precisely
while crossing the **Estlink 1 + Estlink 2** synthetic alignment between Helsinki
and Tallinn. The plane-borne radar `RAD-PLN-01` keeps holding the surface track
straight through the dark window, and at the same time a previously unseen
locally-administered MAC (`7E:2A:F1:09:44:C8`, `deviceManufacturer = null`)
appears on `MAC-PRV-COAST-01` and `MAC-HEL-PORT-03` — both inside the catalog's
500 m Estlink buffer. AIS comes back ~14.1 NM further west with a 12° course
change and 1.8 kn speed reduction that do not match AAL's historical
traffic-separation behaviour. **MV VENLA RESEARCH** (MMSI `230888011`) loiters
in the same area during a legitimate seabed survey and acts as the decoy that
a naive "AIS gap + new MAC near a cable" rule must learn to reject.

## Timeline (UTC)

| t_rel | wall clock | event | signals |
|---|---|---|---|
| T-00:30:00 | 2025-03-18T05:00:00Z | AAL enters detection plane SE of Finland | ais |
| T+00:00:00 | 2025-03-18T05:30:00Z | Dense window opens; AAL at 60.0500, 27.5500, 12.4 kn / 268° | ais, plane_radar |
| T+00:08:20 | 2025-03-18T05:38:20Z | AAL crosses notional FI EEZ line | ais |
| T+00:10:00 | 2025-03-18T05:40:00Z | Decoy VENLA RESEARCH enters from W (60.05, 23.10), 6.0 kn | ais |
| T+00:14:10 | 2025-03-18T05:44:10Z | First MAC hit at MAC-KTK-COAST-01: crew MAC A4:83:E7:5C:9B:11 | mac |
| T+00:18:40 | 2025-03-18T05:48:40Z | MAC-KTK-COAST-01 RSSI peak -71 dBm | mac |
| T+00:24:00 | 2025-03-18T05:54:00Z | MAC-KTK-COAST-01 session ends (~590 s) | mac |
| T+00:33:50 | 2025-03-18T06:03:50Z | MAC-PRV-COAST-01 picks up crew MAC | mac |
| T+00:41:30 | 2025-03-18T06:11:30Z | MAC-PRV-COAST-01 RSSI peak -68 dBm | mac, ais |
| T+00:45:00 | 2025-03-18T06:15:00Z | VENLA begins survey loiter (1.5–3 kn) near Estlink | ais |
| T+00:48:00 | 2025-03-18T06:18:00Z | AAL 6 NM east of Estlink 500 m buffer | ais, plane_radar |
| **T+00:55:10** | **2025-03-18T06:25:10Z** | **AIS goes silent** at 60.1800, 25.1800 | **ais (gap)** |
| T+00:57:00 | 2025-03-18T06:27:00Z | Plane radar still holds track; SOG drop, course swing | plane_radar |
| T+01:02:40 | 2025-03-18T06:32:40Z | Burner MAC 7E:2A:F1:09:44:C8 first hit at MAC-PRV-COAST-01 | mac |
| T+01:08:00 | 2025-03-18T06:38:00Z | AAL enters Estlink-1 500 m buffer (radar-only) | plane_radar |
| T+01:12:20 | 2025-03-18T06:42:20Z | Burner RSSI peak -79 dBm | mac |
| T+01:18:00 | 2025-03-18T06:48:00Z | Plane radar SOG dips to 8.4 kn (loiter over alignment) | plane_radar |
| T+01:24:10 | 2025-03-18T06:54:10Z | Burner MAC last seen, RSSI -101 dBm | mac |
| T+01:28:00 | 2025-03-18T06:58:00Z | AAL exits Estlink-2 500 m buffer west edge | plane_radar |
| **T+01:28:30** | **2025-03-18T06:58:30Z** | **AIS reappears** at 60.2100, 24.6200, 10.2 kn / 256° | **ais (gap end)** |
| T+01:35:00 | 2025-03-18T07:05:00Z | MAC-HEL-COAST-01 picks up crew MAC again | mac |
| T+01:46:00 | 2025-03-18T07:16:00Z | MAC-HEL-COAST-01 RSSI peak -73 dBm | mac |
| T+02:05:00 | 2025-03-18T07:35:00Z | AAL south of Inkoo, SOG back to 12.0 kn | ais, plane_radar |
| T+02:20:00 | 2025-03-18T07:50:00Z | MAC-HKO-COAST-01 picks up crew MAC | mac |
| T+02:30:00 | 2025-03-18T08:00:00Z | Dense window closes; AAL on Hanko approach | all |

(For the full machine-readable version see `timeline.json` — same 26 events.)

## Signals Overview

| Signal | Source files | Cadence | Notes |
|---|---|---|---|
| **AIS** | `data/realtime/ais.ndjson`, `data/realtime/ais_snapshot.geojson` | 3 s subject + decoy, 30 s ambient | Subject ship has the 33 m 20 s dark window. |
| **Plane radar** | `data/realtime/plane_radar.ndjson` | 4 s | `RAD-PLN-01` — sees through the dark window. |
| **MAC sessions** | `data/realtime/mac.ndjson`, `data/realtime/mac.csv` | session-closed | Canonical 12-column CSV header; crew + burner + background noise. |
| **Static layers** | `data/static/{area_of_interest,sensors_used,infrastructure_used}.geojson` | once | Estlink corridors + 500 m buffers + EEZ + sensors that appear. |
| **Historical baseline** | `data/historical/{ais_baseline.ndjson,mac_baseline.{ndjson,csv}}` | 30 s AIS | 6 prior daily AAL transits + 6 VENLA transits + 6 days of MAC sensor noise. |

## MAC Fusion Narrative

1. **Detect** any AIS gap > 90 s for a single MMSI. Compute `ais_gap_score` via
   the canonical sigmoid `1 / (1 + exp(-(gap_s - 300) / 600))`.
2. **Spatial join** the gap polyline against the Estlink 500 m buffer polygons.
   `spatial_proximity_infra_score = 1.0` if it intersects, else linearly fades
   over 1000 m.
3. **Crew-MAC attribution**: any MAC that has historically `co_observed` with
   the same MMSI's AIS within ±15 min raises `mac_attribution_score`. Crew MAC
   `A4:83:E7:5C:9B:11` (P-AAL-MASTER) is the canonical persistent device for
   AALLOTAR across S1, S3, S5 and S6.
4. **Reconstruct dark-window position** from crew-MAC sensor hits using RSSI
   ranges → multilateration polygon (UI overlay).
5. **Burner candidate flag**: any MAC inside the gap window with zero baseline
   matches sends `mac_first_seen_ratio → 1.0`. The LAA bit (first octet
   `0x7E & 0x02 == 0x02`) and `deviceManufacturer = null` add to the signal.

The four signals combine to a composite per `weights.json`:

| Signal | Weight |
|---|---|
| `ais_gap_score` | 0.35 |
| `mac_attribution_score` | 0.20 |
| `mac_first_seen_ratio` | 0.20 |
| `spatial_proximity_infra_score` | 0.25 |

Alert threshold: **0.70**. AAL fires at ≈ 0.91. VENLA caps at ≈ 0.22.

## Decoy Explanation — Why VENLA RESEARCH Does Not Trip

A naive "any AIS gap + any new MAC near Estlink = incident" rule would fire on
**MV VENLA RESEARCH** (`230888011`) because she loiters inside the 500 m
Estlink buffer for ~75 minutes performing a legitimate seabed survey, and one
new MAC (`38:F9:D3:11:22:81`, Samsung, a researcher's new phone after a device
refresh) appears at `MAC-INK-COAST-01` during her transit.

The correctly weighted composite rejects her because:

- `ais_gap_score = 0` — VENLA never goes dark.
- `mac_first_seen_ratio` is moderate (Samsung OUI maps to a known vendor, not
  null/LAA).
- `mac_attribution_score` to her own MMSI in earlier weeks is high (her
  persistent crew MACs are well-known to the sensors).
- Even though `spatial_proximity_infra_score = 1.0`, the missing `ais_gap_score`
  caps her composite well below the 0.70 alert threshold.

## KQL Sketches

```kusto
// 1) Detect AIS gaps per MMSI (gap > 90 s) — feeds ais_gap_score
AisMessages
| where timestamp between (datetime(2025-03-18 05:30) .. datetime(2025-03-18 08:00))
| where mmsi == 230999401
| order by timestamp asc
| extend prev_ts = prev(timestamp), prev_lat = prev(lat), prev_lon = prev(lon)
| extend gap_s = datetime_diff('second', timestamp, prev_ts)
| where gap_s > 90
| extend ais_gap_score = 1.0 / (1.0 + exp(-(toreal(gap_s) - 300.0) / 600.0))
```

```kusto
// 2) Spatial join: does the gap polyline cross the Estlink buffer? — feeds spatial_proximity_infra_score
let cable_buf = toscalar(Infrastructure
    | where featureId in ("estlink-1-buffer","estlink-2-buffer")
    | summarize geo_union_polygons_array(geom));
AisGaps
| extend gap_line = dynamic({"type":"LineString",
    "coordinates":[[start_lon,start_lat],[end_lon,end_lat]]})
| extend crosses = geo_intersects_2features(gap_line, cable_buf)
| extend spatial_proximity_infra_score =
    iff(crosses, 1.0, max_of(0.0, 1.0 - geo_distance_2features(gap_line, cable_buf) / 1000.0))
```

```kusto
// 3) Crew-MAC attribution via 42-day baseline co-occurrence — feeds mac_attribution_score
let lookback = 42d;
MacSessions
| where processingTimestamp between (win_start - lookback .. win_start)
| join kind=inner (AisMessages | where mmsi == 230999401
    | summarize ais_min=min(timestamp), ais_max=max(timestamp) by bin(timestamp, 15m)
  ) on $left.processingTimestamp between ($right.ais_min .. $right.ais_max)
| summarize hits = count(), total = count() by macAddress, deviceId
| extend mac_attribution_score = todouble(hits) / todouble(total)
```

```kusto
// 4) Burner-MAC novelty inside the gap window — feeds mac_first_seen_ratio
let baselineMacs = toscalar(MacSessions
    | where processingTimestamp between (win_start - 42d .. win_start - 3h)
    | summarize make_set(macAddress));
MacSessions
| where processingTimestamp between (datetime(2025-03-18 06:25) .. datetime(2025-03-18 07:00))
| summarize new_macs = countif(macAddress !in (baselineMacs)),
            total_macs = count() by deviceId
| extend mac_first_seen_ratio = todouble(new_macs) / todouble(total_macs)
```

```kusto
// 5) Composite incident score (weights from weights.json, sum = 1.0)
GapsWithSpatial
| join kind=inner CrewAttribution on $left.mmsi == 230999401
| join kind=inner SensorNovelty on (true)
| extend incident_score =
       0.35 * ais_gap_score
     + 0.20 * mac_attribution_score
     + 0.20 * mac_first_seen_ratio
     + 0.25 * spatial_proximity_infra_score
| where incident_score > 0.70
```

## Ingestion Notes

- **NDJSON** files start with a `__meta__: "synthetic"` disclaimer record. Skip
  the first line on ingest or filter `where __meta__ != "synthetic"`.
- **CSV** files start with `# {"__meta__":"synthetic", …}` as a leading comment
  row. Configure the CSV reader to treat `#` as a comment prefix.
- **GeoJSON** carries the disclaimer in a top-level `_meta` property on the
  FeatureCollection.
- All timestamps are UTC ISO 8601 with trailing `Z`. MAC `ingestion_ts` is
  epoch-ms `int`.
- `mac.csv` uses the canonical 12-column header exactly as it appears in
  `generators/mac_generator.py::MAC_CSV_HEADER`.

## Reproducing

```pwsh
python scenarios/01-ais-dark-near-cable/generate.py
```

The script imports the shared generators from the `generators/` package
(no copy-paste of shared code) and writes all files under `data/`.
