Runbook: wp_users / wp_usermeta surgical restore (WPCA UID-collision incident)

This runbook is the case-specific application of the generic skill mysql-operator-backup-restore (in ~/dev/ai-skills-develop/skills/devops/). That skill covers how to pull an Oracle MySQL Operator backup from S3 into a target schema. This runbook covers what to do with that snapshot once it’s loaded: scoping the damage, transactional apply against live, re-importing downstream data, verification, and cleanup.

1. Context

1.1. What happened

On or before 2026-04-18, an Event Participant (EP) import was run against wpca_prod using the legacy-format CSVs:

  • input/123-participants.csv (Duynefontein, event 119)

  • input/125-participants.csv (Redhill, event 120)

  • input/126-participants.csv (Killarney 1, event 121)

Each CSV starts with a UID column holding the legacy WPCA system’s wp_users.ID values. The EP import code maps UIDPERSON_EXTERNAL_ID and passes it to person.setId(…​) on the request DTO (see EventParticipantImportXLS.java:108 and :196). Downstream, EventParticipantServiceEx.register() treats this as "the Person already exists" and loads wp_users by PK.

The legacy WPCA wp_users.ID space does not match the new system’s wp_users.ID space. So the import loaded unrelated new-system users by coincidentally-matching IDs and overwrote their personal/identity fields with the WPCA participant data. 519 distinct legacy UIDs were at risk.

1.2. Scope of overwrite

wp_users (direct columns): display_name, user_nicename. Evidence: UserWrapper.setName() writes both whenever setFirstName/setLastName is called on an existing user. No other wp_users field is touched (user_login, user_pass, user_email, etc. are only set in PersonWrapper.init() for brand-new users).

wp_usermeta: any meta key the EP import’s PersonWrapper setters write — first_name, last_name, id_number, id_type, id_country, date_of_birth, gender, contact_number, person_email, ice_name, ice_number, iceRelationship, parent_name, parent_number, parent_relationship, main_member_id, billing_address_1, billing_city, billing_postcode, last_edited.

1.3. Constraints

  • wpca_prod is live. Full restore is not an option — the rest of the system has moved on since the incident.

  • The 792 EventParticipant rows currently in events 119/120/121 (imported on 2026-04-21 with corrected -no-uid CSVs) may be linked to corrupted wp_users records — we cannot trust those EP→person links until the wp_users restore is complete. So EPs and RaceResults for the 3 events must be purged before the restore, and re-imported after.

  • Backups are Oracle MySQL Operator util.dumpInstance format in S3 — see the mysql-operator-backup-restore skill for the access workflow.

2. Scope

2.1. Affected rows

Measure Count Notes

Distinct UIDs across the 3 CSVs

519

awk -F, 'FNR>1 && $1!="" {print $1}' input/12{3,5,6}-participants.csv | sort -un

UIDs present as wp_users in live (susceptible to damage)

467

Included in the restore set

UIDs absent from live (bad import silently failed with stale-object error)

52

Safe to skip — no damage occurred

UIDs with actual diff on wp_users.display_name/user_nicename vs 2026-04-17 snapshot

233

Within the 467

UIDs with any wp_usermeta diff vs 2026-04-17 snapshot

298

Within the 467

Top diff meta keys (live vs 2026-04-17 snapshot): id_number (199), first_name (177), date_of_birth (176), last_name (174), id_type (165), gender (138), person_email (135).

2.2. Out of scope

  • EP rows for events 119/120/121 — purged and re-imported as part of the runbook, but the re-imports follow the established EP import procedure (not new work).

  • System / admin accounts (ID ≤ 10) — pre-flight check confirmed UID 8 (iisaacs, Igshaan Isaacs) is a regular user; no admin account is in the affected set.

3. Execution

3.1. Step 0 — Sign-offs

  1. Confirm incident timestamp. If the bad import ran on or after the daily backup window (02:00 UTC), step forward one day. For this incident: 2026-04-17 02:00 UTC backup is pre-incident.

  2. Confirm 30-day retention of pre-restore snapshot tables is acceptable.

  3. Confirm no system/admin user is in the 519-UID set.

3.2. Step 1 — Build UID list and pre-flight

mkdir -p recon/logs/wp-users-restore
awk -F, 'FNR>1 && $1!="" {print $1}' \
  input/123-participants.csv input/125-participants.csv input/126-participants.csv \
  | sort -un > recon/logs/wp-users-restore/affected_uids.txt
wc -l recon/logs/wp-users-restore/affected_uids.txt   # expect 519
-- Pre-flight: how many UIDs exist in live wp_users?
SELECT COUNT(*) FROM wp_users WHERE ID IN (<519>);   -- expect 467

-- Pre-flight: any admin / system accounts?
SELECT ID, user_login FROM wp_users
WHERE ID IN (<519>) AND (ID <= 10 OR user_login IN ('admin','administrator'));
-- If any row returns, STOP and re-plan.

3.3. Step 2 — Extract 2026-04-17 backup into wp_restore_260417 schema

Use the mysql-operator-backup-restore skill. Key parameters:

  • Source: s3://idl-xnl-jhb1-rc1-backup/mysql/idealogic-prod/idealogic-prod-daily-backup260417020003

  • Include schemas: wpca_prod

  • Include tables: wpca_prod.wp_users, wpca_prod.wp_usermeta

  • Target schema: wp_restore_260417

  • Loader: ephemeral pod in mysql namespace (community-operator image, mysqlsh 8.4)

  • Privileges: claude needs temporary CREATE + SUPER globally; ALL on wp_restore_260417

Validation after load:

SELECT COUNT(*) FROM wp_restore_260417.wp_users;         -- expect ~33780
SELECT COUNT(*) FROM wp_restore_260417.wp_usermeta;      -- expect ~410726

-- Sentinel: source untouched
SELECT COUNT(*) FROM wpca_prod.wp_users;                 -- unchanged vs pre-load
SELECT COUNT(*) FROM wpca_prod.wp_usermeta;              -- unchanged vs pre-load

3.4. Step 3 — Purge RaceResults + EventParticipants for events 119/120/121

START TRANSACTION;

DELETE rr FROM race_result rr
  JOIN result_set rs ON rs.id = rr.result_set_id
  WHERE rs.event_id IN (119, 120, 121);
-- expect ~696 rows

DELETE FROM event_participant WHERE event_id IN (119, 120, 121);
-- expect ~792 rows

SELECT
  (SELECT COUNT(*) FROM race_result rr JOIN result_set rs ON rs.id = rr.result_set_id
   WHERE rs.event_id IN (119,120,121)) AS rr_remaining,
  (SELECT COUNT(*) FROM event_participant WHERE event_id IN (119,120,121)) AS ep_remaining,
  (SELECT COUNT(*) FROM result_set WHERE event_id IN (119,120,121)) AS rs_preserved;
-- expect 0, 0, 30

COMMIT;

result_set rows stay (empty containers that the Results re-import will fill).

3.5. Step 4 — Pre-restore rollback snapshot

The idealogic-prod cluster runs MySQL Group Replication, which enforces that every table must have an explicit PRIMARY KEY at CREATE time. CREATE TABLE …​ AS SELECT fails with error 3098. Use CREATE TABLE …​ LIKE + INSERT …​ SELECT to inherit the PK.

CREATE TABLE wp_users_prerestore_20260422 LIKE wp_users;
INSERT INTO wp_users_prerestore_20260422 SELECT * FROM wp_users WHERE ID IN (<519>);
-- expect 467 rows

CREATE TABLE wp_usermeta_prerestore_20260422 LIKE wp_usermeta;
INSERT INTO wp_usermeta_prerestore_20260422 SELECT * FROM wp_usermeta WHERE user_id IN (<519>);
-- expect 10422 rows

Retain for 30 days (drop 2026-05-22). These are the rollback dataset.

3.6. Step 5 — Apply restore (single transaction)

The restore uses two patterns:

  • Field-level UPDATE on wp_users — only display_name + user_nicename (the code-confirmed fields touched).

  • Blanket DELETE+INSERT on wp_usermeta — simpler than field-level, and safe because umeta_id values are preserved from snapshot.

3.6.1. Why blanket replace on wp_usermeta is safe

umeta_id is an AUTO_INCREMENT PK. Analysis:

  • If the bad import UPDATEd an existing meta row (common case — key already existed), the row’s umeta_id is unchanged. Deleting it frees that value; the snapshot re-insert uses the same umeta_id — no collision.

  • If the bad import INSERTed a new meta row (key didn’t exist pre-incident), the new umeta_id is higher than anything in the snapshot. Deleting it frees a value that the snapshot never used. No collision.

  • MySQL AUTO_INCREMENT never re-issues a freed value, so no other user’s row has taken those umeta_id values in the meantime.

3.6.2. Transaction

SET SESSION innodb_lock_wait_timeout = 30;

START TRANSACTION;

-- 5a. wp_users: restore display_name + user_nicename
UPDATE wpca_prod.wp_users u
  JOIN wp_restore_260417.wp_users s ON s.ID = u.ID
SET u.display_name = s.display_name,
    u.user_nicename = s.user_nicename
WHERE u.ID IN (<519>);
-- MySQL UPDATE ROW_COUNT counts rows actually changed, not matched.
-- Expect ~233-237; the difference from 467 is rows where live already matched snapshot.

-- 5b. wp_usermeta: blanket delete + insert
DELETE FROM wpca_prod.wp_usermeta WHERE user_id IN (<519>);
-- expect 10422 rows

INSERT INTO wpca_prod.wp_usermeta (umeta_id, user_id, meta_key, meta_value)
  SELECT umeta_id, user_id, meta_key, meta_value
  FROM wp_restore_260417.wp_usermeta
  WHERE user_id IN (<519>);
-- expect 10579 rows (+157 restored rows that the bad import had DELETED)

-- Verification gates: READ EACH COUNT ALOUD before COMMIT
SELECT 'gate_1_users_match_snapshot (want 0)' AS gate, COUNT(*) AS cnt
FROM wpca_prod.wp_users u JOIN wp_restore_260417.wp_users s ON s.ID = u.ID
WHERE u.ID IN (<519>)
  AND (u.display_name <> s.display_name OR u.user_nicename <> s.user_nicename);

SELECT 'gate_2_meta_row_count' AS gate,
  (SELECT COUNT(*) FROM wpca_prod.wp_usermeta WHERE user_id IN (<519>)) AS live,
  (SELECT COUNT(*) FROM wp_restore_260417.wp_usermeta WHERE user_id IN (<519>)) AS snap;
-- live must equal snap

SELECT 'gate_3_umeta_id_no_cross_user_collision (want 0)' AS gate, COUNT(*) AS cnt
FROM wpca_prod.wp_usermeta m
WHERE m.user_id IN (<519>)
  AND EXISTS (SELECT 1 FROM wpca_prod.wp_usermeta m2
              WHERE m2.umeta_id = m.umeta_id AND m2.user_id <> m.user_id);

SELECT 'gate_4_sentinel_wp_users_total (want pre-run value)' AS gate, COUNT(*) AS cnt
FROM wpca_prod.wp_users;

COMMIT;

If any gate fails, ROLLBACK — do not COMMIT.

3.7. Step 6 — Re-import EPs

Use the -typed.xlsx files that carry the IDType=NATIONAL workaround column (for Bug #471). See Import Operations Guide.

for e in 119 120 121; do
  curl -X PUT -H "X-API-KEY: $ADMIN_KEY" \
    -F "file=@input/${e}-participants-typed.xlsx" \
    "https://admin-service.idealogic.co.za/api/event-participants/import\
?eventId=${e}&sheetIndex=0&createCustom1=true&createCustom2=true&createCustom3=true"
done

Events 120 (238 rows) and 121 (353 rows) usually 504 at the nginx 180s timeout. The backend continues processing. Poll DB until stable, then retry to fill any missing rows (imports are idempotent with IDType=NATIONAL).

Expected outcome after retry-to-fill:

Event EPs Notes

119 Duynefontein

205

Clean

120 Redhill

237

1 legitimate duplicate registration rejected (Niguse, same id_number on two regids)

121 Killarney 1

350

3 legitimate duplicates (Arendse, Darries, Pagel)

Verify zero id_number duplicates:

SELECT um.meta_value AS id_number, COUNT(DISTINCT um.user_id) AS users
FROM event_participant ep
JOIN wp_usermeta um ON um.user_id = ep.person_id AND um.meta_key='id_number' AND um.meta_value<>''
WHERE ep.event_id IN (119,120,121)
GROUP BY um.meta_value HAVING users > 1;
-- Expected: 0 rows

3.8. Step 7 — Re-import Results

Use the -fixed.csv variants for events 120 and 121 (they carry the regid swaps that recover two results lost to duplicate-registration, detailed in design-journal/2026-04/result-import-improvements.adoc). Event 119 uses the original CSV.

curl -X PUT -H "X-API-KEY: $ADMIN_KEY" \
  -F "file=@input/119-ResultExport.csv" \
  "https://admin-service.idealogic.co.za/api/result-sets/import-bulk\
?eventId=119&participantIdMode=regid&pointsCalculator=wpca-road-league"

curl -X PUT -H "X-API-KEY: $ADMIN_KEY" \
  -F "file=@input/120-ResultExport-fixed.csv" \
  "https://admin-service.idealogic.co.za/api/result-sets/import-bulk\
?eventId=120&participantIdMode=regid&pointsCalculator=wpca-road-league"

curl -X PUT -H "X-API-KEY: $ADMIN_KEY" \
  -F "file=@input/121-ResultExport-fixed.csv" \
  "https://admin-service.idealogic.co.za/api/result-sets/import-bulk\
?eventId=121&participantIdMode=regid&pointsCalculator=wpca-road-league"

Expected: 182 / 201 / 314 RaceResults. 100% laps populated. WPCA points scheme applied (top-12 scoring + 2 participation points for 13+).

3.9. Step 8 — Delete Yaghya Cat 4 Drop artifact

The Cat 4 "Drop" row in 121-ResultExport.csv for Yaghya Darries (regid 30274) still lands in the Cat 4 ResultSet on re-import. Since the same EP now holds his Cat 3 Pos 78 result (via the regid swap in the -fixed.csv), we delete the Cat 4 artifact per the "can only race once" decision:

DELETE rr FROM race_result rr
  JOIN result_set rs ON rs.id = rr.result_set_id
  JOIN race r ON r.id = rs.race_id
  JOIN event_category ec ON ec.id = r.event_category_id
  JOIN event_participant ep ON ep.id = rr.event_participant_id
WHERE rs.event_id = 121
  AND ep.registration_id = '30274'
  AND ec.name = 'Category 4'
  AND rr.status IS NULL AND rr.position IS NULL;
-- expect 1 row

This step is NOT CSV-durable: any future re-import of the original 121-ResultExport.csv will recreate the Cat 4 row. Archive the originals and use the -fixed.csv files going forward.

3.10. Step 9 — Post-run verification

  1. wp_users/wp_usermeta match snapshot — a few diffs are expected from the Step 6 EP re-import writing CSV values (Emailperson_email, ContactNumbercontact_number) back onto matched persons. All such diffs should correlate with UIDs that were matched by an EP re-import.

  2. Zero id_number duplicates across events 119/120/121.

  3. Mahder Niguse has a Pos 3 (40 points) result in event 120 U/19 Women. Yaghya Darries has a Pos 78 (2 points) result in event 121 Category 3 and NO Cat 4 result.

  4. EP→person linkage sanity — sample 5 random EPs, confirm their first_name matches the linked person’s first_name meta.

3.11. Step 10 — Cleanup

  1. Delete the loader pod: kubectl -n mysql delete pod wp-restore-loader

  2. Revoke temporary SUPER + global CREATE from claude.

  3. Schedule retention:

    • Drop wp_restore_260417 schema on 2026-04-29 (7 days).

    • Drop wp_users_prerestore_20260422 + wp_usermeta_prerestore_20260422 on 2026-05-22 (30 days).

4. Decisions made

Decision Rationale

Restore into a new schema on the same cluster (not a standalone MySQL)

Simpler restore SQL (cross-schema JOINs). Schema-name typo risk mitigated by distinctive name (wp_restore_260417 — date-encoded, not _backup/_snapshot) and always-qualified DML in every statement.

Blanket DELETE+INSERT on wp_usermeta (not field-level allow-list)

Deterministic; simpler; no allow-list drift. umeta_id collision risk is zero given AUTO_INCREMENT monotonicity and Hibernate’s UPDATE-in-place semantics.

Field-level UPDATE on wp_users (not DELETE+INSERT)

Only 2 columns are touched by the import (display_name, user_nicename); updating in place avoids churn on auto-generated columns and keeps FK references stable.

Purge EPs/Results BEFORE the wp_users restore

Today’s EP re-import (2026-04-21) matched persons against corrupted wp_users; those EPs may point to the wrong user. After restore, re-import from scratch to rebuild the EP→person links against clean data.

Accept that EP re-import writes CSV values back onto restored persons

The CSV is authoritative for participant profile fields at registration time. The snapshot is a starting point; the CSV-sourced values (Email, ContactNumber, etc.) correctly override.

Manual commit gate on the restore transaction

519 users × ~22 meta rows is small enough that automation offers no benefit; the single most important control is the operator reading row counts before typing COMMIT.

30-day retention of pre-restore snapshots

Gives time for any user whose legitimate post-incident edit was reverted to surface a complaint. After 30 days, the snapshots can be dropped.

5. Retention & calendar

Artefact Drop date Notes

wp_restore_260417 schema

2026-04-29

The extracted 2026-04-17 snapshot. Keep 7 days for any follow-up investigation.

wpca_prod.wp_users_prerestore_20260422

2026-05-22

Rollback dataset — restore from here if anyone reports lost legitimate edits.

wpca_prod.wp_usermeta_prerestore_20260422

2026-05-22

Companion to the above.

recon/logs/wp-users-restore/affected_uids.txt

Permanent

Audit log — the authoritative list of UIDs processed.

6. References

6.1. Skill

  • mysql-operator-backup-restore (at ~/dev/ai-skills-develop/skills/devops/mysql-operator-backup-restore/SKILL.md) — generic workflow for loading an Oracle MySQL Operator backup into a target schema. Covers backup identification, S3 access, loader pod, util.loadDump, common pitfalls.

  • mysql-idealogic-prod (at ~/dev/ai-skills-develop/skills/devops/mysql-idealogic-prod/SKILL.md) — tunnel + credential setup for the cluster.

6.2. Source code inspected during planning

  • admin-service/src/main/java/za/co/idealogic/event/admin/service/EventParticipantImportXLS.java:108 — UID column → PERSON_EXTERNAL_ID mapping

  • admin-service/src/main/java/za/co/idealogic/event/admin/service/EventParticipantImportXLS.java:196 — person.setId(…​) triggers JPA load-by-ID

  • admin-service/src/main/java/za/co/idealogic/event/admin/service/EventParticipantServiceEx.java:661-712 — updateIdentificationDetails

  • wordpress-database/src/main/java/za/co/idealogic/wordpress/UserWrapper.java:92-106 — setName() writes user_nicename + display_name on wp_users

  • database/src/main/java/za/co/idealogic/event/domain/PersonWrapper.java — full list of wp_usermeta keys the import writes

  • Bug #471: EP Import — matchPerson OTHER branch creates duplicate Person. Workaround was IDType=NATIONAL in the CSVs used for Step 6.

  • Feature #442: WPCA Event Migration — EP Import & Result Import Improvements.

  • design-journal/2026-04/wpca-event-migration.adoc — broader migration context.

  • design-journal/2026-04/result-import-improvements.adoc — ADO-452 (laps), ADO-459 (WPCA points), regid-swap rationale for -fixed.csv files.

  • design-journal/2026-04/wp-users-backup-restore.adoc — journal entry for the 2026-04-22 execution of this runbook.