My database instances appear to treat FK constraint ON DELETE RESTRICT differently and I don't know why. On production database instance (and a clone, where I do my testing), I believe the RESTRICT is working correctly. But on Test, Staging and developer machines it isn't.
All databases are Postgres 10.17, AWS RDS instances or Win 10. All databases have identical schema. Our databases are small so Dev, Test and Staging use a database that has been taken from production by being dumped (pg_dump), restored (pg_restore), anonymised and then dumped (pg_dump) again. Could that be the key to the difference between production and the rest?
Table Structure
A
|- B
|- C
|- D
It's a simple hierarchy of one-to-many relations going from top to bottom. For legacy reasons, there are still IDs FKs that point to parents further up the tree. For example, there id a FK relationship between D and B, even though it's inferred via its relationship with C. Performance on write isn't a problem for the application, so we're parking that change for now.
The foreign keys are all ON DELETE CASCADE, except between D and C. That is ON DELETE RESTRICT. For our app, that's a bug (we know how to fix) but that's not why I'm here. I'm here because production was the first place we saw it and I don't like that.
Expected, Correct Behaviour
On production clone, I try a simple delete from the top of the hierarchy and am told that it cannot delete because the foreign key constraints, as expected.
db=> SET search_path TO dbo;
db=> delete from "A" where "A_ID"=1234;
ERROR: update or delete on table "C" violates foreign key constraint "FK_D_C_C_ID" on table "D"
DETAIL: Key (C_ID)=(654321) is still referenced from table "D".
Table D description (boring columns removed):
db=> \d "D"
Table "dbo.D"
Column | Type | Collation | Nullable | Default
----------------------+-----------------------------+-----------+----------+----------------------------------------------------
D_ID | integer | | not null | generated by default as identity
B_ID | integer | | not null | 0
C_ID | integer | | |
Indexes:
"PK_dbo.D" PRIMARY KEY, btree ("D_ID")
"IX_D_B_ID" btree ("B_ID")
"IX_D_C_ID" btree ("C_ID")
Foreign-key constraints:
"FK_D_B_B_ID" FOREIGN KEY ("B_ID") REFERENCES "B"("B_ID") ON DELETE CASCADE
"FK_D_C_C_ID" FOREIGN KEY ("C_ID") REFERENCES "C"("C_ID") ON DELETE RESTRICT
The schema is identical on the other database instances and the same command completes correctly. The data is identical between all instances.
On our staging:
db=> delete from "A" where "A_ID"=1267
db=>
Which is clearly wrong, it should throw the same error as we saw on production and the production clone.
I'm suspicious of our dump/restore process
My hunch is that by dumping, restoring, anonymising and then dumping again, something is happening to the database so that the ON DELETE RESTRICT isn't triggered on a delete.
The dump command we use is:
pg_dump <DB CONNECTION ARGUMENTS> --format custom --blobs --no-privileges --no-tablespaces --verbose --no-unlogged-table-data --file <FILE PATH>
And restore is:
pg_restore <DB CONNECTION ARGUMENTS> -w <FILE PATH>
Anonymisation only changes varchar/text values and in none of the A,B,C,D tables, so I've omitted that.
I can't see anything particularly out of the ordinary with our dump/restore.
Thanks in advance.