I’m using SQL Server (SQL Server 2016 and Azure SQL) and I have this
MERGE statement which uses a rather gnarly
WHEN MATCHED condition to only update rows where values are actually different.
This is done for two reasons:
- The table has a
rowversioncolumn that will change when an
UPDATEoperation is performed, even if all of the values are the same. The
rowversionvalues are useful for reducing client-activity (the application uses
rowversionfor optimistic concurrency).
- The table is also a Temporal Table, and SQL Server’s implementation of Temporal Tables will add copies of live data to the History table whenever an
UPDATEis performed, even if no values are actually changed.
CREATE PROCEDURE UpsertItems @groupId int, @items dbo.ItemsList READONLY -- This is a table-valued parameter. The UDT Table-Type has the same design as the `dbo.Items` table. WITH existing AS -- Using a CTE as the MERGE target to allow *safe* use of `WHEN NOT MATCHED BY SOURCE THEN DELETE` and apparently it's good for performance. ( SELECT groupId, itemId, a, b, c, d, e, f, -- etc FROM dbo.Items WHERE groupId = @groupId ) MERGE INTO existing WITH (HOLDLOCK) AS tgt USING @items AS src ON tgt.itemId = src.itemId WHEN MATCHED AND ( -- This part is painful, but unfortunately these are all NULLable columns so they need the full `x IS DISTINCT FROM y`-equivalent comparison: ( ( tgt.a <> src.a OR tgt.a IS NULL OR src.a IS NULL ) AND NOT ( tgt.a IS NULL AND src.a IS NULL ) ) OR ( ( tgt.b <> src.b OR tgt.b IS NULL OR src.b IS NULL ) AND NOT ( tgt.b IS NULL AND src.b IS NULL ) ) OR ( ( tgt.c <> src.c OR tgt.c IS NULL OR src.c IS NULL ) AND NOT ( tgt.c IS NULL AND src.c IS NULL ) ) OR ( ( tgt.d <> src.d OR tgt.d IS NULL OR src.d IS NULL ) AND NOT ( tgt.d IS NULL AND src.d IS NULL ) ) OR ( ( tgt.e <> src.e OR tgt.e IS NULL OR src.e IS NULL ) AND NOT ( tgt.e IS NULL AND src.e IS NULL ) ) OR ( ( tgt.f <> src.f OR tgt.f IS NULL OR src.f IS NULL ) AND NOT ( tgt.f IS NULL AND src.f IS NULL ) ) -- etc ) THEN UPDATE SET tgt.a = src.a, tgt.b = src.b, tgt.c = src.c, tgt.d = src.d, tgt.e = src.e, tgt.f = src.f, -- etc WHEN NOT MATCHED BY TARGET THEN INSERT ( groupId, itemId, a, b, c, d, e, f, -- etc ) VALUES ( src.groupId, src.itemId, src.a, src.b, src.c, src.d, src.e, src.f, -- etc ) WHEN NOT MATCHED BY SOURCE THEN DELETE OUTPUT $action AS [Action], inserted.groupId AS Ins_groupId, deleted .groupId AS Del_groupId, inserted.itemId AS Ins_itemId, deleted .itemId AS Del_itemId, inserted.a AS Ins_a, deleted .a AS Del_a, inserted.b AS Ins_b, deleted .b AS Del_b, inserted.c AS Ins_c, deleted .c AS Del_c, inserted.d AS Ins_d, deleted .d AS Del_d, inserted.e AS Ins_e, deleted .e AS Del_e, inserted.f AS Ins_f, deleted .f AS Del_f, -- etc ;
As you can see, this is quit the pain to maintain!
I already use tools like T4 to automate generating the repetitive parts of this query, but the sheer… scale and pain of this
MERGE statement makes me feel like I’m doing something very wrong (because software is meant to light-the-way via the Pit of Success, so if one is encountering difficulties trying to do the right thing you’re probably doing it wrong), but I can’t think or see a better way of accomplishing this (
BULK INSERT notwithstanding, but for the purposes of this question that’s not a possibility).
I know this statement can be simplified in other RDBMS which support
x IS DISTINCT FROM y (which replaces the horrible-but-necessary
NULL-safe checks in the
WHEN MATCHED AND, but SQL Server still doesn’t support it.
Another pain is the lack of DRY in SQL in general – and the difficulty of implementing a DRY database in SQL Server (e.g. there’s no support for Deferred Constraints or Table Inheritance so you can’t implement the Subclass Table Pattern, which means needless repetition of data-design in multiple tables and weaker constraints) – but that’s another topic. I’m just frustrated with how backward SQL programming seems today compared to the many time-saving and keypress-saving features in modern languages like Kotlin and TypeScript.
I’d love to be able to do something like this, and not have to do with any gotchas (like how
MERGE is unsafe by default without an explicit
HOLDLOCK – that’s crazy!):
MERGE INTO dbo.Items AS tgt WHERE tgt.groupId = @groupId FROM @items AS src ON tgt.itemId = src.itemId WHEN MATCHED AND DIFFERENT THEN UPDATE ( automap ) WHEN NOT MATCHED BY TARGET THEN INSERT ( automap ) WHEN NOT MATCHED BY SOURCE THEN DELETE OUTPUT ALL;
automap would be an amazing feature in SQL Server that automatically maps source and destination columns by name and throws a compile-time error if it cannot automatically map columns to each other), and
OUTPUT ALL would output all
deleted values with different column names – using the same column names but with
deleted values in adjacent rows).
Go to Source