Mejora del rendimiento de un join where or separando y juntando con union

Un compañero me llamó la atención sobre el rendimiento de esta query:

SELECT
	d0."id",
	d0."version",
	d0."class",
	d0."description",
	d0."metadata",
	d0."group",
	d0."name",
	d0."type",
	d0."deleted_at",
	d0."hash",
	d0."ghash",
	d0."lhash",
	d0."data_structure_id",
	d0."inserted_at",
	d0."updated_at"
FROM "data_structure_versions" AS d0
INNER JOIN "data_structures" AS d1
ON d1."id" = d0."data_structure_id"
WHERE (
	((d1."updated_at" >= '2022-09-10 00:02:51') OR (d0."updated_at" >= '2022-09-10 00:02:51'))
	OR (d0.deleted_at >= '2022-09-10 00:02:51')
)

Examinando con explain analyze:

Gather  (cost=145157.46..1097083.29 rows=398 width=784) (actual time=2981.334..4174.031 rows=17 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Parallel Hash Join  (cost=144157.46..1096043.49 rows=166 width=784) (actual time=2904.393..3790.212 rows=6 loops=3)
        Hash Cond: (d0.data_structure_id = d1.id)
        Join Filter: ((d0.updated_at >= '2022-09-10 00:02:51'::timestamp without time zone) OR (d1.updated_at >= '2022-09-10 00:02:51'::timestamp without time zone) OR (d0.deleted_at >= '2022-09-10 00:02:51'::timestamp without time zone))
        Rows Removed by Join Filter: 1660765
        ->  Parallel Seq Scan on data_structure_versions d0  (cost=0.00..527622.63 rows=2075963 width=784) (actual time=0.025..800.242 rows=1660770 loops=3)
        ->  Parallel Hash  (cost=111065.65..111065.65 rows=1903665 width=16) (actual time=470.178..470.178 rows=1522932 loops=3)
              Buckets: 131072  Batches: 128  Memory Usage: 2752kB
              ->  Parallel Seq Scan on data_structures d1  (cost=0.00..111065.65 rows=1903665 width=16) (actual time=0.020..206.613 rows=1522932 loops=3)
Planning Time: 1.265 ms
Execution Time: 4174.234 ms

Cuatro segundos, un poco lentorro. En producción era peor, porque el API REST que utiliza esa query daba un timeout. Buscando ideas para mejorar el rendimiento, encontré esto:

If the individual joins work fast, the composite OR condition is probably slow because it cannot use a single index whereas the individual conditions can each use a single index for the three joining conditions. Because it can’t use one index, it is probably doing non-indexed sequential table scans. (You should study the query plan so you understand what the optimizer is actually doing.)
Given that the individual queries work reasonably fast, therefore, you should get dramatically better performance using UNION (unless the optimizer in your DBMS has a blind spot):
[…código]

That should give you a result approximately as quickly as the 3 individual queries. It won’t be quite as fast because UNION does duplicate elimination (which the individual queries do not, of course). You could use UNION ALL, but if there are many rows in the two tables where 2 or 3 of the pairs of fields match, that could lead to a lot of repetition in the results.
https://stackoverflow.com/a/27696729

Así que eso hice: separé el where or en dos, juntando con un union:

select dsv.* from data_structure_versions dsv inner join data_structures ds ON dsv.data_structure_id = ds.id where dsv.updated_at >= '2022-09-10 00:02:51' or dsv.deleted_at >= '2022-09-10 00:02:51'
union
select dsv.* from data_structure_versions dsv inner join data_structures ds ON dsv.data_structure_id = ds.id where ds.updated_at >= '2022-09-10 00:02:51'

Mejoró considerablemente:

HashAggregate  (cost=654902.76..654902.96 rows=20 width=3409) (actual time=579.409..579.424 rows=17 loops=1)
  Group Key: dsv.id, dsv.version, dsv.data_structure_id, dsv.inserted_at, dsv.updated_at, dsv.class, dsv.deleted_at, dsv.description, dsv."group", dsv.metadata, dsv.name, dsv.type, dsv.hash, dsv.lhash, dsv.ghash, ds.id, ds.system, ds.last_change_by, ds.inserted_at, ds.updated_at, ds.confidential, ds.external_id, ds.system_id, ds.source_id, ds.domain_ids, ds.alias
  ->  Gather  (cost=1000.43..654901.46 rows=20 width=3409) (actual time=213.936..579.815 rows=17 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Parallel Append  (cost=0.43..653899.46 rows=20 width=3409) (actual time=333.193..576.739 rows=6 loops=3)
              ->  Nested Loop  (cost=0.43..538044.69 rows=5 width=1935) (actual time=435.311..435.311 rows=0 loops=3)
                    ->  Parallel Seq Scan on data_structure_versions dsv  (cost=0.00..538002.44 rows=5 width=786) (actual time=435.310..435.310 rows=0 loops=3)
                          Filter: ((updated_at >= '2022-09-10 00:02:51'::timestamp without time zone) OR (deleted_at >= '2022-09-10 00:02:51'::timestamp without time zone))
                          Rows Removed by Filter: 1660770
                    ->  Index Scan using data_structures_pkey on data_structures ds  (cost=0.43..8.45 rows=1 width=1149) (never executed)
                          Index Cond: (id = dsv.data_structure_id)
              ->  Nested Loop  (cost=0.43..115854.46 rows=3 width=1935) (actual time=211.898..212.139 rows=8 loops=2)
                    ->  Parallel Seq Scan on data_structures ds_1  (cost=0.00..115824.81 rows=2 width=1149) (actual time=211.840..211.967 rows=8 loops=2)
                          Filter: (updated_at >= '2022-09-10 00:02:51'::timestamp without time zone)
                          Rows Removed by Filter: 2284389
                    ->  Index Scan using data_structure_versions_data_structure_id_version_index on data_structure_versions dsv_1  (cost=0.43..14.79 rows=4 width=786) (actual time=0.016..0.016 rows=1 loops=17)
                          Index Cond: (data_structure_id = ds_1.id)
Planning Time: 0.483 ms
Execution Time: 580.106 ms

Posteriormente, añadí índices a las columnas del where, y mejoró bastante más:

HashAggregate  (cost=6609.10..6613.09 rows=399 width=2268) (actual time=0.298..0.314 rows=17 loops=1)
  Group Key: dsv.id, dsv.version, dsv.data_structure_id, dsv.inserted_at, dsv.updated_at, dsv.class, dsv.deleted_at, dsv.description, dsv."group", dsv.metadata, dsv.name, dsv.type, dsv.hash, dsv.lhash, dsv.ghash
  ->  Append  (cost=9.31..6594.14 rows=399 width=2268) (actual time=0.060..0.221 rows=17 loops=1)
        ->  Nested Loop  (cost=9.31..17.35 rows=1 width=784) (actual time=0.015..0.015 rows=0 loops=1)
              ->  Bitmap Heap Scan on data_structure_versions dsv  (cost=8.88..12.90 rows=1 width=784) (actual time=0.014..0.014 rows=0 loops=1)
                    Recheck Cond: ((updated_at >= '2022-09-10 00:02:51'::timestamp without time zone) OR (deleted_at >= '2022-09-10 00:02:51'::timestamp without time zone))
                    ->  BitmapOr  (cost=8.88..8.88 rows=1 width=0) (actual time=0.012..0.012 rows=0 loops=1)
                          ->  Bitmap Index Scan on data_structure_versions_updated_at_index  (cost=0.00..4.44 rows=1 width=0) (actual time=0.007..0.007 rows=0 loops=1)
                                Index Cond: (updated_at >= '2022-09-10 00:02:51'::timestamp without time zone)
                          ->  Bitmap Index Scan on data_structure_versions_deleted_at_index  (cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
                                Index Cond: (deleted_at >= '2022-09-10 00:02:51'::timestamp without time zone)
              ->  Index Only Scan using data_structures_pkey on data_structures ds  (cost=0.43..4.45 rows=1 width=8) (never executed)
                    Index Cond: (id = dsv.data_structure_id)
                    Heap Fetches: 0
        ->  Nested Loop  (cost=0.86..6570.81 rows=398 width=784) (actual time=0.044..0.202 rows=17 loops=1)
              ->  Index Scan using data_structures_updated_at_index on data_structures ds_1  (cost=0.43..1166.43 rows=365 width=8) (actual time=0.021..0.043 rows=17 loops=1)
                    Index Cond: (updated_at >= '2022-09-10 00:02:51'::timestamp without time zone)
              ->  Index Scan using data_structure_versions_data_structure_id_version_index on data_structure_versions dsv_1  (cost=0.43..14.77 rows=4 width=784) (actual time=0.007..0.008 rows=1 loops=17)
                    Index Cond: (data_structure_id = ds_1.id)
Planning Time: 0.822 ms
Execution Time: 0.405 ms

Se puede ver que han desaparecido los Parallel Seq Scan y ahora se usan Index Scan con los índices data_structure_versions_updated_at_index, data_structure_versions_deleted_at_index, data_structures_updated_at_index.