I am not an expert at Postgres/GIS subjects and I have an issue with a large database (over 20 million records) of geometries. First of all my set up looks like this:
mmt=# select version();
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------
version | PostgreSQL 13.2 (Debian 13.2-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
mmt=# select PostGIS_Version();
-[ RECORD 1 ]---+--------------------------------------
postgis_version | 3.1 USE_GEOS=1 USE_PROJ=1 USE_STATS=1
The table that I am querying contains the following columns:
mmt=# \d titles
                                              Table "public.titles"
        Column        |           Type           | Collation | Nullable |                 Default                 
----------------------+--------------------------+-----------+----------+-----------------------------------------
 ogc_fid              | integer                  |           | not null | nextval('titles_ogc_fid_seq'::regclass)
 wkb_geometry         | bytea                    |           |          | 
 timestamp            | timestamp with time zone |           |          | 
 end                  | timestamp with time zone |           |          | 
 gml_id               | character varying        |           |          | 
 validfrom            | character varying        |           |          | 
 beginlifespanversion | character varying        |           |          | 
 geom_bounding_box    | geometry(Geometry,4326)  |           |          | 
Indexes:
    "titles_pkey" PRIMARY KEY, btree (ogc_fid)
    "geom_idx" gist (geom_bounding_box)
The geom_bounding_box column holds the bounding box of the wkb_geometry. I have created that bounding box column because the wkb geometries exceed the default size limits for items in a GIST index. Some of them are quite complex geometries with several dozens of points making up a polygon. Using a bounding box instead meant I was able to put an index on that column as a way of speeding up the search.. at least that's the theory.
My search aims to find geometries which fall within 100 metres of a given point as follows, however this takes well over two minutes to return. I want to get that under one second!:
select ogc_fid, web_geometry from titles where ST_DWithin(geom_bounding_box, 'SRID=4326;POINT(-0.145872 51.509691)'::geography, 100);
Below is a basic explain output. What can I do to speed this thing up?
Thank you!
mmt=# explain select ogc_fid from titles where ST_DWithin(geom_bounding_box, 'SRID=4326;POINT(-0.145872 51.509691)'::geography, 100);
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Gather  (cost=1000.00..243806855.33 rows=2307 width=4)
-[ RECORD 2 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   Workers Planned: 2
-[ RECORD 3 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Parallel Seq Scan on titles  (cost=0.00..243805624.63 rows=961 width=4)
-[ RECORD 4 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         Filter: st_dwithin((geom_bounding_box)::geography, '0101000020E61000006878B306EFABC2BF6308008E3DC14940'::geography, '100'::double precision, true)
-[ RECORD 5 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | JIT:
-[ RECORD 6 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   Functions: 4
-[ RECORD 7 ]----------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   Options: Inlining true, Optimization true, Expressions true, Deforming true