2

I am working with PostgreSQL and PostGIS and handling a very large geom dataset. My goal is to subsample the data similarly to PCD (point cloud) subsampling, selecting representative points every 10 meters while keeping actual values.

I have considered:

ST_ClusterDBSCAN() → Scans the entire dataset, which is inefficient for my case. ST_SnapToGrid(geom, 10) → Does not select representative points but adjusts them to a grid. To achieve my goal, I wanted to use ST_Resample(), but I found that it only works with raster data and not geom.

Why does ST_Resample() work only with raster and not with geom? Is there an equivalent function or method for resampling geom data at fixed intervals (e.g., 10m) while preserving real values?

5
  • 5
    Raster data is a gridded data structure; each cell has a known dimension, known grid index as well as a known neighborhood - each of which are cells themselves, sharing the same inherent grid relations. Vector data is discrete; there is no inherent relationship as part of the data itself, e.g. a polygon does not know its neighborhood without an explicit analytical step. Establishing those relationships on your vector data is inevitable, and ST_ClusterDBSCAN indeed a powerful tool for the general case - for points only, creating a grid through ST_SnapToGrid can work just as well. Commented Mar 18 at 9:34
  • Thank you so much! Your answer has given me direction. I really appreciate it! Commented Mar 20 at 8:35
  • What I want to do is start with a specific point (a representative point), eliminate all other data within a 10m radius, and then continue this process by selecting a new representative point and removing any remaining data within its radius. I want to have only one representative point at a time, similar to how Google Maps' Street View moves step by step. Would you be able to share any good ideas on how to achieve this? Commented Mar 20 at 9:03
  • 1
    As I said, since you seem to have point data you have two options: 1) use ST_ClusterDBSCAN to cluster your points and extract their centroids - their location depends on the actual distribution of your points; 2) use ST_SnapToGrid to griddify your points - the resulting points will be regularly distributed in grid cells. Do you have a preference? Note that in both cases, your point coordinates need to be referenced in a suitable projection for your AOI that uses meter as unit! Commented Mar 20 at 11:34
  • Thanks a lot. Your insight really helped me see things from a new angle and think differently about my initial objective. Commented Mar 21 at 1:38

1 Answer 1

0

Vector data is not sampled. Instead of grid, it is described by its border line vertices. You can't resample it, but you can typically reduce the size of the data using ST_Simplify function (see also its variants - ST_SimplifyPreserveTopology, ST_SimplifyVW which might be preferable in some cases) that removes some vertices as long as the new line fits the old one within tolerance parameter.

3
  • Thank you for your detailed response! unfortunately, my data consists of geom point data, so I couldn't use ST_Simplify or other similar methods. What I want to do is start with a specific point (a representative point), eliminate all other data within a 10m radius, and then continue this process by selecting a new representative point and removing any remaining data within its radius. I want to have only one representative point at a time, similar to how Google Maps' Street View moves step by step. Would you be able to share any good ideas on how to achieve this? Commented Mar 20 at 9:04
  • I would use ST_ClusterDBSCAN. To avoid scanning whole dataset, partition by some suitable coarse granularity, e.g. by state or city boundaries, or by short GeoHash string. Commented Mar 20 at 19:47
  • Your answer shed new light on my thinking and made me revisit my original intentions. Thank you! Commented Mar 21 at 1:39

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.