PostGIS
Name
ST_ClusterDBSCAN — Window function that returns a cluster id for each input geometry using the DBSCAN algorithm.
Description
Returns cluster number for each input geometry, based on a 2D
implementation of the Density-based
spatial clustering of applications with noise (DBSCAN) algorithm.
Unlike ST_ClusterKMeans, it does not require
the number of clusters to be specified, but instead uses the desired
distance (eps
) and density (minpoints
)
parameters to construct each cluster.
An input geometry will be added to a cluster if it is either:
Note that border geometries may be within eps
distance of core
geometries in more than one cluster; in this case, either assignment
would be correct, and the border geometry will be arbitrarily asssigned
to one of the available clusters. In these cases, it is possible for a
correct cluster to be generated with fewer than minpoints
geometries. When assignment of a border geometry is ambiguous, repeated
calls to ST_ClusterDBSCAN will produce identical results if an ORDER BY
clause is included in the window definition, but cluster assignments may
differ from other implementations of the same algorithm.
|
Availability: 2.3.0
Examples
Assigning a cluster number to each polygon within 50 meters of each other. Require at least 2 polygons per cluster
within 50 meters at least 2 per cluster. singletons have NULL for cid
|
|
bucket -------------------------------------+-------- Manulife Tower |
0 Park Lane Seaport I |
0 Park Lane Seaport II |
0 Renaissance Boston Waterfront Hotel |
0 Seaport Boston Hotel |
0 Seaport Hotel & World Trade Center |
0 Waterside Place |
0 World Trade Center East |
0 100 Northern Avenue |
1 100 Pier 4 |
1 The Institute of Contemporary Art |
1 101 Seaport |
2 District Hall |
2 One Marina Park Drive |
2 Twenty Two Liberty |
2 Vertex |
2 Vertex |
2 Watermark Seaport |
2 Blue Hills Bank Pavilion |
NULL World Trade Center West |
Combining parcels with the same cluster number into a single geometry. This uses named argument calling
SELECT cid, ST_Collect(geom) AS cluster_geom, array_agg(parcel_id) AS ids_in_cluster FROM (
SELECT parcel_id, ST_ClusterDBSCAN(geom, eps := 0.5, minpoints := 5) over () AS cid, geom
FROM parcels) sq
GROUP BY cid;