The per-feature verbs in Streaming
spatial operations treat each geometry on its own. A second family
of verbs reads a whole feature set at once: the constructions and
topology operations that need every feature in a group together, not one
row at a time. These ride a shared partition tier. A by
argument routes the layer into one shard per group, each shard is
processed as an independent coverage, and the shards stream so the
resident budget is one group rather than the whole layer.
This vignette covers building polygons from lines, cleaning a polygon coverage, decomposing geometry into its parts, set-wise constructions, and linear referencing. Every block runs on the optional sf package.
spatial_polygonize() builds the polygonal faces enclosed
by a line network, the inverse of taking polygon boundaries. A group’s
lines are unioned and noded, then the faces of that arrangement are
returned, one per row.
grid <- st_sfc(
st_linestring(rbind(c(0, 0), c(2, 0))),
st_linestring(rbind(c(0, 1), c(2, 1))),
st_linestring(rbind(c(0, 2), c(2, 2))),
st_linestring(rbind(c(0, 0), c(0, 2))),
st_linestring(rbind(c(1, 0), c(1, 2))),
st_linestring(rbind(c(2, 0), c(2, 2))))
f <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(grid, hex = TRUE)), f)
tbl(f) |> spatial_polygonize() |> collect_sf()
#> Simple feature collection with 4 features and 0 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 2 ymax: 2
#> CRS: NA
#> geometry
#> 1 POLYGON ((1 0, 0 0, 0 1, 1 ...
#> 2 POLYGON ((2 0, 1 0, 1 1, 2 ...
#> 3 POLYGON ((1 1, 0 1, 0 2, 1 ...
#> 4 POLYGON ((2 1, 1 1, 1 2, 2 ...spatial_line_merge() is the line counterpart of a
dissolve: it sews segments that meet end to end into maximal
linestrings, one row per chain. Segments meeting at a junction of degree
greater than two stay separate.
seg <- st_sfc(
st_linestring(rbind(c(0, 0), c(1, 0))),
st_linestring(rbind(c(1, 0), c(2, 0))),
st_linestring(rbind(c(2, 0), c(3, 0))))
g <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(seg, hex = TRUE)), g)
tbl(g) |> spatial_line_merge() |> collect_sf()
#> Simple feature collection with 1 feature and 0 fields
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 3 ymax: 0
#> CRS: NA
#> geometry
#> 1 LINESTRING (0 0, 1 0, 2 0, ...Real coverages arrive with jittered shared borders, sub-pixel slivers, and more detail than a map needs. Three verbs clean them while keeping adjacent polygons edge-matched.
spatial_snap_grid() rounds coordinates to a regular grid
and repairs the result, one batch at a time. It is the fixed-precision
snap-rounding that spatial_overlay() applies internally,
exposed as a standalone verb, so a layer can be pre-noded to a common
precision without running a full overlay.
p <- st_polygon(list(rbind(c(0.04, 0.03), c(1.02, 0.01),
c(0.98, 1.03), c(0.01, 0.97), c(0.04, 0.03))))
h <- tempfile(fileext = ".vtr")
write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(p), hex = TRUE)), h)
tbl(h) |> spatial_snap_grid(0.1) |> collect_sf()
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 1 ymax: 1
#> CRS: NA
#> id geometry
#> 1 1 POLYGON ((0 1, 1 1, 1 0, 0 ...spatial_snap() instead pulls a streamed layer onto a
resident reference layer: vertices and edges within
tolerance of y are moved onto it, so two
layers that should share a boundary line up exactly.
ref <- st_sfc(st_linestring(rbind(c(0, 0), c(10, 0))))
line <- st_linestring(rbind(c(0, 0.2), c(5, 0.1), c(10, 0.2)))
sn <- tempfile(fileext = ".vtr")
write_vtr(data.frame(id = 1L, geometry = st_as_binary(st_sfc(line), hex = TRUE)), sn)
tbl(sn) |> spatial_snap(ref, tolerance = 0.5) |> collect_sf()
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 10 ymax: 0.1
#> CRS: NA
#> id geometry
#> 1 1 LINESTRING (0 0, 5 0.1, 10 0)spatial_eliminate() cleans a polygon coverage by
absorbing every feature smaller than max_area into a
neighbour, the QGIS “Eliminate”. Each sliver joins the neighbour it
shares the longest border with, or the largest-area neighbour with
into = "largest_area". An area-rooted union-find collapses
chains of slivers so a connected run flows to its single largest member,
whose attributes survive.
big <- st_polygon(list(rbind(
c(0, 0), c(10, 0), c(10, 10), c(0, 10), c(0, 0))))
sliver <- st_polygon(list(rbind(
c(10, 0), c(10.3, 0), c(10.3, 10), c(10, 10), c(10, 0))))
e <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
id = c("keep", "sliver"),
geometry = st_as_binary(st_sfc(big, sliver), hex = TRUE)), e)
tbl(e) |> spatial_eliminate(max_area = 5) |> collect_sf()
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 10.3 ymax: 10
#> CRS: NA
#> id geometry
#> 1 keep POLYGON ((0 0, 0 10, 10 10,...The thin sliver is gone, absorbed into the square it bordered, and
the survivor keeps the keep attributes.
spatial_simplify() simplifies a polygon coverage without
tearing shared edges. Boundaries are unioned so a shared border is one
line, noded into arcs, each arc simplified once with its junction
endpoints pinned, then re-polygonized. Adjacent polygons stay
edge-matched with no slivers. This is the topology-preserving
simplification a per-feature
spatial_map(~ sf::st_simplify()) cannot give, because that
simplifies each polygon’s copy of a shared border independently.
p1 <- st_polygon(list(rbind(
c(0, 0), c(1, 0), c(1, 0.5), c(1, 1), c(0, 1), c(0, 0))))
p2 <- st_polygon(list(rbind(
c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0.5), c(1, 0))))
s <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
id = c("a", "b"),
geometry = st_as_binary(st_sfc(p1, p2), hex = TRUE)), s)
tbl(s) |> spatial_simplify(tolerance = 0.6) |> collect_sf()
#> Simple feature collection with 2 features and 1 field
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 2 ymax: 1
#> CRS: NA
#> id geometry
#> 1 a POLYGON ((1 0, 0 0, 0 1, 1 ...
#> 2 b POLYGON ((1 0, 1 0.5, 1 1, ...spatial_explode() splits every multipart geometry into
its single-part components, a MULTIPOLYGON into one row per
polygon and so on, copying the source attributes onto each part. An
optional part argument names a 1-based part-index column.
It is the streaming “multipart to singleparts” tool and the inverse of
spatial_dissolve().
mp <- st_multipolygon(list(
list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0))),
list(rbind(c(2, 2), c(3, 2), c(3, 3), c(2, 3), c(2, 2)))))
m <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
id = 1L, geometry = st_as_binary(st_sfc(mp), hex = TRUE)), m)
tbl(m) |> spatial_explode(part = "part_id") |> collect_sf()
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 0 ymin: 0 xmax: 3 ymax: 3
#> CRS: NA
#> id part_id geometry
#> 1 1 1 POLYGON ((0 0, 1 0, 1 1, 0 ...
#> 2 1 2 POLYGON ((2 2, 3 2, 3 3, 2 ...spatial_topology() decomposes a polygon coverage into
the arcs of its planar topology. The unioned boundaries are noded so a
shared border becomes one arc, tagged with the identifiers of the (up to
two) polygons on either side: two for an internal shared edge, one for
an outer edge. It is the inverse of
spatial_polygonize().
q1 <- st_polygon(list(rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0, 0))))
q2 <- st_polygon(list(rbind(c(1, 0), c(2, 0), c(2, 1), c(1, 1), c(1, 0))))
tp <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
id = c("a", "b"),
geometry = st_as_binary(st_sfc(q1, q2), hex = TRUE)), tp)
tbl(tp) |> spatial_topology(id = "id") |> collect()
#> face1 face2
#> 1 a <NA>
#> 2 a b
#> 3 a <NA>
#> 4 b <NA>
#> geometry
#> 1 01020000000200000000000000000000000000000000000000000000000000f03f0000000000000000
#> 2 010200000002000000000000000000f03f0000000000000000000000000000f03f000000000000f03f
#> 3 010200000003000000000000000000f03f000000000000f03f0000000000000000000000000000f03f00000000000000000000000000000000
#> 4 010200000004000000000000000000f03f0000000000000000000000000000004000000000000000000000000000000040000000000000f03f000000000000f03f000000000000f03fThe shared edge between a and b appears
once, with both face columns filled; each outer edge appears once with
the second face NA.
spatial_centerline() traces the centerline (medial axis)
of each polygon from the Voronoi diagram of its densified boundary: the
Voronoi edges that fall inside the polygon are its skeleton, merged into
maximal lines. density sets the boundary sampling and
prune drops the short spurs the skeleton grows toward
convex corners. It is the usual approximation for a river or road
centerline from a filled shape; non-polygon geometry passes through
unchanged.
strip <- st_polygon(list(rbind(
c(0, 0), c(10, 0), c(10, 2), c(0, 2), c(0, 0))))
cl <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(st_sfc(strip), hex = TRUE)), cl)
tbl(cl) |> spatial_centerline(density = 0.25, prune = 0.5) |> collect_sf()
#> Simple feature collection with 5 features and 0 fields
#> Geometry type: LINESTRING
#> Dimension: XY
#> Bounding box: xmin: 0.125 ymin: 0.125 xmax: 9.875 ymax: 1.875
#> CRS: NA
#> geometry
#> 1 LINESTRING (1 1, 0.875 0.87...
#> 2 LINESTRING (0.125 1.875, 0....
#> 3 LINESTRING (1 1, 1.125 1, 1...
#> 4 LINESTRING (9.875 1.875, 9....
#> 5 LINESTRING (9 1, 9.125 0.87...spatial_construct() builds a geometry construction from
a whole feature set, the constructions a per-feature
spatial_map() cannot express. A kind argument
selects it: "convex_hull", "concave_hull",
"envelope", "oriented_box",
"enclosing_circle", "inscribed_circle",
"pole" (the pole of inaccessibility, the centre of the
maximum inscribed circle), "voronoi", and
"delaunay". A by argument builds one
construction per group; the enclosing kinds emit one feature per group,
the tessellations one polygon per cell.
nc <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
nc$band <- nc$SID74 > 5
n <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
band = nc$band,
geometry = st_as_binary(st_geometry(nc), hex = TRUE)), n)
tbl(n) |>
spatial_construct("convex_hull", by = "band", crs = st_crs(nc)) |>
collect_sf()
#> Simple feature collection with 2 features and 1 field
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
#> Geodetic CRS: NAD27
#> band geometry
#> 1 FALSE POLYGON ((-84.32385 34.9890...
#> 2 TRUE POLYGON ((-82.88111 35.6735...A tessellation kind returns one polygon per cell, each carrying the
group’s by values:
xy <- rbind(c(0, 0), c(1, 0), c(1, 1), c(0, 1), c(0.4, 0.6))
cloud <- st_sfc(lapply(seq_len(nrow(xy)), function(i) st_point(xy[i, ])))
v <- tempfile(fileext = ".vtr")
write_vtr(data.frame(geometry = st_as_binary(cloud, hex = TRUE)), v)
tbl(v) |> spatial_construct("voronoi") |> collect_sf()
#> Simple feature collection with 5 features and 0 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: -1 ymin: -1 xmax: 2 ymax: 2
#> CRS: NA
#> geometry
#> 1 POLYGON ((2 2, 2 0.5, 0.9 0...
#> 2 POLYGON ((-1 2, 0.5 2, 0.5 ...
#> 3 POLYGON ((-1 -1, -1 0.5, -0...
#> 4 POLYGON ((0.5 1.1, 0.9 0.5,...
#> 5 POLYGON ((2 -1, 0.5 -1, 0.5...spatial_locate() places streamed points along a resident
line layer: each point gets its nearest line’s identifier, the measure
(distance along that line to the point’s projection), and the
perpendicular offset, with an optional snap onto the line.
The inverse, a measure back to a point, is
sf::st_line_interpolate() through
spatial_map().
roads <- st_sf(road = c("main", "side"), geometry = st_sfc(
st_linestring(rbind(c(0, 0), c(10, 0))),
st_linestring(rbind(c(0, 5), c(0, 15)))))
pts <- data.frame(id = 1:2, x = c(3, 1), y = c(1, 9))
l <- tempfile(fileext = ".vtr")
write_vtr(pts, l)
tbl(l) |>
spatial_locate(roads, coords = c("x", "y"), y_id = "road") |>
collect()
#> id x y line measure distance geometry
#> 1 1 3 1 main 3 1 01010000000000000000000840000000000000f03f
#> 2 2 1 9 side 4 1 0101000000000000000000f03f0000000000002240Point 1 is one unit off main at measure 3; point 2 is
one unit off side at measure 4. With
snap = TRUE each point’s geometry is replaced by its
projection onto the matched line.