The geofi package provides tools to access spatial data
from Statistics Finland’s OGC API, including
administrative boundaries, population data by administrative units, and
population data by statistical grid cells. This vignette demonstrates
how to use the package’s core functions to:
The package handles pagination, spatial filtering, and coordinate
reference system (CRS) transformations, delivering data as
sf objects compatible with the sf package for
spatial analysis and visualization.
The geofi package includes the following key functions
for accessing Statistics Finland data:
ogc_get_statfi_area(): Retrieves administrative area
polygons (e.g., municipalities, wellbeing areas) for specified years and
scales.ogc_get_statfi_area_pop(): Fetches administrative area
polygons with associated population data, pivoted into a wide
format.ogc_get_statfi_statistical_grid(): Retrieves population
data for statistical grid cells at different resolutions (1km or
5km).All functions return spatial data as sf objects, making
it easy to integrate with spatial analysis workflows in R.
The ogc_get_statfi_area() function retrieves polygons
for Finnish administrative units, such as municipalities
(kunta), wellbeing areas (hyvinvointialue), or
regions (maakunta). You can customize the output with
parameters like:
year: The year of the boundaries (2020–2022).scale: Map resolution (1:1,000,000 or
1:4,500,000).tessellation: Type of administrative unit (e.g., kunta,
hyvinvointialue).crs: Coordinate reference system (EPSG:3067 or
EPSG:4326).limit: Maximum number of features (or NULL for
all).bbox: Bounding box for spatial filtering.Fetch all municipalities for 2022 at the 1:4,500,000 scale:
Visualize the municipalities using ggplot2:
To retrieve municipalities within a specific area (e.g., southern Finland), use the bbox parameter.
bbox_finland_south <- "18.797607,59.573288,30.476074,61.695082"
muni_south <- ogc_get_statfi_area(
year = 2022,
scale = 4500,
tessellation = "kunta",
bbox = bbox_finland_south,
crs = 3067
)Visualize the filtered results:
The ogc_get_statfi_area_pop() function fetches
administrative area polygons with associated population data, pivoted
into a wide format where each population variable is a column.
Parameters include:
year: The year of the data (2019–2021).crs: Coordinate reference system (EPSG:3067 or
EPSG:4326).limit: Maximum number of features (or NULL
for all).bbox: Bounding box for spatial filtering.Retrieve population data for 2021.
By default, function returns the at all regional breakdown available,
and it is users task to filter out the regional breakdown of interest.
At the moment, that can be down using regular expressions on the prefix
of variable areaStatisticalUnit_inspireId_localId.
Following prefixes are available
"avi", "ely", "kunta", "maakunta", "seutukunta", "suuralue"
Visualize the share of female population at municpality
(kunta) level.
Fetch population data within a bounding box:
pop_south <- ogc_get_statfi_area_pop(year = 2021, bbox = bbox_finland_south, crs = 4326)
ggplot(data=pop_south |> filter(grepl("^kunta", areaStatisticalUnit_inspireId_localId))) +
geom_sf(aes(fill = female_percentage)) +
scale_fill_viridis_c(option = "plasma") +
theme_minimal() +
labs(title = "Population by Administrative Area (2021)", fill = "share of females (%)")The ogc_get_statfi_statistical_grid() function retrieves
population data for statistical grid cells at 1km or 5km resolution.
Data is returned in EPSG:3067 (ETRS89 / TM35FIN). Parameters
include:
year: The year of the data (2019–2021).resolution: Grid cell size (1000m or 5000m).limit: Maximum number of features (or NULL
for all).bbox: Bounding box for spatial filtering.Retrieve population data for a 5km grid in 2021:
grid_data <- ogc_get_statfi_statistical_grid(year = 2021, resolution = 5000, bbox = bbox_finland_south)Visualize the grid data:
When limit = NULL, the
fetch_ogc_api_statfi() function automatically paginates
through large datasets, fetching up to 10,000 features per request. This
ensures all available data is retrieved, even for large administrative
or grid datasets.
The package includes robust error handling:
NULL with a warning if no data is retrieved,
helping users diagnose issues.The functions support two CRS options:
Note that ogc_get_statfi_statistical_grid() is fixed to
EPSG:3067, as per the API’s design.
The bbox parameter allows spatial filtering to focus on
specific regions. Bounding box coordinates in EPSG:4326 will work with
both crs EPSG:4326 and EPSG:3067. Bounding box in EPSG:3067 requires crs
to be also set to EPSG:3067 with the function argument. Example format:
"18.797607,59.573288,30.476074,61.695082".
limit or bbox to
estimate runtime before fetching all features.EPSG:3067 for
Finnish data unless you need EPSG:4326 for compatibility
with other systems.tessellation options (kunta,
hyvinvointialue, etc.) when using
ogc_get_statfi_area().ogc_get_statfi_area_pop() and
ogc_get_statfi_statistical_grid() is pivoted into wide
format. Check column names to identify available variables.The geofi package simplifies access to Statistics
Finland’s spatial and population data, enabling analyses of
administrative boundaries, population distributions, and grid-based
statistics. With no API key required, users can quickly retrieve and
visualize data using sf and ggplot2. Try the
examples above to explore Finland’s spatial and demographic
datasets!