Tidyverse-friendly interface to the IBGE Aggregate Data API (version 3), which powers SIDRA, the automatic data retrieval system for all surveys and censuses conducted by IBGE (Brazilian Institute of Geography and Statistics).
Installation
# install.packages("remotes")
remotes::install_github("StrategicProjects/ibger")Note: if you encounter the error
curl_modify_url is not an exported object, update the curl package withinstall.packages("curl"). Version 6.0.0 or higher is required.
Quick start
library(ibger)
# Browse available aggregates
ibge_aggregates()
# Inspect an aggregate (full metadata)
meta <- ibge_metadata(7060)
meta
meta$variables
meta$classifications
# Available periods and localities
ibge_periods(7060)
ibge_localities(7060, level = "N6")
# Get data — the main function
ibge_variables(7060, localities = "BR")Automatic validation
ibge_variables() and ibge_localities() validate parameters against the aggregate metadata before querying. Invalid parameters stop execution with a clear error showing the allowed values:
ibge_variables(7060, localities = "N3")
#> Error:
#> ! Geographic level(s) "N3" not available for aggregate 7060.
#> ℹ Available levels: "N1", "N6", and "N7".Examples
# Monthly IPCA in Brazil, last 12 months
ibge_variables(7060, variable = 63, periods = -12, localities = "BR")
# IPCA by product group for specific municipalities
ibge_variables(
aggregate = 7060,
variable = 63,
periods = -6,
localities = list(N6 = c(1200401, 2800308)),
classification = list("315" = c(7169, 7170, 7445))
)Value column and IBGE conventions
In accordance with official IBGE data standards, the value column returned by the API may be of type character rather than numeric.
This occurs because IBGE uses special symbols to represent specific data conditions, which are part of the statistical dissemination standard and should not be treated as errors.
The possible values are:
| Value | Meaning |
|---|---|
- |
Numeric zero (not resulting from rounding) |
.. |
Not applicable |
... |
Data not available |
X |
Suppressed to avoid identifying individual respondents |
As a consequence, users should not assume that the value column is always numeric. When numerical analysis is required, these symbols must be handled explicitly before coercion.
Design
- Tidyverse native: all functions return tidy tibbles
- snake_case: function and column names
- Natural interface: classifications as named lists, flexible localities
- Validation: checks metadata before querying, stops with clear errors
- Caching: metadata cached in memory to avoid duplicate calls
- Feedback: clear cli messages at each step
- Robust: automatic retry via httr2
Functions
Aggregates API (data retrieval)
| Function | Description |
|---|---|
ibge_aggregates() |
List aggregates grouped by survey |
ibge_metadata() |
Full metadata for an aggregate |
ibge_periods() |
Available periods |
ibge_localities() |
Localities by level (with validation) |
ibge_variables() |
Get data (with validation) |
ibge_subjects() |
Look up IBGE subject (theme) codes |
Survey catalog (Metadata API)
| Function | Description |
|---|---|
ibge_surveys() |
List all IBGE surveys with status and category |
ibge_survey_periods() |
Available periods for a survey’s metadata |
ibge_survey_metadata() |
Institutional/methodological metadata for a survey |
Utilities
| Function | Description |
|---|---|
parse_sidra_url() |
Parse SIDRA URL into ibger parameters |
fetch_sidra_url() |
Fetch data directly from a SIDRA URL |
parse_ibge_value() |
Convert value column to numeric |
ibge_clear_cache() |
Clear metadata cache |
Learn more
-
vignette("getting-started")— full walkthrough -
vignette("api-concepts")— understanding the IBGE API data model -
vignette("ipca-example")— real-world IPCA inflation analysis -
vignette("tutorial")— tracking state GDP components with IBGE data
Interactive Aggregate Explorer
The function ibge_explorer() launches an interactive Shiny application that allows you to browse, filter, and export the full catalog of IBGE aggregates available via the API.
It is designed to make exploration easier before calling functions such as ibge_metadata(), ibge_variables(), or ibge_aggregates().
✨ Features
- 📊 Summary value boxes with total counts
- 🔍 Global and column-level table filtering
- 🧭 Filter by survey
- 📥 CSV download of filtered results
- 🆔 Click a row to display the corresponding
ibge_metadata()call
🚀 Usage
# Open in RStudio Viewer (default behavior)
ibge_explorer()
# Open in your system browser
ibge_explorer(launch.browser = TRUE)📦 Dependencies
The explorer requires the following packages:
shinyDTbslibbsicons
If any of them are not installed, ibge_explorer() will display a friendly CLI error message.
💡 When to Use It?
Use ibge_explorer() when you:
- Don’t remember an aggregate ID
- Want to search by survey name
- Need a quick way to copy the correct
ibge_metadata()call
- Prefer a visual workflow before writing code
Comparison with other packages
Several R packages provide access to IBGE data. Here is how ibger differs:
| Feature | ibger | sidrar | PNADcIBGE / SIPDIBGE |
|---|---|---|---|
| Data source | IBGE Aggregates API (v3) | SIDRA API (apisidra) |
IBGE microdata (FTP/download) |
| API base URL | servicodados.ibge.gov.br |
apisidra.ibge.gov.br |
— |
| Row limit per request | 100,000 | 20,000 | — |
| Output | Tibbles (tidy, long format) | data.frames (wide by default) | survey design objects (survey) |
| Parameter format | Named R lists | Positional + string codes | File paths / year + quarter |
| Metadata discovery | Dedicated endpoints (/metadados, /periodos, /localidades) |
HTML scraping of desctabapi.aspx
|
— |
| Pre-flight validation | Checks metadata before querying | None (errors come from the API) | — |
| Caching | In-memory metadata + aggregates cache | None | Local file cache |
| HTTP stack | httr2 (automatic retry, structured errors) | httr v1 (no retry) | — |
| Feedback | cli progress messages | None | None |
vs sidrar
sidrar is the closest alternative. Both packages access IBGE aggregate data, but they talk to different APIs hosted by IBGE:
-
Different endpoints: sidrar uses the legacy SIDRA API (
apisidra.ibge.gov.br/values), while ibger uses the IBGE Aggregates API v3 (servicodados.ibge.gov.br/api/v3/agregados). IBGE’s own documentation describes the Aggregates API as the standardized version of the SIDRA API. - Higher row limit: the SIDRA API caps responses at 20,000 rows (sidrar’s error message: “more than 20k values”); the Aggregates API allows up to 100,000 values per request — a 5× increase.
-
Structured metadata: ibger queries dedicated JSON endpoints for metadata (
/metadados,/periodos,/localidades/{nivel}). sidrar scrapes an HTML page (desctabapi.aspx) to discover classifications, andinfo_sidra()opens metadata in the web browser rather than returning it as data. -
Parameter ergonomics: sidrar requires positional string codes (
classific = "c315",geo = "City",geo.filter = list("State" = 50)), while ibger uses standard R objects (classification = list("315" = 7169),localities = list(N6 = 5002704)). - Pre-flight validation: ibger checks your parameters (levels, periods, variables, classifications) against the aggregate metadata before hitting the API. Invalid requests fail fast with clear error messages showing the allowed values, instead of returning an opaque API error.
- Modern stack: ibger uses httr2 (automatic retry, structured error handling), cli (progress messages), and returns tibbles by default. sidrar uses httr v1 with no retry mechanism.
vs PNADcIBGE and SIPDIBGE
These packages serve a fundamentally different purpose. They download and process microdata (individual survey responses) from household surveys like PNAD Contínua, POF, and MUNIC, returning survey design objects suitable for complex survey analysis with the survey package.
ibger works with aggregate data — the pre-tabulated summary tables published through SIDRA. If you need individual-level records and proper survey weights, use PNADcIBGE or SIPDIBGE. If you need ready-made indicators, time series, and cross-tabulations (such as IPCA, GDP, census totals, or agricultural production by municipality), use ibger.
Disclaimer
This package is an independent, open-source project and is not affiliated with, endorsed by, or officially connected to the Instituto Brasileiro de Geografia e Estatística (IBGE) in any way.
All data retrieved through this package is sourced from the IBGE Aggregates API and the IBGE Metadata API and remains the intellectual property of IBGE. Users must comply with IBGE’s terms of use when using, publishing, or redistributing the data.
The data is provided as-is, without warranty of any kind. The package authors are not responsible for the accuracy, completeness, or timeliness of the data returned by the API. For official statistics and methodology, always refer to ibge.gov.br.
API availability, rate limits, and response formats are controlled by IBGE and may change without notice.