ibger: Access the IBGE Aggregate Data API from R • ibger

CRAN_Status_Badge CRAN Downloads

Tidyverse-friendly interface to the IBGE Aggregate Data API (version 3), which powers SIDRA, the automatic data retrieval system for all surveys and censuses conducted by IBGE (Brazilian Institute of Geography and Statistics).

Installation

# install.packages("remotes")
remotes::install_github("StrategicProjects/ibger")

Note: if you encounter the error curl_modify_url is not an exported object, update the curl package with install.packages("curl"). Version 6.0.0 or higher is required.

Quick start

library(ibger)

# Browse available aggregates
ibge_aggregates()

# Inspect an aggregate (full metadata)
meta <- ibge_metadata(7060)
meta
meta$variables
meta$classifications

# Available periods and localities
ibge_periods(7060)
ibge_localities(7060, level = "N6")

# Get data — the main function
ibge_variables(7060, localities = "BR")

Automatic validation

ibge_variables() and ibge_localities() validate parameters against the aggregate metadata before querying. Invalid parameters stop execution with a clear error showing the allowed values:

ibge_variables(7060, localities = "N3")
#> Error:
#> ! Geographic level(s) "N3" not available for aggregate 7060.
#> ℹ Available levels: "N1", "N6", and "N7".

Examples

# Monthly IPCA in Brazil, last 12 months
ibge_variables(7060, variable = 63, periods = -12, localities = "BR")

# IPCA by product group for specific municipalities
ibge_variables(
  aggregate      = 7060,
  variable       = 63,
  periods        = -6,
  localities     = list(N6 = c(1200401, 2800308)),
  classification = list("315" = c(7169, 7170, 7445))
)

Value column and IBGE conventions

In accordance with official IBGE data standards, the value column returned by the API may be of type character rather than numeric.

This occurs because IBGE uses special symbols to represent specific data conditions, which are part of the statistical dissemination standard and should not be treated as errors.

The possible values are:

Value	Meaning
`-`	Numeric zero (not resulting from rounding)
`..`	Not applicable
`...`	Data not available
`X`	Suppressed to avoid identifying individual respondents

As a consequence, users should not assume that the value column is always numeric. When numerical analysis is required, these symbols must be handled explicitly before coercion.

Design

Tidyverse native: all functions return tidy tibbles
snake_case: function and column names
Natural interface: classifications as named lists, flexible localities
Validation: checks metadata before querying, stops with clear errors
Caching: metadata cached in memory to avoid duplicate calls
Feedback: clear cli messages at each step
Robust: automatic retry via httr2

Functions

Aggregates API (data retrieval)

Function	Description
`ibge_aggregates()`	List aggregates grouped by survey
`ibge_metadata()`	Full metadata for an aggregate
`ibge_periods()`	Available periods
`ibge_localities()`	Localities by level (with validation)
`ibge_variables()`	Get data (with validation)
`ibge_subjects()`	Look up IBGE subject (theme) codes

Survey catalog (Metadata API)

Function	Description
`ibge_surveys()`	List all IBGE surveys with status and category
`ibge_survey_periods()`	Available periods for a survey’s metadata
`ibge_survey_metadata()`	Institutional/methodological metadata for a survey

Utilities

Function	Description
`parse_sidra_url()`	Parse SIDRA URL into ibger parameters
`fetch_sidra_url()`	Fetch data directly from a SIDRA URL
`parse_ibge_value()`	Convert value column to numeric
`ibge_clear_cache()`	Clear metadata cache

Learn more

vignette("getting-started") — full walkthrough
vignette("api-concepts") — understanding the IBGE API data model
vignette("ipca-example") — real-world IPCA inflation analysis
vignette("tutorial") — tracking state GDP components with IBGE data

Interactive Aggregate Explorer

The function ibge_explorer() launches an interactive Shiny application that allows you to browse, filter, and export the full catalog of IBGE aggregates available via the API.

It is designed to make exploration easier before calling functions such as ibge_metadata(), ibge_variables(), or ibge_aggregates().

✨ Features

📊 Summary value boxes with total counts
🔍 Global and column-level table filtering
🧭 Filter by survey
📥 CSV download of filtered results
🆔 Click a row to display the corresponding ibge_metadata() call

🚀 Usage

# Open in RStudio Viewer (default behavior)
ibge_explorer()

# Open in your system browser
ibge_explorer(launch.browser = TRUE)

📦 Dependencies

The explorer requires the following packages:

shiny
DT
bslib
bsicons

If any of them are not installed, ibge_explorer() will display a friendly CLI error message.

💡 When to Use It?

Use ibge_explorer() when you:

Don’t remember an aggregate ID
Want to search by survey name
Need a quick way to copy the correct ibge_metadata() call
Prefer a visual workflow before writing code

Comparison with other packages

Several R packages provide access to IBGE data. Here is how ibger differs:

Feature	ibger	sidrar	PNADcIBGE / SIPDIBGE
Data source	IBGE Aggregates API (v3)	SIDRA API (`apisidra`)	IBGE microdata (FTP/download)
API base URL	`servicodados.ibge.gov.br`	`apisidra.ibge.gov.br`	—
Row limit per request	100,000	20,000	—
Output	Tibbles (tidy, long format)	data.frames (wide by default)	survey design objects (`survey`)
Parameter format	Named R lists	Positional + string codes	File paths / year + quarter
Metadata discovery	Dedicated endpoints (`/metadados`, `/periodos`, `/localidades`)	HTML scraping of `desctabapi.aspx`	—
Pre-flight validation	Checks metadata before querying	None (errors come from the API)	—
Caching	In-memory metadata + aggregates cache	None	Local file cache
HTTP stack	httr2 (automatic retry, structured errors)	httr v1 (no retry)	—
Feedback	cli progress messages	None	None

vs sidrar

sidrar is the closest alternative. Both packages access IBGE aggregate data, but they talk to different APIs hosted by IBGE:

Different endpoints: sidrar uses the legacy SIDRA API (apisidra.ibge.gov.br/values), while ibger uses the IBGE Aggregates API v3 (servicodados.ibge.gov.br/api/v3/agregados). IBGE’s own documentation describes the Aggregates API as the standardized version of the SIDRA API.
Higher row limit: the SIDRA API caps responses at 20,000 rows (sidrar’s error message: “more than 20k values”); the Aggregates API allows up to 100,000 values per request — a 5× increase.
Structured metadata: ibger queries dedicated JSON endpoints for metadata (/metadados, /periodos, /localidades/{nivel}). sidrar scrapes an HTML page (desctabapi.aspx) to discover classifications, and info_sidra() opens metadata in the web browser rather than returning it as data.
Parameter ergonomics: sidrar requires positional string codes (classific = "c315", geo = "City", geo.filter = list("State" = 50)), while ibger uses standard R objects (classification = list("315" = 7169), localities = list(N6 = 5002704)).
Pre-flight validation: ibger checks your parameters (levels, periods, variables, classifications) against the aggregate metadata before hitting the API. Invalid requests fail fast with clear error messages showing the allowed values, instead of returning an opaque API error.
Modern stack: ibger uses httr2 (automatic retry, structured error handling), cli (progress messages), and returns tibbles by default. sidrar uses httr v1 with no retry mechanism.

vs PNADcIBGE and SIPDIBGE

These packages serve a fundamentally different purpose. They download and process microdata (individual survey responses) from household surveys like PNAD Contínua, POF, and MUNIC, returning survey design objects suitable for complex survey analysis with the survey package.

ibger works with aggregate data — the pre-tabulated summary tables published through SIDRA. If you need individual-level records and proper survey weights, use PNADcIBGE or SIPDIBGE. If you need ready-made indicators, time series, and cross-tabulations (such as IPCA, GDP, census totals, or agricultural production by municipality), use ibger.

Disclaimer

This package is an independent, open-source project and is not affiliated with, endorsed by, or officially connected to the Instituto Brasileiro de Geografia e Estatística (IBGE) in any way.

All data retrieved through this package is sourced from the IBGE Aggregates API and the IBGE Metadata API and remains the intellectual property of IBGE. Users must comply with IBGE’s terms of use when using, publishing, or redistributing the data.

The data is provided as-is, without warranty of any kind. The package authors are not responsible for the accuracy, completeness, or timeliness of the data returned by the API. For official statistics and methodology, always refer to ibge.gov.br.

API availability, rate limits, and response formats are controlled by IBGE and may change without notice.