Air Quality, Open Data, and You

4 minute read #airquality #opendata #openaq

In which I introduce OpenAQ, the open data platform for air quality. I talk about how I got involved with the project. Finally, I’ll persuade you to start contributing to OpenAQ.

What’s OpenAQ? #

OpenAQ is a non-profit promoting and providing open data for air quality.

As you might know, air pollution is quite deadly and harmful:

  1. It kills 8.8 million people a year (study from 2019)
  2. It causes heart disease, lung disease, increases cancer risk, and more.

To better study the impact of air pollution, open data is very helpful.

Many countries provide access to their data, but not in an accessible way. Countries usually maintain their own sensor stations and data access methods.

Think of a mixture of HTML tables, TXT files, XML feed, etc.

Can you imagine countries not using a standardized data format?

OpenAQ’s goal is to provide a single API to access the available data.

It takes care of several things:

  1. Maintain dedicated data adapters for each data source.
  2. Collect research grade raw metric data.
  3. Provide public access to the data.

At its simplest form, OpenAQ is a website where you download your hometown’s air quality data.

It’s also more than that:

The Ulaanbaatar Workshop #

How did I get involved with OpenAQ?

In 2015, I came across an announcement of a workshop about an air quality open source project. I got registered immediately.

My interest in the topic was quite simple: Air pollution was a recurring problem for Ulaanbaatarers. It affected everyone who lived in the city. It was personal.

Ulaanbaatar smog in 2017

After the workshop I started to contribute to the projects. It started very small: a documentation update at first. Then some updates to the API, and new data source adapters.

OK, enough history, let’s talk about how OpenAQ works.

What kind of data? #

OpenAQ keeps track of several common air pollutants. These are:

  1. PM2.5 - Particulate matter less than 2.5 micrometers in diameter
  2. PM10 - Particulate matter less than 10 micrometers in diameter
  3. O₃ - Ozone
  4. CO - Carbon Monoxide
  5. NO₂ - Nitrogen Dioxide
  6. SO₂ - Sulfur Dioxide
  7. BC - Black Carbon

Platform Architecture #

It starts with the fetch job. The fetch job triggers parallel tasks for each data source. It reads from the sources and parses current data points. This happens every 10 minutes.

OpenAQ system overview

The extracted data is then saved to a public S3 bucket and a Postgres database.

An HTTP API serves user requests, by reading from the database.

That’s basically it, but there are extra moving parts of course.

Data Access Methods #

There are several ways to access data from OpenAQ.


The API has endpoints like measurements and locations. This is useful if you want some current data for a given city or a geolocation.

There are several wrapper libraries created by the community:

AWS S3 #

As I said before, the raw metrics are dumped into S3, and it’s publicly available. You can do anything you want with it.

You can use tools like awscli to automate the process.

AWS Athena #

With Athena, you can query the S3 bucket with SQL! Yes, it’s good old SQL, but you get to run it over the whole collection of JSON files. If you want specialized results, this might be helpful.

Open source projects #

There are 2 main projects:

  1. openaq-api - the API, implemented with Hapi.
  2. openaq-fetch - the fetch tool, implemented in Node.

The projects are stable, but any project needs maintenance and improvements.

On Contributing #

Are you interested in helping out? Super!

Contributing comes in many shapes and form:

Asking Questions #

When it comes to contributing, nothing is small enough. People say, start with updating documentation. I would say, start asking questions.

Asking questions, starting discussions, you become part of the community. Your question might help to bring into light what was confusing to a lot of people.

Where to Start #

If you’re interested, come to OpenAQ’s Slack and introduce yourself!

Learn More #

Some highlights from the OpenAQ blog, and other interesting pages: