Air Quality, Open Data, and You
In which I introduce OpenAQ, the open data platform for air quality. I talk about how I got involved with the project. Finally, I’ll persuade you to start contributing to OpenAQ.
What’s OpenAQ? #
OpenAQ is a non-profit promoting and providing open data for air quality.
As you might know, air pollution is quite deadly and harmful:
- It kills 8.8 million people a year (study from 2019)
- It causes heart disease, lung disease, increases cancer risk, and more.
To better study the impact of air pollution, open data is very helpful.
Many countries provide access to their data, but not in an accessible way. Countries usually maintain their own sensor stations and data access methods.
Think of a mixture of HTML tables, TXT files, XML feed, etc.
Can you imagine countries not using a standardized data format?
OpenAQ’s goal is to provide a single API to access the available data.
It takes care of several things:
- Maintain dedicated data adapters for each data source.
- Collect research grade raw metric data.
- Provide public access to the data.
At its simplest form, OpenAQ is a website where you download your hometown’s air quality data.
It’s also more than that:
- An API that lets you query for air quality data from around the world.
- A non-profit that advocates air quality open data, publishes air quality data availability reports, and organizes workshops around the world.
- A community of people interested in air quality, including journalists, scientists, software engineers, and teachers.
- A set of open source projects powering the platform.
The Ulaanbaatar Workshop #
How did I get involved with OpenAQ?
In 2015, I came across an announcement of a workshop about an air quality open source project. I got registered immediately.
My interest in the topic was quite simple: Air pollution was a recurring problem for Ulaanbaatarers. It affected everyone who lived in the city. It was personal.
- Before the workshop, I made a web client for the API. It allowed me to quickly learn how the API worked.
- At the workshop, I showed what I made to everyone.
- I met a lot of people there: OpenAQ cofounders, scientists, journalists, and researchers. It was eye opening.
After the workshop I started to contribute to the projects. It started very small: a documentation update at first. Then some updates to the API, and new data source adapters.
OK, enough history, let’s talk about how OpenAQ works.
What kind of data? #
OpenAQ keeps track of several common air pollutants. These are:
- PM2.5 - Particulate matter less than 2.5 micrometers in diameter
- PM10 - Particulate matter less than 10 micrometers in diameter
- O₃ - Ozone
- CO - Carbon Monoxide
- NO₂ - Nitrogen Dioxide
- SO₂ - Sulfur Dioxide
- BC - Black Carbon
Platform Architecture #
It starts with the fetch job. The fetch job triggers parallel tasks for each data source. It reads from the sources and parses current data points. This happens every 10 minutes.
The extracted data is then saved to a public S3 bucket and a Postgres database.
An HTTP API serves user requests, by reading from the database.
That’s basically it, but there are extra moving parts of course.
Data Access Methods #
There are several ways to access data from OpenAQ.
The HTTP API #
The API has endpoints like
This is useful if you want some current data for a given city or a geolocation.
There are several wrapper libraries created by the community:
AWS S3 #
As I said before, the raw metrics are dumped into S3, and it’s publicly available. You can do anything you want with it.
You can use tools like
awscli to automate the process.
AWS Athena #
With Athena, you can query the S3 bucket with SQL! Yes, it’s good old SQL, but you get to run it over the whole collection of JSON files. If you want specialized results, this might be helpful.
Open source projects #
There are 2 main projects:
The projects are stable, but any project needs maintenance and improvements.
- There are new data sources that need adapters.
- Some API endpoints don’t behave as expected in certain situations.
- Documentation can be missing useful details.
On Contributing #
Are you interested in helping out? Super!
Contributing comes in many shapes and form:
- Answering questions on Slack
- Writing blog posts about air quality
- Triaging GitHub issues
- Updating documentation
- Fixing bugs
- Improving or adding features
Asking Questions #
When it comes to contributing, nothing is small enough. People say, start with updating documentation. I would say, start asking questions.
Asking questions, starting discussions, you become part of the community. Your question might help to bring into light what was confusing to a lot of people.
Where to Start #
If you’re interested, come to OpenAQ’s Slack and introduce yourself!
Learn More #
Some highlights from the OpenAQ blog, and other interesting pages: