Getting Started with InfluxDB

5 minute read #timeseriesdb #influxdb

Let’s learn about how to start using InfluxDB, an open source Time Series Database.

At the end of this post you will have a graph generated from your data, showing air quality change in Berlin:

Data Explorer with visualization

What’s a time series database and what is it good for? #

Time series databases are โœจspecially designedโœจ databases for storing data points that change over time. For example:

What makes them better suited for storing this type of data than relational databases? That’s a very good question. This requires some knowledge of how databases actually store data on disk.

I’ll try to keep it short and simple.

Relational databases store each group of objects (tables) in separate places on disk. This makes finding related data expensive. You literally need to go all over the disk to find what you need. Indexes are used to speed up this process.

But, if you have a type of data that changes regularly, it kinda makes sense to keep those values close together on disk. And later, when you want to query them, because the data points are adjacent to each other, it would be very fast and cheap to read them! Voila, you have a time-series database. It’s an oversimplification, but it’ll do for this post ๐Ÿ˜…

There are more features that make up TSDBs, such as data compression, down-sampling, and support for time series aware queries. If you want to learn more about TSDBs, Influxdata has a nice explanation.

InfluxDB 2.0 #

Starting from version 2.0, InfluxDB is becoming more than just a database. It now includes Kapacitor for processing, and Chronograf for user interface. Add Telegraf to that, and you’ll get the TICK stack.

InfluxData, the company behind the TICK stack, offers InfluxDB Cloud 2.0, as a fully managed service. We’ll be using their free plan to start using InfluxDB fast.

I found the free plan only useful for trying out the product. If you want to get serious, you may want to spin up your own database (using the OSS version) – or pay for the upgrades.

Register on InfluxDB Cloud 2.0 and create a bucket #

Get InfluxDB webpage

Go ahead to sign up for a free InfluxDB Cloud account, and create your first bucket.

After signing up, go to Load Data > Buckets page (visible on the sidebar). And create a bucket.

Create Bucket modal window

We’ll use this bucket later to write data into.

Ways to write data to InfluxDB #

There are lots of ways to write data to InfluxDB:

So that’s a lot of options to interact with InfluxDB.

Many of these tools use line protocol to describe a data point.

Line Protocol ๐Ÿ“ #

InfluxDB uses a line protocol to represent data point entries. Let’s see an example first, then break down the containing elements.

Here’s a single air quality measurement containing pollutant value and some metadata:

aq_measurement,location=DEBE010,city=Berlin,country=DE,parameter=pm10,unit=ยตg/mยณ,latitude=52.543041,longitude=13.349326 value=5.91 1591398000000000000

Line protocol is a text based format that describes:

measurementName,tagKey=tagValue fieldKey="fieldValue" 1465839830100400200
--------------- --------------- --------------------- -------------------
       |               |                  |                    |
  Measurement       Tag set           Field set            Timestamp

Write some data ๐Ÿ’พ #

I’ve prepared 3 months of air quality measurements as line protocol for you. Let’s write them to our empty bucket.

From your InfluxDB Cloud dashboard, go to Load Data > Buckets, and find your bucket.

Click the Add Data button, and choose Line Protocol.

Buckets list - add data button visible

You have the option to upload a file, and enter data manually. Let’s choose Enter Manually.

Copy and paste the air quality data in the text box.

Click Write Data. You should see “Data Written Successfully” message.

Message showing 'written successfully'

That’s it! Now your data is ready to be queried, further processed, and visualized!

Let’s see how we can explore and visualize this data.

Using the Data Explorer ๐Ÿ“ˆ #

In a few steps you’ll have a chart showing the change in Berlin air quality over time.

There are several lines, each representing data from one air quality station.

Data Explorer with query visualization

This query and visualization will not be preserved. If you want to keep it, click on the Save As button on the top right corner.

Add cell to dashboard

The graph will be added as a cell in the dashboard. Go to the Boards page and check it out.

What’s next #

That was a taste of what you can do with InfluxDB.

Now you might be saying, but I only input some prepared data, I want to keep feeding the database with fresh data! I hear you.

As I mentioned earlier in the post, there are numerous ways to feed the database. Just to mention a few:

We didn’t cover the Flux query language, how to process data with InfluxDB tasks, and how to use client libraries to enter data. Why not continue using your InfluxDB cloud and try these out?

This post was born from my research notes while I was making an app that feeds InfluxDB from AWS Lambda. I’ll talk more about it in a later post.