Learn how to use the Bokeh library to make beautiful interactive plots, and also save them to PDF.

Indigo Curnick

October 28, 2024

Articles

We need to begin by setting up the environment, so start with making a venv

```
$ python -m venv .venv
$ source .venv/bin/activate
```

Make a `requirements.txt`

and place it in the root. Add `bokeh`

to that file, and then run `pip install -r requirements.txt`

Let’s start by making a super simple line chart. This will show up some of the basic concepts of Bokeh.

We can start by importing the fundamentals

`from bokeh.plotting import figure, show`

And then we create some data

```
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]
```

It’s important that these lists are the same length. The next step is to create the figure.

```
p = figure(title="Simple line example", x_axis_label="x", y_axis_label="y")
p.line(x, y, legend_label="Values", line_width=2)
```

Notice how we first create a figure, and then add a line to it. We also get a lot of options to customise the figure and line - we can add titles, labels and legends. Finally, we want to see this plot so we can use

`show(p)`

Bokeh is intended for the web, so when we run this it will open the chart on a page in your default web browser. Here’s the final chart.

Let’s add some more data - this is really easy to do! All we have to do is create more lines. Let’s start with some more data

```
1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3y1 = [2, 3, 4, 5, 6]
4y2 = [5, 4, 3, 2, 1]
```

And then we make more calls to `line`

```
1p = figure(title="Multiple lines", x_axis_label="x", y_axis_label="y")
2p.line(x, y, legend_label="Values", line_width=2, color="blue")
3p.line(x, y1, legend_label="More Values", line_width=2, color="red")
4p.line(x, y2, legend_label="Even More Values", line_width=2, color="purple")
```

And this is the chart it produces

So now we can see the basic workflow of Bokeh

- Prepare some data, usually into lists or maybe a numpy array
- Create a figure
- As as many series as you have data
- Show the plot

We can customise each step of this flow quite extensively, as we will see in this article

Bokeh supports many different kinds of “glyphs” - that’s basically what Bokeh calls different items that can be displayed on the figure. Let’s explore some of these options.

```
1p.vbar(x=x, top=y, legend_label="Bar", color="blue", width=0.5, bottom=0)
2p.scatter(x, y1, legend_label="Scatter Crosses", color="red", size=16, marker="x")
3p.scatter(x, y2, legend_label="Scatter Circles", color="purple", size=16)
```

Here we use `vbar`

and `scatter`

. `scatter`

works a lot like `lines`

, but we can customise the size of the marker and the marker type. Circles and crosses are the most common, but there are others too. `vbar`

is a little more involved - we need to set the `x`

and `top`

as named arguments. You can customise the width of the bars, as well as from where the bottom starts (usually, you just want this to be 0).

And then when we display it we get the following chart

A really useful feature in Bokeh is annotations - they let you mark certain areas of the plot. To give a real life example of this, we’re going to plot some random data, as well as display the standard deviation of that data. Let’s start by generating those numbers and standard deviations

```
1import random
2import statistics
3
4N = 30 # Number of random numbers to generate
5
6x = [i for i in range(N)]
7random_numbers = [random.random() for _ in range(N)]
8mean = statistics.mean(random_numbers)
```

Now we’ll create the basic line graph - it’s the same as we did before so this should be familiar.

```
p = figure(title="Standard Deviation Example", x_axis_label="x", y_axis_label="y")
p.line(x, random_numbers, line_width=2, color="black")
```

In order to add annotations to this plot, we need to import the following

`from bokeh.models import BoxAnnotation`

Then we can create the annotations. Since we want to annotate the areas within and outside the standard deviation of the data, we want the “inside” region to be between the mean plus and minus the standard deviation. The middle box has two bounds - the top and bottom. The high and low box have no bottom and top bound respectively, meaning they extend to the end of the plot

```
1low = mean - std_dev
2high = mean + std_dev
3
4low_box = BoxAnnotation(top=low, fill_alpha=0.2, fill_color="red")
5mid_box = BoxAnnotation(bottom=low, top=high, fill_alpha=0.2, fill_color="green")
6high_box = BoxAnnotation(bottom=high, fill_alpha=0.2, fill_color="red")
7
8p.add_layout(low_box)
9p.add_layout(mid_box)
10p.add_layout(high_box)
```

We then simply add the three boxes to the plot with `add_layout `

- and we get the following plot

An interesting plotting project we can use to show off some of Bokeh’s potential is plotting K-means. For this we need a few more dependencies, so add the following to the `requirements.txt`

(and make sure you run `pip install -r requirements.txt`

)

```
1numpy
2scikit-learn
```

Since this isn’t a K-means tutorial, we’ll skip over the details - but if you don’t know what K-means does, the basic idea is to group data into K groups. Here’s the code we’ll use for this

```
1import numpy as np
2from sklearn.cluster import KMeans
3
4data = np.vstack(
5 [
6 np.random.normal(loc=(0, 0), scale=1.0, size=(100, 2)),
7 np.random.normal(loc=(5, 5), scale=1.0, size=(100, 2)),
8 np.random.normal(loc=(0, 5), scale=1.0, size=(100, 2)),
9 ]
10)
11
12kmeans = KMeans(n_clusters=3)
13pred = kmeans.fit_predict(data)
```

Now we have the groups we need to group the data in a more convenient way for us to plot.

```
1plotting_data = {}
2
3for i in range(N):
4 plotting_data[i] = []
5
6for point, group in zip(data, pred):
7 plotting_data[group].append(point.tolist())
```

This will be nice and generic so if we increase `N`

in future it still works - we are essentially making a dictionary of group to a list of the coordinates in that group.

We can make the basic plot again with

`p = figure(title="K-means", x_axis_label="x", y_axis_label="y")`

The next thing to do is sort out the colours. For this kind of plot the best colour scheme to use would be viridis. In order to create the viridis colours we can do the following

```
from bokeh.palettes import Viridis256
colors = Viridis256[::len(Viridis256) // N]
```

This gives us a list of colours, which we can access. Fortunately, scikitlearn numbers the groups from 0 to N-1, which is exactly the same format as the colours we just generated! Therefore, we can plot with the following

```
1for k in plotting_data:
2 v = plotting_data[k]
3
4 x = [row[0] for row in v]
5 y = [row[1] for row in v]
6
7 p.scatter(x, y, legend_label="Group: {}".format(k), size=8, color=colors[k])
8
9show(p)
```

And this generated the following graph

Bokeh doesn’t have a built in way to save to PDF. However, we can export to an SVG and then convert that into a PDF plot. We need a few other dependencies to do this, so add the following to the `requirements.txt`

(and make sure you run `pip install -r requirements.txt`

)

```
1svglib
2reportlab
3selenium
```

We also need to have a webbrowser installed. According to the docs, FireFox or Chrome will work, but I couldn’t make it work with FireFox on my ArchLinux system. I just had to install Chromium and it worked fine (`sudo pacman -S chromium`

on Arch).

First, we need to import a few things

```
1from bokeh.io import export_svgs
2import svglib.svglib as svglib
3from reportlab.graphics import renderPDF
```

And then I turned saving to PDF into a simple function

```
1def save_to_pdf(p, name):
2 # Step 1: Save to SVG
3 p.output_backend = "svg"
4 export_svgs(p, filename=name + ".svg")
5
6 # Step 2: Read in SVG
7 svglib.register_font("helvetica", "/home/fonts/Helvetica.ttf")
8 svg = svglib.svg2rlg(name + ".svg")
9
10 # Step 3: Save as PDF
11 renderPDF.drawToFile(svg, name + ".pdf")
```

All you have to is to provide the plot and the name of the PDF (without the .pdf extension). An example usage looks like this

```
1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3
4p = figure(title="Save in PDF", x_axis_label="x", y_axis_label="y")
5p.line(x, y, line_width=2, color="blue")
6
7save_to_pdf(p, "pdf_test")
```

Also, this will keep the SVG saved on your system, which is helpful as you can also use that in many places where you might want to use a PDF!

In conclusion, Bokeh is a very powerful library for creating beautiful interactive plots. When it comes to the workflow of using it just remember the four steps

- Prepare some data, usually into lists or maybe a numpy array
- Create a figure
- As as many series as you have data
- Show the plot

The examples in this guide should be enough to get you started in most applications. There’s a huge amount of customisation which Bokeh supports, but too much to cover everything in this article. You can find the full reference here.