Jupyter NotebookΒΆ

Jupyter is a web application that allows users to create and share documents containing live code, equations, visualizations and narrative text. This section describes how to configure a Jupyter Python 3 notebook to allow access to Koverse data sets.

The prerequisites for accessing Koverse data sets in a Jupyter Python 3 notebook are:

Spark 1.6
Python 3.5

Once the prerequisites are met, you will need to download a koverse-spark-datasource JAR file. The version you download should match your installed Koverse. You can find the JAR files here:


Next, you will make additions and changes to your environment variables, as follows. Be sure to replace /opt/spark with the location of your installed Spark 1.6 and /usr/local/bin/python with the location of your Python 3 binary executable:

export SPARK_HOME=/opt/spark
export PYSPARK_PYTHON=/usr/local/bin/python

You are now ready to start the Jupyter notebook using pyspark which is part of the Spark installation:

pyspark --jars <location of koverse-spark-datasource JAR file downloaded, above>

An example of reading a Koverse data set in a Jupyter Python 3 notebook is shown below.


Note that there is currently a limitation requiring Koverse data sets to be written as the user Koverse.