Jupyter Notebook

by  Paulina Wyrwas
August 26 2020, 15:00pm

Jupyter is a programming environment that allows writing, modifying and running code from a web browser. It is a powerful tool used mainly in Data Science and Machine Learning due to comfort of cell by cell execution and highly developed visualisation abilities.

What exactly is Jupyter notebook?

Jupyter emerged from IPython project and is an open-source software released under BSD License. The name ‘Jupyter’ came from the main programming languages that it supports: Julia, Python and R. Its main feature is the ability to execute code cell by cell. It means that each part of the code can be run separately, while the parts that were already executed stay in kernel memory with their variables and function definitions. It is especially useful when some chunks of code require re-adjusting or have to be re-run for visualisation. What is more, Jupyter allows adding text, comments and creating extensive markups for clearer and more readable code.

Why should you use notebooks?

Jupyter can be an extremely useful tool, but it is not suitable for building large, commercial applications - so why use it at all? The main answer is: for convenience. Notebooks are the perfect way to build parts of more complex projects - they are great for testing and re-writing some functions and functionalities without the need to build and run the entire program. You can also create notebooks to view and analyse input and output data for your programs. Finally, notebooks give the ability to create manuals for running the application, which can be helpful within a team as well as with the customer.

How to set up Jupyter for Python?

You can set up your Jupyter environment either using pip: pip install jupyter or by installing Anaconda, a free distribution of Python and R languages that comes with Jupyter (as well as some libraries like Numpy, Pandas, Matplotlib) already installed.

With Jupyter installed you can open your first notebook. Go to the terminal and navigate to a folder you want to keep your notebook in. To start up Jupyter run command: jupyter notebook.
It will open your default browser and go to http://localhost:8888/tree That is a reflection of your folder allowing you to create notebooks and navigate through them.

To create a new notebook click on New and select a Python distribution - it will open a blank Untitled notebook.

Now you can start writing your code, creating markups and descriptions. Everything should be divided in logical chunks that can be executed separately. There are three types of cells - code, markdown and raw NBConvert. Cells can be inserted, moved up and down, copied and removed. They are executed by holding Shift + Enter.

Code cells are primary cells in every project. Once they are executed, variables defined within them will stay in Kernel memory. Markdown cells allow creating titles, extensive code comments and descriptions. You can style your text to make it more readable, for example adding # before text creates a header and the more # you use, the smaller the header.

You can also create bullet lists with sublists using *.

Raw NBConvert cells are useful when you want to convert a notebook to another format, like Latex or HTML (by default notebooks are saved as a JSON text file) - they will be converted in a way specific to the output format.

One last important thing before you start working with Python notebooks is getting to know the Kernel options. Sometimes, especially when you’re running complex machine learning models, you might want to stop Kernel and adjust some parameters - this is when the Interrupt option comes in handy. You can also Restart your Kernel (it removes all variables and defined methods from the memory, but all outputs, graphs etc. remain printed), Restart & Clear Output (similar to Restart, but removes all outputs from the notebook) or Restart & Run All (removes all variable and methods from memory, clears outputs and runs all cells from one by one, starting from the top).

You can save your notebook and exit Jupyter at any point. Jupyter files are saved as .ipynb files together with a checkpoint .ipynb. Every 2 minutes your notebook will be auto saved, however autosave updates only the main file, not the checkpoint one. Under File -> Revert to Checkpoint there is an option to revert changes to the last time the notebook was saved (not autosaved).

With this quick introduction, you should be able to set up and start writing Jupyter notebooks. They can be a great asset to any developer working in Python, especially when dealing with data analysis.