Marimo Tutorial: Reactive and Reproducible Python Notebooks

# Marimo: Notebook Python yang Reaktif dan Reproducible Marimo adalah notebook Python yang menyimpan isinya sebagai berkas `.py` biasa dan menjalankan sel secara reaktif, mirip cara spreadsheet mengh...

By Ruby Abdullah · · tutorial
MarimoNotebookReactiveData ScienceReproducibilityPython

Marimo: Reactive, Reproducible Python Notebooks

Marimo is a Python notebook that stores its content as a plain .py file and runs cells reactively, much like a spreadsheet recalculates dependent formulas. It was designed to address recurring pain points in traditional notebook workflows, especially hidden state and out-of-order execution. This tutorial walks through what Marimo is, why its model differs from Jupyter, and how to build an interactive data-exploration notebook that also runs as a script and as a web app.

Why Another Notebook?

Notebooks are popular for exploration, teaching, and reporting. They are also notorious for a specific class of bugs that come from how the classic notebook model works. Before looking at Marimo's features, it helps to name the problems it tries to solve.

Hidden State

In a typical Jupyter session, the kernel keeps every variable you ever defined, even after you delete the cell that created it. A notebook can appear to work simply because a now-deleted cell once ran. Reopen it later, run top to bottom, and it breaks. The visible code no longer matches the runtime state.

Out-of-Order Execution

Jupyter lets you run cells in any order. The In [n] counter records the order you happened to use, not a reproducible sequence. Two people running the same notebook can reach different results depending on which cells they ran and when.

Poor Diffs and Version Control

A .ipynb file is JSON that embeds source code, execution counts, and base64-encoded outputs (images, tables) in one document. A one-line code change can produce a large, noisy diff. Reviewing a notebook pull request is awkward, and merge conflicts are common.

Hard to Reuse

Turning a notebook into a script or a module usually means copying cells into a .py file and untangling the execution order by hand. The notebook itself is not directly importable or runnable as a program.

Marimo takes a different position on each of these points. The next sections explain how.

Installation and First Steps

Marimo is a standard Python package.

pip install marimo

Confirm the install and check the version:

marimo --version

Create a new notebook:

marimo new

Open an existing notebook in the editor:

marimo edit notebook.py

The editor runs in your browser but the notebook file lives on disk as ordinary Python. If you already have Jupyter notebooks, convert them:

marimo convert oldanalysis.ipynb > newanalysis.py

To serve a notebook as a read-only interactive application instead of an editable document:

marimo run notebook.py

In run mode the code cells are hidden and only the UI and outputs are shown. The same file is the editor document, the app, and a script. There is no separate export step to get a working program.

The .py File Format

A Marimo notebook is a regular Python file. Each cell is a function decorated with @app.cell, and the file ends with a small runner block. A minimal notebook looks like this:

import marimo

app = marimo.App()

@app.cell

def ():

import marimo as mo

return (mo,)

@app.cell

def (mo):

x = 21

mo.md(f"x is {x}")

return (x,)

if name == "main":

app.run()

Two things are worth noticing. First, this is valid Python that you can lint, format with tools like Black or Ruff, and review in a normal diff. A code change shows up as a code change, not as a blob of JSON. Second, the function arguments and return values are not boilerplate you maintain by hand. Marimo generates them from the variables each cell reads and defines. Those declarations are exactly what powers the reactive model.

The Reactive Execution Model

This is the central idea in Marimo and the clearest break from Jupyter.

Related Articles

Kedro Tutorial: Reproducible and Maintainable Data Science Pipelines

Kedro: Pipeline Data Science yang Reproducible dan Mudah Dirawat Sebagian besar proyek data science dimulai dari satu no...

DuckDB: In-Process Analytical Database for Data Science

DuckDB: Database Analitik In-Process untuk Data Science DuckDB adalah database analitik in-process yang dirancang khusus...

Polars Tutorial: Ultra-Fast DataFrame Library for Data Science

Polars - Tutorial Lengkap Library DataFrame Ultra-Cepat Daftar Isi Pendahuluan Prasyarat Dasar-Dasar Polars [Evaluasi La...

Feature Engineering Masterclass Tutorial: Feature Techniques for ML

Tutorial 14: Masterclass Rekayasa Fitur (Feature Engineering) Daftar Isi Pendahuluan Prasyarat Mengapa Rekayasa Fitur Pe...