You are reading the work-in-progress edition of Tidy Finance with Python. Code chunks and text might change over the next couple of months. We are always looking for feedback via Meanwhile, you can find the complete R version here.

Why does this book exist?

Our book Tidy Finance with R received great feedback from students and teachers alike. However, one of the most common feedback we received was that many interested coders are constrained and have to use Python in their institutions. We really love R for data analysis tasks, but we acknowledge the flexibility and popularity of Python. Hence, we decided to increase our team of authors with a Python expert and extend our original work following the same tidy principles.

Why Tidy?

As you start working with data, you quickly realize that you spend a lot of time reading, cleaning, and transforming your data. In fact, it is often said that more than 80% of data analysis is spent on preparing data. By tidying data, we want to structure data sets to facilitate further analyses. As Wickham (2014) puts it:

[T]idy datasets are all alike, but every messy dataset is messy in its own way. Tidy datasets provide a standardized way to link the structure of a dataset (its physical layout) with its semantics (its meaning).

In its essence, tidy data follows these three principles:

  1. Every column is a variable.
  2. Every row is an observation.
  3. Every cell is a single value.

Throughout this book, we try to follow these principles as best as possible. If you want to learn more about tidy data principles in an informal manner, we refer you to this vignette as part of Wickham and Girlich (2022).

In addition to the data layer, there are also tidy coding principles outlined in the tidy tools manifesto that we try to follow:

  1. Reuse existing data structures.
  2. Compose simple functions with the pipe.
  3. Embrace functional programming.
  4. Design for humans.

About the Authors


This book is licensed to you under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International CC BY-NC-SA 4.0. The code samples in this book are licensed under Creative Commons CC0 1.0 Universal (CC0 1.0), i.e., public domain. You can cite this work-in-progress version of the python project as follows:

Frey, C., Scheuch, C., Voigt, S., & Weiss, P. (2023). Tidy Finance with Python.

  title = {Tidy Finance with Python},
  author = {Frey, Christoph and Scheuch, Christoph and Voigt, Stefan and Weiss, Patrick},
  year = {2023},
  edition = {work-in-progress},
  url = {}


Wickham, Hadley. 2014. Tidy data.” Journal of Statistical Software 59 (1): 1–23.
Wickham, Hadley, and Maximilian Girlich. 2022. tidyr: Tidy messy data.