Deutsche Bahn Statistics is a (German) website with plots and tables about the Deutsche Bahn together with Python code to create them. The statistics are automatically updated monthly and use publicly available data. The data is available in a separate repo here: https://github.com/piebro/deutsche-bahn-data.
git clone https://github.com/piebro/deutsche-bahn-statistics
cd deutsche-bahn-data
uv sync
bash download_data.sh
uv run -m http.server
uv run run_all_calculations.py
Do it manually and copy the folder structure of the previous question. An overview of the dataset with its columns can be found in the data repository.
Update all the generated parts of index.html using the following script, after adding a new question.
START_DATE=$(date -d "$(date +%Y-%m-01) -3 month" +%Y.%m.%d)
END_DATE=$(date +%Y.%m.01)
uv run generate_html_links.py $START_DATE $END_DATE "allgemein,zeitraum,direkter_zug,zugverbindung,verspaetungsverlauf_zugfahrt,verspaetung_pro_bahnhof,zuggattungen_pro_bahnhof,bahnhof,zuggattung"
Use ruff check
and ruff format
to lint and format the Python code before committing new code.
There are a few other projects that look at similar data.
Contributions are welcome. Open an Issue if you want to report a bug, have an idea, or want to propose a change.
There is lightweight tracking with Plausible for the website to get info about how many people are visiting. Everyone who is interested can look at these stats here: https://plausible.io/piebro.github.io%2Fdeutsche-bahn-statistics?period=30d. Only users without an AdBlocker are counted, so these statistics are underestimating the actual count of visitors. I would guess that quite a few people (including me) visiting the site have an AdBlocker.
All code in this project is licensed under the MIT License. The data is licensed under Attribution 4.0 International (CC BY 4.0) by Deutsche Bahn.