Dr habil. Marek Gagolewski
I'm currently a Senior Lecturer in Applied AI/Data Science at Deakin University in Melbourne, Australia, and an Associate Professor (on leave) at the Systems Research Institute, Polish Academy of Sciences.
My research interests include machine learning, data aggregation and clustering, computational and applied statistics, mathematical modelling (the science of science, sport, economics, social sciences, psychometrics, bibliometrics, etc.).
In my spare time, I write books for my students and develop free (libre) data analysis software.
Start here:
Open-access textbooks (non-profit, free, independent; please spread the news!):
- Minimalist Data Wrangling in Python (HTML) (PDF) (GitHub) (Amazon: AU CA DE ES FR IT JP NL PL SE UK US)
- Deep R Programming (HTML) (PDF) (GitHub) (draft)
and many more.
Python packages:
- genieclust – Fast and robust hierarchical clustering with noise point detection (GitHub) (PyPI) (paper)
- clustering-benchmarks – A framework for benchmarking clustering algorithms (GitHub) (PyPI) (paper)
R packages:
- stringi – Fast and portable character string processing in R (one of the most often downloaded packages for R) (GitHub) (CRAN) (paper)
- genieclust – Fast and robust hierarchical clustering with noise point detection (GitHub) (CRAN) (paper)
- stringx – Drop-in replacements for base R string functions powered by stringi (GitHub) (CRAN)
- realtest – Where expectations meet reality: Realistic unit testing in R (GitHub) (CRAN)
- TurtleGraphics – Learn computer programming in R while having a jolly time! (GitHub) (CRAN)
Data:
- Clustering Benchmarks (framework, datasets, results)
- Teaching Resources

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
