Why This Book Exists
Many students arrive at graduate level study with a strong dislike of statistics and, importantly, without the preparation modern research demands. The gap here isn’t only a statistical one. It’s also about the tools and workflows that make statistical work reliable, transparent, and reproducible.
Open-source tools such as R, Git, Markdown, and LaTeX, and many others are now standard in many research settings. Students paying for a rigorous education should be introduced to these tools early. Yet many never even learn that they exist, and those that do typically discover them by accident since most programs barely mention them. Those who do encounter them often learn under pressure, quickly and inconsistently, because a deadline leaves them with no alternative. That path is stressful and prone to mistakes. Its the route I and many others took and it left me frustrated and resentful. How could an overpriced education fail so badly at preparing students for the practical realities of contemporary research.
These tools are not distractions from learning statistics; they support the process. They help students compute, document, verify, and communicate their work with far less friction than would otherwise be experianced. In an era of unprecedented access to computing power, it makes little sense to train students as if that power, and the practices that harness it, barely matter. When we cling to older methods by default, we risk mistaking tradition for rigor and blaming students for problems our training choices create.
This book is an attempt to close that gap: to teach statistics alongside the open-source tools and workflows that students will actually use, so they can spend less time fighting the process and more time doing careful, credible research.
The primary tool used throughout this book is the R programming language, though we will introduce other tools (markdown, git, LaTeX) when appropriate. Over the long term, my goal is for this text to function as a practical, open-access guide that leads complete beginners step by step through both statistical thinking and statistical programming in a clear, thorough, and usable way. Please treat everything here as a work in progress: the book will evolve.
Errata
No manuscript is free from error. Should you unearth any mistakes, please file an issue on GitHub: