The Curriculum
44 units. One per day, fifteen minutes each. Ten weeks to statistical self-defence. Follow the sequence or jump to any unit from the reference library.
A million and a billion sound like cousins. They are not. Most numerical manipulation in public life depends on you not feeling the difference. This unit corrects that, permanently.
The vocabulary of proportion. Percentages, fractions, and ratios are the building blocks of almost every statistical claim — and the source of some of the most reliably effective manipulation. This unit covers the percentage point versus percentage change distinction, what 'X times more likely' actually means, and why every percentage needs a denominator.
The word 'average' conceals three different calculations, each giving a different answer. This unit shows why mean and median diverge in skewed distributions — and how choosing between them is one of the most common tools of numerical manipulation.
A graph communicates through visual impression before you process the numbers. That impression can be engineered. Five tricks — the truncated y-axis, the compressed x-axis, area distortion, dual axes, and the missing baseline — account for the vast majority of graph manipulation you will encounter.
Before formal probability, you need an intuitive vocabulary for risk. This unit covers absolute risk, baselines, and why your gut feeling about risk is systematically, predictably wrong — and how that gets exploited.
Two legitimate interpretations of probability, both useful, often confused. The frequentist view treats probability as long-run frequency; the Bayesian view treats it as a degree of belief. The distinction matters every time you read a forecast, a poll, or a risk figure in public life.
Most people have a wrong model of what randomness looks like. Random sequences contain runs and clusters as a matter of course. Understanding this protects you from one of the most seductive errors in statistics: seeing a pattern in noise and inventing a cause for it.
The four formal rules that govern how probabilities combine — addition for exclusive events, addition for non-exclusive events, multiplication for independent events, and the complementary rule. Plus the specific way these rules get weaponised to mislead.
The probability of an event changes depending on what you already know. P(A|B) is not the same as P(B|A), and confusing the two has sent innocent people to prison and led doctors to badly misread test results. This unit gives you the tool that sits at the heart of Bayes' theorem.
What probability theory predicts will happen on average, over many trials. How to calculate it, when to use it, when it misleads you, and why the people selling you lottery tickets and insurance policies already know it and are counting on you not to.
Two data sets with identical averages can be entirely different in character. Variance and standard deviation measure how far outcomes scatter around the centre — and in finance, medicine, and public policy, ignoring spread is how people get badly misled.
The most important single idea in the curriculum, introduced intuitively. A positive test result for a rare disease is probably wrong, even when the test is 99% accurate. Understanding why requires prior probability, likelihood, and posterior belief — the three ingredients of Bayesian reasoning.
Mean and median were the start. This unit adds variance, standard deviation, percentiles, quartiles, the five-number summary, box plots, and weighted averages — the full toolkit for understanding what a dataset actually looks like, and for noticing when someone has shown you only part of it.
Why the bell curve appears everywhere, what it actually implies, and what happens when the people running your pension fund assume it holds when it does not. The 68-95-99.7 rule, z-scores, and the normality assumption that helped cause the 2008 financial crisis.
Real data is rarely bell-shaped. Skew, fat tails, bimodal distributions, log-normal distributions, and power laws each produce systematic errors when treated as if they were normal. This unit shows what those errors look like and why they cost people money, elections, and lives.
Every study is a sample. The quality of the conclusion depends on the quality of the sample. Population vs sample, random sampling, convenience sampling, self-selection bias, and stratified sampling — with the 1936 Literary Digest disaster as the object lesson.
A thousand randomly chosen people can accurately represent sixty million. A million carefully selected people might tell you almost nothing. Sample size matters, but size alone is not the point. How the sample was gathered is what determines whether the number is meaningful.
Almost everyone who reads a confidence interval misreads it. The misreading is not a minor technicality. It is the difference between knowing something is uncertain and believing it is not.
The p-value is the most cited and most misunderstood number in public science. Almost everyone who uses it is using it wrong, including many scientists. Here is what it actually means, why the 0.05 threshold is completely arbitrary, and why a result can be both statistically significant and completely pointless.
Every statistical test can fail in exactly two ways. Which failure you find more tolerable is not a technical question. It is a values question, and the answer determines how the system treats you.
The complete null-hypothesis significance testing procedure, assembled from confidence intervals, p-values, and error types. One-tailed versus two-tailed tests, degrees of freedom, multiple comparisons, and the one trick that inflates false discoveries without anyone technically cheating.
The two dominant frameworks in statistics answer completely different questions. Null-hypothesis testing tells you how surprising your data would be if nothing were going on. Bayesian inference tells you how probable your hypothesis is, given what you now know. These are not the same thing, and the difference matters every time a drug is approved, a trial proceeds to verdict, or a screening programme is designed.
Most published scientific findings do not replicate. This is not a failure of individual scientists — it is a structural consequence of how science is conducted and reported. Here is what it means, why it happened, and how to read a single study properly.
The single most important unit in the curriculum. The most common trick in medicine, policy, and advertising. A treatment that reduces risk from 2% to 1% is '50% effective' — but reduces your absolute risk by one percentage point.
Any dataset contains patterns. The selection of which patterns to present is a powerful tool. This unit covers cherry-picked time periods, subgroup analyses, and selective citation of studies.
Every trick in the visual statistics arsenal: the truncated axis, the inverted axis, the dual axis, the area distortion, and the missing uncertainty band. With documented real-world examples from governments, media, and financial advertising.
We only see the planes that came back. Abraham Wald noticed this in 1943 and saved bomber crews' lives. The same blind spot distorts investment returns, entrepreneurship mythology, and medical evidence — wherever failures disappear from the data before you see it.
Extreme measurements are followed by less extreme ones — not because anything changed, but because that is how randomness works. Misattributing this statistical law to an intervention is one of the most common errors in medicine, management, and sport.
The politics of who gets studied and who does not. WEIRD psychology populations, male-dominant clinical trials, the Tuskegee legacy, volunteer bias, and attrition bias — and how conclusions drawn from one group get quietly applied to everyone else.
The most frequently stated statistical fact and the least frequently applied one. Three structural reasons why correlation does not imply causation — confounding, reverse causation, and chance — with the Bradford Hill criteria as a framework for when it might.
Using statistics about groups to draw conclusions about individuals is a structural error with serious real-world consequences — in criminal profiling, public health policy, and cross-national research. Here is how to recognise it and its mirror image.
Confusing P(evidence|innocent) with P(innocent|evidence). This error has contributed to wrongful convictions. The Sally Clark case, DNA database searches, and how to spot the tell in any courtroom probabilistic argument.
Statistical significance can be manufactured without fraud, simply by making enough analysis choices until the right number appears. This is p-hacking, and it is built into how academic publishing works. Understanding it changes how you read every headline that says 'scientists find'.
The studies that get published are not a random sample of all studies conducted. Positive findings are published; negative findings are buried. The result is a scientific literature systematically tilted toward conclusions that support the people funding the research — and a public making decisions based on a distorted picture.
The synthesis unit. NNT, NNH, absolute vs relative risk, natural frequencies, icon arrays, sensitivity, specificity, PPV, NPV, and how PPV collapses when you screen a low-prevalence population. The HIV test worked through in full. The mammography debate as a case study in honest versus dishonest risk communication.
We judge the probability of an event by how easily examples come to mind. Vivid, recent, or emotionally charged events are overweighted. This one cognitive shortcut shapes public policy, insurance markets, and personal decisions in ways that have nothing to do with actual risk.
The most consequential cognitive error in everyday probabilistic reasoning. When a specific piece of evidence arrives, the mind discards the background frequency of the event in the population. The cab problem shows you exactly how this happens, and Bayes' theorem shows you exactly how to stop it.
People consistently judge a specific, detailed scenario as more probable than a broader one that contains it. This violates a foundational rule of probability. Understanding why it happens, and what it looks like in the wild, is the protection.
Past random events are taken to influence future random events, even when there is no causal mechanism connecting them. The casino industry is built, in part, on this error. Understanding it — really understanding it, not just nodding at it — changes how you read any streak.
The belief that a player on a hot streak will keep performing well was declared a cognitive illusion in 1985. Then in 2018 a mathematician found a flaw in the original analysis. The hot hand may be real after all. This unit covers what changed, why it matters, and how it connects to a deeper point about dependent processes.
The first number you encounter shapes every estimate that follows, even when you know that number was chosen at random. Anchoring operates in salary negotiations, retail pricing, and courtroom sentencing. Understanding it changes how you read any situation where someone else names a number first.
People's confidence in their own judgements reliably exceeds their accuracy. This is not a personality flaw — it is a systematic feature of how human minds form predictions. Understanding it, and the correctives that actually work, is the difference between someone whose plans routinely collapse and someone who consistently delivers.
Judgements of probability are replaced by judgements of similarity to a prototype. The conjunction fallacy, base rate neglect, and the small sample fallacy are all the same mistake wearing different clothes. When you understand the single mechanism behind all three, you become much harder to fool.
The final unit. Bayes' theorem is not just a calculation — it is a description of how a rational mind should work. Every cognitive failure in this area is, at bottom, a failure to apply Bayes. Calibration. Superforecasting. Prediction markets. And the three questions that, if you ask them habitually, will change how you think.