Episodes 44

1

What Is Statistics

0%
January 24, 201813m

Welcome to Crash Course Statistics! In this series we're going to take a look at the important role statistics play in our everyday lives, because statistics are everywhere! Statistics help us better understand the world and make decisions from what you'll wear tomorrow to government policy. But in the wrong hands, statistics can be used to misinform. So we're going to try to do two things in this series. Help show you the usefulness of statistics, but also help you become a more informed consumer of statistics. From probabilities, paradoxes, and p-values there's a lot to cover in this series, and there will be some math, but we promise only when it's most important. But first, we should talk about what statistics actually are, and what we can do with them. Statistics are tools, but they can't give us all the answers.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

2

Mathematical Thinking

0%
January 31, 201811m

oday we’re going to talk about numeracy - that is understanding numbers. From really really big numbers to really small numbers, it's difficult to comprehend information at this scale, but these are often the types of numbers we see most in statistics. So understanding how these numbers work, how to best visualize them, and how they affect our world can help us become better decision makers - from deciding if we should really worry about Ebola to helping improve fighter jets during World War II!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Today we’re going to talk about measures of central tendency - those are the numbers that tend to hang out in the middle of our data: the mean, the median, and mode. All of these numbers can be called “averages” and they’re the numbers we tend to see most often - whether it’s in politics when talking about polling or income equality to batting averages in baseball (and cricket) and Amazon reviews. Averages are everywhere so today we’re going to discuss how these measures differ, how their relationship with one another can tell us a lot about the underlying data, and how they are sometimes used to mislead.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

4

Measures of Spread

0%
February 14, 201811m

Today, we're looking at measures of spread, or dispersion, which we use to understand how well medians and means represent the data, and how reliable our conclusions are. They can help understand test scores, income inequality, spot stock bubbles, and plan gambling junkets. They're pretty useful, and now you're going to know how to calculate them!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

5

Today we're going to start our two-part unit on data visualization. Up to this point we've discussed raw data - which are just numbers - but usually it's much more useful to represent this information with charts and graphs. There are two types of data we encounter, categorical and quantitative data, and they likewise require different types of visualizations. Today we'll focus on bar charts, pie charts, pictographs, and histograms and show you what they can and cannot tell us about their underlying data as well as some of the ways they can be misused to misinform.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Today we’re going to finish up our unit on data visualization by taking a closer look at how dot plots, box plots, and stem and leaf plots represent data. We’ll also talk about the rules we can use to identify outliers and apply our new data viz skills by taking a closer look at how Justin Timberlake’s song lyrics have changed since he went solo.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

7

The Shape of Data: Distributions

0%
March 7, 201811m

When collecting data to make observations about the world it usually just isn't possible to collect ALL THE DATA. So instead of asking every single person about student loan debt for instance we take a sample of the population, and then use the shape of our samples to make inferences about the true underlying distribution our data. It turns out we can learn a lot about how something occurs, even if we don't know the underlying process that causes it. Today, we’ll also introduce the normal (or bell) curve and talk about how we can learn some really useful things from a sample's shape - like if an exam was particularly difficult, how often old faithful erupts, or if there are two types of runners that participate in marathons!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

8

Correlation Doesn’t Equal Causation

0%
March 14, 201812m

Today we’re going to talk about data relationships and what we can learn from them. We’ll focus on correlation, which is a measure of how two variables move together, and we’ll also introduce some useful statistical terms you’ve probably heard of like regression coefficient, correlation coefficient (r), and r^2. But first, we’ll need to introduce a useful way to represent bivariate continuous data - the scatter plot. The scatter plot has been called “the most useful invention in the history of statistical graphics” but that doesn’t necessarily mean it can tell us everything. Just because two data sets move together doesn’t necessarily mean one CAUSES the other. This gives us one of the most important tenets of statistics: correlation does not imply causation.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

9

Controlled Experiments

0%
March 21, 201812m

We may be living IN a simulation (according to Elon Musk and many others), but that doesn't mean we don't need to perform simulations ourselves. Today, we're going to talk about good experimental design and how we can create controlled experiments to minimize bias when collecting data. We'll also talk about single and double blind studies, randomized block design, and how placebos work.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

10

Sampling Methods and Bias with Surveys

0%
March 28, 201811m

Today we’re going to talk about good and bad surveys. Surveys are everywhere, from user feedback surveys to telephone polls, and those questionnaires at your doctor's office. Still, with their ease to create and distribute, they're also susceptible to bias and error. So today we’re going to talk about identifying good and bad survey questions, and how groups (or samples) are selected to represent the entire population since it's often just not feasible to ask everyone.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

11

Science Journalism

0%
April 11, 201810m

We’ve talked a lot in this series about how often you see data and statistics in the news and on social media - which is ALL THE TIME! But how do you know who and what you can trust? Today, we’re going to talk about how we, as consumers, can spot flawed studies, sensationalized articles, and just plain poor reporting. And this isn’t to say that all science articles you read on facebook or in magazines are wrong, but that it's valuable to read those catchy headlines with some skepticism.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Today we’re going to talk about ethical data collection. From the Tuskegee syphilis experiments and Henrietta Lacks’ HeLa cells to the horrifying experiments performed at Nazi concentration camps, many strides have been made from Institutional Review Boards (or IRBs) to the Nuremberg Code to guarantee voluntariness, informed consent, and beneficence in modern statistical gathering. But as we’ll discuss, with the complexities of research in the digital age many new ethical questions arise.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

13

Probability Part 1: Rules and Patterns

0%
April 25, 201812m

Today we’re going to begin our discussion of probability. We’ll talk about how the addition (OR) rule, the multiplication (AND) rule, and conditional probabilities help us figure out the likelihood of sequences of events happening - from optimizing your chances of having a great night out with friends to seeing Cole Sprouse at IHop!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Today we're going to introduce bayesian statistics and discuss how this new approach to statistics has revolutionized the field from artificial intelligence and clinical trials to how your computer filters spam! We'll also discuss the Law of Large Numbers and how we can use simulations to help us better understand the "rules" of our data, even if we don't know the equations that define those rules.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

15

The Binomial Distribution

0%
May 9, 201814m

Today we're going to discuss the Binomial Distribution and a special case of this distribution known as a Bernoulli Distribution. The formulas that define these distributions provide us with shortcuts for calculating the probabilities of all kinds of events that happen in everyday life. They can also be used to help us look at how probabilities are connected! For instance, knowing the chance of getting a flat tire today is useful, but knowing the likelihood of getting one this year, or in the next five years, may be more useful. And heads up, this episode is going to have a lot more equations than normal, but to sweeten the deal, we added zombies!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Geometric probabilities, and probabilities in general, allow us to guess how long we'll have to wait for something to happen. Today, we'll discuss how they can be used to figure out how many Bertie Bott's Every Flavour Beans you could eat before getting the dreaded vomit flavored bean, and how they can help us make decisions when there is a little uncertainty - like getting a Pikachu in a pack of Pokémon Cards! We'll finish off this unit on probability by taking a closer look at the Birthday Paradox (or birthday problem) which asks the question: how many people do you think need to be in a room for there to likely be a shared birthday? (It's likely much fewer than you would expect!)

Read More

You need to be logged in to continue. Click here to login or here to sign up.

17

Randomness

0%
May 23, 201812m

There are a lot of events in life that we just can’t predict, but just because something is random doesn’t mean we don’t know or can’t learn anything about it. Today, we’re going to talk about how we can extract information from seemingly random events starting with the expected value or mean of a distribution and walking through the first four “moments” - the mean, variance, skewness, and kurtosis.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

18

Z-Scores and Percentiles

0%
May 30, 201810m

Today we’re going to talk about how we compare things that aren’t exactly the same - or aren’t measured in the same way. For example, if you wanted to know if a 1200 on the SAT is better than the 25 on the ACT. For this, we need to standardize our data using z-scores - which allow us to make comparisons between two sets of data as long as they’re normally distributed. We’ll also talk about converting these scores to percentiles and discuss how percentiles, though valuable, don’t actually tell us how “extreme” our data really is.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

19

The Normal Distribution

0%
June 6, 201811m

Today is the day we finally talk about the normal distribution! The normal distribution is incredibly important in statistics because distributions of means are normally distributed even if populations aren't. We'll get into why this is so - due to the Central Limit Theorem - but it's useful because it allows us to make comparisons between different groups even if we don't know the underlying distribution of the population being studied.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

20

Confidence Intervals

0%
June 13, 201813m

Today we’re going to talk about confidence intervals. Confidence intervals allow us to quantify our uncertainty, by allowing us to define a range of values for our predictions and assigning a likelihood that something falls within that range. And confidence intervals come up a lot like when you get delivery windows for packages, during elections when pollsters cite margin of errors, and we use them instinctively in everyday decisions. But confidence intervals also demonstrate the tradeoff of accuracy for precision - the greater our confidence, usually the less useful our range.

Read More

Crew 0

Directed by: No director has been added.

Written by: No writer has been added.

Guest Stars 0 Full Cast & Crew

No guest stars have been added.

Episode Images 0 View All Episode Images

No episode images have been added.

You need to be logged in to continue. Click here to login or here to sign up.

21

How P-Values Help Us Test Hypotheses

0%
June 27, 201811m

Today we're going to begin our three-part unit on p-values. In this episode we'll talk about Null Hypothesis Significance Testing (or NHST) which is a framework for comparing two sets of information. In NHST we assume that there is no difference between the two things we are observing and and use our p-value as a predetermined cutoff for if something seems sufficiently rare or not to allow us to reject that these two observations are the same. This p-value tells us if something is statistically significant, but as you'll see that doesn't necessarily mean the information is significant or meaningful to you.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

22

P-Value Problems

0%
July 11, 201812m

Last week we introduced p-values as a way to set a predetermined cutoff when testing if something seems unusual enough to reject our null hypothesis - that they are the same. But today we’re going to discuss some problems with the logic of p-values, how they are commonly misinterpreted, how p-values don’t give us exactly what we want to know, and how that cutoff is arbitrary - and arguably not stringent enough in some scenarios.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

23

Playing with Power: P-Values Pt 3

0%
July 18, 201812m

We're going to finish up our discussion of p-values by taking a closer look at how they can get it wrong, and what we can do to minimize those errors. We'll discuss Type 1 (when we think we've detected an effect, but there actually isn't one) and Type 2 (when there was an effect we didn't see) errors and introduce statistical power - which tells us the chance of detecting an effect if there is one.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

24

You Know I’m All About that Bayes

0%
July 25, 201812m

Today we’re going to talk about Bayes Theorem and Bayesian hypothesis testing. Bayesian methods like these are different from how we've been approaching statistics so far, because they allow us to update our beliefs as we gather new information - which is how we tend to think naturally about the world. And this can be a really powerful tool, since it allows us to incorporate both scientifically rigorous data AND our previous biases into our evolving opinions.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

25

Bayes in Science and Everyday Life

0%
August 1, 201811m

Today we're going to finish up our discussion of Bayesian inference by showing you how we can it be used for continuous data sets and be applied both in science and everyday life. From A/B testing of websites and getting a better understanding of psychological disorders to helping with language translation and purchase recommendations Bayes statistics really are being used everywhere!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

26

Test Statistics

0%
August 8, 201812m

Test statistics allow us to quantify how close things are to our expectations or theories. Instead of going on our gut feelings, they allow us to add a little mathematical rigor when asking the question: “Is this random… or real?” Today, we’ll introduce some examples using both t-tests and z-tests and explain how critical values and p-values are different ways of telling us the same information. We’ll get to some other test statistics like F tests and chi-square in a future episode.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

27

T-Tests: A Matched Pair Made in Heaven

0%
August 15, 201811m

Today we're going to walk through a couple of statistical approaches to answer the question: "is coffee from the local cafe, Caf-fiend, better than that other cafe, The Blend Den?" We'll build a two sample t-test which will tell us how many standard errors away from the mean our observed difference is in our tasting experiment, and then we'll introduce a matched pair t-tests which allow us to remove variation in the experiment. All of these approaches rely on the test statistic framework we introduced last episode.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

28

Degrees of Freedom and Effect Sizes

0%
August 22, 201813m

Today we're going to talk about degrees of freedom - which are the number of independent pieces of information that make up our models. More degrees of freedom typically mean more concrete results. But something that is statistically significant isn't always practically significant. And to measure that, we'll introduce another new concept - effect size.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

29

Chi-Square Tests

0%
August 29, 201811m

Today we're going to talk about Chi-Square Tests - which allow us to measure differences in strictly categorical data like hair color, dog breed, or academic degree. We'll cover the three main Chi-Square tests: goodness of fit test, test of independence, and test of homogeneity. And explain how we can use each of these tests to make comparisons.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

30

P-Hacking

0%
September 5, 201811m

Today we're going to talk about p-hacking (also called data dredging or data fishing). P-hacking is when data is analyzed to find patterns that produce statistically significant results, even if there really isn't an underlying effect, and it has become a huge problem in science since many scientific theories rely on p-values as proof of their existence! Today, we're going to talk about a few ways researchers have "hacked" their data, and give you some tips for identifying and avoiding these types of problems when you encounter stats in your own lives.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

31

The Replication Crisis

0%
September 26, 201814m

Replication (re-running studies to confirm results) and reproducibility (the ability to repeat an analyses on data) have come under fire over the past few years. The foundation of science itself is built upon statistical analysis and yet there has been more and more evidence that suggests possibly even the majority of studies cannot be replicated. This "replication crisis" is likely being caused by a number of factors which we'll discuss as well as some of the proposed solutions to ensure that the results we're drawing from scientific studies are reliable.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

32

Regression

0%
October 3, 201812m

Today we're going to introduce one of the most flexible statistical tools - the General Linear Model (or GLM). GLMs allow us to create many different models to help describe the world - you see them a lot in science, economics, and politics. Today we're going to build a hypothetical model to look at the relationship between likes and comments on a trending YouTube video using the Regression Model. We'll be introducing other popular models over the next few episodes.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

33

ANOVA

0%
October 10, 201813m

Today we're going to continue our discussion of statistical models by showing how we can find if there are differences between multiple groups using a collection of models called ANOVA. ANOVA, which stands for Analysis of Variance is similar to regression (which we discussed in episode 32), but allows us to compare three or more groups for statistical significance.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

34

Do you think a red minivan would be more expensive than a beige one? Now what if the car was something sportier like a corvette? Last week we introduced the ANOVA model which allows us to compare measurements of more than two groups, and today we’re going to show you how it can be applied to look at data that belong to multiple groups that overlap and interact. Most things after all can be grouped in many different ways - like a car has a make, model, and color - so if we wanted to try to predict the price of a car, it’d be especially helpful to know how those different variables interact with one another.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

35

Fitting Models Is like Tetris

0%
October 24, 201811m

Today we're going to wrap up our discussion of General Linear Models (or GLMs) by taking a closer looking at two final common models: ANCOVA (Analysis of Covariance) and RMA (Repeated Measures ANOVA). We'll show you how additional variables, known has covariates can be used to reduce error, and show you how to tell if there's a difference between 2 or more groups or conditions. Between Regression, ANOVA, ANCOVA, and RMA you should have the tools necessary to better analyze both categorical and continuous data.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

36

Supervised Machine Learning

0%
October 31, 201811m

We've talked a lot about modeling data and making inferences about it, but today we're going to look towards the future at how machine learning is being used to build models to predict future outcomes. We'll discuss three popular types of supervised machine learning models: Logistic Regression, Linear discriminant Analysis (or LDA) and K Nearest Neighbors (or KNN). For a broader overview of machine learning, check out our episode in Crash Course Computer Science!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

37

Unsupervised Machine Learning

0%
November 7, 201810m

Today we're going to discuss how machine learning can be used to group and label information even if those labels don't exist. We'll explore two types of clustering used in Unsupervised Machine Learning: k-means and Hierarchical clustering, and show how they can be used in many ways - from book suggestions and medical interventions, to giving people better deals on pizza!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

38

Intro to Big Data

0%
November 14, 201811m

Today, we're going to begin our discussion of Big Data. Everything from which videos we click (and how long we watch them) on YouTube to our likes on Facebook say a lot about us - and increasingly more and more sophisticated algorithms are being designed to learn about us from our clicks and not-clicks. Today we're going to focus on some ways Big Data impacts on our lives from what liking Hello Kitty says about us to how Netflix chooses just the right thumbnail to encourage us to watch more content. And Big Data is necessarily a good thing, next week we're going to discuss some of the problems that rise from collecting all that data.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

39

Big Data Problems

0%
November 21, 201812m

There is a lot of excitement around the field of Big Data, but today we want to take a moment to look at some of the problems it creates. From questions of bias and transparency to privacy and security concerns, there is still a lot to be done to manage these problems as Big Data plays a bigger role in our lives.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

40

Statistics in the Courts

0%
November 28, 201811m

As we near the end of the series, we're going look at how statistics impacts our lives. Today, we're going to discuss how statistics is often used and misused in the courtroom. We're going to focus on three stories in which three huge statistical errors were made: the handwriting analysis of French officer Alfred Dreyfus in 1894, the murder charges of mother Sally Clark in 1998, and the expulsion of student Jonathan Dorfman from UC San Diego in 2011.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

41

Neural Networks

0%
December 12, 201812m

Today we're going to talk big picture about what Neural Networks are and how they work. Neural Networks, which are computer models that act like neurons in the human brain, are really popular right now - they're being used in everything from self-driving cars and Snapchat filters to even creating original art! As data gets bigger and bigger neural networks will likely play an increasingly important role in helping us make sense of all that data.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

42

War

0%
December 19, 201811m

Today we're going to discuss the role of statistics during war. From helping the Allies break Nazi Enigma codes and estimate tank production rates to finding sunken submarines, statistics have and continue to play a critical role on the battlefield.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

43

When Predictions Fail

0%
January 2, 201910m

Today we’re going to talk about why many predictions fail - specifically we’ll take a look at the 2008 financial crisis, the 2016 U.S. presidential election, and earthquake prediction in general. From inaccurate or just too little data to biased models and polling errors, knowing when and why we make inaccurate predictions can help us make better ones in the future. And even knowing what we can’t predict can help us make better decisions too.

Read More

You need to be logged in to continue. Click here to login or here to sign up.

44

When Predictions Succeed

0%
Season Finale
January 9, 201911m

In our series finale, we're going to take a look at some of the times we've used statistics to gaze into our crystal ball, and actually got it right! We'll talk about how stores know what we want to buy (which can sometimes be a good thing), how baseball was changed forever when Paul DePodesta created a record-winning Oakland A's baseball team, and how statistics keeps us safe with the incredible strides we've made in weather forecasting. Statistics are everywhere, and even if you don't remember all the formulae and graphs we've thrown at you in this series, we hope you take with you a better appreciation of the many ways statistics impacts your life, and hopefully we've given your a more math-y perspective on how the world works. Thanks so much for watching DFTBAQ!

Read More

You need to be logged in to continue. Click here to login or here to sign up.

Back to top

You need to be logged in to continue. Click here to login or here to sign up.

Can't find a movie or TV show? Login to create it.

Global

s focus the search bar
p open profile menu
esc close an open window
? open keyboard shortcut window

On media pages

b go back (or to parent when applicable)
e go to edit page

On TV season pages

(right arrow) go to next season
(left arrow) go to previous season

On TV episode pages

(right arrow) go to next episode
(left arrow) go to previous episode

On all image pages

a open add image window

On all edit pages

t open translation selector
ctrl+ s submit form

On discussion pages

n create new discussion
w toggle watching status
p toggle public/private
c toggle close/open
a open activity
r reply to discussion
l go to last reply
ctrl+ enter submit your message
(right arrow) next page
(left arrow) previous page

Settings

Want to rate or add this item to a list?

Login