Can We Trust the Estimates?
Standard errors, t-tests, and confidence intervals
If we collected new data, the line would shift. How much can we trust our β₁?
Here's how we'll figure out if the slope is trustworthy or just noise.
How much would the slope change if we collected new data?
Is the slope far enough from zero to be meaningful?
What's the probability this result happened by chance?
What range of slopes is plausible?
Run it 500 times. The slopes form a distribution — the sampling distribution.
Distribution of Estimated Slopes (500 samples)
The spread of that distribution is the standard error.
We just saw that slopes vary across samples. Ideally, we'd measure that spread by running hundreds of experiments — but in real life, we only have one dataset.
So statisticians developed a formula that estimates the spread from a single sample. That estimate is the standard error (SE).
Sampling Distribution with Standard Error
What's MSE? It stands for Mean Squared Error — the average size of the squared residuals. It's the SSE from Chapter 4, divided by n − 2 (degrees of freedom):
The SE formula has two ingredients: MSE (how noisy the data is) in the numerator, and ∑(xi − x̄)² (the spread of x-values) in the denominator. More noise → bigger SE. More spread in x → smaller SE.
Computed from one sample using the formula above. This is what you'd use in practice — you only have one dataset.
The actual standard deviation of slopes across 500 samples. This is the "true" spread, but requires repeating the experiment many times.
Run the simulations above to see how the formula-based SE compares to the actual spread of slopes across many samples.
The t-statistic: how many standard errors is β̂₁ away from zero?
The p-value: if the true slope were 0, how likely is a t-stat this extreme?
The 95% confidence interval: we're 95% confident the true slope is in this range.
More data → smaller standard error → tighter confidence interval.
In practice, you'll see results presented as a regression table. Here's how to read one.
| Estimate | Std. Error | t-statistic | p-value | 95% CI | |
|---|---|---|---|---|---|
| Intercept (β₀) | 44262.2 | 18037.07 | 2.45 | 0.0290 | [5302.1, 83222.2] |
| House Size (β₁) | 375.1 | 46.89 | 8.00 | < 0.001 | [273.8, 476.4] |
Let's walk through what each part of the table tells us.
The predicted price when house size is 0 sqft. In practice, this is rarely meaningful on its own — no house has 0 sqft. It anchors the line so that predictions in the observed range are accurate.
p = 0.0290 — statistically significant, but that just means the line doesn't pass through the origin.
For each additional 1 sqft of house size, the predicted price increases by $375. This is the main finding of the regression.
SE = 46.89 — tells us the slope estimate could plausibly shift by this much with different data.
t = 8.00 — the slope is 8.0 standard errors away from zero. That's far.
p < 0.001 — if house size had no relationship with price, the chance of seeing a slope this extreme is essentially zero.
95% CI = [273.8, 476.4] — we're 95% confident the true slope falls in this range. Since the interval doesn't include 0, the relationship is statistically significant.
R² = 0.831 — House size explains 83.1% of the variation in price. The remaining 16.9% is driven by other factors (location, condition, etc.).
RMSE = $27,836 — On average, our predictions are off by about $27,836. This gives you a sense of the model's practical accuracy in the same units as the outcome.
Finally, let's be clear about what this analysis can and cannot tell us.
- • There is a statistically significant association between house size and price.
- • Larger houses tend to have higher prices. On average, each additional sqft is associated with ~$375 more.
- • House size alone accounts for 83% of the price variation in this dataset.
- • The result is unlikely to be due to random chance (p < 0.001).
- • Adding a sqft causes the price to go up by $375. Correlation is not causation. Bigger houses may be in better neighborhoods, have more bedrooms, or be newer — these confounders could be driving the relationship.
- • The model captures the full picture. With R² = 0.83, there is still 17% unexplained variation. Important predictors are missing.
- • The relationship is linear everywhere. Our model assumes a straight line, but the true relationship could curve at very small or very large sizes.
- • We can predict outside our data range. Extrapolating to, say, a 5,000 sqft mansion is risky — the pattern may not hold beyond the observed range.
To establish causation, we would need a controlled experiment or advanced techniques (like instrumental variables, difference-in-differences, or regression discontinuity) that account for confounders. That's what we'll explore in future courses.