Statistical Inference

When we use information from a sample to draw conclusions about a much larger population.

Sampling Distribution

Many different samples that yielded different values of means. The distribution of values taken by the statistic in all possible samples of the same size from the same population.

Parameter

A fixed number that describes a population. It always exists, but we rarely know the value because it is difficult to create a census.

Parameter Example

If we wanted to compare IQ’s of all Americans and Asians, it would be IMPOSSIBLE, however the parameters still exist.

Statistic

A number that describes a sample and can change from sample to sample. We often use these to estimate an unknown parameter.

How large does a population need to be compared to the sample?

For an accurate result that maintains independence, the population needs to be at least 10 times as big as the sample that came from it.

μ

The mean of a population (parameter)

p (definition)

The proportion of a population (parameter)

x̅

The mean of a sample (Statistic)

p̂ (definition)

The proportion of a sample (Statistic)

Sampling Variability

Whenever we get a different mean from a sample; The value of a statistic varies in repeated sampling.

Calculator Notation: Y[-10,45].5

Y minimum at -10, Y maximum at 45, counting by intervals of 5.

Unbiased Estimator

A parameter is unbiased IF the mean (center) of the sampling distribution is equal to the true value of the population parameter being estimated. Using this doesn’t guarantee your statistics will be close to the actual value.

Low Variability = Shorter Spread

The larger the sample, the shorter the spread. Focus on larger samples, not more.

Shape

In most cases, the sampling distribution can be approximated by a normal curve. This depends on the sample size and the population proportion. (n and p)

Center

Mean p̂ = P or the mean of a sample is the mean of a population. This happens because P-hat is an unbiased estimator of P.

Spread

The standard deviation of p̂ gets smaller as n gets larger. This depends on both n and p.

p̂ =

the size of success in a sample over the size of the sample. X/n OR it reduces down to just p.

Mean

Mean of p̂ is p.

Standard Deviation of p̂

The square root of: ((P times 1-P)/n)) as long as N is greater than or equal to 10n. (the population is 10 times larger than the sample). This is on the formula sheet provided.

Large Counts Condition or 10% Condition

Satisfied by making sure that np is greater than or equal to 10 and n(1-p) is greater than or equal to 10.

Rule of Thumb

Only use Normal distribution for p̂ if both conditions are true (N greater than or equal to 10n and np greater than or equal to 10)

Does College Board round?

No. College board does not round the standard deviation in the multiple choice section of the test. To account for this, store the standard deviation in the x value.

You should suspect a sample when…

Anything is below a 5% margin, which is most likely an error and NOT just chance.

Standard Deviation of a mean…

Standard deviation over the square root of the sample, given that N is greater than or equal to 10n.

Properties of the mean of a distribution…

Mean is less variable and more normal. It’s an unbiased estimator.

Central Limit Theorem (CLT)

Says that when n is large (greater than 30) the sampling distribution of the sample mean is approximately normal.