Everything2
Near Matches
Ignore Exact
Full Text
Everything2

Central Limit Theorem

created by ModernAngel

(idea) by ModernAngel (1.6 d) (print)   ?   (I like it!) Thu Feb 10 2000 at 4:00:26

A fundamentally important theorem of statistics, it describes the properties of the bell curve and gives the statistician the ability to form probable conclusions about entire populations based on incomplete data about sample populations.

If the number of sample data is sufficiently large, the mean distribution is approximately normal.

(thing) by dowjones (2.1 y) (print)   ?   (I like it!) Thu Jun 21 2001 at 2:39:12

The Central Limit Theorem is an important theorem used in mathematical statistics used to make inferences about populations based on limited amounts of information.

The principle is that if you have n random variables, Y1, Y2,...,Yn each with mean (expected value) u; and each with some variance s^2, then

U = sqrt(n)*((Y - u)/s^2), where Y is the average of the realised value of these n random variables

will converge to the standard normal distribution as n approaches infinity. The standard normal distribution is with mean 0 and variance 1, usually denoted Z. Note that the CLT can be applied to any random sample Y1, Y2,...,Yn so long as n is large (say >30) and as long as the mean and variance of Y are known and finite.

There are two other important ways to think about the CLT.

1. Y is approximately normally distributed with mean u and variance s^2/n. This makes sense because as n gets lager and larger, the variance will get smaller and smaller making Y, a better and better estimate of u. That is, Y ~` N(u , s^2/n ).

2. Alternatively, Y1+ Y+,...,+Yn are approximately normally distributed with mean nu and variance ns^2. That is Y1+ Y+,...,+Yn ~` N(nu , ns^2).

An Example. Suppose the test scores of all high school students in a certain state have mean 60 and variance 64. A random sample of 100 students from a large high school had mean 58. Is there any evidence to suggest that the high school is inferior?

Let Y denote the mean of the random sample of n=100 scores from a population with mean u = 60 and variance s^2 = 64. We want to find the probability that this sample mean is less than or equal to 58. If this probability is small, then there is reason to suggest that the school is inferior. We know from the CLT that Y is approximately normally distributed with mean u and variance s^2/n from (1).

So, we want: P(Y less than 58) = P({Y - u}/sqrt(s^2/n) less than {58 - u}/sqrt(s^2/n)), this has standardised Y

This expression is now in the context of the CLT and so we can replace the left hand side by Z, where Z has the standard normal distribution. Ie Z ~ N(0,1).

So, = P(Z less than {58 - 60}/sqrt(64/100))
= P(Z less than -2.5)

Since Z is a continuous random variable, then we can ignore the `=' sign and just consider values of Z less than -2.5. At this point we consult our standard normal distribution tables and look up a value of -2.5, to find that the probability of Z being less than -2.5 is just 0.0062.

That is, we can say that the probability that this school obtained an average score of 58 given that it has the same abilities as the rest of the state is approximately 0.0062, or 0.62%. Hence, there is reason to suggest that the school is inferior.


(thing) by Semisane (7.2 mon) (print)   ?   (I like it!) Tue Jun 21 2005 at 4:56:31

Technically, the central limit theorum refers to the characteristics of a sampling distribution.

Let's imagine we're looking at the heights of people in the population. There is a real value out there for the mean height, say, of British people. We could find it by measuring everyone, but to do so would be excessive. Instead, we take a random sample and measure the heights of those selected. The central limit theorum states that, for any collection of samples, the mean of the mean values for those samples will approach the mean of the population. (More samples will generally bring you closer to the true value.) Meanwhile, the distribution of those sample means will follow a classic bell curve distribution.

The fact that this is true for any population, whether the underlying distribution follows a unimodal, symmetric bell curve or not, is one of the most surprising and useful facts of statistics.


(idea) by Grayscale (1.9 y) (print)   ?   (I like it!) 1 C! Fri Jan 27 2006 at 21:27:58

WARNING: Lots of HTML math ahead!

Forward

There are a several specific results which are known as central limit theorems, each sometimes referred to as "the" central limit theorem. Here we will focus on one particular version which

A word on notation: Here we will use the notation E(x) to denote the expectation value of a random variable x. There are other conventions in common use, including wedge brackets ⟨x⟩. The symbol i will be used for the basic imaginary number, while j and n will be used for counting indices.

Theorem

Consider a set {xj}, j=1,...,N of N independent random variables with expectation E(xj) = μj and variance E(xj2)-E(xj)2 = σj2, where the σj are real and finite. (A specific additional condition on the σj will be discussed later.) Let σ = (&Sigmajσj2)1/2 and define a new variable z = Σj(xjj)/σ as the (scaled and shifted) sum of the xj. Then as N→∞ the distribution of z approaches normal, i.e. p(z) = (2π)-1/2exp[-z2/2] where p(z) is the density function of z.


Preliminary Definitions

The characteristic function Φ(k) for a variable x is defined as

Φ(k) = E(exp[ikx]) = ∫exp[ikx]p(x)dx

This is a calculational device for finding the moments E(x), E(x2), etc. as

Φ(m)(0) = imE(xm)

where Φ(m)(k) represents the mth derivative of Φ(k). If we can write these moments as derivatives of Φ(k), we can also do the reverse and write Φ(k) in a Taylor series:

Φ(k) = Σ E(xn)(ik)n/n!

The logarithm of the characteristic function is known as the cumulant generating function, defined as

Ψ(k) = ln[Φ(k)] = ΣCn(ik)n/n!

where the Cn, known as cumulants, are polynomials in the moments E(x), E(x2), etc. Of special note are C1 = E(x) and C2 = E(x2)-E(x)2 = σ2. Note that if we try to evaluate C0 the result is always zero, so this term is generally ignored.


Proof

Let Φz(k) and Φj(k) denote the characteristic functions for z and the xj. Then

Φz(k) = E(exp[ikz]) = E(exp[ikΣj(xjj)/σ]) = E(Πj exp[ik(xjj)/σ]) = E(Πj exp[ikxj/σ] exp[-ikμj/σ])

As the xj are independent the product can be moved outside the calculation of the expectation; so can the exponential in μj, as it is a constant. This results in

Φz(k) = Πj E(exp[ikxj/σ]) exp[-ikμj/σ] = Πj Φj(k/σ) exp[-ikμj/σ]

Now we take the log, to change the characteristic functions into the cumulant-generating functions:

Ψz(k) = Σj Ψj(k/σ) - ikμj/σ

Substituting the Taylor expansions,

Σn Czn(ik)n/n! = &SigmajΣn Cjn(ik/σ)n/n! - ikμj/σ

Coefficients of like powers of k must be equal on both sides, so we can solve for the Czn. As Cj1 = μj and Cj2 = σj2 we find

Cz1 = Σj Cj1/σ - μj/σ = Σj μj/σ - μj/σ = 0
Cz2 = Σj Cj2/σ2 = (Σj σj2)/j σj2) = 1

Now, Czn ∝ 1/σn, while σ is a sum of N finite σj, so as N→∞ it should not be surprising that Cz3 and higher-order Czn approach zero. This is straightforward if we make the simplifying assumption that the xj have equal variance, i.e. that the σj are all equal. However, there are several sufficient, weaker restrictions which we can impose on the distribution of the xj including the Lyapunov, Lindeberg, and Feller-Lévy conditions; the study and proof of these variants is left to the interested reader. In all cases, we find that Czn = 0 for n>2, so

Ψz(k) = (ik)2/2! = -k2/2

and

Φz(k) = exp[-k2/2]

This is the characteristic function of a standard normal distribution; we can verify this by performing an inverse Fourier transform to recover p(z):

p(z) = (2π)-1 ∫exp[-ikz]Φz(k)dk = (2π)-1 ∫exp[-ikz-k2/2]dk = (2π)-1/2exp[-z2/2]

Thus z converges to the standard normal distribution, as desired.


printable version
chaos

Bertrand's Box Paradox Zero divided by zero Good from far, but far from good Gaussian Distribution
Why can't Starbucks sell "small," "medium," and "large" drinks? Why Koreans choose seemingly random email addresses Car names totally lacking in coolness Law of large numbers
Walter A. Shewhart Statistics every writer should know chi-square curve Bayesian Network
normal distribution There's a Delta for Every Epsilon standard error Names for Large Numbers
Andrei Markov normal distribution tables Fundamental theorem of arithmetic simple random walk
Natalie Portman binomial distribution Confidence Interval Pepsi
Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.
  Epicenter
Login
Password

password reminder
register

Everything2 Help

Cool Staff Picks
Just another sprinkling of indeterminacy
FINALLY: Morally Bankrupt but Rich on Beads (the New Orleans gathering aftermath node)
Bullfighting
NFL trick plays
I wrote you a letter on the bus back from the city, but that's a different kind of weary
Wuthering Heights
Think of us as a lost
Arabic coffee
White Star Line
Dian Fossey
Existentialism
The Road to Wigan Pier
Link and Link
Homeschooling
New Writeups
doctor wilson
Soup, of the green variety(recipe)
Ctrl Y
cognitive dissonance(fiction)
SharQ
Gone Baby Gone(review)
halfWit
If I could, I'd title this "Freedom"(thing)
Roninspoon
Airline Hero(thing)
Ktistec
Why Women Are Always Cold(person)
doctor wilson
Drug policy reform(thing)
tejasa
Easy Raspberry Cheesecake(recipe)
Joysim
Drug policy reform(idea)
aneurin
Tyburn(place)
niruena
Boiling to death(idea)
artman2003
summer(thing)
doctor wilson
The Silver City and the Silent Sea(log)
Dreamvirus
The Silver City and the Silent Sea(poetry)
Aerobe
A nihilist's soulmate(poetry)
Everything 2 is brought to you by the letter C and The Everything Development Company