- Last updated

- Save as PDF

- Page ID
- 3997

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

While the binomial distribution is conceptually the simplest distribution to understand, it’s not the most important one. That particular honour goes to the **normal distribution**, which is also referred to as “the bell curve” or a “Gaussian distribution”. A normal distribution is described using two parameters, the mean of the distribution μ and the standard deviation of the distribution σ. The notation that we sometimes use to say that a variable X is normally distributed is as follows:

X∼Normal(μ,σ)

Of course, that’s just notation. It doesn’t tell us anything interesting about the normal distribution itself. As was the case with the binomial distribution, I have included the formula for the normal distribution in this book, because I think it’s important enough that everyone who learns statistics should at least look at it, but since this is an introductory text I don’t want to focus on it, so I’ve tucked it away in Table 9.2. Similarly, the R functions for the normal distribution are `dnorm()`

, `pnorm()`

, `qnorm()`

and `rnorm()`

. However, they behave in pretty much exactly the same way as the corresponding functions for the binomial distribution, so there’s not a lot that you need to know. The only thing that I should point out is that the argument names for the parameters are `mean`

and `sd`

. In pretty much every other respect, there’s nothing else to add.

Instead of focusing on the maths, let’s try to get a sense for what it means for a variable to be normally distributed. To that end, have a look at Figure 9.6, which plots a normal distribution with mean μ=0 and standard deviation σ=1. You can see where the name “bell curve” comes from: it looks a bit like a bell. Notice that, unlike the plots that I drew to illustrate the binomial distribution, the picture of the normal distribution in Figure 9.6 shows a smooth curve instead of “histogram-like” bars. This isn’t an arbitrary choice: the normal distribution is continuous, whereas the binomial is discrete. For instance, in the die rolling example from the last section, it was possible to get 3 skulls or 4 skulls, but impossible to get 3.9 skulls. The figures that I drew in the previous section reflected this fact: in Figure 9.3, for instance, there’s a bar located at X=3 and another one at X=4, but there’s nothing in between. Continuous quantities don’t have this constraint. For instance, suppose we’re talking about the weather. The temperature on a pleasant Spring day could be 23 degrees, 24 degrees, 23.9 degrees, or anything in between since temperature is a continuous variable, and so a normal distribution might be quite appropriate for describing Spring temperatures.^{145}

With this in mind, let’s see if we can’t get an intuition for how the normal distribution works. Firstly, let’s have a look at what happens when we play around with the parameters of the distribution. To that end, Figure 9.7 plots normal distributions that have different means, but have the same standard deviation. As you might expect, all of these distributions have the same “width”. The only difference between them is that they’ve been shifted to the left or to the right. In every other respect they’re identical. In contrast, if we increase the standard deviation while keeping the mean constant, the peak of the distribution stays in the same place, but the distribution gets wider, as you can see in Figure 9.8. Notice, though, that when we widen the distribution, the height of the peak shrinks. This has to happen: in the same way that the heights of the bars that we used to draw a discrete binomial distribution have to * sum* to 1, the total

*for the normal distribution must equal 1. Before moving on, I want to point out one important characteristic of the normal distribution. Irrespective of what the actual mean and standard deviation are, 68.3% of the area falls within 1 standard deviation of the mean. Similarly, 95.4% of the distribution falls within 2 standard deviations of the mean, and 99.7% of the distribution is within 3 standard deviations. This idea is illustrated in Figure*

*area under the curve***??**.

## Probability density

There’s something I’ve been trying to hide throughout my discussion of the normal distribution, something that some introductory textbooks omit completely. They might be right to do so: this “thing” that I’m hiding is weird and counterintuitive even by the admittedly distorted standards that apply in statistics. Fortunately, it’s not something that you need to understand at a deep level in order to do basic statistics: rather, it’s something that starts to become important later on when you move beyond the basics. So, if it doesn’t make complete sense, don’t worry: try to make sure that you follow the gist of it.

Throughout my discussion of the normal distribution, there’s been one or two things that don’t quite make sense. Perhaps you noticed that the y-axis in these figures is labelled “Probability Density” rather than density. Maybe you noticed that I used p(X) instead of P(X) when giving the formula for the normal distribution. Maybe you’re wondering why R uses the “d” prefix for functions like `dnorm()`

. And maybe, just maybe, you’ve been playing around with the `dnorm()`

function, and you accidentally typed in a command like this:

**dnorm**( x = 1, mean = 1, sd = 0.1 )

`## [1] 3.989423`

And if you’ve done the last part, you’re probably very confused. I’ve asked R to calculate the probability that `x = 1`

, for a normally distributed variable with `mean = 1`

and standard deviation `sd = 0.1`

; and it tells me that the probability is 3.99. But, as we discussed earlier, probabilities * can’t* be larger than 1. So either I’ve made a mistake, or that’s not a probability.

As it turns out, the second answer is correct. What we’ve calculated here isn’t actually a probability: it’s something else. To understand what that something is, you have to spend a little time thinking about what it really * means* to say that X is a continuous variable. Let’s say we’re talking about the temperature outside. The thermometer tells me it’s 23 degrees, but I know that’s not really true. It’s not

*23 degrees. Maybe it’s 23.1 degrees, I think to myself. But I know that that’s not really true either, because it might actually be 23.09 degrees. But, I know that… well, you get the idea. The tricky thing with genuinely continuous quantities is that you never really know exactly what they are.*

*exactly*Now think about what this implies when we talk about probabilities. Suppose that tomorrow’s maximum temperature is sampled from a normal distribution with mean 23 and standard deviation 1. What’s the probability that the temperature will be * exactly* 23 degrees? The answer is “zero”, or possibly, “a number so close to zero that it might as well be zero”. Why is this? It’s like trying to throw a dart at an infinitely small dart board: no matter how good your aim, you’ll never hit it. In real life you’ll never get a value of exactly 23. It’ll always be something like 23.1 or 22.99998 or something. In other words, it’s completely meaningless to talk about the probability that the temperature is exactly 23 degrees. However, in everyday language, if I told you that it was 23 degrees outside and it turned out to be 22.9998 degrees, you probably wouldn’t call me a liar. Because in everyday language, “23 degrees” usually means something like “somewhere between 22.5 and 23.5 degrees”. And while it doesn’t feel very meaningful to ask about the probability that the temperature is exactly 23 degrees, it does seem sensible to ask about the probability that the temperature lies between 22.5 and 23.5, or between 20 and 30, or any other range of temperatures.

The point of this discussion is to make clear that, when we’re talking about continuous distributions, it’s not meaningful to talk about the probability of a specific value. However, what we * can* talk about is the probability that the value lies within a particular range of values. To find out the probability associated with a particular range, what you need to do is calculate the “area under the curve”. We’ve seen this concept already: in Figures 9.9 and (fig:sdnorm1b), the shaded areas shown depict genuine probabilities (e.g., in Figure 9.9 it shows the probability of observing a value that falls within 1 standard deviation of the mean).

Okay, so that explains part of the story. I’ve explained a little bit about how continuous probability distributions should be interpreted (i.e., area under the curve is the key thing), but I haven’t actually explained what the `dnorm()`

function actually calculates. Equivalently, what does the formula for p(x) that I described earlier actually mean? Obviously, p(x) doesn’t describe a probability, but what is it? The name for this quantity p(x) is a **probability density**, and in terms of the plots we’ve been drawing, it corresponds to the

*of the curve. The densities themselves aren’t meaningful in and of themselves: but they’re “rigged” to ensure that the*

*height**under the curve is always interpretable as genuine probabilities. To be honest, that’s about as much as you really need to know for now.*

*area*^{146}

## FAQs

### What is 0.95 normal distribution? ›

Thus, the probability is 0.95 that **a normal variable takes a value within 1.96 standard deviations of its mean**. Once again, the Standard Deviation Rule is shown to be just roughly accurate, since it states that the probability is 0.95 that a normal variable takes a value within 2 standard deviations of its mean.

**How do you find a 95% normal distribution? ›**

The 68-95-99 rule is based on the mean and standard deviation. It says: 68% of the population is within 1 standard deviation of the mean. **95% of the population is within 2 standard deviation of the mean**.

**How do I calculate normal distribution? ›**

The standard normal distribution (z distribution) is a normal distribution with a mean of 0 and a standard deviation of 1. Any point (x) from a normal distribution can be converted to the standard normal distribution (z) with the formula **z = (x-mean) / standard deviation**.

**What is 0.5 in a normal distribution? ›**

For example, **the probability of observing a value less than or equal to zero** on the standard normal density curve is 0.5, since exactly half of the area of the density curve lies to the left of zero.

**What is the top 10% in normal distribution? ›**

As a decimal, the top 10% of marks would be **those marks above 0.9** (i.e., 100% - 90% = 10% or 1 - 0.9 = 0.1). First, we should convert our frequency distribution into a standard normal distribution as discussed in the opening paragraphs of this guide.

**What is 0.95 on the Z table? ›**

z (0.95) is located on the left-hand side of the normal distribution since the area to the right is 0.95. The area in the tail to the left then contains the other 0.05, as shown in Figure 6.9. Using Table 3, z (0.95) = **–1.65**.

**What is 99.7 normal distribution? ›**

Key Takeaways. The Empirical Rule states that **99.7% of data observed following a normal distribution lies within 3 standard deviations of the mean**. Under this rule, 68% of the data falls within one standard deviation, 95% percent within two standard deviations, and 99.7% within three standard deviations from the mean.

**Is 95 1.96 normal distribution? ›**

In probability and statistics, the 97.5th percentile point of the standard normal distribution is a number commonly used for statistical calculations. The approximate value of this number is 1.96, meaning that **95% of the area under a normal curve lies within approximately 1.96 standard deviations of the mean**.

**What is 1.645 normal distribution? ›**

The value 1.645 is the z-score from a standard normal probability distribution that puts **an area of 0.90 in the center, an area of 0.05 in the far left tail, and an area of 0.05 in the far right tail**.

**What is an example of a normal distribution? ›**

All kinds of variables in natural and social sciences are normally or approximately normally distributed. **Height, birth weight, reading ability, job satisfaction, or SAT scores** are just a few examples of such variables.

### Why is the normal distribution formula? ›

What is the normal distribution formula? For a random variable x, with mean “μ” and standard deviation “σ”, the normal distribution formula is given by: **f(x) = (1/√(2πσ ^{2})) (e^{[-(}^{x}^{-}^{μ}^{)}^{^}^{2}^{]/}^{2σ}^{^}^{2})**.

**What is normal distribution value? ›**

The normal distribution is the proper term for a probability bell curve. In a normal distribution **the mean is zero and the standard deviation is 1**. It has zero skew and a kurtosis of 3. Normal distributions are symmetrical, but not all symmetrical distributions are normal.

**What is 5% of normal distribution? ›**

The top 5% of the normal distribution **indicates that only 5% of the data lies on the right of the normal standard curve**. As we know z-table tells us the probability of values less than the given z-score, we will check the 0.95 probability in the Z-table in order to calculate the top 5% of the normal distribution.

**What is 2.5 in normal distribution? ›**

Z | 0.00 | 0.05 |
---|---|---|

2.4 | 0.4918 | 0.4929 |

2.5 | 0.4938 | 0.4946 |

2.6 | 0.4953 | 0.4960 |

2.7 | 0.4965 | 0.4970 |

**What is the top 5% normal distribution? ›**

“Top 5%” means the minimum percentile rank is at 95, which is 0.95 in percentage.

**What is 90% of a normal distribution? ›**

For any normal distribution a probability of 90% **corresponds to a Z score of about 1.28**. We also could have computed this using R by using the qnorm() function to find the Z score corresponding to a 90 percent probability.

**What is 1 normal distribution? ›**

The standard normal distribution is one of the forms of the normal distribution. It occurs when a normal random variable has a mean equal to zero and a standard deviation equal to one. In other words, **a normal distribution with a mean 0 and standard deviation of 1** is called the standard normal distribution.

**How do you calculate z-score? ›**

The z-score of a value is the count of the number of standard deviations between the value and the mean of the set. You can find it by **subtracting the value from the mean, and dividing the result by the standard deviation**.

**What is 0.45 in Z table? ›**

To find the z value for 0.45, move along the area in the table and locate the nearest value. It is 0.4505 in our table [Fig-3]. First move to the left extreme find the value in the z column. It is **1.6**.

**What is the z-score for 95% to the left? ›**

Hence, the z value at the 95 percent confidence interval is **1.96**.

### What is normal distribution z-score? ›

**A Z score represents how many standard deviations an observation is away from the mean**. The mean of the standard normal distribution is 0. Z scores above the mean are positive and Z scores below the mean are negative.

**What is normal distribution in Z? ›**

The standard normal distribution, also called the z-distribution, is **a special normal distribution where the mean is 0 and the standard deviation is 1**. Any normal distribution can be standardized by converting its values into z scores. Z scores tell you how many standard deviations from the mean each value lies.

**How many standard deviation is 95? ›**

Since 95% of values fall within **two standard deviations of the mean** according to the 68-95-99.7 Rule, simply add and subtract two standard deviations from the mean in order to obtain the 95% confidence interval.

**Why 0.975 for 95 confidence interval? ›**

**A 95% confidence interval would encompass all but the bottom 2.5% and the top 97.5%** which correspond to probabilities of 0.025 and 0.975.

**What is 1.96 2.58 normal distribution? ›**

95% of values fall within 1.96 standard deviations of the mean (-1.96s <= X <= 1.96s) **99% of values fall within 2.58 standard deviations of the mean (-2.58s <= X <= 2.58s)**

**What is Z 1.96 in normal distribution table? ›**

The table value for Z is the value of the cumulative normal distribution. For example, the value for 1.96 is **P(Z<1.96) = .** **9750**.

**What is 0.05 from the normal distribution? ›**

A significance level of 0.05 indicates a **5% risk** of concluding that the data do not follow a normal distribution when the data do follow a normal distribution.

**What is a normal distribution for dummies? ›**

A normal distribution is **symmetrical around the mean**. Normal distribution reaches its highest point at the mean. It is bell-shaped. It has a zero point at the mean and it decreases as you move away from the mean on both sides.

**What is normal in statistics? ›**

"Normal" data are **data that are drawn (come from) a population that has a normal distribution**. This distribution is inarguably the most important and the most frequently used distribution in both the theory and application of statistics.

**What is the normal distribution formula simple? ›**

The normal distribution is produced by the normal density function, **p(x) = e ^{−}^{(}^{x} ^{−} ^{μ}^{)}^{2}/2σ^{2}/σ √2π**. In this exponential function e is the constant 2.71828…, is the mean, and σ is the standard deviation.

### What is a normal distribution table? ›

The standard normal distribution table is a compilation of areas from the standard normal distribution, more commonly known as a bell curve, which provides the area of the region located under the bell curve and to the left of a given z-score to represent probabilities of occurrence in a given population.

**Is normal distribution mean 50%? ›**

The mean (the perpindicular line down the center of the curve) of the normaldistribution divides the curve in half, so that **50% of the area under the curveis to the right of the mean and 50% is to the left**. Therefore, 50% of testscores are greater than the mean, and 50% of test scores are less than the mean.

**What is 75% in normal distribution? ›**

Assuming a normal distribution, the 75th percentile corresponds to **a z score of 0.675** (ans.). That is, an observation at the 75th percentile is 0.675 standard deviations from the mean.

**What is 30% of the standard normal distribution? ›**

30% is equal to **0.3** or the probability of normal distribution is 0.3. We look through the table for the (cumulative) probability 0.3. The value of the row in which that value is found and (added to) the value of the column in which that value is found gives you the value for c.

**What is the top 2.5 percent of normal distribution? ›**

For the given normal distribution, the top 2.5% would be scores **above 12.87** (1.96 standard deviations above the mean).

**What is the z-score for .95 probability? ›**

The Z value for 95% confidence is **Z=1.96**.

**What does 0.95 probability mean? ›**

**A 95% confidence interval** has a 0.95 probability of containing the population mean. 95% of the population distribution is contained in the confidence interval.

**What is the z-score for 0.95 confidence level? ›**

The critical z-score values when using a 95 percent confidence level are **-1.96** and +1.96 standard deviations.

**Why is the z-score 1.96 for 95? ›**

1.96 is used **because the 95% confidence interval has only 2.5% on each side**. The probability for a z score below −1.96 is 2.5%, and similarly for a z score above +1.96; added together this is 5%.

**What is the z-score for 96%? ›**

Percentile | z-Score |
---|---|

96 | 1.751 |

97 | 1.881 |

98 | 2.054 |

99 | 2.326 |

### How do you find the z-score in probability? ›

The Z-score formula is **z = x − μ σ** .

**Why is .95 statistical significance? ›**

Declaring that a result is significantly different from another at the 95% significance level means that **there is 95% certainty that the experiment correctly determines that the treatments are, in fact, different from one another**.

**What does a probability of 0.99 mean? ›**

the probability of occuring any observation definitely is 1. since, o. 99 is merely equal to 1. therefore , the event having probability of 0.99 is said to be **happen likely**.

**Why do we use 95 confidence interval? ›**

With a 95 percent confidence interval, **you have a 5 percent chance of being wrong**. With a 90 percent confidence interval, you have a 10 percent chance of being wrong. A 99 percent confidence interval would be wider than a 95 percent confidence interval (for example, plus or minus 4.5 percent instead of 3.5 percent).

**What is the z-score for .90 confidence? ›**

Hence, the z value at the 90 percent confidence interval is **1.645**.

**What is the z-score for 94% confidence? ›**

For a 94% z-interval, there will be 6% of the area outside of the interval. That is, there will be 97% of the area less than the upper critical value of z. The nearest entry to 0.97 in the table of standard normal probabilities is 0.9699, which corresponds to a z-score of **1.88**.

**What is the z-score for 5% normal distribution? ›**

A: A z-score of **+/- 1.96 or greater** is considered statistically significant at the 5% level of significance (i.e., p < 0.05).

**What are Z scores for normal distribution? ›**

**A Z score represents how many standard deviations an observation is away from the mean**. The mean of the standard normal distribution is 0. Z scores above the mean are positive and Z scores below the mean are negative.

**What percentile is 95 confidence interval? ›**

The 95th percentile is estimated as 6.2785. The “95%” CI is **(0.49, 6.29)**, which is the entire range of the sample data.