What Does High Variance Mean

broken image


This lecture presents some examples of Hypothesis testing, focusing on tests of hypothesis about the variance, that is, on using a sample to perform tests of hypothesis about the variance of an unknown distribution.

The variance of a population can be completed with the following steps: Compute the mean of the measurement Take the difference between each element in the population and the mean. No data can be judged as good or bad on the basic of its variance. Variance is a measure of heterogeneity in a given data. Higher the variance, more heterogeneous is it and smaller the variance, more homogeneous is it. What Does Efficiency Variance Mean? Efficiency variance is the difference between the actual quantity of input put into a manufacturing process and the estimated or budgeted quantity. The input could be labor hours or other overhead costs. The efficiency variance shows how productive or efficient the manufacturing process was with its inputs.

Table of contents

Normal IID samples - Known mean

In this example we make the same assumptions we made in the example of set estimation of the variance entitled Normal IID samples - Known mean. The reader is strongly advised to read that example before reading this one.

The sample

The sample is made of independent draws from a normal distribution having known mean and unknown variance . Specifically, we observe realizations , .., of independent random variables, .., , all having a normal distribution with known mean and unknown variance . The sample is the -dimensional vector , which is a realization of the random vector.

The null hypothesis

We test the null hypothesis that the variance is equal to a specific value :

The alternative hypothesis

We assume that the parameter space is the set of strictly positive real numbers, i.e., . Therefore, the alternative hypothesis is

The test statistic

What does high variance mean

To construct a test statistic, we use the following point estimator of the variance:

The test statistic isThis test statistic is often called Chi-square statistic (also written as -statistic) and a test of hypothesis based on this statistic is called Chi-square test (also written as -test).

The critical region

Let and . We reject the null hypothesis if or if . In other words, the critical region isThus, the critical values of the test are and .

The power function

The power function of the test iswhere is a Chi-square random variable with degrees of freedom and the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true variance is equal to .

The power function can be written aswhere we have definedAs demonstrated in the lecture entitled Point estimation of the variance, the estimator has a Gamma distribution with parameters and , given the assumptions on the sample we made above. Multiplying a Gamma random variable with parameters and by one obtains a Chi-square random variable with degrees of freedom. Therefore, the variable has a Chi-square distribution with degrees of freedom.

The size of the test

When evaluated at the point , the power function is equal to the probability of committing a Type I error, i.e., the probability of rejecting the null hypothesis when the null hypothesis is true. This probability is called the size of the test and it is equal to where is a Chi-square random variable with degrees of freedom (this is trivially obtained by substituting with in the formula for the power function found above).

Normal IID samples - Unknown mean

This example is similar to the previous one. The only difference is that we now relax the assumption that the mean of the distribution is known.

The sample

In this example, the sample is made of independent draws from a normal distribution having unknown mean and unknown variance . Specifically, we observe realizations , .., of independent random variables , .., , all having a normal distribution with unknown mean and unknown variance . The sample is the -dimensional vector , which is a realization of the random vector .

The null hypothesis

We test the null hypothesis that the variance is equal to a specific value :

The alternative hypothesis

We assume that the parameter space is the set of strictly positive real numbers, i.e., . Therefore, the alternative hypothesis is

The test statistic

We construct a test statistic by using the sample meanand either the unadjusted sample varianceor the adjusted sample variance

The test statistic isThis test statistic is often called Chi-square statistic (also written as -statistic) and a test of hypothesis based on this statistic is called Chi-square test (also written as -test).

The critical region

Let and . We reject the null hypothesis if or if . In other words, the critical region isThus, the critical values of the test are and .

The power function

The power function of the test iswhere the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true variance is equal to and has a Chi-square distribution with degrees of freedom.

The power function can be written aswhere we have definedGiven the assumptions on the sample we made above, the unadjusted sample variance has a Gamma distribution with parameters and (see Point estimation of the variance), so that the random variablehas a Chi-square distribution with degrees of freedom.

The size of the test

The size of the test is equal to where has a Chi-square distribution with degrees of freedom (this is trivially obtained by substituting with in the formula for the power function found above). Modo bass download.

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Sqlpro studio 2020 09 download free. Denote by the distribution function of a Chi-square random variable with degrees of freedom. Suppose you observe independent realizations of a normal random variable. What is the probability, expressed in terms of , that you will commit a Type I error if you run a Chi-square test of the null hypothesis that the variance is equal to , based on the observed realizations, and choosing and as the critical values?

The probability of committing a Type I error is equal to the size of the test:where has a Chi-square distribution with degrees of freedom. ButThus,If you wish, you can utilize some statistical software to compute the values of the distribution function. For example, with the MATLAB commands chi2cdf(65,39) and chi2cdf(15,39) we obtainAs a consequence, the size of the test is

Exercise 2

Make the same assumptions of the previous exercise and denote by the inverse of . Change the critical value in such a way that the size of the test becomes exactly equal to .

Replace with in the formula for the size of the test:You need to set in such a way that . In other words, you need to solvewhich is equivalent toProvided the right-hand side of the equation is positive, this is solved byIf you wish, you can compute numerically. From the previous exercise we know thatTherefore, we need to computeIn MATLAB, this is done with the command chi2inv(0.0444,39), which gives as a result

Exercise 3

Make the same assumptions of Exercise 1 above. If the unadjusted sample variance is equal to 0.9, is the null hypothesis rejected?

In order to carry out the test, we need to compute the test statisticwhere is the sample size, is the value of the variance under the null hypothesis, and is the unadjusted sample variance.

Thus, the value of the test statistic isSince and , we have thatIn other words, the test statistic does not exceed the critical values of the test. As a consequence, the null hypothesis is not rejected. Acrobat support phone number.

How to cite

Please cite as:

Taboga, Marco (2017). 'Hypothesis tests about the variance', Lectures on probability theory and mathematical statistics, Third edition. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/hypothesis-testing-variance.

What does a high/low standard deviation mean in real terms?

1 Answer

The higher the standard deviation the more variability or spread you have in your data.

Explanation:

Standard deviation measures how much your entire data set differs from the mean.

What Does A High Sample Variance Mean

The larger your standard deviation, the more spread or variation in your data. Small standard deviations mean that most of your data is clustered around the mean.

In the following graph, the mean is 84.47, the standard deviation is 6.92 and the distribution looks like this:
Many of the test scores are around the average. There's one student who scored a 96, two students who scored 69, another two who scored 71, but most students scored close to somewhat close to the average of 84.47.

What Does A High Variance Mean

In this second graph, the mean is 80, the standard deviation is 14.57 , and the distribution looks like this:
There is greater variability in the test scores. One student scored a 24, which is pretty far from the average test score of 80. Another student scored a 45, which also isn't close to 80.

What Does Variance Indicate

Related topic

Mean and Standard Deviation of a Probability Distribution
Questions

Related questions





broken image