Significance tests

Choosing a SignificanceTest.

Published 16th February 2017 by Andy Connelly. Updated 9th May 2017.


In an experiment you may hypothesize there is a relationship between two values or two sets of values. To ensure that relationship is robust you need carry out a test of the significance of that hypothesized relationship.

Significance test: A statistical test to determine whether there is a statistically significant difference between two (or more) sets of data at a defined probability level.

Two common examples of this are:

  • A test to evaluate whether the difference between a measured amount and a standard amount (e.g. a certified reference material) can be explained by the presence of random errors in the measurement process.
  • A test to evaluate whether differences between the results from a new method and those of an accepted (perhaps standard) method are significant.

Here I am going to cover the basics of picking and performing a significance test.

DISCLAIMER: I am not an expert on statistics. The content of this blog is what I have discovered through my efforts to understand the subject. I have done my best to make the information here in as accurate as possible. If you spot any errors or admissions, or have any comments, please let me know.

Steps in carrying out a significance test

Figure 2 shows a range of significance tests you can use for different situations. There are many tests available, many more than listed here, and each one is designed for a specific circumstance. For all of the tests we use the same basic procedure (see Figure 1).

Steps of carrying out significance test.
Figure 1: Flow charts showing the key steps of carrying out significance test.

State the Research Hypothesis

A research hypothesis states the expected relationship between two variables. It may be stated in general terms, or it may include dimensions of direction and magnitude. For example,

  • The amount of C-14 is related to the distance from the spillage (2-tail).
  • The longer the dissolution of the rock, the greater the concentration of Ca in solution (1-tail).

The research hypothesis will define whether you use a one or two tail test.

One tail or two?

In general analytical chemistry you will almost always use the standard two tail test. The tail refers to the tail of the Gaussian distribution. However, in certain situations you may need to use a one tail calculation. This effectively means you only take into consideration uncertainty on only one side of the distribution.

The normal distribution is symmetrical about the mean. When we talk about a certain percentage of the distribution we can choose the area from infinity which leaves the remaining area at one end (one tailed), or the area either side of the mean, leaving half the remaining area at either end of the distribution (two tailed).

This becomes relevant when you are only looking at data in one direction. For example, if you are testing two means and have no reason to believe one is bigger or smaller than the other use a two-tailed test. Use a one-tailed test if you want to know if one mean is significantly greater than the other.

The difference between the tests is purely the value of t. There is normally a separate table for t values to use in a one tail test.

Null hypothesis

The Null Hypothesis is a statement used as a starting point for a significance test – it outlines an assumption which you test. It is very much the “Nothing to see here” hypothesis. The alternative hypothesis usually states the opposite to the Null hypothesis – the “something is going on here” hypothesis.

  • Null hypothesis (H0) The hypothesis that the population parameters being compared (e.g., mean or variance) on the basis of the data are the same, and the observed differences arise from random variation only. This is the hypothesis used in many statistical significance tests that ‘‘there is no difference between the factors that are being compared.’’

The Null hypothesis, H0:

  • States the assumption (numerical) to be tested
  • Begins with the assumption that the null hypothesis is TRUE
  • Always contains the ‘=’ sign

The alternative hypothesis, Ha:

  • Is the opposite of the null hypothesis
  • Challenges the status quo
  • Does not generally contains the ‘=’ sign
  • Is generally the hypothesis that is believed to be true by the researcher

The examples below show the three hypothesis together:

Example of a Null hypothesis.
Figure 2: Example of a Null hypothesis.

Select a ‘probability of error’ level (alpha level)

You need to specify the probability at which you want to test your hypothesis. For example, if you want to test at the 95% level then you carry out your tests at a significance level of α=0.05 (1-0.95).

In most situations you will use 95%; however, you may want to vary that depending on the potential errors you are willing to accept.

Choosing a test

There are various tests available as shown in Figure 3; however, this is just a small sample of the many tests available. I will highlight some of the important ones here. The main tests are described briefly in Appendix A.

Typically, a t-test is used to examine the differences between the means of two groups. For example, in an experiment you may want to compare the overall mean for the group on which the manipulation took place vs a control group. A paired t-test allows a new methodology to be tested against accepted method by analysing several different samples of slightly varying composition. Where Paired data means that each data point is related to one data point in the other data set

If you have more than two groups, you shouldn’t just use multiple t-tests as the error adds up and thus you increase your chances of finding an effect when there really isn’t one. Therefore, when you have more than two groups to compare e.g. in a drugs trial when you have a high dose, low does and a placebo group (so 3 groups), you use ANOVA to examine whether there any differences between the groups.

Choosing a SignificanceTest.
Figure 3: Significance tests for a range of situations. Assumes normally distributed data. This is only a small sample of those available. See Appendix B for links to more information on the tests included here.

Implementing a test

In general, the tests shown in Figure 3 are based around an equation that you use to calculate a value. That calculated value can then be used to test the hypothesis. This is shown in Figure 1 and Table 1.

After following Figure 1 you can see that there are two ways of deciding whether the Null hypothesis is true.

Two ways of judging a significance tests.
Table 1: Two ways of judging a significance tests.

In these calculations the p-value is normally calculated using a software package but can be calculated by hand. The probability hypothesis test has been traditionally more common. However, now software packages are more generally available the p-value test is gaining popularity and becoming the standard.

In summary, if the probability that the data are consistent with the Null hypothesis falls below a predetermined low value (say 0.05 or 0.01), then the hypothesis is rejected at that probability. Therefore, p<0.05 means that if the null hypothesis were true we would find the observed data (or more accurately the value of the statistic, or greater, calculated from the data) in less than 5% of repeated experiments.


Significance tests are a fantastic method for comparing sets of data. It took me a long time to understand the key aspects of the significance test. i hope that the contents of this blog have helped you better understand this vital statistical technique.

Further reading

  1. Statistics and Chemometrics for Analytical Chemistry, Miller & Miller, 5th ed. Pearson (2005)
  2. Data analysis for chemstry: An introductory guide for students and laboratory scientists, Hibbert & Gooding, 2006
  3. Statistics: A guide to the use of statistical methods in the physical sciences. Roger Barlow, John Wiley & Sons, 1989.

Appendix A: Details of tests

1 sample t-test summary.
1 sample t-test summary.
ANOVA summary.
ANOVA summary.
(Fisher) F-test summary.
(Fisher) F-test summary.
Independent samples t-est unpaired summary.
Z-test summary.

Appendix B – links for more details

These are some links for more information on the significance tests.


There are many versions of ANOVA. The ANOVAs listed in Figure 3 are just some examples. For more see:




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s