Basic statistical techniques can be utilized to calculate standard deviation in order to evaluate the error margin in a measurement. Yet, even the estimated standard deviation for a sample of the whole population still takes up a relatively large amount of measurements.
In spectroscopy a handful of measurements of a sample are usually taken, somewhere between 5 and 10. Employing the basic expressions for standard deviation of a normal distribution does not produce reliable values for the measurement error at this sampling level. Luckily, work performed in the early 20th century regarding beer production presents us with a solution.
William Sealy Gosset. Image Credit: Wikipedia.org
William Sealy Gosset and Guinness
William Sealy Gosset was a statistician who worked at the Guinness Brewery in Dublin as Head Brewer. He was interested in acquiring the best yield from types of barley which is a key ingredient in beer. When he had to gather meaningful statistical conclusions from as little as three grains of barley, he experienced the small sample size problem in his work.
He describes the problem In his paper published in 1908, The Probable Error of a Mean, as thus:
“as we decrease the number of experiments, the value of the standard deviation found from the sample of experiments becomes itself subject to an increasing error, until judgements reached in this way may become altogether misleading.”
Gosset came up with what came to be known as the Student t-distribution function (named as such because he published the paper under the pseudonym ‘Student’) and published tables of values that can be utilized when working with extremely small sample sizes.
The distribution is shorter and wider than the typical distribution and enables more outlying measurements. The distribution moves towards a classic normal curve as the number of measurements increases.
The Student t-distribution Function
The expression for the confidence interval given by the t-distribution function is:
uc= x̅ +/- Tx
Where:
x̅ : average of the measured values
Tx : value of the t-distribution function. This is calculated from the following formula:
Tx= (t (f,P) x s / N1/2
Where:
t : value taken from published tables which depends on f (number of samples measured - 1) and P (the desired confidence level)
s : standard deviation of the measurement series
N : number of measurements taken
Using the t-distribution Function in Spectroscopy
We can gather some real-world composition results for chromium in a component to look at the process of utilizing the Student t-distribution function to create a confidence interval.
Average of 10 readings: 18.54%
Standard deviation: 0.1%
We will select a confidence level of 95%, so the numbers to work with are:
N : 10 (10 readings)
s : 0.1% (standard deviation taken from table above)
t : 2.262 (taken from published tables for confidence interval of 95.9 and sample size of 10, f = n-1)
So:
Tx = (2·262 × 0·1%) / 3·162 = 0·072%
We can employ this as the confidence interval:
uc = x + /- Tx
x : 18.54 (mean value of measured results, taken from table above)
uc (95.9) = 18.54% +/- 0.07%
This means that we can be around 95% sure that the true value for chromium lies somewhere between 18.47% and 18.61%.
Interestingly, the t-distribution function has provided us with a confidence interval which is less than the standard deviation, which means the spectroscopy measurements are actually more precise than the standard statistical technique may suggest. To use this technique multiple readings of a single sample must be gathered in your analyzer.
This information has been sourced, reviewed and adapted from materials provided by Hitachi High-Tech Analytical Science.
For more information on this source, please visit Hitachi High-Tech Analytical Science.