If one has a perfectly balanced coin, one can calculate (by definition using probability theory) that the probability of the coin falling heads up is 0.50 and the probability of the coin falling with tails up is also 0.50. Therefore, if one flips the coin 100 times the expected frequency is 50 heads and 50 tails. However, it is also likely that in a sample of 100 coin flips one will observe 45 heads and 55 tails, or 52 heads and 48 tails, or 41 heads and 59 tails, or other combinations differing from 50 heads and 50 tails. Once in a while the observation will be exactly 50 heads and 50 tails, but most of the time you will observe these deviations called sampling error. These are called sampling error because the 0.50 heads and 0.50 tails will only be obtained from an infinite number of flips (an impossibility of course) and any lesser number is only a sample of the infinite number. Because samples are smaller than infinity, they usually deviate from the exact expectations derived from calculations.
It is of great importance that you understand that there is an exact probability which is known by statisticians that chance alone will cause sampling errors of any particular magnitude in a sample of a certain size. As the size of the sample increases, the probability of getting a sampling error of a certain percentage decreases. To illustrate, let us suppose that one were to flip a perfectly balanced coin 200 times. It is known ahead of time that there is a certain probability of obtaining 100 heads and 100 tails. It is also known that there is an exact probability, one which is considerably smaller, that you would get a sampling error of 10 (10%) from a result of 100 heads and 100 tails, i.e., 110 heads and 90 tails or 90 heads and 110 tails. Finally, if one flips the coin 2000 times, there is a very much smaller probability of getting a sampling error as large as 10% from the expected result of 1000 heads and 1000 tails.
The problem that one faces when one obtains a ratio of 60 black and 40 white offspring from a mating which one calculated should yield 50% black and 50% white can be stated very simply. It is the problem of deciding between two alternative hypotheses: (1) The genetic mechanism is one which really produces a 1:1 ratio and the deviation is due to chance (sampling error) or (2) The genetic mechanism is not one which really produces a 1:1 ratio (not necessarily a 6:4 ratio -- perhaps a 3:1 ratio). In order that your decision is not a mere guess, you should apply statistical techniques which will test goodness of fit of your observed results to your expected ratio (in this case 1:1). This test is the Chi-square test (formula presented below). You will arrive at a value for probability (p). You must know (a) what this probability refers to and (b) how to apply it to the making of your decision.
The probability which you calculate is exactly this: The probable frequency with which a deviation as great as, or greater than, the observed one will appear as a result of chance alone if another similar trial or another similar sample is taken. For example, if the value of the probability which you get from your calculation is 0.50 then there is a 50% probability that another similar trial will produce results which, because of chance alone, deviates as much or more than yours from the 1:1 ratio being tested. Another way of saying it is that one out of every two trials would be expected to show as much deviation because of sampling errors caused by chance alone. If this were true, obviously your deviation from a 1:1 ratio is not significant. It does not in any way support the hypothesis that the genetic mechanism is one which produces a 1:1 ratio; but, on the other hand, if there is a certain explanation which predicts that the mating should produce a 1:1 ratio, the deviation from 1:1 ratio places no suspicion upon that explanation because the deviation can be readily attributed to sampling errors.
However, if the value of the probability which you calculate is 0.05, then only 5% of trials like yours will deviate from the expected ratio as much as yours and because of chance alone or sampling error. You either actually have gotten a deviation that is expected to occur by chance alone only 1 out of 20 times or you have some genetic mechanism which is producing something other than a 1:1 ratio. If you reject the 1:1 ratio hypothesis at the 5% level, then you risk making an error of ejecting the true 1:1 producing genetic mechanism 5% of the time or 1 out of every 20 trials or samples. Rather arbitrarily, statisticians accept this level of potential error and call this the first level of significance. Probabilities of the value of 5% or lower are significant; that is, you better not ignore those 20 to 1 odds against your deviation being due to chance alone and you should refuse to accept your observations as an indication of a 1:1 ratio. If no acceptable alternative hypothesis gives a ratio with a high goodness of fit, then you should conduct more trials to collect additional observations to see if they show a better fit to the 1:1 ratio. (After all, maybe you hit the 1 out of 20 times that a 1:1 ratio is accompanied by that much deviation because of sampling error alone.) A value for probability of 0.01 means that only 1 out of 100 times another trial would show the amount of deviation you obtained because of chance alone. This is the second level of significance. Your observations should certainly not be considered a 1:1 ratio. Here again, more observations could be made or perhaps another acceptable hypothesis will be found to have a high goodness of fit.
Note that this statistical test does not prove nor disprove anything. It simply determines the probabilities that deviations from the expected are or are not due to chance. This helps you decide that the explanation upon which the expectations were based is probably correct or not correct. Also note that you do not generally calculate the probability (p), instead you calculate a Chi square value and compare that value to a table of values (presented below) assigned to different p values. Computer programs will often calculate a specific p value associated with each chi square value, however.
Degrees of Freedom means the number values you need to know in order to know all of the values. For example, let's assume there are 100 total individuals in a population and you want to know how many of those individuals have the dominant phenotype and how many have the recessive phenotype. If you count 75 dominant individuals, then you know that there are 25 recessive individuals. Therefore, there is one degree of freedom. If a population has three phenotypes, then you need to know how many individuals there are in total plus how many of phenotypes 1 and 2 to determine the number of phenotype 3. In general then, the degrees of freedom is equal to N-1, where N is equal to the number of classes (e.g., phenotypes or genotypes).

| Degrees of Freedom | Probability, p | ||||
| 0.99 | 0.95 | 0.05 | 0.01 | 0.001 | |
| 1 | 0.000 | 0.004 | 3.84 | 6.64 | 10.83 |
| 2 | 0.020 | 0.103 | 5.99 | 9.21 | 13.82 |
| 3 | 0.115 | 0.352 | 7.82 | 11.35 | 16.27 |
| 4 | 0.297 | 0.711 | 9.49 | 13.28 | 18.47 |
| 5 | 0.554 | 1.145 | 11.07 | 15.09 | 20.52 |
| 6 | 0.872 | 1.635 | 12.59 | 16.81 | 22.46 |
| 7 | 1.239 | 2.167 | 14.07 | 18.48 | 24.32 |
| 8 | 1.646 | 2.733 | 15.51 | 20.09 | 26.13 |
| 9 | 2.088 | 3.325 | 16.92 | 21.67 | 27.88 |
| 10 | 2.558 | 3.940 | 18.31 | 23.21 | 29.59 |
| 11 | 3.05 | 4.58 | 19.68 | 24.73 | 31.26 |
| 12 | 3.57 | 5.23 | 21.03 | 26.22 | 32.91 |
| 13 | 4.11 | 5.89 | 22.36 | 27.69 | 34.53 |
| 14 | 4.66 | 6.57 | 23.69 | 29.14 | 36.12 |
| 15 | 5.23 | 7.26 | 25.00 | 30.58 | 37.70 |
| 16 | 5.81 | 7.96 | 26.30 | 32.00 | 39.25 |
| 17 | 6.41 | 8.67 | 27.59 | 33.41 | 40.79 |
| 18 | 7.02 | 9.39 | 28.87 | 34.81 | 42.31 |
| 19 | 7.63 | 10.12 | 30.14 | 36.19 | 43.82 |
| 20 | 8.26 | 10.85 | 31.41 | 37.57 | 45.32 |
| 21 | 8.90 | 11.59 | 32.67 | 38.93 | 46.80 |
| 22 | 9.54 | 12.34 | 33.92 | 40.29 | 48.27 |
| 23 | 10.20 | 13.09 | 35.17 | 41.64 | 49.73 |
| 24 | 10.86 | 13.85 | 36.42 | 42.98 | 51.18 |
| 25 | 11.52 | 14.61 | 37.65 | 44.31 | 52.62 |
| 26 | 12.20 | 15.38 | 38.89 | 45.64 | 54.05 |
| 27 | 12.88 | 16.15 | 40.11 | 46.96 | 55.48 |
| 28 | 13.57 | 16.93 | 41.34 | 48.28 | 56.89 |
| 29 | 14.26 | 17.71 | 42.56 | 49.59 | 58.30 |
| 30 | 14.95 | 18.49 | 43.77 | 50.89 | 59.70 |
| Attachment | Size |
|---|---|
| chisq.jpg | 995 bytes |