The P-value, or probability value, is a statistical measure that helps scientists determine the correctness of their assumptions. P is used to understand if the results of an experiment fall within the normal range of values for the observed event. Usually, if the P-value of a given data set falls below a certain predetermined level (e.g. 0.05) then scientists reject the "null hypothesis" of their experiment, in other words they rule out the hypothesis whose variable is not significant for the results. You can use a table to find the p-value, after calculating other statistical values. One of the statistical values to be determined first is the chi-square.
Steps
Step 1. Determine the expected results from your experiment
Usually, when scientists conduct tests and observe the results, they already have an idea in advance of what is "normal" or "typical". This idea can be based on previous experiments, on a series of reliable data, on scientific literature and / or on other sources. Then, in your experiment, determine what the expected results might be and express them in numerical form.
For example: Let's say previous studies have shown that, nationwide, red car drivers have taken more speeding fines than blue car drivers, in a ratio of 2: 1. You want to understand if the police in your city "respect" this statistic and prefer to fine the red cars. If you take a random sample of 150 speeding tickets awarded to red and blue cars, you should expect that 100 are for the reds and 50 for the blues, if the police in your city respect the national trend.
Step 2. Determine the observed results of your experiment
Now that you know what to expect, you need to conduct the test to find the real (or "observed") value. Also in this case the results must be expressed in numerical form. If we manipulate some external conditions and notice that the results differ from those expected, there are two possibilities: it is a coincidence, or our intervention has caused the deviation. The purpose of calculating the P value is to understand if the resulting data deviate so much from those expected as to make the "null hypothesis" (i.e. the hypothesis that there is no correlation between the experimental variable and the observed results) quite unlikely. to be rejected.
For example: In your city, the 150 random speeding fines you considered turn out to be broken down into 90 for red cars e 60 for the blue ones. This data deviates from the national (and expected) average 100 And 50. Was our manipulation of the experiment (in this case we changed the sample from national to local) the cause of this difference, or is it the city police not following the national average? Are we observing different behavior or have we introduced a significant variable? The P value tells us just that.
Step 3. Determine the degree of freedom of your experiment
Degrees of freedom are the measure of the amount of variability that the experiment predicts and which is determined by the number of categories you are looking at. The equation for degrees of freedom is: Degrees of freedom = n-1, where "n" is the number of categories, or variables, you are analyzing.
-
Example: Your experiment has two categories, one for red cars and the other for blue cars. So you have 2-1 = 1 degree of freedom.
If you had considered the red, blue and green cars, you would have had
Step 2. degrees of freedom and so on.
Step 4. Compare the expected results with the observed ones using the chi square
The chi-squared (written "x2") is a numerical value that measures the difference between the expected and observed data of a test. The equation for chi-squared is: x2 = Σ ((o-e)2/And), where "o" is the observed value and "e" is the expected one. Add the results of this equation for all possible outcomes (see below).
- Note that the equation includes the symbol Σ (sigma). In other words you have to calculate ((| o-e | -, 05)2/ e) for each possible outcome and then add the results together to obtain the chi square. In the example we are considering we have two outcomes: the car that got the fine is blue or red. Then we calculate ((o-e)2/ e) twice, once for the reds and the other for the blues.
-
For example: we insert the expected and observed values into the equation x2 = Σ ((o-e)2/And). Remember that since there is a sigma symbol, you have to do the calculation twice, once for the red cars and the other for the blue ones. Here's how you need to do it:
- x2 = ((90-100)2/100) + (60-50)2/50)
- x2 = ((-10)2/100) + (10)2/50)
- x2 = (100/100) + (100/50) = 1 + 2 = 3.
Step 5. Choose a significance level
Now that you have the degrees of freedom and chi-square, there is one last value you need to find the P-value, you need to decide on the significance level. In practice it is a value that measures how much you want to be sure of your result: a low level of significance corresponds to a low probability that the experiment has produced random data and vice versa. This value is expressed in decimals (such as 0.01) and corresponds to the percentage of chance that the resulting data is random (in this case 1%).
- By convention, scientists determine their significance level at 0.05 or 5%. This means that the experimental data have, at most, a 5% chance of being random. In other words, there is a 95% chance that the results were influenced by the scientists' manipulation of the test variables. For most experiments, 95% confidence that there is a correlation between two variables "satisfactorily" demonstrates that the correlation does indeed exist.
- For example: in your red and blue car test, you follow the convention of the scientific community and set your significance level to 0, 05.
Step 6. Use a chi-squared distribution table to approximate your P-value
Scientists and statisticians use large tables to calculate P in their tests. These tables usually have the various degrees of freedom on the vertical column on the left and the corresponding P value on the horizontal row at the top. First find the degrees of freedom and then scroll down the table from left to right to find the first largest number of the your chi square. Now go up to find what P-value corresponds to (usually the P-value is between this number you found and the next largest).
- Chi-square distribution tables are available almost everywhere, you can find them online or in science and statistics texts. If you can't get them, use the one pictured above or use this link.
-
For example: your chi square is 3. Then use the distribution table in the photo above and find the approximate value of P. Since you know your experiment has only
Step 1. degree of freedom, you will start with the top row. Move from left to right in the table until you find a larger value d
Step 3. (your chi square). The first number you come across is 3.84. You go up on the column and notice that it corresponds to a value of 0.05. This means that our value of P is between 0.05 and 0.1 (the next largest number in the table).
Step 7. Decide whether to reject or keep your null hypothesis
Since you have found an approximate value of P for your experiment, you can decide whether or not to reject the null hypothesis (I remind you that the null hypothesis is the one that assumes that there is no correlation between the variable and the results of the experiment). If P is less than your significance level, congratulations: you have shown that there is a high probability of correlation between the variable and the observed results. If P is greater than your significance level then the observed results may more likely be the result of chance.
- For example: the value of P is between 0.05 and 0.1, so it is certainly not less than 0.05. This means that you cannot reject your null hypothesis and that you have not reached the minimum safety threshold of 95% to decide if the police in your city give the fines to the red and blue cars with a significantly different proportion to the national average.
- In other words, there is a 5-10% chance that the data obtained was the result of chance and not the fact that you changed the sample (from national to local). Since you have set yourself a maximum insecurity limit of 5% you cannot say surely that the police in your city are less "prejudiced" against motorists driving a red car.
Advice
- Using a scientific calculator will make calculations much easier. You can also find calculators online.
- It is possible to calculate the p-value using various programs, such as common spreadsheet software or more specialized ones for statistical calculation.