In statistics, absolute frequency refers to the number of times a particular value appears in a data series. The cumulative frequency expresses a different concept: it is the total sum of the absolute frequency of the element of the series under consideration and of all the absolute frequencies of the values that precede it. It may seem like a very technical and complicated definition, but when it comes to getting into the calculations everything becomes a lot easier.
Steps
Part 1 of 2: Calculating the Cumulative Frequency
Step 1. Sort the data series to study
By series, set or distribution of data we simply mean the group of numbers or quantities that are the object of your study. Sort the values in ascending order, starting with the smallest to get to the largest.
Example: The data series to study shows the number of books read by each student in the last month. After sorting the values, here's what the data set looks like: 3, 3, 5, 6, 6, 6, 8
Step 2. Calculate the absolute frequency of each value
Frequency is the number of times a given data appears within the series (you can call this "absolute frequency" so you don't get confused with the cumulative frequency). The simplest way to keep track of this data is to represent it graphically. As the header of the first column, write the word "Values" (alternatively you can use the description of the quantity that is measured by the series of values). As the header of the second column, use the word "Frequency". Populate the table with all necessary values.
- Example: in our case the header of the first column could be "Number of Books", while that of the second column will be "Frequency".
- In the second row of the first column, enter the first value of the series under consideration: 3.
- Now calculate the frequency of the first data, ie the number of times the number 3 appears in the data series. At the end of the calculation enter the number 2 in the same row as the "Frequency" column.
-
Repeat the previous step for each value present in the dataset resulting in the following table:
- 3 | F = 2
- 5 | F = 1
- 6 | F = 3
- 8 | F = 1
Step 3. Calculate the cumulative frequency of the first value
The cumulative frequency answers the question "how many times does this value or a smaller value appear?". Always start the calculation with the smallest value in the data series. Since there are no values smaller than the first element in the series, the cumulative frequency will be equal to the absolute frequency.
-
Example: in our case the smallest value is 3. The number of students who read 3 books in the last month is 2. No one has read less than 3 books, so the cumulative frequency is 2. Enter the value in the first. row of the third column of our table, as follows:
3 | F = 2 | CF = 2
Step 4. Calculate the cumulative frequency of the next value
Consider the next value in the example table. At this point we have already identified the number of times the smallest value in our dataset appeared. To calculate the cumulative frequency of the data in question, we simply need to add its absolute frequency to the previous total. In simpler words, the absolute frequency of the current element must be added to the last calculated cumulative frequency.
-
Example:
-
3 | F = 2 | CF =
Step 2.
-
5 | F =
Step 1. | CF
Step 2
Step 1. = 3
Step 5. Repeat the previous step for all values in the series
Continue by examining the increasing values present within the dataset you are studying. For each value you will need to add its absolute frequency to the cumulative frequency of the previous element.
-
Example:
-
3 | F = 2 | CF =
Step 2.
-
5 | F = 1 | CF = 2 + 1 =
Step 3.
-
6 | F = 3 | CF = 3 + 3 =
Step 6.
-
8 | F = 1 | CF = 6 + 1 =
Step 7.
Step 6. Check your work
At the end of the calculation you will have performed the sum of all the absolute frequencies of the elements that make up the series in question. The last cumulative frequency should therefore be equal to the number of values present in the set under study. To check that everything is correct you can use two methods:
- Summarize the individual absolute frequencies: 2 + 1 + 3 + 1 = 7, which corresponds to the final cumulative frequency of our example.
- Or it counts the number of elements that make up the data series under consideration. The dataset of our example was as follows: 3, 3, 5, 6, 6, 6, 8. The number of elements that compose it is 7, which corresponds to the overall cumulative frequency.
Part 2 of 2: Advanced Use of Cumulative Frequency
Step 1. Understand the difference between discrete and continuous (or dense) data
A data set is defined as discrete when it is countable through whole units, where it is impossible to determine the value of a part of the unit. A continuous dataset describes uncountable elements, where the measured values can fall anywhere in the chosen measurement units. Here are some examples to clarify the ideas:
- Number of dogs: fair. There is no element that corresponds to "half dog".
- The depth of a snowdrift: continuous. As snow falls, it accumulates in a gradual and continuous way that cannot be expressed in whole units of measurement. Trying to measure a snowdrift the result will surely be a non-whole measurement - for example 15.6 cm.
Step 2. Group the continuous data into subsets
Continuous data series are often characterized by a large number of unique variables. If I tried to use the method described above to calculate the cumulative frequency, the resulting table would be extremely long and difficult to read. Instead, inserting a subset of data in each row of the table will make everything easier and more readable. The important thing is that each subgroup has the same size (e.g. 0-10, 11-20, 21-30, etc.), regardless of the number of values that make it up. Below is an example of how to graph a continuous data series:
- Data series: 233, 259, 277, 278, 289, 301, 303
-
Table (in the first column we insert the values, in the second the absolute frequency while in the third the cumulative frequency):
- 200–250 | 1 | 1
- 251–300 | 4 | 1 + 4 = 5
- 301–350 | 2 | 5 + 2 = 7
Step 3. Plot the data on a line chart.
After calculating the cumulative frequency, you can graph it. Draw the X and Y axes of the chart using a sheet of squared or graph paper. The X axis represents the values present in the data series under consideration, while on the Y axis we will report the values of the relative cumulative frequency. This way the next steps will be much easier.
- For example, if your data series consists of numbers 1 through 8, divide the x-axis into 8 units. For each unit present on the X axis, draw a point corresponding to the respective cumulative frequency present on the Y axis. At the end connect all the contiguous points with a line.
- If there are values for which a point has not been plotted on the graph, it means that their absolute frequency is equal to 0. Therefore, adding 0 to the cumulative frequency of the previous element, the latter does not change. For the value in question you can therefore report on the graph a point corresponding to the same cumulative frequency of the previous element.
- Since the cumulative frequency always tends to increase according to the absolute frequencies of the values of the series in question, graphically you should get a broken line that tends upwards as you move to the right on the X axis. any point the slope of the line should be negative, it means that most likely an error has been made in the calculation of the absolute frequency of the relative value.
Step 4. Plot the median (or midpoint) of the line graph
The median is the point that is exactly in the center of the data distribution. So half of the values of the series under consideration will be distributed above the midpoint, while the other half will be below. Here's how to find the median starting from the line graph taken as an example:
- Look at the last point drawn on the far right of the graph. The Y coordinate of said point corresponds to the total cumulative frequency, which therefore corresponds to the number of elements that make up the series of values under consideration. Let's assume that the number of elements is 16.
- Multiply this number by ½, then find the result obtained on the Y axis. In our example we will get 16/2 = 8. Find the number 8 on the Y axis.
- Now locate the point on the graph line corresponding to the value of the Y axis just calculated. To do this, place your finger on the graph at unit 8 of the Y axis, then move it in a straight line to the right until it intersects the line that graphically describes the cumulative frequency trend. The identified point corresponds to the median of the data set under examination.
- Find the X coordinate of the midpoint. Place your finger exactly on the midpoint you just found, then move it in a straight line downwards until it intersects the X axis. The value found corresponds to the median element of the data series being examined. For example, if this value is 65, it means that half of the elements of the studied data series are distributed below this value while the other half is above.
Step 5. Find the quartiles from the graph
Quartiles are the elements that divide the data series into four sections. The process for finding quartiles is very similar to that used for finding the median. The only difference is in the way in which the coordinates on the Y axis are identified:
- To find the Y coordinate of the lower quartile, multiply the cumulative total frequency by ¼. The X coordinate of the corresponding point on the line of the graph will graphically show the section made up of the first quarter of elements of the series under consideration.
- To find the Y coordinate of the upper quartile, multiply the total cumulative frequency by ¾. The X coordinate of the corresponding point on the graph line will graphically divide the data set into the lower ¾ and the upper ¼.
-
-