Using Excel, when analyzing a very large set of data, you may need to create an additional data sample, to be able to make a comparison or simply for a more in-depth analysis. One way to get the result you want is to assign a random number to each cell in your dataset and then sort them according to your needs. Let's see together how to proceed.
Steps
Step 1. Prepare your dataset in Excel
It can consist of as many rows and columns as you like. It also may or may not have a header row.
Step 2. Insert two blank columns on the left of your dataset
To do this, select with the right mouse button the header cell of column 'A', then select the item 'Insert columns' from the context menu that appeared. Repeat this step twice.
Alternatively you can select column 'A', then choose the 'Columns' option from the 'Insert' menu
Step 3. Inside the first empty cell, type the following command '= RAND ()' (without quotes)
It should be the first empty cell in column 'A', after the header cell. The 'RAND ()' function generates a random number between 0 and 1.
Step 4. Copy the formula 'RAND ()' and paste it into the affected cells of column 'A'
Make sure that each element of your initial dataset has a random number.
Step 5. Select the entire column where you generated your random data (in our case the 'A' column)
Copy all values.
Step 6. Paste the copied data into column 'B'
Use the 'Paste Special' function and choose the 'Values' option. This will copy only the values of the selected cells and not their formulas.
The 'RAND ()' function recalculates a new random number following each change to the contents of the worksheet. Releasing the values generated by the formula is therefore essential to preserve the integrity of the data during your analysis
Step 7. Sort the random dataset
Select column 'B' in its entirety. Select the icon to sort on the toolbar (alternatively select the 'Sort' item in the 'Data' menu). Use an 'Ascending' arrangement.
- Make sure you select the 'Expand Selection' sorting option, then hit the 'Sort' button. In this way the data present in the other columns of the worksheet will be reorganized while maintaining the relationship with the data present in column 'B'.
- Now you can delete column 'A' and / or column 'B'. You won't need them anymore, unless you need to sort again.
Step 8. Select your dataset
You can select as many rows or cells as you want, based on the size of your data sample. Simply select the dataset you want, starting from the top of the list. Since your data has been sorted based on a set of random numbers, the data you are going to analyze will also be a set of random values.