How to group similar rows together in excel
Sebastian Wright
I have a spreadsheet with about 2000 rows that I need to group together based on the contents of about 12 columns. The key here is that the rows may differ in a number of different columns, but for those specific 12 columns, if they have equivalent values for each row in each respective column, I want to group them together and assign them a number,name,whatever.
What's the best way to do this? Is there an easy way? The problem with filtering based on a certain column's value is that each column could have several different values in it, and the number of combinations is huge so for me to manually select each combination would be a huge headache.
5 Answers
Simple Way:
Create another column that is the concatenation of all 12 columns, then sort by that column.
Harder Way:
Use VBA, probably triggered off of a button.
Concatenate the 12 columns into a string, creating an array of these strings.
Sort all internally, keeping track of which row each is, then output the way you want.
Try pivot tables. They will let you group, summarise, filter and sort your data with a great amount of flexibility.
There are a few requirements on the way your source data should be laid out for pivoting to be most effective. Without seeing your data it is difficult to know how suitable it is.
You can group rows based on a similar values using this formula:
Example:
Name Place
Rajesh Markapur
Rajesh Vijayawada
Suresh Minneapolis
Mahesh Guntur
Mahesh Chennai
Brijesh Markapur
Sateesh Bubaneshwar
Sateesh Markapur And if you want to group these data on the basis of name as below
Excel col A B C
Excelrow Group Name Place 1 1 Rajesh Markapur 2 1 Rajesh Vijayawada 3 2 Suresh Minneapolis 4 3 Mahesh Guntur 5 3 Mahesh ChennaiUse this formula, =if (B2=B3, A1,A2+1)
Grouping similar rows together is a form of clustering. Wikipedia has a decent introduction to multi-dimensional cluster analysis, you can start there.
There are many techniques for performing clustering, but any of these techniques would be a challenge to implement within the context of Excel formulas.
Maybe you should have a look at ASAP Utilities. It's an addon for Excel that, among a lot of things, lets you do conditional row and column select. It's free (for non commercial use) so you might as well give it a try.