How to rank text uniquely without duplicates
Table of Contents
1. How to rank text uniquely without duplicates
Question: How do I rank text cell values uniquely? If text values were sorted alphabetically from A to Z, the first text value would rank 1 and so on... Duplicate text values rank uniquely.
Answer:
The image above shows random text values in cell range B3:B14. The formula in cell C3 creates numbers that represent the order if the text values were sorted from A to Z.
Formula in cell C3:
Copy cell C3 and paste to cells below as far as needed. I'll explain each number in C3:C14 and how the formula produces that result:
- Cell C3 corresponds to "DD" and returns 8. There are 7 values less than "DD" (all AAs, BBs, CC). This is the 1st "DD" encountered: 7 + 1 = 8
- Cell C4 corresponds to "AA" and returns 1. No values less than "AA". This is the 1st "AA" encountered: 0 + 1 = 1
- Cell C5 corresponds to "AA" and returns 2. No values less than "AA". This is the 2nd "AA" encountered: 0 + 2 = 2
- Cell C6 corresponds to "BB" and returns 5. There are 4 "AA" values less than "BB". This is the 1st "BB" encountered: 4 + 1 = 5
- Cell C7 corresponds to "AA" and returns 3. No values less than "AA". This is the 3rd "AA" encountered: 0 + 3 = 3
- Cell C8 corresponds to "DD" and returns 9. There are 7 values less than "DD" (all AAs, BBs, CC). This is the 2nd "DD" encountered: 7 + 2 = 9
- Cell C9 corresponds to "EE" and returns 10. There are 9 values less than "EE" (all AAs, BBs, CC, DDs). This is the 1st "EE" encountered: 9 + 1 = 10
- Cell C10 corresponds to "AA" and returns 4. No values less than "AA". This is the 4th "AA" encountered: 0 + 4 = 4
- Cell C11 corresponds to "BB" and returns 6. There are 4 "AA" values less than "BB". This is the 2nd "BB" encountered: 4 + 2 = 6
- Cell C12 corresponds to "EE" and returns 11. There are 9 values less than "EE" (all AAs, BBs, CC, DDs). This is the 2nd "EE" encountered: 9 + 2 = 11
- Cell C13 corresponds to "CC" and returns 7. There are 6 values less than "CC" (all AAs, BBs). This is the 1st "CC" encountered: 6 + 1 = 7
- Cell C14 corresponds to "EE" and returns 12. There are 9 values less than "EE" (all AAs, BBs, CC, DDs). This is the 3rd "EE" encountered: 9 + 3 = 12
The formula works by counting how many values are less than the current value (first COUNTIF) and adding that to the count of how many times the current value has appeared so far (second COUNTIF), effectively creating a unique rank for each entry.
How does this formula work?
Let us start with cell C3.
=COUNTIF($B$3:$B$14,"<"&B3)+COUNTIF($B$3:B3,B3)
Step 1 - Calculate rank
The COUNTIF function is an incredibly versatile function, in this case, instead of counting values based on a condition we simply check if a value is smaller or larger than the others.
Remember, we are using text values so the function returns a rank number based on the position if the list were sorted alphabetically.
COUNTIF($B$3:$B$14,"<"&B3) returns 7. 7 values are sorted before DD, as if they were sorted from A to Z.
The problem is that value DD has a duplicate and we need to make sure that the duplicate doesn't get the same number as the first instance.
Step 2 - Count current value and prior instances of current value
COUNTIF($B$3:B3,B3)
becomes
COUNTIF("DD","DD")
and returns 1.
Note that the cell reference $B$3:B3 expands as we copy the cell to cells below.
Step 3 - Add values
COUNTIF($B$3:$B$14,"<"&B3)+COUNTIF($B$3:B3,B3)
becomes
7 + 1 and returns 8 in cell C3.
Final notes
Select B3 and use "evaluate formula" to see how the next cell calculates rank. Press with left mouse button on "Evaluate" button and excel calculates cell formula step by step.
Get *.xlsx file
Text values uniquely ranked.xlsx
2. How to rank uniquely based on a condition
The following formula ranks text values in column C uniquely based on the category in column B.
- The formula ranks the "Text" values within each unique "ID" group.
- For each "ID" group, it starts the ranking from 1.
- If there are duplicate "Text" values within an "ID" group, they receive consecutive ranks.
This type of ranking is useful for categorizing items within subgroups while maintaining a consistent ranking system across different categories or IDs.
Formula in D3:
Let's break it down by ID:
For ID 7:
- "parta" gets rank 1 (first occurrence)
- "partb" gets rank 3 (first occurrence after "parta")
- Another "partb" gets rank 4 (second occurrence)
- Another "parta" gets rank 2 (second occurrence)
For ID 11:
- "parta" gets rank 1 (first occurrence)
- "partc" gets rank 3 (first occurrence after "parta")
- "partb" gets rank 2 (first occurrence, comes alphabetically between "parta" and "partc")
It counts how many unique text values come before the current one within the same ID group. Adding 1 to that count to get the rank. Handling duplicates by giving them consecutive ranks within their ID group.
Explaining formula in cell D3
Step 1 - Concatenate cell C3 and current row number
The ROW function returns the row number from a cell reference, if a cell ref is omitted then the row number of the current cell is returned.
The ampersand character & concatenates two values.
C3&ROW()
becomes
"parta"&3
and returns parta3.
Step - 2 Concatenate each cell in column C and corresponding row numbers row-wise
$C$3:$C$9&ROW($C$3:$C$9)
returns {"parta3";"partb4";...;"partb9"}
Step 3 - Compare value with array using the larger than sign
(C3&ROW()>$C$3:$C$9&ROW($C$3:$C$9))
returns {FALSE;....;FALSE}.
The number concatenated to each value makes it unique.
Step 4 - Compare categories with current cell
(B3=$B$3:$B$9)
returns {TRUE;... ;FALSE}
Step - 5 Multiply arrays
(C3&ROW()>$C$3:$C$9&ROW($C$3:$C$9))*(B3=$B$3:$B$9)
returns {0;0;0;0;0;0;0}
Step 6 - Sum values in array
SUMPRODUCT((C3&ROW()>$C$3:$C$9&ROW($C$3:$C$9))*(B3=$B$3:$B$9))
becomes SUMPRODUCT({0;0;0;0;0;0;0}) and returns 0.
Step 7 - Add 1
SUMPRODUCT((C3&ROW()>$C$3:$C$9&ROW($C$3:$C$9))*(B3=$B$3:$B$9)) + 1
becomes
0 + 1 equals 1 in cell D3.
Get Excel *.xlsx file
ranking text with condition.xlsx
Rank category
This article demonstrates how to sort a data set by multipe columns using the SORTBY function, Ecel Table and Excel […]
Excel categories
26 Responses to “How to rank text uniquely without duplicates”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form
The formula works exactly as advertised and I have implemented it with no problems. However, even when using the "Evaluate Formula" feature to walk through the steps Excel uses to generate the output, I still don't understand how this formula works.
For instance, using the example file you provided, when I evaluate Cell B2 (which correctly ends up as 10), the final line before producing the output is:
MATCH(7.00000763673575,{6.00000763673575},0)
How is the MATCH function returning a 10 out of this? I sorta think I understand all the steps in the evaluation up to this point, but the money line eludes me. Would it be possible to break this formula down and explain the logic further.
If not, thanks for the formula anyway. I know how to use it even if I don't understand it.
And by Cell B2 I mean Cell B8, oops
Thank you for your comment!
Really good question, I have no answer why MATCH(7.00000763673575,{6.00000763673575},0) works.
I have changed the array formula in this post to a smaller, easier array formula. I also updated the attached excel file.
Oscar,
Thank you, the new formula works just as well and is obviously simpler.
But now you've piqued my interest. Where did the original come from? It seems like a rather complicated formula to have been derived by trial & error (although possible I guess), so somebody must have created it on purpose.
Thanks again.
If you use this formula, then you can enter it normally (that is, you do *not* have to use Ctrl+Shift+Enter, just Enter will work)...
=COUNTIF(A$2:A$20,"<"&A2)+COUNTIF($A$2:A2,A2)
If you use this formula, then you can enter it normally (that is, you do *not* have to use Ctrl+Shift+Enter, just Enter will work)...
=COUNTIF(List,"<"&A2)+COUNTIF($A$2:A2,A2)
Matt Gahlinger,
I created the original formula by trial and error. It is a weird formula that should not work, I think. But it it worked and then I forgot about it.
Rick Rothstein (MVP - Excel),
Thanks!
Guys,
All the formulas work as far as I can tell. Thanks for the multiple solutions.
Will the formula work if "List" (a named range) is substituted by A$2:A$13 instead?
I need to rank several text columns and would like to copy & paste the formulas & not have to define a name for each list.
Steve,
Yes, it will work.
Tried it & it worked perfectly. Thanks.
I got the formula working, and although it's a great formula for criteria ranking it's not suitable for excel sheets with more than a thousand lines. Unless you have a very powerful machine.
Tried it, and copied colums a and b to my excel, and it worked. But I could not copy and paste just the formula in my own excel, nor could I rewrite it without losing the unique part. I use Excel 2007
Brendon,
=COUNTIF(List, "<"&A2)+SUM(IF(A2=$A$2:A2, 1, 0)) You need to adjust cell references in both named range List and in formula.
My real data returned many duplicated rankings with the original formula. I used this to give unique rankings for each entry:
=COUNTIF(List, "<"&A2)+COUNTIF($A$2:A2,A2)
Just noticed that's effectively what "Rick Rothstein (MVP - Excel)" posted, so that makes two of us.
Hi Oscar,
I am trying to rank some scores for a golf sweepstake (without having to do it manually). I know how to organise the scoring into ascending order, but when I match the names to the scores (using vlookup), the players who are on the same score all come up with the same name. I then tried to apply your method, however, I can only get as far as the differentiating the top 4 players (all on 69). Is there anyway I can continue this ranking the whole way through down to the low of 79?
Score Player
73 Lee Westwood
70 Jim Furyk
76 Adam Scott
72 Hunter Mahan
77 Louis Oosthuizen
77 Rory McIlroy
78 Bubba Watson
70 Matt Kuchar
70 Ian Poulter
69 Graeme McDowell
69 Tiger Woods
76 Phil Mickleson
72 Jason Dufner
69 Nick Watney
74 Martin Kaymer
79 Luke Donald
69 Justin Rose
71 Francesco Molinari
77 Zach Johnson
75 Ernie Els
72 Rickie Fowler
75 Dustin Johnston
73 Sergio Garcia
73 Keegan Bradley
72 Nicolas Colsaerts
Jordy,
See attached file:
Jordy.xlsx
Thats fantastic Oscar! Looks far too complicated to have explained!
Cheers!
Jordy,
Formula in D2:
Array formula in E2:
Explaining array formula in cell E2
Step 1 - Remove names already in list
COUNTIF($E$1:E1, $B$2:$B$26)
becomes
COUNTIF("", {"Lee Westwood"; "Jim Furyk"; "Adam Scott"; "Hunter Mahan"; "Louis Oosthuizen"; "Rory McIlroy"; "Bubba Watson"; "Matt Kuchar"; "Ian Poulter"; "Graeme McDowell"; "Tiger Woods"; "Phil Mickleson"; "Jason Dufner"; "Nick Watney"; "Martin Kaymer"; "Luke Donald"; "Justin Rose"; "Francesco Molinari"; "Zach Johnson"; "Ernie Els"; "Rickie Fowler"; "Dustin Johnston"; "Sergio Garcia"; "Keegan Bradley"; "Nicolas Colsaerts"})
and returns {0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0}
Step 2 - Change False to True or True to False
(0 is False and 1 is True)
NOT(COUNTIF($E$1:E1, $B$2:$B$26))
becomes
NOT({0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0})
and returns
{TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE}
Step 3 - Multiply scores with array
$A$2:$A$26*NOT(COUNTIF($E$1:E1, $B$2:$B$26))
becomes
{73; 70; 76; 72; 77; 77; 78; 70; 70; 69; 69; 76; 72; 69; 74; 79; 69; 71; 77; 75; 72; 75; 73; 73; 72}*{TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE; TRUE}
and returns
{73; 70; 76; 72; 77; 77; 78; 70; 70; 69; 69; 76; 72; 69; 74; 79; 69; 71; 77; 75; 72; 75; 73; 73; 72}
Step 4 - Return the relative position of the score
MATCH(D2, $A$2:$A$26*NOT(COUNTIF($E$1:E1, $B$2:$B$26)), 0)
becomes
MATCH(69, {73; 70; 76; 72; 77; 77; 78; 70; 70; 69; 69; 76; 72; 69; 74; 79; 69; 71; 77; 75; 72; 75; 73; 73; 72}, 0)
and returns 10
Step 5 - Return the value of a cell at the intersection of a particular row and column
=INDEX($B$2:$B$26, MATCH(D2, $A$2:$A$26*NOT(COUNTIF($E$1:E1, $B$2:$B$26)), 0))
becomes
=INDEX({"Lee Westwood"; "Jim Furyk"; "Adam Scott"; "Hunter Mahan"; "Louis Oosthuizen"; "Rory McIlroy"; "Bubba Watson"; "Matt Kuchar"; "Ian Poulter"; "Graeme McDowell"; "Tiger Woods"; "Phil Mickleson"; "Jason Dufner"; "Nick Watney"; "Martin Kaymer"; "Luke Donald"; "Justin Rose"; "Francesco Molinari"; "Zach Johnson"; "Ernie Els"; "Rickie Fowler"; "Dustin Johnston"; "Sergio Garcia"; "Keegan Bradley"; "Nicolas Colsaerts"}, 10)
and returns Graeme McDowell in cell E2.
Hi Oscar,
That last explanation was fantastic, but I have another question now, probably slightly easier.
Sticking with golf, if there was a team game, taking a sum of the 3 best players from a team of 5.
However, the problem is if 3 out of 5 players miss the cut for the 3rd & 4th round the SMALL(F5:F9,{1,2,3}) doesn't work.
Is there some way I can add a formula that says something along the lines of, =if 3 or more out of 5 in F5:F9 say MC, then = MC at H10, if not then =(SUM(SMALL(F5:F9,{1,2,3})).
Hi Oscar, sorry to waste your time... I figured it out myself through trial and error...
It was quite simply
=IFERROR(SUM(SMALL(F5:F9,{1,2,3})),$J$24
J24 = N/A
New issue... J24 = OUT now... If more than one team is OUT, then the problem I had in the first place arises again!!! If there are 2 cells that say OUT, I can't get different names adjacent to the two cells that say OUT... Is there are simpler way to do this?
This is exactly what I was looking for, except I want to give duplicate values the same ranking. In short I want to extract a unique list of names, by categories, however, the names show up multiple times. I don't want to use an array formula so I thought a shortcut would be to rank the unique people by each category then bring them into separate tables!
Hi, how can I apply this formula to multiple columns?
Hi Oscar. I tried this formula. For some reason, the numbers were off by 8, so I subtracted 8 at the end of the formula and it now ranks correctly.
But I have several columns that I want to ranked simultaneously. How can I use this formula with multiple columns? Going with your sample data for example, there is List 1 column, List 2 column, and List 3 column. List 2 is a sub-category to List 1. List 3 is a subcategory of List 2.
Hi Oscar, How I use this formula to rank text with multiple criteria-columns?