How to extract a case sensitive unique list from a column
This article demonstrates a formula that extracts unique values from a column also considering upper and lower characters (case sensitive).
Table of Contents
My definition of unique values is values that exist only once in a cell range. The image below shows you a list in column B, some of these values have duplicates.
The list in column D contains only values that are unique, Aa and BB exist only once in the list, all other values have a duplicate.
The formula extracts CC, bb, aa and Cc because they exist only once in column B.
Array formula in cell D3:
If you are looking for a unique distinct list, read this post:Â Extract unique distinct values (case sensitive) [Formula].
This post explains how to do a case-sensitive VLOOKUP and returning multiple values:
Recommended articles
The array formula in cell F5 returns adjacent values from column C where values in column B matches the search […]
Make sure you read the following article if you want to extract duplicate values:
Recommended articles
The array formula in cell C2 extracts duplicate values from column A. Only one duplicate of each value is displayed […]
1.1 How to enter an array formula
- Copy (Ctrl + c) above formula.
- Double press with left mouse button on cell C2.
- Paste (Ctrl + v) to cell C2.
- Press and hold CTRL + SHIFT simultaneously.
- Press Enter once.
- Release all keys.
Your formula now looks like this: {=array_formula}
Don't enter the curly brackets, they appear automatically.
Recommended articles
Array formulas allows you to do advanced calculations not possible with regular formulas.
1.2 Explaining array formula in cell D3:
You can easily follow along if you get the attached file and select cell D3. Then go to tab "Formulas" on the ribbon and press with left mouse button on the "Evaluate Formula" button.
Press with left mouse button on "Evaluate" button shown in above picture to move to next step.
Step 1 - Check if values are case sensitive
The EXACT function is case sensitive function that allows you to compare values. If they match EXACT returns TRUE, if not FALSE.
EXACT($B$3:$B$15,TRANSPOSE(B$3:$B$15))*1
If we use TRANSPOSE we can compare values against each other to build an array, in a single calculation. The following picture shows you this array as an index table, I have highlighted cells that match green. Example, cell C3 shows you the result of a comparison between the value in cell C2 and B3. Since it is the same value they must match and the formula returns 1 and is highlighted green.
It is now obvious that value Aa has a duplicate because cell J3 is also highlighted green.
Incredible that Excel allows you to do such a complicated calculation in a single cell.
Step 2 - MMULT function lets you SUM values column-wise or row-wise
The MMULT function is like the SUMPRODUCT function but on steroids, let me explain. SUMPRODUCT lets you multiply and then sum values, the result is a single value.
MMULT lets you multiply and sum values either column-wise or row-wise, the result is an array.
MMULT(EXACT($B$3:$B$15, TRANSPOSE(B$3:$B$15))*1, ROW($B$3:$B$15)^0) is entered in column P, see picture below.
It is now easy to spot unique values in the index table, if column P contains 1 the corresponding value in column B must be unique.
Step 3 - Check if value in array is equal to 1 and if so return corresponding row number
Column Q shows you corresponding relative row number if value in column P is equal to 1.
IF(MMULT(EXACT($B$3:$B$15, TRANSPOSE(B$3:$B$15))*1, ROW($B$3:$B$15)^0)=1, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), "")
returns this array:Â {"";2;"";""; "";"";7; "";9;"";11; "";""}
Step 4 - Filter the k-th smallest row number
SMALL(IF(MMULT(EXACT($B$3:$B$15, TRANSPOSE(B$3:$B$15))*1, ROW($B$3:$B$15)^0)=1, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), ""),ROWS($A$1:A1))
becomes
SMALL({"";2;"";""; "";"";7; "";9;"";11; "";""},ROWS($A$1:A1))
becomes
SMALL({"";2;"";""; "";"";7; "";9;"";11; "";""},1)
and returns 2.
Step 5 - Return value based on coordinate
INDEX($B$3:$B$15, SMALL(IF(MMULT(EXACT($B$3:$B$15, TRANSPOSE(B$3:$B$15))*1, ROW($B$3:$B$15)^0)=1, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), ""),ROWS($A$1:A1)))
becomes
INDEX($B$3:$B$15, 2)
becomes
INDEX({"Aa"; "CC"; "AA"; "BB"; "BB"; "EE"; "bb"; "Aa"; "aa"; "AA"; "Cc"; "EE"; "BB"}, 2)
and returns "CC" in cell D3.
2. Extract a case sensitive unique list from a column - Excel 365
Formula in cell D3:
The following formula is for Excel 365 users:
2.1 Explaining formula
Step 1 - Convert a vertical array to horizontal
The TRANSPOSE function allows you to convert a vertical range to a horizontal range, or vice versa. This works also with arrays.
TRANSPOSE(B3:B15)
becomes
TRANSPOSE({"Aa"; "CC"; "AA"; "BB"; "BB"; "EE"; "bb"; "Aa"; "aa"; "AA"; "Cc"; "EE"; "BB"})
and returns
{"Aa", "CC", "AA", "BB", "BB", "EE", "bb", "Aa", "aa", "AA", "Cc", "EE", "BB"}
Step 2 - Check if values match (case sensitive)
The EXACT function allows you to check if two values are precisely the same, it returns TRUE or FALSE. The EXACT function also considers upper and lower case letters.
EXACT(B3:B15, TRANSPOSE(B3:B15))
becomes
EXACT({"Aa"; "CC"; "AA"; "BB"; "BB"; "EE"; "bb"; "Aa"; "aa"; "AA"; "Cc"; "EE"; "BB"}, {"Aa", "CC", "AA", "BB", "BB", "EE", "bb", "Aa", "aa", "AA", "Cc", "EE", "BB"})
and returns
{TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE; FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE; TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE}
The image above shows the array in cell range C3:O15. It compares each value against each other and if there are two or more TRUE the value has a duplicate.
Step 3 - Multiply with 1
The MMULT function can't work with boolean values (TRUE or FALSE) , however, there is a workaround. They have numerical equivalents that work with the MMULT function.
TRUE = 1
FALSE = 0 (zero)
EXACT(B3:B15, TRANSPOSE(B3:B15))*1
becomes
{TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE; FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE; TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE; FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE; FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE; FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE}
and returns
{1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0; 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0; 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1; 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0; 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0; 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0; 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0; 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0; 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1}
Step 4 - Create a sequence of 1
The ROW function returns a number representing a row number based on a cell reference. If the cell reference points to multiple cells an array of numbers is returned.
ROW(B3:B15)^0
becomes
{3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15}^0
and returns {1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1}.
Step 5 - Sum values row-wise
The MMULT function calculates the matrix product of two arrays, this can be used to sum values either column-wise or row-wise, the result is an array.
MMULT(EXACT(B3:B15, TRANSPOSE(B3:B15))*1, ROW(B3:B15)^0)
becomes
MMULT({1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0; 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0; 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1; 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0; 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0; 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0; 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0; 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0; 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0; 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1}, {1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1})
and returns {2; 1; 2; 3; 3; 2; 1; 2; 1; 2; 1; 2; 3}
The image above shows the array in column P.
Step 6 - Check if unique
Values that return 1 exist only once, two or more means duplicates.
MMULT(EXACT(B3:B15, TRANSPOSE(B3:B15))*1, ROW(B3:B15)^0)=1
becomes
{2; 1; 2; 3; 3; 2; 1; 2; 1; 2; 1; 2; 3}=1
and returns {FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; TRUE; FALSE; TRUE; FALSE; TRUE; FALSE; FALSE}.
Step 7 - Filter unique values
The FILTER function lets you extract values/rows based on a condition or criteria.
FILTER(B3:B15, MMULT(EXACT(B3:B15, TRANSPOSE(B3:B15))*1, ROW(B3:B15)^0)=1)
becomes
FILTER({"Aa"; "CC"; "AA"; "BB"; "BB"; "EE"; "bb"; "Aa"; "aa"; "AA"; "Cc"; "EE"; "BB"}, {FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; TRUE; FALSE; TRUE; FALSE; TRUE; FALSE; FALSE})
and returns {"CC"; "bb"; "aa"; "Cc"}.
Step 8 - Shorten formula
The LET function assigns names to calculation results. This can shorten formulas considerably and make them run much faster.
LET(name1, name_value1, calculation_or_name2, [name_value2, calculation_or_name3...])
LET(z, B3:B15, FILTER(z, MMULT(EXACT(z, TRANSPOSE(z))*1, ROW(z)^0)=1))
Unique values category
First, let me explain the difference between unique values and unique distinct values, it is important you know the difference […]
I read an article Merging Lists To A List Of Distinct Values at CPearson. The article describes code that you […]
The formula in cell F3 extracts unique values from column C, the formula in cell F3 extracts the corresponding dates […]
Excel categories
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.