Extract unique distinct records from two data sets
The picture above shows an array formula in cell B9:C13 that extracts unique distinct records from two tables in cell range B3:C6 and E3:F6.
If a record exists in both tables only one record is returned by the formula. If a record exists multiple times in one table only one record is returned by the formula.
Example, John 42 exists in both tables, however, the formula returns only one instance of John 42.
Laura 26 exists multiple times but only in the first table, the formula returns only one record of Laura 26.
Array formula in cell B9:
To enter an array formula, type the formula in cell B9 then press and hold CTRL + SHIFT simultaneously, now press Enter once. Release all keys.
The formula bar now shows the formula with a beginning and ending curly bracket telling you that you entered the formula successfully.
Don't enter the curly brackets yourself, they appear automatically.
What is a unique distinct record?
Unique distinct records are all records except duplicates merged into one distinct value.
In other words, duplicate records are removed.
Explaining the formula in cell B9
Use the "Evaluate Formula" tool to examine the calculation steps in greater detail.
Go to tab "Formula" on the ribbon, press with left mouse button on "Evaluate Formula" button to start the tool.
Then press with left mouse button on the "Evaluate" button to see the next step in the calculation, this will make it easier to understand how the formula works.
Step 1 - Count previous records against table 1
The COUNTIFS function allows you to count how many times a record exists in a table.
The previous values in cell B9 are the values in B8 and C8.
The first argument $B$8:B8Â in the COUNTIFS function has both absolute and relative cell references, this allows the formula to automatically expand when you copy it and paste to cells below.
"Name" and "Age" is not found in cell range B3:C6 so the COUNTIFS function returns 0 (zero) for each record.
COUNTIFS($B$8:B8, $B$3:$B$6, $C$8:C8, $C$3:$C$6)
becomes
=COUNTIFS("Name", {"John";"Laura";"Martin";"Laura"}, "Age", {42;26;45;26})
and returns {0;0;0;0}. There are four records and the function returns an array containing 4 values (zeros).
Step 2 - Find the first instance of 0 (zero) in the array
The MATCH function allows you to identify which record to return next.
MATCH(0, COUNTIFS($B$8:B8, $B$3:$B$6, $C$8:C8, $C$3:$C$6), 0)
becomes
MATCH(0, {0;0;0;0}, 0)
and returns 1. The first instance of 0 (zero) is found in position 1 in the array.
Step 3 - Return value from a record
The INDEX function lets get a specific value using a row and column number.
The MATCH function calculates the row number we need to get the correct value, however, the COLUMNS function keeps track of which value in the record to get.
The COLUMNS function calculates the number of columns in a cell reference, the cell reference used here $A$1:A1 is also expanding when the formula is copied to other cells.
COLUMNS($A$1:A1) returns 1.
INDEX($B$3:$C$6, MATCH(0, COUNTIFS($B$8:B8, $B$3:$B$6, $C$8:C8, $C$3:$C$6), 0),COLUMNS($A$1:A1))
becomes
INDEX($B$3:$C$6, 1, COLUMNS($A$1:A1))
becomes
INDEX($B$3:$C$6, 1, 1) and returns John in cell B9.
Step 4 - IFNA function points the calculation in a new direction
Step 1 to 3 explains how the formula extracts unique distinct records from the first table.
The first part of the formula returns a #N/A error when there are no records left in table 1 to extract.
The IFNA function points the calculation to part2 when the error occurs.
The second part of the formula does the exact same thing as the first part except that the cell references this time points to the second table.
The picture above shows that the two last records are extracted from the second table.
Get Excel *.xlsx file
Extract unique distinct records from two data sets.xlsx
Unique distinct records category
davidlim asks: re: Count records between two dates and a criterion based on the example, i was looking for 1 […]
This article demonstrates how to sort records in a data set based on their count meaning the formula counts each […]
Sean asks: If Tea and Coffee has Americano,it will only return Americano once and not twice. I am looking for a […]
Excel categories
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.