Compare two columns and extract differences
This article demonstrates a formula that extracts values that exist only in one column out of two columns.
There are text values in column B and column C.
Update!
Excel 365 formula in cell E3:
Excel 365 formula in cell F3:
The formulas above are entered like regular formulas. They contain the SEQUENCE function that older Excel versions are missing.
Copy cell E3 and F3 and paste to cells below as far as needed.
Array formula for older Excel versions
The array formula in cell E3 extracts values existing only in column B, compared to column C:
The array formula in cell F3 extracts values existing only in column C, compared to column B:
How to enter array formula in cell E3
- Copy above array formula (Ctrl + c).
- Select cell E3.
- Press with left mouse button on in the formula bar.
- Paste array formula (Ctrl + v) to the formula bar.
- Press and hold CTRL + SHIFT simultaneously.
- Press Enter once.
- Release all keys.
The formula is now an array formula. See the curly brackets, they tell you it is an array formula. Don't enter the curly brackets yourself, they appear if you enter it correctly, like this:
How to copy array formula
- Select cell E3.
- Copy (Ctrl + c).
- Select cell range E4:E8.
- Paste (Ctrl + v).
Explaining array formula in cell E3
I recommend the "Evaluate Formula" tool when you want to understand, troubleshoot or examine a specific formula.
Select the cell containing the formula you want to evaluate. Go to tab "Formulas" on the ribbon, press with left mouse button on the "Evaluate Formula" button, see image above.
A dialog box appears, it shows the formula and the button "Evaluate" below the formula allows you to go through the formula calculations step by step.
Step 1 - Count values in column C based on values in column B
The COUNTIF function lets you count values based on a condition, however, it is also possible to use multiple conditions but then the function returns an array of values instead of a single value.
This is what makes the formula an array formula. Here are the arguments in the COUNTIF function:
COUNTIF(range, criteria)
COUNTIF($C$3:$C$11, $B$3:$B$15)
becomes
COUNTIF({"BB"; "DD"; "EE"; "HH"; "II"; "JJ"; "KK"; "VV"; "PP"}, $B$3:$B$15)
becomes
COUNTIF({"BB"; "DD"; "EE"; "HH"; "II"; "JJ"; "KK"; "VV"; "PP"}, {"AA"; "CC"; "DD"; "EE"; "GG"; "HH"; "II"; "JJ"; "KK"; "MM"; "NN"; "OO"; "PP"})
and returns the following array of values:
{0; 0; 1; 1; 0; 1; 1; 1; 1; 0; 0; 0; 1}
The position of each value in the array is very important, they make it possible to identify and extract the values we want. The position of each value in the array corresponds to the value in column B, see image above.
A 0 (zero) means that the value in column B is not found in column C. 1 is that the value in column B is found once in column C.
Step 2 - Check if they are equal to 0 (zero)
The equal sign checks if the values are equal to 0 (zero) and returns the boolean values TRUE or FALSE.
COUNTIF($C$3:$C$11, $B$3:$B$15)=0
becomes
{0; 0; 1; 1; 0; 1; 1; 1; 1; 0; 0; 0; 1}=0
and returns
{TRUE; TRUE; FALSE; FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; TRUE; TRUE; TRUE; FALSE}
Step 3 - If they are equal to zero, return the corresponding relative row number
The IF function allows you to return a specific value if the logical test is TRUE and another value if FALSE.
IF(logical_test, [value_if_true], [value_if_false])
IF(COUNTIF($C$3:$C$11, $B$3:$B$15)=0, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), "")
becomes
IF({TRUE; TRUE; FALSE; FALSE; TRUE; FALSE; FALSE; FALSE; FALSE; TRUE; TRUE; TRUE; FALSE}, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), "")
The MATCH and ROW functions create an array from 1 to 11 which we then will use to extract the correct value from cell range B3:B15.
MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15))
becomes
MATCH({3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15}, {3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15})
and returns
{1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13}.
Step 4 - Return the k-th smallest row number
The SMALL function returns the k-th smallest number from an array or cell range.
SMALL(IF(COUNTIF($C$3:$C$11, $B$3:$B$15)=0, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), ""), ROWS($A$1:A1))
beomes
SMALL({1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13}, ROWS($A$1:A1))
The ROWS function counts the number of rows in a given cell reference. The cell ref in this example expands when you copy the cell and paste to cells below. This makes the SMALL function return a new number in each cell.
SMALL({1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13}, ROWS($A$1:A1))
becomes
SMALL({1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13}, 1)
and returns 1.
Step 5 - Return value
The INDEX function returns a value or multiple values based on a row and/or column number.
INDEX($B$3:$B$15, SMALL(IF(COUNTIF($C$3:$C$11, $B$3:$B$15)=0, MATCH(ROW($B$3:$B$15), ROW($B$3:$B$15)), ""), ROWS($A$1:A1)))
becomes
INDEX($B$3:$B$15, 1)
and returns "AA" in cell E3.
Recommended articles
- How to Compare Two Columns in Excel (for matches & differences)
- How to compare two columns in Excel for matches and differences
- Compare Two Columns
If you are looking for comparing two cell ranges, read this article:
Filter values existing in range 1 but not in range 2 using array formula
Do you want to compare text values in two cell ranges, read this article:
Filter text values existing in range 1 but not in range 2 using array formula
I have also written an article about comparing records between two data tables:
Compare two lists of data: Filter records existing in only one list
Compare category
This article demonstrates ways to extract shared values in different cell ranges, two and three cell ranges. The Excel 365 […]
Array formula in B15: =INDEX($B$3:$B$12, MATCH(0, COUNTIF($B$14:B14, $B$3:$B$12)+IF(((COUNTIF($D$3:$D$11, $B$3:$B$12)>0)+(COUNTIF($F$3:$F$12, $B$3:$B$12)>0))=2, 0, 1), 0)) Copy cell B15 and paste it to […]
This article shows how to compare two nonadjacent cell ranges and extract values that exist only in one of the […]
How would you figure out an unique list where the sum of in one column doesn't match the other column? […]
The image above demonstrates an array formula in cell B11 that extracts values that only exist in List 1 (B3:B7) […]
I will in this blog post demonstrate a formula that extracts common records (shared records) from two data sets in […]
This article explains how to extract values that exist in three different columns, they must occur in each of the […]
Overview Updating a list using copy/paste is a boring task. This blog article describes how to update values in a price […]
Today I am going to show you how to quickly compare two tables using Conditional Formatting (CF). I am going […]
Question: How do i remove common values between two lists? Answer: The solution in this article, removes common values and […]
Question: i have two sets of data - one has an identifier column and one result column. A2 data1 B2 […]
The image above shows an array formula in cell B12 that extracts values shared by cell range B2:D4 (One) and […]
The formulas above extracts values that exists only in one or the other cell range, if you are looking for […]
In this example we are going to use two lists with identical columns, shown in the image above. It is […]
This article describes an array formula that compares values from two different columns in two worksheets twice and returns a […]
Functions in this article
More than 600 Excel formulas
Excel categories
3 Responses to “Compare two columns and extract differences”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Hi Oscar,
I started with the solution provided here for obtaining values existing only in one of two lists. I know that, for an ordered done job, one should tend use excel in a 'database-like' fashion, with columns as field and rows for data, and so I do.
Anyway, it happened that I had the necessity to have two lists of data to compare,but they spread horizontally. I also read your solutions for filtering values existing in different ranges, but since I was in a hurry,I adapted the formulas provided here, and wanted to share my solution.Here is the two alternative formulas that do the job in a 'column fashion':
Let's say we have two list to compare in ranges G1:V1 and G2:V2 respectively. In the result's range, I put the formula:
={INDEX($G$1:$V$1;;SMALL(IF(COUNTIF($G$2:$V$2;$G$1:$V$1)=0;MATCH(COLUMN($G$1:$V$1);COLUMN($G$1:$V$1));"");COLUMN(A1)))}
or, alternatively (thanks to another solution found here):
={INDEX($G$1:$V$1; SMALL(IF(ISERROR(MATCH($G$1:$V$1; $G$2:$V$2; 0)); (COLUMN($G$1:$V$1)-MIN(COLUMN($G$1:$V$1))+1); ""); COLUMN(A$1:A$65536)))} . I noticed that, if I use the same size for all three ranges (lists and results), I end up with having some zeroes padding the 2nd result range (Missing data in List 1), whether I use vertical or horizontal lists.
as you can see from the image I provide here:
https://s12.postimg.org/jp0p9y6st/Filter_values_existing_in_column_1_but_not_in_co.jpg
I am wondering how those zeroes appear ?
I uploaded the example excel file.
Bruno,
You are comparing 4 blank cells ($G$2:$V$2) with the values in cell range $G$1:$V$1. Since there are no blank cells the formula returns the blank cells. The INDEX function then returns 0.
Try this formula in cell G2:
=INDEX($G$2:$O$2, , SMALL(IF(COUNTIF($G$1:$S$1, $G$2:$O$2)=0, MATCH(COLUMN($G$2:$O$2), COLUMN($G$2:$O$2)), ""), COLUMN(A1)))
Hi.
I have excel file:
code bookname language bookcode id
1 book1 en 100
2 book2 fa 101
3 book1 ar 102
4 book3 en 103
5 book2 fa 104
6 book4 az 105
...
i have want to filter by book & language columns and when two columns are exist, value of id column equal is last row value of code column. for example:
book2 is true but book1 is not true. so id book1 = 104
thanks