In this post I am going to describe how to filter duplicate and unique distinct values from a really large dataset. Both array formula and advanced filter are too slow.

In this example, cell range A1:A1 000 000 contains seven random letters from A to Z.

Create a table

  1. Select cell A1
  2. Press Ctrl + A . This selects cell range A1:A1000000.
  3. Click "Insert" on the ribbon
  4. Click "Table" button
  5. Click OK.

Sort values in column A

  1. Click black arrow
  2. Click "Sort A to Z"

Sorting takes 15-20 seconds on my old computer.

Excel Formula - Find duplicates

  1. Select cell B3
  2. Type
    =IF(B3=B2,B3, "")
  3. Press Enter

Formula calculation for the entire range takes maybe 15 to 20 seconds. To make it case sensitive, change formula to:

=IF(EXACT(B3, B2), B3, "")

How to sort case sensitive

Filter duplicates

  1.  Click black arrow on column 2
  2. Disable blanks
  3. Click OK

How to return unique distinct values

Make sure the range is sorted.

Change formula in cell B3:

=IF(B3<>B2, B3, "")

To make the formula case sensitive, change formula to:

=IF(EXACT(B3, B2)=FALSE, B3, "")

How to sort case sensitive

Filter unique distinct values

  1.  Click black arrow on column 2
  2. Disable blanks
  3. Click OK

Download excel file

Filter unique distinct values from a large data set.xlsx
Excel 2007 *.xlsx

Recommended blog posts

Want to learn more about filtering unique distinct values? You must read these blog posts:

Learn more about sorting unique distinct values. Read these blog posts:

Learn more about counting unique distinct values and records. Read these blog posts:

Learn more about filtering and comparing unique distinct records. Read these blog posts:

Read more about custom functions in excel