Extract a unique distinct list and sum amounts based on a condition
This article shows how to perform data aggregation using formulas based on a condition in Excel 365 and earlier Excel versions.
Table of Contents
1. Extract a unique distinct list and sum amounts based on a condition - earlier Excel versions
Anura asks:
Is it possible to extend this by matching items that meet a criteria?
I have a list of credit card transactions showing the name of the card holder, their Branch and the amount. I want to produce a report for each Branch, so what I want is to extract only those people who match the Branch name. For example:
Frank Branch A
Frank Branch A
Frank Branch A
Joe Branch A
Mary Branch B
Jane Branch C
Mike Branch A
Joe Branch A
Dave Branch C
I would like a list of only those people for Branch A, and then be able to summarise the transactions. Or should I do this in two stages?
Answer:
This post demonstrates how to build a formula to extract a unique distinct list using a condition, however, I highly recommend a pivot table for this task: Pivot table - Unique distinct list
Type a branch in cell G2 and the formulas in cell range F5:H7 returns the correct records from B3:D12. If you have more than two values in one record (except the number) use the COUNTIFS function to filter unique distinct records.
Array formula in F5:
Copy cell F5 and paste to cell range F5:G7.
Formula in cell H5:
If you have a version earlier than Excel 2007 use the SUMPRODUCT function
If you are interested in how the SUMPRODUCT function works, read:
Recommended articles
The SUMPRODUCT function calculates the product of corresponding values and then returns the sum of each multiplication.
Watch a video where I explain the formula
Recommended link
Recommended articles
First, let me explain the difference between unique values and unique distinct values, it is important you know the difference […]
How to enter an array formula
- Copy (Ctrl + c) above formula
- Double press with left mouse button on cell E8
- Paste (Ctrl + v) to cell
- Press and hold CTRL + Shift simultaneously
- Press Enter
- Release all keys
Copy cell E8 and paste it down as far as needed. Copy cells and paste into cell range F8 and down as far as needed.
Recommended link
Recommended articles
Array formulas allows you to do advanced calculations not possible with regular formulas.
Explaining array formula in cell E8
Step 1 - Check previous values to make sure they are not repeated
COUNTIF($E$7:$E7, $A$2:$A$11)
If you examine the cell references you may find this strange: $E$7:$E7 It is an absolute and relative cell reference. It expands when the formula is copied across the spreadsheet. Read more here: Absolute and relative references in excel
COUNTIF($E$7:$E7, $A$2:$A$11)
returns {0;0;0;0;0;0;0;0;0;0}
Step 2 - Compare values in column B with the value in cell F1
($B$2:$B$11<>$F$1)
returns {FALSE; ... ; FALSE}
Step 3 - Add arrays
COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1)
becomes
{0;0;0;0;0;0;0;0;0;0} + {FALSE; FALSE; FALSE; FALSE; TRUE; TRUE; FALSE; FALSE; TRUE; FALSE}
and returns
{0;0;0;0;1;1;0;0;1;0}
Step 4 - Find first 0 (zero) in array
MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1), 0)
becomes
MATCH(0, {0;0;0;0;1;1;0;0;1;0}, 0)
and returns 1.
Step 5 - Return a value or reference of the cell at the intersection of a particular row and column
INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1), 0), COLUMN(A1))
returns "Frank" in cell E8.
Get Excel *.xls file
unique distinct list matching criteria.xls
(Excel 97-2003 Workbook *.xls)
2. Extract a unique distinct list and sum amounts based on a condition - Excel 365
The following dynamic Excel 365 formula aggregates or groups the amounts based "Name" and the specified "Branch" in cell C14, and then sorts the output array based on the amounts.
The image above shows this data table in cell range B2:D12:
Name | Branch | Amount |
Frank | Branch A | 10 |
Frank | Branch A | 20 |
Frank | Branch A | 40 |
Joe | Branch A | 30 |
Mary | Branch B | 5 |
Jane | Branch C | 10 |
Mike | Branch A | 15 |
Joe | Branch A | 40 |
Dave | Branch C | 35 |
Joe | Branch A | 20 |
Formula in cell B17:
The formula in cell B17 returns this dynamic array:
Joe | Branch A | 90 |
Frank | Branch A | 70 |
Mike | Branch A | 15 |
It contains only data based on the specified "Branch" in cell C14 which is "Branch A" in this example. "Branch A" has the following names: Joe, Frank, and Mike and the corresponding amounts are 90, 70, and 15.
Here's a breakdown of the formula:
- SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12): This function calculates the sum of values in column D for each row where the values in columns C and B match the values in the same row. It's essentially a self-referential sumifs that sums up the values in column D for each unique combination of values in columns C and B.
- HSTACK(B3:C12,SUMIFS(...)): This function horizontally stacks (or combines) the values in columns B and C with the results of the sumifs function. This creates a new array with the values from columns B and C, and the corresponding sumifs results.
- FILTER(HSTACK(...),C3:C12=C14): This function filters the array created by the hstack function to only include rows where the value in column C matches the value in cell C14.
- UNIQUE(FILTER(...)): This function removes any duplicate rows from the filtered array, leaving only unique combinations of values.
- SORT(UNIQUE(...),3,-1): This function sorts the unique array in descending order (due to the -1) based on the values in the third column which is the sumifs results.
2.1 Explaining formula
Step 1 - Filter data based on given branch
The SUMIFS function adds numbers based on criteria.
Function syntax: SUMIFS(sum_range, criteria_range1, criteria1, [criteria_range2], [criteria2], ...)
SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12)
returns
Step 2 - Stack arrays horizontally
The HSTACK function combines cell ranges or arrays. Joins data to the first blank cell to the right of a cell range or array (horizontal stacking)
Function syntax: HSTACK(array1,[array2],...)
HSTACK(B3:C12,SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12))
returns
Step 3 - Filter data based on given branch
The FILTER function extracts values/rows based on a condition or criteria.
Function syntax: FILTER(array, include, [if_empty])
FILTER(HSTACK(B3:C12,SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12)),C3:C12=C14)
returns
Step 4 - List unique distinct rows
The UNIQUE function returns a unique or unique distinct list.
Function syntax: UNIQUE(array,[by_col],[exactly_once])
UNIQUE(FILTER(HSTACK(B3:C12,SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12)),C3:C12=C14))
returns
Step 5 - Sort array in descending order based on amounts in column 3
The SORT function sorts values from a cell range or array
Function syntax: SORT(array,[sort_index],[sort_order],[by_col])
SORT(UNIQUE(FILTER(HSTACK(B3:C12,SUMIFS(D3:D12,C3:C12,C3:C12,B3:B12,B3:B12)),C3:C12=C14)),3,-1)
returns
Unique distinct values category
First, let me explain the difference between unique values and unique distinct values, it is important you know the difference […]
This article demonstrates Excel formulas that allows you to list unique distinct values from a single column and sort them […]
Question: I have two ranges or lists (List1 and List2) from where I would like to extract a unique distinct […]
Excel categories
45 Responses to “Extract a unique distinct list and sum amounts based on a condition”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Contact Oscar
You can contact me through this contact form
This really is brilliant. Can I ask for one modification? How can I count the number of transactions per person?
Anura,
Formula in H8:
=SUMPRODUCT(--(E8=$A$2:$A$11)) + ENTER.
That is just awesome.
I tried equating the value "Branch A" itself in the formula instead of its address and it worked perfectly.
Thanks Oscar!
Vashi,
You are welcome!
Occar, your breakdown above worked for me, HOWEVER, i am using a date vs. Branch A. Needless to say, my spreadsheet is dynamic so i need the $F$1, in this scenario to map to a different page. When i simply change the formula to ='High Level Summary'!U3 (result = 1/10/2013 for example), my results turn to errors. It works fine as long as my date is just a date and not a formula. I even tried using INDIRECT prior to ='High Level Summary'!U3. A dropdown box is what dictates the date to be used. PLEASE HELP.... :(
Jana
Jana, can you provide the formula? Maybe your dates are not excel dates?
How excel stores dates
Oscar,
This is a pretty awesome solution!
I am trying to include your formula in an dashboard that shows dynamic results depending on the look-up value. I'm struggling with hiding error values (i.e. "#N/A") for the formula in the "Name" column.
I've already tried a few methods for hiding zero and error values that work with other formulas, but I can't seem to make anything work for your formula.
Can you help?
Thanks!
Dave
Looks like one way to do it is:
=IFERROR(INDEX($A$2:$C$11,MATCH(0,COUNTIF($E$7:$E7,$A$2:$A$11)+($B$2:$B$11$F$1),0),COLUMN(A1)),"")
This is a thing of beauty.
Very helpful - thank you!
dubdub,
thank you for commenting!
Thank you. This was very helpful as I wanted to extract personal data for league members (using the team as the criteria) from a league data sheet to a team data sheet.
I used David Myers' suggestion for ignoring #N/A error messages in empty rows but wonder if anyone can suggest a way to "hide" the zeros that appear when I have an incomplete row with empty cells. I assume I will need to use a nested if function in the column reference but can't get the syntax right.
I would really value any suggestions.
Thanks
Does this work if your data is on a different tab?
Tim,
Yes, change cell references.
Example,
=INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1), 0), COLUMN(A1))
becomes
=INDEX(sheet2!$A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, sheet2!$A$2:$A$11)+(sheet2!$B$2:$B$11<>$F$1), 0), COLUMN(A1))
Hi ,
Is these work on the more column. i have 1400 column and this formula not working.
data is much bigger.
I am want agent project list.
i want list of project on which agent has worked.
Regards,
Krunal
Krunal,
Yes, you need a custom function (vba).
how would you add another criteria, I am looking to do this with the date and time as the criteria to match.
john dalton,
If the start date is in F1 and end date is in F2 and the dates are in cell range B2:B11:
=INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+(($B$2:$B$11>$F$1)*($B$2:$B$11<$F$2)), 0), COLUMN(A1))
How do I add another additional criteria? Say if I wanted to pull a unique list of branches and invidiuals
Justine,
perhaps this is helpful:
Filter unique distinct row records in excel 2007
Hi Oscar,
I need help as well. Suppose that i needed to find the top 20 names with the highest amounts in a Branch. I've tried with the LARGE function but i can't make it work.
Thanks
Hello,
Thanks for this! It's saved me. However, I would like to modify this formula so that it doesn't summarize by name. So using your example, I would want to see Frank's name three times, Joe's name three times, etc. I want my unique distinct list to be the same as the original list, I just want to filter out those names that aren't Branch A.
Thanks.
Chuck
Array formula in cell E8:
=INDEX($A$2:$C$11, SMALL(IF($F$1<>$B$2:$B$11, MATCH(ROW($B$2:$B$11), ROW($B$2:$B$11)), ""), ROW(A1)), COLUMNS($A$1:A1))
Thanks, Oscar! Works like a champ. However, I would like to point out that this formula only works if your original table starts in A1. So the last part of your formula:
...ROW(A1)), COLUMNS($A$1:A1))
I had to come up with another way to get the correct values here. Once I did it worked great. Thank you.
Hi Oscar,
This works as a charm, however (as in previous comment) I need value be displayed all times it appears in the list. Is there any way to tell this formula not to ignore repetitive values?
I mean the result should look like
Frank Branch A
Frank Branch A
Frank Branch A
Joe Branch A
Mike Branch A
Joe Branch A
Joe Branch A
Thanks in advance!
I don't have any questions. I just want to say that you are awesome.
Thanks for your helpful and clear examples. Following your same example – how about if from a long list of branches, I need to extract the amounts of sub-set of branches. For examples, need the amounts from 7 out of 20 branches, just to illustrate an example. How do I name the string of non-consecutive branch numbers? Thanks again
Thanks for your helpful and clear examples. Following your same example – how about if from a long list of branches, I need to extract the amounts of sub-set of branches. For examples, need the amounts from 7 out of 20 branches, just to illustrate an example. How do I specify a string of non-consecutive branch numbers?
Sorry for the multiple entries, something is wrong on this end
Hi Oscar,
I dint understand how this countif is working. I can see that the range is smaller than the criteria. Can you please explain the logic through the first principles.
Thanks.
Hi Chuck
How can I sort my data while fetching. Lets say, I want my data to be sorted ascending order wise depending on amount. I don't want a help column. How can I do it in one go, while I am pulling the data using your formula
I have what I think is a very simple process I want to do, but I cannot find the answer on the internet because I do not know what to call it in 'excel speak'. I have a list of students and their reading level (numerical). I just want it to dynamically create lists of all the students in each level.
e.g.
Students Level
studentA 2
studentB 9
studentC 7
studentD 3
studentE 9
etc (in two columns) and I want to then create a table with the levels as the headings and the students names to automatically go under the level.
e.g.
1 2 3... ...9
studentA studentB
student E
How can I do this?
Hi Oscar,
I'm starting to learn Excel now. And I have a spreadsheet similar to a data entry. My job is to go through weekly data to see which kid is in trouble. My question is, Is there a formula I can set where it list down data from a certain range. For example if I want to view data from the 5/18/2016 to 5/24/2016.
Thanks
John
I recommend you convert your data table to an excel defined table and apply a date filter to your date column.
https://support.office.com/en-us/article/Overview-of-Excel-tables-7ab0bb7d-3a9e-4b56-a3c9-6c94334e492c
Hi there,
I have a dataset and I wish to show, in a separate tab, a distinct list based on two criterion. What would I need to do?
Thanks, Kevin
Kevin,
This formula has one criterion (bolded):
=INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1), 0), COLUMN(A1))
If you need two criteria, see this formula:
=INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+($B$2:$B$11<>$F$1)*($A$2:$A$11<>"Frank"), 0), COLUMN(A1))
It looks for a criterion in column A and another criterion in column B.
This formula looks for two criteria ($F$1:$F$2) in cell range $B$2:$B$11:
=INDEX($A$2:$C$11, MATCH(0, COUNTIF($E$7:$E7, $A$2:$A$11)+COUNTIF($F$1:$F$2,"<>"&$B$2:$B$11), 0), COLUMN(A1))
Hi Oscar,
I'm using a very similar formula to yours:
=IFERROR(INDEX(all_names, MATCH(0, IF($D$2=all_groups, COUNTIF(D$4:D4, all_names), ""), 0)),"")
This works well. However, if I try to change it to include an OR function it no longer work:
=IFERROR(INDEX(all_names, MATCH(0, IF(OR($D$2=all_groups,$D$2=all_groups_2,$D$2=all_groups_3), COUNTIF(D$4:D4, all_names), ""), 0)),"")
Any idea what I'm doing wrong?
Thanks,
Jake
Jake,
Try this:
=IFERROR(INDEX(all_names, MATCH(0, IF(($D$2=all_groups)+($D$2=all_groups_2)+($D$2=all_groups_3), COUNTIF(D$4:D4, all_names), ""), 0)),"")
Hello Oscar,
What code is needed to cause cells in Columns F - I to fill with the contents of Columns C - E when a cell in Column B includes a numeric value?
Liam,
See this post:
https://www.get-digital-help.com/2016/12/20/filter-rows-where-a-cell-contains-a-numeric-value/
THANK YOU SO MUCH.... IT IS VERY VERY VERY HELPFUL AND I AM TOO MUCH EXCITING TO LEARN THIS FORMULA..... I WOULD LIKE TO SAY AGAIN *"THANK YOU SO MUCH...."*
Dear
How can I make this work by having the names F5:F7 alphabetically with 2 conditions ?
Thanks
Oliver
[…] Liam asks: […]
[…] Extract a unique distinct list and sum amounts based on a condition […]
Oscar,
I have a list with duplicate values in 2 columns. How do I extract one distinct list with two columns. They are not two separate lists. They are columns with two different column headings: Type & Letter
Type Letter
Car A
Bike D
Car A
Bike D
Bike E
Result:
Car A
Bike D
Bike E
Car A
I made a mistake. The result should be
Type Letter
Car A
Bike D
Bike E