Filter unique distinct words from a cell range [UDF]
This blog post describes how to create a list of unique distinct words from a cell range. Unique distinct words are all words but duplicate words are only listed once.
Cell range A2:A14 contains words, see picture below.
Rick Rothstein (MVP - Excel) helped me out here with a powerful user defined function (udf).
Array formula in cell B2:B23
=ListOfWords($A$2:$A$18, TRUE) + CTRL + SHIFT + ENTER
Select all the cells to be filled, then type the above formula into the Formula Bar and press CTRL+SHIFT+ENTER
Array formula in cell C2:C23
=ListOfWords($A$2:$A$18, FALSE) + CTRL + SHIFT + ENTER
Select all the cells to be filled, then type the above formula into the Formula Bar and press CTRL+SHIFT+ENTER
User defined function
Instructions
- You can select far more cells to load the formulas in than are required by the list. The empty text string will be displayed for cells not having an entry.
- You can specify a larger range than the there are filled in cells as the argument to these macros to allow for future entries in the column.
- You can specify whether the listing is to be case sensitive or not via the optional second argument with the default value being FALSE, meaning duplicated entries with different casing like One, one, ONE, onE, etc.. will all be treated as if they were the same word with the same spelling. If you pass TRUE for that optional second argument, then those words would all be treated as if they were different words.
- For all the "Case Insensitive" listing, the words are listed in Proper Case (first letter upper case, remaining letters lower case). The reason being if you had One, one and ONE then there is not reason to prefer one version over another, so I solved the problem by using Proper Case throughout.
VBA Code:
Function ListOfWords(Rng As Range, Optional CaseSensitive As Boolean) As Variant Dim X As Long, Index As Long, List As String, Words() As String, LoW As Variant With WorksheetFunction Words = Split(.Trim(Replace(Join(.Transpose(Rng)), Chr(160), " "))) LoW = Split(Space(.Max(UBound(Words), Application.Caller.Count) + 1)) For X = 0 To UBound(Words) If InStr(1, Chr(1) & List & Chr(1), Chr(1) & Words(X) & Chr(1), 1 - Abs(CaseSensitive)) = 0 Then List = List & Chr(1) & Words(X) If CaseSensitive Then LoW(Index) = Words(X) Else LoW(Index) = StrConv(Words(X), vbProperCase) End If Index = Index + 1 End If Next ListOfWords = .Transpose(LoW) End With End Function
How to copy above code to your workbook
- Press Alt-F11 to open visual basic editor
- Click Module on the Insert menu
- Copy and paste above user defined function to code module
- Exit visual basic editor
- Select a sheet
- Select a cell range
- Type =ListOfWords($A$2:$A$18, TRUE) into formula bar and press CTRL+SHIFT+ENTER
Download Rick Rothstein´s excel example file
Many thanks to Rick Rothstein (Mvp - Excel)!!
How to count word frequency in a cell range [UDF]
This user defined function creates a unique distinct list of words and how many times they occur in the selected […]
Extract unique distinct values from a filtered Excel defined Table [UDF and Formula]
This article demonstrates two formulas that extract distinct values from a filtered Excel Table, one formula for Excel 365 subscribers […]
List files in a folder and subfolders [UDF]
This article demonstrates a user defined function that lists files in a ggiven folder and subfolders. A user defined function is […]
Search for a file in folder and subfolders [UDF]
The image above demonstrates a user-defined function in cell range B6:D7 that allows you to search a folder and subfolders […]
Split values equally into groups
Question: How do I divide values equally into groups (3 lists or less)? This post shows you two different approaches, […]
Split words in a cell range into a cell each [UDF]
This post describes how to split words in a cell range into a cell each using a custom function. I […]
Count unique distinct values by cell color
This article demonstrates a User Defined Function (UDF) that counts unique distinct cell values based on a given cell color. […]
Substitute multiple text strings [UDF]
The SUBSTITUTE and REPLACE functions can only handle one string, the following User-Defined Function (UDF) allows you to substitute multiple […]
List permutations without repetition [UDF]
This blog post describes how to create permutations, repetition is NOT allowed. Permutations are items arranged in a given order meaning […]
12 Responses to “Filter unique distinct words from a cell range [UDF]”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Hi,
I followed all these steps and ended up with the first word of the first cell of the selected column. Any suggestions to extract all of the unique words? The column I want to extract unique words from has approx. 225,000 rows.
Thanks,
Diana
Diana Bubser,
I am using the example in this blog post.
How to create an array formula.
1. Select cell range B2:B14
2. Type =ListOfWords($A$2:$A$18, FALSE)
3. Press and hold Ctrl + Shift
4. Press Enter once
5. Release all kyes
Greetings,
Are there known limitations to the size of the data in the column that ListofWords can handle? The script works fine when I use it on small columns, but when using it on larger columns, I get #VALUE! errors in my output array. Specifically, Excel reports "A value used in the formula is of the wrong data type." errors when I increase the number of characters past a pre-determined size.
What's odd is that ListofWords runs successfully up until a point. And then if I add an additional character to my list--either in a new row or to an existing word already on my list--the #VALUE! occurs.
I can successfully filter a 3361-row column that consists of 4464 words, 849 of which are unique, so I don't think I'm testing the limits of the Excel memory. I also tested the VBA code in multiple versions of Excel on different computers, all with the same result. Is there a way of tweaking the VBA code to avoid these constraints?
Thanks,
Tom
Tom,
Download *.xls file
Tom.xls
Excuse me, but are there any limitations towards special characters? (-/_=). I have several cells that contain words "merged together", for example, "Asia-Pacific", and I'm getting the #VALUE! response Tom is mentioning in his post.
These are the great UDF's
Thank you so much
I was trying to modify for the same to get the unique list by cells ( not by words)
Can you please help me out
Chinna,
https://www.get-digital-help.com/2009/03/30/how-to-extract-a-unique-list-and-the-duplicates-in-excel-from-one-column/#vba
I'm having the same problem with the limitation, after a certain amount of words everything turns to #VALUE. Has anyone found a solution to this?
Jose,
Transpose function has a limit of 65536 array values.
https://stackoverflow.com/questions/20055784/best-workaround-for-vba-transpose-array-length-limit
I receive an "Ambiguous name detected: ListOfWords" error in VB Any suggestions?
Thomas,
You have two "user defined functions" with the same name in your workbook.
Hello,
I am constantly getting #NAME? this error when I want to use the formula.
WHy is this happening?