Filter unique distinct words from a cell range [UDF]
This blog post describes how to create a list of unique distinct words from a cell range. Unique distinct words are all words but duplicate words are only listed once.
Cell range A2:A14 contains words, see picture below.
Rick Rothstein (MVP - Excel) helped me out here with a powerful user defined function (udf).
Array formula in cell B2:B23
=ListOfWords($A$2:$A$18, TRUE) + CTRL + SHIFT + ENTER
Select all the cells to be filled, then type the above formula into the Formula Bar and press CTRL+SHIFT+ENTER
Array formula in cell C2:C23
=ListOfWords($A$2:$A$18, FALSE) + CTRL + SHIFT + ENTER
Select all the cells to be filled, then type the above formula into the Formula Bar and press CTRL+SHIFT+ENTER
User defined function
Instructions
- You can select far more cells to load the formulas in than are required by the list. The empty text string will be displayed for cells not having an entry.
- You can specify a larger range than the there are filled in cells as the argument to these macros to allow for future entries in the column.
- You can specify whether the listing is to be case sensitive or not via the optional second argument with the default value being FALSE, meaning duplicated entries with different casing like One, one, ONE, onE, etc.. will all be treated as if they were the same word with the same spelling. If you pass TRUE for that optional second argument, then those words would all be treated as if they were different words.
- For all the "Case Insensitive" listing, the words are listed in Proper Case (first letter upper case, remaining letters lower case). The reason being if you had One, one and ONE then there is not reason to prefer one version over another, so I solved the problem by using Proper Case throughout.
VBA Code:
Function ListOfWords(Rng As Range, Optional CaseSensitive As Boolean) As Variant Dim X As Long, Index As Long, List As String, Words() As String, LoW As Variant With WorksheetFunction Words = Split(.Trim(Replace(Join(.Transpose(Rng)), Chr(160), " "))) LoW = Split(Space(.Max(UBound(Words), Application.Caller.Count) + 1)) For X = 0 To UBound(Words) If InStr(1, Chr(1) & List & Chr(1), Chr(1) & Words(X) & Chr(1), 1 - Abs(CaseSensitive)) = 0 Then List = List & Chr(1) & Words(X) If CaseSensitive Then LoW(Index) = Words(X) Else LoW(Index) = StrConv(Words(X), vbProperCase) End If Index = Index + 1 End If Next ListOfWords = .Transpose(LoW) End With End Function
How to copy above code to your workbook
- Press Alt-F11 to open visual basic editor
- Press with left mouse button on Module on the Insert menu
- Copy and paste above user defined function to code module
- Exit visual basic editor
- Select a sheet
- Select a cell range
- Type =ListOfWords($A$2:$A$18, TRUE) into formula bar and press CTRL+SHIFT+ENTER
Get Rick Rothstein´s excel example file
Many thanks to Rick Rothstein (Mvp - Excel)!!
User defined function category
This article demonstrates two ways to calculate the number of times each word appears in a given range of cells. […]
This article demonstrates how to convert a range of cells containing strings separated by a delimiter into a range of […]
This article demonstrates a User Defined Function (UDF) that counts unique distinct cell values based on a given cell color. […]
Excel categories
12 Responses to “Filter unique distinct words from a cell range [UDF]”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
Hi,
I followed all these steps and ended up with the first word of the first cell of the selected column. Any suggestions to extract all of the unique words? The column I want to extract unique words from has approx. 225,000 rows.
Thanks,
Diana
Diana Bubser,
I am using the example in this blog post.
How to create an array formula.
1. Select cell range B2:B14
2. Type =ListOfWords($A$2:$A$18, FALSE)
3. Press and hold Ctrl + Shift
4. Press Enter once
5. Release all kyes
Greetings,
Are there known limitations to the size of the data in the column that ListofWords can handle? The script works fine when I use it on small columns, but when using it on larger columns, I get #VALUE! errors in my output array. Specifically, Excel reports "A value used in the formula is of the wrong data type." errors when I increase the number of characters past a pre-determined size.
What's odd is that ListofWords runs successfully up until a point. And then if I add an additional character to my list--either in a new row or to an existing word already on my list--the #VALUE! occurs.
I can successfully filter a 3361-row column that consists of 4464 words, 849 of which are unique, so I don't think I'm testing the limits of the Excel memory. I also tested the VBA code in multiple versions of Excel on different computers, all with the same result. Is there a way of tweaking the VBA code to avoid these constraints?
Thanks,
Tom
Tom,
Get the Excel *.xls file
Tom.xls
Excuse me, but are there any limitations towards special characters? (-/_=). I have several cells that contain words "merged together", for example, "Asia-Pacific", and I'm getting the #VALUE! response Tom is mentioning in his post.
These are the great UDF's
Thank you so much
I was trying to modify for the same to get the unique list by cells ( not by words)
Can you please help me out
Chinna,
https://www.get-digital-help.com/how-to-extract-a-unique-list-and-the-duplicates-in-excel-from-one-column/#vba
I'm having the same problem with the limitation, after a certain amount of words everything turns to #VALUE. Has anyone found a solution to this?
Jose,
Transpose function has a limit of 65536 array values.
https://stackoverflow.com/questions/20055784/best-workaround-for-vba-transpose-array-length-limit
I receive an "Ambiguous name detected: ListOfWords" error in VB Any suggestions?
Thomas,
You have two "user defined functions" with the same name in your workbook.
Hello,
I am constantly getting #NAME? this error when I want to use the formula.
WHy is this happening?