Extract text between words [UDF]
I have a somewhat related question, if you don't mind:
I have very large amount of text in a single cell, and I would like to extract multiple instances of text that appear between two specific words.
For example, here is the sample text in one cell:
{"date": 5/7/19 headline:"GE Posts Profit" source:"CNBC"}{"date": 5/8/19 headline:"GE Dividend Shrink" source:"MSN"}{"date": 5/9/19 headline:"GE Bankrupt" source:"WSJ"}
This following formula does a good enough job of extracting the first headline:
=MID(C2,SEARCH("headline",C2)+2,SEARCH("source:",C2)-SEARCH("headline",C2)-4)
However it only extracts the first headline and nothing after it.
If possible, I would like to extract all of headlines within the text in that cell, and generate a vertical array of those headlines so that it looks like this:
GE Posts Profit
GE Dividends Shrink
GE Bankrupt
Is this possible?
Thanks very much.
The array formula that I entered in cell range B9:B11 is a user-defined function that I created. It extracts text from cell B3 based on a start and end string specified in cell C5 and C6 respectively.
Before using it you need to copy the VBA code below and paste it to a regular code module, instructions below.
Array formula in cell range B9:B11
To enter the array formula select cell range B9:B11. Type the formula and then press and hold CTRL + SHIFT simultaneously, now press Enter once. Release all keys.
The formula bar now shows the formula with a beginning and ending curly bracket telling you that you entered the formula successfully. Don't enter the curly brackets yourself.
User Defined Function Syntax
ExtractText(text, start_word, end_word)
Arguments
text | Required. A cell reference to the cell containing the text you want to extract. |
start_word | Required. The first word you want to search for. |
end_word | Required. The second word you want to search for. Text between the first and second word will be extracted, even if there are multiple instances. |
VBA code
Function ExtractText(text As String, start_word As String, end_word As String) 'Dimension variables and declare data types Dim tmpArr() As Variant 'Count instances ccount = UBound(Split(text, start_word)) - 1 ReDim tmpArr(ccount) 'Iterate through text string For i = 0 To ccount 'Find start position of instance StartStr = InStr(text, start_word) + Len(start_word) 'Find end position of instance EndStr = InStr(text, end_word) 'Extract first instance tmpArr(i) = Mid(text, StartStr, EndStr - StartStr) 'Remove instance and save to variable again text = Mid(text, EndStr + Len(EndStr), Len(text)) Next i ExtractText = Application.Transpose(tmpArr) End Function
Where do I put the code above?
- Copy code above.
- Go to tab "Developer", click the "Visual Basic" button to open VB Editor.
- Click "Insert" on the menu.
- Click "Module" to insert a module to your workbook.
- Paste code to code module, see above image.
- Exit VB Editor and return to Excel.
How to count word frequency in a cell range [UDF]
This user defined function creates a unique distinct list of words and how many times they occur in the selected […]
Extract unique distinct values from a filtered Excel defined Table [UDF and Formula]
Robert Jr asks: Oscar, I am using the VBA code & FilterUniqueSort array to generate unique lists that drive Selection […]
List files in a folder and subfolders [UDF]
This article demonstrates a user defined function that lists files in a ggiven folder and subfolders. A user defined function is […]
Search for a file in folder and subfolders [UDF]
The image above demonstrates a user-defined function in cell range B6:D7 that allows you to search a folder and subfolders […]
Split words in a cell range into a cell each [UDF]
This post describes how to split words in a cell range into a cell each using a custom function. I […]
Split values equally into groups
Question: How do I divide values equally into groups (3 lists or less)? This post shows you two different approaches, […]
Filter unique distinct words from a cell range [UDF]
This blog post describes how to create a list of unique distinct words from a cell range. Unique distinct words […]
Count unique distinct values by cell color
This article demonstrates a User Defined Function (UDF) that counts unique distinct cell values based on a given cell color. […]
Substitute multiple text strings [UDF]
The SUBSTITUTE and REPLACE functions can only handle one string, the following User-Defined Function (UDF) allows you to substitute multiple […]
List permutations without repetition [UDF]
This blog post describes how to create permutations, repetition is NOT allowed. Permutations are items arranged in a given order meaning […]
2 Responses to “Extract text between words [UDF]”
Leave a Reply
How to comment
How to add a formula to your comment
<code>Insert your formula here.</code>
Convert less than and larger than signs
Use html character entities instead of less than and larger than signs.
< becomes < and > becomes >
How to add VBA code to your comment
[vb 1="vbnet" language=","]
Put your VBA code here.
[/vb]
How to add a picture to your comment:
Upload picture to postimage.org or imgur
Paste image link to your comment.
I get an error when I use this code. I followed everything to a T...
"Invalid Name Error"
Bob,
which vba row was highlighted?