Author: Oscar Cronquist Article last updated on May 13, 2019

Blake asks:

I have a somewhat related question, if you don't mind:

I have very large amount of text in a single cell, and I would like to extract multiple instances of text that appear between two specific words.

For example, here is the sample text in one cell:

{"date": 5/7/19 headline:"GE Posts Profit" source:"CNBC"}{"date": 5/8/19 headline:"GE Dividend Shrink" source:"MSN"}{"date": 5/9/19 headline:"GE Bankrupt" source:"WSJ"}

This following formula does a good enough job of extracting the first headline:

=MID(C2,SEARCH("headline",C2)+2,SEARCH("source:",C2)-SEARCH("headline",C2)-4)

However it only extracts the first headline and nothing after it.

If possible, I would like to extract all of headlines within the text in that cell, and generate a vertical array of those headlines so that it looks like this:

GE Posts Profit
GE Dividends Shrink
GE Bankrupt

Is this possible?

Thanks very much.

The array formula that I entered in cell range B9:B11 is a user-defined function that I created. It extracts text from cell B3 based on a start and end string specified in cell C5 and C6 respectively.

Before using it you need to copy the VBA code below and paste it to a regular code module, instructions below.

Array formula in cell range B9:B11

=ExtractText(B3,C5,C6)

To enter the array formula select cell range B9:B11. Type the formula and then press and hold CTRL + SHIFT simultaneously, now press Enter once. Release all keys.

The formula bar now shows the formula with a beginning and ending curly bracket telling you that you entered the formula successfully. Don't enter the curly brackets yourself.

User Defined Function Syntax

ExtractText(text, start_word, end_word)

Arguments

text Required. A cell reference to the cell containing the text you want to extract.
start_word Required. The first word you want to search for.
end_word Required. The second word you want to search for. Text between the first and second word will be extracted, even if there are multiple instances.

VBA code

Function ExtractText(text As String, start_word As String, end_word As String)

'Dimension variables and declare data types
Dim tmpArr() As Variant

'Count instances
ccount = UBound(Split(text, start_word)) - 1
ReDim tmpArr(ccount)


'Iterate through text string
For i = 0 To ccount

    'Find start position of instance
    StartStr = InStr(text, start_word) + Len(start_word)
    
    'Find end position of instance
    EndStr = InStr(text, end_word)
    
    'Extract first instance
    tmpArr(i) = Mid(text, StartStr, EndStr - StartStr)
    
    'Remove instance and save to variable again
    text = Mid(text, EndStr + Len(EndStr), Len(text))
    
    
Next i

ExtractText = Application.Transpose(tmpArr)

End Function

Where do I put the code above?

  1. Copy code above.
  2. Go to tab "Developer", click the "Visual Basic" button to open VB Editor.
  3. Click "Insert" on the menu.
  4. Click "Module" to insert a module to your workbook.
  5.  Paste code to code module, see above image.
  6. Exit VB Editor and return to Excel.
Note, make sure you save your workbook with file extension *.xlsm (macro-enabled) in order to keep the code.

Download Excel file through email

Enter your email address to receive the workbook. Note, an email will actually be sent to you.
* You will also get a weekly newsletter, unsubscribe anytime!