Author: Oscar Cronquist Article last updated on April 15, 2021

compare files in two different folders and their sub folders

Today I would like to share a macro that compares the content in two different folders and their subfolders. It compares the file names but not the file sizes.

The macro returns a new worksheet populated with duplicate and unique file names. In this example the picture shows the content in these two folders:

  • c:\temp\
  • c:\temp1\

The file information for duplicate files in the first folder is in columns A to C, the second folder is in columns D to F.

As you can see in the image above, duplicate file names are listed together. Duplicate file names in the same folder tree and the corresponding folder tree are grouped together.

Unique files from both folders are in columns A to C, below the duplicate file list. If you have folders with a lot of files and subfolders the macro may take a while to finish.

What happens when you start the macro?

  1. Select the first folder
  2. Select the second folder
  3. A new sheet is inserted
  4. The file data is saved to the new sheet
  5. Macro ends

Where do I save the code?

  1. Copy the VBA code below
  2. Go to the VB Editor (Alt + F11)
  3. Click "Insert" on the menu
  4. Click "Module"
  5. Paste code to the code window
  6. Return to excel

VBA Code

'Name macro
Sub CompareContentsofTwoFolders()

'Dimension variables and declare data types
Dim pth1 As String, pth2 As String
Dim r1 As Single, r2 As Single
Dim arrd() As Variant
Dim arru() As Variant

'Redimension array variables
ReDim arrd(0 To 5, 0)
ReDim arru(0 To 2, 0)

'Show a dialog box and ask for a path to the first folder
With Application.FileDialog(msoFileDialogFolderPicker)
    .Show
    pth1 = .SelectedItems(1) & "\"
End With

'Show another dialog box and ask for a path to the second folder
With Application.FileDialog(msoFileDialogFolderPicker)
    .Show
    pth2 = .SelectedItems(1) & "\"
End With

'Add a new worksheet to workbook
Sheets.Add

'Save new worksheet to object x
Set x = ActiveSheet

'Disable screen refresh while macro is running to speed things up
Application.ScreenUpdating = False

'Write text to cells and format cells
x.Range("A1") = "Duplicate files"
x.Range("A2") = "Path"
x.Range("B2") = "File name"
x.Range("C2") = "Size"
x.Range("D2") = "Path"
x.Range("E2") = "File name"
x.Range("F2") = "Size"
x.Range("A:F").Font.Bold = False
x.Range("A1:F2").Font.Bold = True

'Start macro Recursive using variable pth1 as a parameter
Recursive pth1

'Find last non-empty value in active worksheet
Lrow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row

'Sort cell range
x.Range("A2:C" & Lrow).Sort Key1:=x.Range("B1"), Header:=xlYes

'Save values from cell range A3:C? to array variable arr1
arr1 = x.Range("A3:C" & Lrow).Value

'Clear values in cell range cell range A3:C?
x.Range("A3:C" & Lrow).Clear

'Start macro Recursive with parameter pth2
Recursive pth2

'Find last non empty cell in column A
Lrow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row

'Sort cell range "A2:C" & Lrow
x.Range("A2:C" & Lrow).Sort Key1:=x.Range("B1"), Header:=xlYes

'Save values from cell range "A3:C" & Lrow to array variable arr2
arr2 = x.Range("A3:C" & Lrow).Value

'Clear cell range "A3:C" & Lrow from values 
x.Range("A3:C" & Lrow).Clear

'For ... Next statement
For r1 = LBound(arr1, 1) To UBound(arr1, 1)

    'Save boolean value False to variable chk
    chk = False
    
        'Check if variable r1 is larger than 1
        'If ... Then ... End If statement 
        If r1 > 1 Then
            
            'Check if the value in array variable arr1 row r1 and column 2 is equal to value in array variable arr1 row r1 -1 snd column 2
            If arr1(r1, 2) = arr1(r1 - 1, 2) Then
                
                'For ...  Next statement 
                For r3 = UBound(arrd, 2) To LBound(arrd, 2) Step -1

                    'Check if value in array variable arrd row 2 column r3 is not equal to nothing AND array variable arrd row 1 column r3 is not equal to array variable arr1 row r1 column 2
                    'If True stop For  ... Next statement
                    If arrd(2, r3) <> "" And arrd(1, r3) <> arr1(r1, 2) Then Exit For

                    'Check if array variable arrd row 1 column r3 is equal to array variable arr1 row r1 column 2
                    If arrd(1, r3) = arr1(r1, 2) Then

                        'Check if number in variable r3 is equal to the upper bound in array variable arrd
                        'If True add another container to array variable arrd
                        If r3 = UBound(arrd, 2) Then ReDim Preserve arrd(UBound(arrd, 1), UBound(arrd, 2) + 1)

                        'Save value in array variable arr1 row r1 column 1 to array variable arrd row 0 column r3 plus one
                        arrd(0, r3 + 1) = arr1(r1, 1)

                        'Save value in array variable arr1 row r1 column 2 to array variable arrd row 1 column r3 plus one
                        arrd(1, r3 + 1) = arr1(r1, 2)

                        'Save value in array variable arr1 row r1 column 3 to array variable arrd row 2 column r3 plus one
                        arrd(2, r3 + 1) = arr1(r1, 3)

                        'Add another column to array variable arrd
                        ReDim Preserve arrd(UBound(arrd, 1), UBound(arrd, 2) + 1)

                        'Stop For ... Next  statement
                        Exit For
                    End If
                Next r3
                
                'For ... Next statement
                For r3 = UBound(arru, 2) To LBound(arru, 2) Step -1

                    'Check if value in array variable arru row 2 column r3 is not equal to nothing AND array variable arru row 1 column r3 is not equal to array variable arr1 row r1 column 2
                    'If True stop For  ... Next statement
                    If arru(2, r3) <> "" And arru(1, r3) <> arr1(r1, 2) Then Exit For

                    'Check if array variable arru row 1 column r3 is equal to array variable arr1 row r1 column 2
                    If arru(1, r3) = arr1(r1, 2) Then

                        'Check if number in variable r3 is equal to the upper bound in array variable arru
                        'If True add another container to array variable arru
                        If r3 = UBound(arru, 2) Then ReDim Preserve arru(UBound(arru, 1), UBound(arru, 2) + 1)

                        'Save value in array variable arr1 row r1 column 1 to array variable arru row 0 column r3 plus one
                        arru(0, r3 + 1) = arr1(r1, 1)
 
                        'Save value in array variable arr1 row r1 column 2 to array variable arru row 1 column r3 plus one
                        arru(1, r3 + 1) = arr1(r1, 2)

                        'Save value in array variable arr1 row r1 column 3 to array variable arru row 2 column r3 plus one
                        arru(2, r3 + 1) = arr1(r1, 3)

                        'Add another column to array variable arru
                        ReDim Preserve arru(UBound(arru, 1), UBound(arru, 2) + 1)

                        'Stop For ... Next  statement
                        Exit For
                    End If
                Next r3

                'Go to statement
                GoTo jmp
            End If
        End If
    
    'For ... Next statement
    For r2 = LBound(arr2, 1) To UBound(arr2, 1)
                
        'Check if array variable arr2 row r2 column 2 is equal to array variable arr1 row r1 column 2
        If arr2(r2, 2) = arr1(r1, 2) Then
        
            'If ... Then ... Else ... End If statement. Check if variable chk is equal to boolean value False
            If chk = False Then

                'Save values from array variable arr1 row r1 to array variable arrd
                arrd(0, UBound(arrd, 2)) = arr1(r1, 1)
                arrd(1, UBound(arrd, 2)) = arr1(r1, 2)
                arrd(2, UBound(arrd, 2)) = arr1(r1, 3)
            
            'Continue here if variable chk is not False
            Else

                'Save nothing to array variable arrd
                arrd(0, UBound(arrd, 2)) = ""
                arrd(1, UBound(arrd, 2)) = ""
                arrd(2, UBound(arrd, 2)) = ""
            End If
            
            'Save values from array variable arr2 row r2 to array variable arrd
            arrd(3, UBound(arrd, 2)) = arr2(r2, 1)
            arrd(4, UBound(arrd, 2)) = arr2(r2, 2)
            arrd(5, UBound(arrd, 2)) = arr2(r2, 3)
            
            'Clear value in array variable arr2 row r2 column 1
            arr2(r2, 1) = ""
            
            'Add another column to array variable arrd
            ReDim Preserve arrd(UBound(arrd, 1), UBound(arrd, 2) + 1)

            'Save boolean value True to variable chk
            chk = True
            
        End If
        
    Next r2
    
    'Check if variable Chk is equal to boolean value False
    If chk = False Then

            'Save value from array variable arr1 row r1 to array variable arru
            arru(0, UBound(arru, 2)) = arr1(r1, 1)
            arru(1, UBound(arru, 2)) = arr1(r1, 2)
            arru(2, UBound(arru, 2)) = arr1(r1, 3)

            'Add another column to array variable arru
            ReDim Preserve arru(UBound(arru, 1), UBound(arru, 2) + 1)
    End If

'Go to location
jmp:
Next r1

'For ... Next statement
For r2 = LBound(arr2, 1) To UBound(arr2, 1)

    'Check if value in array variable arr2 row r2 column 1 is not equal to nothing
    If arr2(r2, 1) <> "" Then

        'Save values from array variable arr2 row r2 to array variable arru
        arru(0, UBound(arru, 2)) = arr2(r2, 1)
        arru(1, UBound(arru, 2)) = arr2(r2, 2)
        arru(2, UBound(arru, 2)) = arr2(r2, 3)

        'Add another column to array variable arru
        ReDim Preserve arru(UBound(arru, 1), UBound(arru, 2) + 1)
    End If
Next r2

'Save values in array variable arrd to a cell range with the same size as the array and the upper left cell is A3
x.Range("A3").Resize(UBound(arrd, 2) + 1, UBound(arrd, 1) + 1) = Application.Transpose(arrd)

'Save text to column A
x.Range("A" & UBound(arrd, 2) + 3) = "Unique files"
x.Range("A" & UBound(arrd, 2) + 4) = "Path"
x.Range("B" & UBound(arrd, 2) + 4) = "File name"
x.Range("C" & UBound(arrd, 2) + 4) = "Size"
x.Range("A" & UBound(arrd, 2) + 3 & ":C" & UBound(arrd, 2) + 4).Font.Bold = True

'Save values from array variable arru to a cell range
x.Range("A" & UBound(arrd, 2) + 5).Resize(UBound(arru, 2) + 1, UBound(arru, 1) + 1) = Application.Transpose(arru)

'Resize columns A:F widths based on text
x.Columns("A:F").AutoFit

'Show changes on the worksheet
Application.ScreenUpdating = True
End Sub

The following macro is a recursive macro that starts for each subfolder in the given folders. You need both macros to be able to iterate through subfolders.

'Name macro and parameters
Sub Recursive(FolderPath As String)

'Dimension variables and declare data types
Dim Value As String, Folders() As String
Dim Folder As Variant, a As Long

'Resize variable Folders
ReDim Folders(0)

'Check if last two characters in variable FolderPath is equal to \\
'If True stop macro
If Right(FolderPath, 2) = "\\" Then Exit Sub

'Returns a String representing the name of a file, directory, or folder that matches a specified pattern or file attribute, or the volume label of a drive.
Value = Dir(FolderPath, &H1F)

'Do Until ... Loop statement
Do Until Value = ""

    'Check if value is equal to . or ..
    If Value = "." Or Value = ".." Then
    Else

        'Check if value is a folder
        If GetAttr(FolderPath & Value) = 16 Or GetAttr(FolderPath & Value) = 48 Then

            'Save value to array variable Folders
            Folders(UBound(Folders)) = Value

            'Add another container to array variable Folders
            ReDim Preserve Folders(UBound(Folders) + 1)

        'Continue here if Value is not a folder
        Else

            'Go to the last non empty value in column A and then move a cell down
            Lrow = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row + 1

            'Save folder path to Column A row Lrow
            ActiveSheet.Range("A" & Lrow) = FolderPath

            'Save Value to Column B row Lrow
            ActiveSheet.Range("B" & Lrow) = Value

            'Save file size to Column C row Lrow
            ActiveSheet.Range("C" & Lrow) = FileLen(FolderPath & Value)
        End If
    End If
    Value = Dir
Loop

'Iterate through all folders saved in array variable Folders
For Each Folder In Folders

    'Start macro Recursive with parameter Folder and a "\"
    Recursive FolderPath & Folder & "\"

'Continue with next Folder
Next Folder
End Sub