0

I am using VBA to try and hopefully automate my PowerQuery needs.

I've recorded a macro from the import data from PDF in the data tab.

I have then tried creating a user definable source path within the code in the macro (See below)

Sub PDF_PQ()
'
' PDF_PQ Macro
'
Dim q As WorkbookQuery
Dim fName As String
Dim Dir As String

Dir = Worksheets("Settings").Cells(10, 3).Value
fName = Worksheets("Settings").Cells(11, 3).Value

' Load PDF Data

'
    ActiveWorkbook.Queries.Add Name:="Page002", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Pdf.Tables(File.Contents(Dir & fName), [Implementation=""1.3""])," & Chr(13) & "" & Chr(10) & "    Page1 = Source{[Id=""Page002""]}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Page1,{{""Column1"", type text}, {""Column2"", type text}, {""Co" & _
        "lumn3"", type text}, {""Column4"", type text}, {""Column5"", type text}, {""Column6"", type text}, {""Column7"", type text}, {""Column8"", type text}, {""Column9"", type text}, {""Column10"", type text}, {""Column11"", type text}, {""Column12"", type text}, {""Column13"", type text}, {""Column14"", type text}, {""Column15"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=Page002;Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Page002]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Page002"
        '.Refresh BackgroundQuery:=False

    End With
    
End Sub

The original has the PDF filepath baked into the source section of the code (See Below)

Original
Source = Pdf.Tables(File.Contents(""D:\OneDrive\OneDrive - Bendtech\The Laser Factory\6. General Documents\bus build spec\JN 1393 Braybon Industries VIC Scania SB.pdf"")

My Changes
Source = Pdf.Tables(File.Contents(Dir & fName)

I am trying to make the Source user definable by being able to select a different PDF and import that.

My tries at rectifying the problem at hand has been uneventful and results in an error of not importing the data.

All I have changed from the original code is the Source path and it seems to have broken the entire thing.

enter image description here

As stated above I had only made the change to the source of the PDF Source.

Importing different PDF's all utilise the same layout, column names etc. but I am having trouble importing the query into the worksheet.

I am at a loss to what I have done to the code X-D

1

1 Answer 1

2

You need to fix how you add the path into the formula string.

Something like this:

    Dim q As WorkbookQuery, wb As Workbook
    Dim fName As String, fDir As String
    
    Set wb = ActiveWorkbook
    With wb.Worksheets("Settings")
        fDir = .Cells(10, 3).Value
        fName = .Cells(11, 3).Value
    End With
    
    If Right(fDir, 1) <> "\" Then fDir = fDir & "\" 'ensure ending backslash
    
    'reference the added query    
    Set q = wb.Queries.Add(Name:="Page002", Formula:= _
        "let" & vbCrLf & _
        "    Source = Pdf.Tables(File.Contents(""" & fDir & fName & """), [Implementation=""1.3""])," & vbCrLf & _
        "    Page1 = Source{[Id=""Page002""]}[Data]," & vbCrLf & _
        "    #""Changed Type"" = Table.TransformColumnTypes(Page1,{" & _
        "{""Column1"", type text}, {""Column2"", type text}, " & _
        "{""Column3"", type text}, {""Column4"", type text}, " & _
        "{""Column5"", type text}, {""Column6"", type text}, " & _
        "{""Column7"", type text}, {""Column8"", type text}, " & _
        "{""Column9"", type text}, {""Column10"", type text}, " & _
        "{""Column11"", type text}, {""Column12"", type text}, " & _
        "{""Column13"", type text}, {""Column14"", type text}, " & _
        "{""Column15"", type text}})" & vbCrLf & _
        "in" & vbCrLf & "    #""Changed Type""")

        'etc etc

I would change the variable named Dir to (eg) fDir to avoid any confusion with the built-in Dir() method.

Note there is a built-in constant vbCrLf to represent a carriage return followed by a line feed.

Also worth noting that if you want to apply the same column transformation to all columns you can do that without explicitly listing all of the columns (see the code below).
That way you don't need to change your code to handle different-width tables, and you're closer to a re-useable method to which you can pass a file path, and a couple other parameters such as the table id, and where to show the data.

For example:

Sub Tester()
    Const PDF_PATH = "C:\Temp\Risk_Assessment.pdf"
    Dim i As Long
    
    'clear any existing ListObjects
    For i = Sheet2.ListObjects.Count To 1 Step -1
        Sheet2.ListObjects(i).Delete
    Next i
    
    'pull a couple of tables from the PDF and place them on Sheet2
    PlacePdfTable PDF_PATH, "Table005", Sheet2.Range("A1"), "Table1"                    
    PlacePdfTable PDF_PATH, "Table008", Sheet2.Range("I1"), "Table2"

End Sub


'Read data from table `pdfTableId` in PDF file located at `pdfPath`, and
'  display the data in a Listobject named `tableName` as location `tableLocation`
Sub PlacePdfTable(pdfPath As String, pdfTableId As String, _
                  tableLocation As Range, tableName As String)
    
    Dim q As WorkbookQuery, wb As Workbook
    Dim fName As String, fDir As String
    
    Set wb = tableLocation.Parent.Parent 'the target workbook

    'remove any query with the same name as the one we're creating
    On Error Resume Next  'ignore any error if query doesn't exist
    wb.Queries(pdfTableId).Delete
    On Error GoTo 0       'stop ignoring errors
    
    Set q = wb.Queries.Add(Name:=pdfTableId, Formula:= _
        " let Source = Pdf.Tables(File.Contents(""" & pdfPath & """), [Implementation=""1.3""])," & vbCrLf & _
        "    Page1 = Source{[Id=""" & pdfTableId & """]}[Data]," & vbCrLf & _
        "    TransformedTable = Table.TransformColumnTypes(" & _
        "    Page1,List.Transform(Table.ColumnNames(Page1), each {_, type text}))" & vbCrLf & _
        " in TransformedTable")
    
    With tableLocation.Parent.ListObjects.Add(SourceType:=0, Source:= _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;" & _
            "Location=" & q.Name & ";Extended Properties=""""", _
            Destination:=tableLocation).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [" & q.Name & "]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = tableName
        .Refresh
    End With
End Sub
Sign up to request clarification or add additional context in comments.

1 Comment

Please flag as "Accepted" if this answered your question, to help out anyone coming along later with a similar problem.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.