18

I have been looking to convert a .pptx file to a .pdf file through a Python script for several hours but nothing seems to be working.

What I have tried: I have tried 1) this script which calls windows32.client, and 2) unoconv, but none of them seem to be working for me.

Problems encountered: Using script from first option throws up an error (com_error: (-2147352567, 'Exception occurred.', (0, None, None, None, 0, -2147024894), None)), whereas in second option Python can't seem to recognize unoconv even after installing it using pip.

I also saw some recommended Pandoc, but I can't understand how to use it for Python.

Versions I am using: Python 2.7.9, Windows 8.1

10
  • 1
    I haven't coded in VBA in several years. I was trying to look through some old code that I had, but I can't find the work that I did accessing the filesystem.
    – AMR
    Commented Jul 18, 2015 at 5:35
  • 1
    Try reasking this question on stack exchange Super User and reframe it as a VBA question. I see more VBA questions over there.
    – AMR
    Commented Jul 18, 2015 at 5:37
  • 1
    Thanks for your suggestions
    – user238469
    Commented Jul 18, 2015 at 5:38
  • 1
    Also try this post. There are a lot of similarities to writing Python code to VBA. You only need to learn a few of the Objects in the Object Model and that shouldn't be more than a few hours if you are already advanced enough to be tackling challenges like this. stackoverflow.com/questions/25526335/…
    – AMR
    Commented Jul 18, 2015 at 5:40
  • 1
    @AMR: I solved it with the help of comtypes and this post.
    – user238469
    Commented Jul 18, 2015 at 6:37

11 Answers 11

31

I found the answer with the help of this post and the answer from this question.

Note that comtypes is only available for Windows. Other platforms will not support this.

import comtypes.client

def PPTtoPDF(inputFileName, outputFileName, formatType = 32):
    powerpoint = comtypes.client.CreateObject("Powerpoint.Application")
    powerpoint.Visible = 1

    if outputFileName[-3:] != 'pdf':
        outputFileName = outputFileName + ".pdf"
    deck = powerpoint.Presentations.Open(inputFileName)
    deck.SaveAs(outputFileName, formatType) # formatType = 32 for ppt to pdf
    deck.Close()
    powerpoint.Quit()
2
  • 2
    Where does the number 32 come from? Is there a list of formats available somewhere? Commented Sep 10, 2018 at 16:42
  • @OskarPersson The number comes from the PpSaveAsFileType enumeration, the complete list is here: learn.microsoft.com/en-us/office/vba/api/…
    – kibibu
    Commented Feb 25, 2019 at 3:06
7

I was working with this solution but I needed to search all .pptx, .ppt, and then turn them all to .pdf (python 3.7.5). Hope it works...

import os
import win32com.client

ppttoPDF = 32

for root, dirs, files in os.walk(r'your directory here'):
    for f in files:

        if f.endswith(".pptx"):
            try:
                print(f)
                in_file=os.path.join(root,f)
                powerpoint = win32com.client.Dispatch("Powerpoint.Application")
                deck = powerpoint.Presentations.Open(in_file)
                deck.SaveAs(os.path.join(root,f[:-5]), ppttoPDF) # formatType = 32 for ppt to pdf
                deck.Close()
                powerpoint.Quit()
                print('done')
                os.remove(os.path.join(root,f))
                pass
            except:
                print('could not open')
                # os.remove(os.path.join(root,f))
        elif f.endswith(".ppt"):
            try:
                print(f)
                in_file=os.path.join(root,f)
                powerpoint = win32com.client.Dispatch("Powerpoint.Application")
                deck = powerpoint.Presentations.Open(in_file)
                deck.SaveAs(os.path.join(root,f[:-4]), ppttoPDF) # formatType = 32 for ppt to pdf
                deck.Close()
                powerpoint.Quit()
                print('done')
                os.remove(os.path.join(root,f))
                pass
            except:
                print('could not open')
                # os.remove(os.path.join(root,f))
        else:
            pass

The try and except was for those documents I couldn't read and won't exit the code until the last document. And I would recommend doing each type of format aside: first .pptx and then .ppt (or viceversa).

1
  • This works, however this approach creates problems if there is a dot (.) in the file name (file_v_1.3.pptx). The work around is to rename the file first and than rename it again in the end. Is there a better way of doing this?
    – valenzio
    Commented Oct 28, 2020 at 7:17
3

I believe the answer has to be updated because because comtypes doesn't work anymore.

So this is the code which works (updated version of the accepted answer) :

import win32com.client

def PPTtoPDF(inputFileName, outputFileName, formatType = 32):
    powerpoint = win32com.client.DispatchEx("Powerpoint.Application")
    powerpoint.Visible = 1

    if outputFileName[-3:] != 'pdf':
        outputFileName = outputFileName + ".pdf"
    deck = powerpoint.Presentations.Open(inputFileName)
    deck.SaveAs(outputFileName, formatType) # formatType = 32 for ppt to pdf
    deck.Close()
    powerpoint.Quit()
1

I need a way to save PPTX file to PDF and PDF with notes. Here is my solution

from comtypes.client import CreateObject, Constants

def PPTtoPDF(inputFileName, outputFileName, formatType = 32):
    powerpoint = CreateObject('Powerpoint.Application')
    constants = Constants(powerpoint)
    powerpoint.Visible = 1

    if outputFileName[-3:] != 'pdf':
        outputFileName = outputFileName + ".pdf"
    deck = powerpoint.Presentations.Open(inputFileName)
    deck.SaveAs(outputFileName, constants.PpSaveAsPDF)
    deck.Close()
    powerpoint.Quit()


def PPTtoPDFNote(inputFileName, outputFileName, formatType = 32):
    powerpoint = CreateObject('Powerpoint.Application')
    constants = Constants(powerpoint)
    powerpoint.Visible = 1

    if outputFileName[-3:] != 'pdf':
        outputFileName = outputFileName + ".pdf"
    deck = powerpoint.Presentations.Open(inputFileName)
    deck.ExportAsFixedFormat(
        outputFileName,
        constants.ppFixedFormatTypePDF,
        constants.ppFixedFormatIntentPrint,
        False, # No frame
        constants.ppPrintHandoutHorizontalFirst,
        constants.ppPrintOutputNotesPages,
        constants.ppPrintAll
    )
    deck.Close()
    powerpoint.Quit()

To use it,

PPTtoPDF    ('.\\Test.pptx', '.\Test.pdf'          )
PPTtoPDFNote('.\\Test.pptx', '.\Test_with_Note.pdf')

Note: It is always the best to do it using Windows platform, i.e., using comtypes so that it could always support new format and features in Microsoft Powerpoint.

0

unoconv is a great tool to perform this task and it is indeed build in python. Regarding your problem, it might be related to a recurring problem with the way the python interpreter is set in the main unoconv file after it has been installed.

To run it with python3 interpreter, replace #!/usr/bin/env python with #!/usr/bin/env python3 or #!/usr/bin/python3 in unoconv file (/usr/bin/unoconv).

one liner:

sudo sed -i -e '1s:#!/usr/bin/env python$:#!/usr/bin/env python3:' /usr/bin/unoconv

You could also symlink /usr/bin/unoconv to /usr/local/bin/unoconv.

0

Have a look at the following snippet. It uses unoconv and it's working ex expected on UBUNTU 20.04.

# requirements
# sudo apt install unoconv
# pip install tqdm
# pip install glob
import glob
import tqdm
path = "<INPUT FOLDER>"
extension = "pptx"
files = [f for f in glob.glob(path + "/**/*.{}".format(extension), recursive=True)]
for f in tqdm.tqdm(files):
    command = "unoconv -f pdf \"{}\"".format(f)
    os.system(command)

This snippet can be used for different-2 format conversion.

Original Snippet

1
  • 1
    I dont seem to get output when running this snippet. Where should I be able to find the created pdf?
    – Ger
    Commented Sep 5, 2021 at 22:16
0

For converting .pptx/.docx to pdf on google cloud function, I referred to this github repo https://github.com/zdenulo/gcp-docx2pdf/tree/master/cloud_function, they are using google drive api's. In this repo they have used mime type of docx to convert .docx file to .pdf file over google drive, you can use other mime types as well, like mime type of pptx(referring: https://developers.google.com/drive/api/v3/mime-types) to convert files over google drive. Rest all code is same as mentioned in the github repo.

1
0

try this code it works with me

import os
import win32com.client as win32
import comtypes

#make sure to initial cometypes
comtypes.CoInitialize()


# Path to input PowerPoint document
input_path = 'path/to/input/document.pptx'

# Path to output PDF file
output_path = 'path/to/output/document.pdf'

# Open PowerPoint document and convert to PDF
powerpoint = win32.Dispatch('Powerpoint.Application')
presentation = powerpoint.Presentations.Open(input_path)
presentation.SaveAs(output_path , 32)
presentation.Close()
powerpoint.Quit()
0

Here is an optimization of @user238469 's answer. This function uses ExportAsFixedFormat method to save as pdf.

def PPTtoPDF(inputFileName, outputFileName):
    powerpoint = comtypes.client.CreateObject("Powerpoint.Application")
    powerpoint.Visible = 1

    if outputFileName[-3:] != 'pdf':
        outputFileName = outputFileName + ".pdf"
    
    outputWindowsPath = Path(outputFileName)
    if outputWindowsPath.exists():
        outputWindowsPath.unlink()
    deck = powerpoint.Presentations.Open(inputFileName)
    deck.ExportAsFixedFormat(outputFileName, 2, 1, 0)
    deck.Close()
    powerpoint.Quit()

Checking if pdf file exists already before saving the new one allows to avoid the following error :

_ctypes.COMError: (-2147467259, 'Unspecified error', (None, None, None, 0, None))
1
  • In my tests it is also important to specify an absolute path for the files. A relative path like 'test\file.pptx' caused the -2147467259, 'Unspecified error'. Commented Dec 6, 2024 at 22:59
0

this works:

import os
import win32com.client
from pywintypes import com_error

def convert_ppt_to_pdf(file_path):
    ppttoPDF = 32

    try:
        powerpoint = win32com.client.Dispatch("Powerpoint.Application")
        deck = powerpoint.Presentations.Open(file_path)

        # Get the base filename without extension
        base_filename = os.path.splitext(file_path)[0]

        # Save as PDF
        pdf_path = base_filename + ".pdf"
        deck.SaveAs(pdf_path, ppttoPDF)

        deck.Close()
        powerpoint.Quit()

        print("Conversion successful: {} -> {}".format(file_path, pdf_path))
    except com_error as e:
        print("Failed to convert: {}".format(file_path))
        print("Error:", str(e))

# Example usage
ppt_file_path = r'C:\Users\usern\Desktop\examplefolder\examplefolder\example.ppt'
#ppt_file_path = r'C:\Users\usern\Desktop\examplefolder\examplefolder\example.pptx'
convert_ppt_to_pdf(ppt_file_path)
0

I have achieved converting PPT to PDF using the Spire.Presentation package. See https://www.e-iceblue.com/Introduce/presentation-for-python.html

You have to pay for it: around $600 for a perpetual license. It relies on some DLL. Spire unfortunately does not work perfectly either. It left out the text box in one slide. Very strange.

I have tried using LibreOffice before, but it had a problem with EMF images.

Powerpoint automation as described in the other answers seems to work, but requires a Desktop session running; ie you cannot run it inside a scheduled task.