Three ways to convert HTML to PDF using Microsoft Flow
The ability to generate PDF’s as part of a business process is a common one – mainly driven by compliance. A common way to do this is to create a HTML file and then convert that to PDF.
<TLDR>Go to the conclusion for a summary of the approaches</TLDR>
In Microsoft Flow, there has been options for doing this for a while now. For a start, there are two 3rd party flow actions available that are part of a broad suite of tools for managing and manipulating documents. They are Muhimbi PDF and Plumsail Documents.
Both offerings are simple to set up and use. They also allow some configuration and tuning. At the time I wrote this article, both allow you to specify page size and orientation, and Muhimbi has a couple of extras like letting you make password protected PDF’s. As they are both commercial tools, these will come at a cost which depends on how many PDF’s you produce. But in saying that, the cost is not particularly excessive.
Below I show you both actions in use. Each one is followed by an action to save the PDF into my OneDrive – easy peasy…
But I am cheap Paul!
For those of you who do not have a budget, or are simply cheap-assed, there is also the OneDrive Convert File action. This one works by saving a HTML file to OneDrive (or OneDrive for Business). You then pass that file into the Convert File action and save the resulting PDF. So instead of two steps like the ones above, you have three.
Head over to John Liu’s page for a great example of this technique…
Now there is only one teeny problem with this. Not so long ago Microsoft broke it and as I type these lines, it remains broken but with a commitment to get it fixed…
Now by the time you read this it may well be fixed, but you need to be aware of another limitation with this approach. Unlike Plumsail and Muhimbi, this converter does not honour css page breaks. Therefore your PDF can end up looking pretty ugly as content wraps over pages in ugly ways…
Is there another option? Why bother?
So you might be thinking, okay so just use one of the commercial offerings while Microsoft sorts out a fix? After all, even if it costs you a few backs, you can always go back to the cheap version later.
I indeed attempted this but I had an issue with my Flow that precluded it. I had no problem signing up for both Muhimbi and Plumsail, but when I added the actions to my flow, I was met with this type of error. My flow simply did not like using 3rd party connections it seemed.
Unable to process template language expressions in action ‘Convert_HTML_to_PDF’ inputs at line ‘1’ and column ‘2336’: ‘The template language expression ‘json(decodeBase64(triggerOutputs().headers[‘X-MS-APIM-Tokens’]))[‘$connections’][‘shared_muhimbi’][‘connectionId’]’ cannot be evaluated because property ‘shared_muhimbi’ doesn’t exist, available properties are ‘shared_sharepointonline, shared_onedriveforbusiness’. Please see https://aka.ms/logicexpressions for usage details.’.
Now this error is the subject of an open case with Microsoft so I will update this post when I get an answer. <update> It turns out that for flows with a PowerApps trigger, you need to disconnect and reconnect it to PowerApps to start working</update>. But in the meantime I had a deadline and had to demo PDF creation to a client. So I decided to make an Azure function and call it from flow – after all it sounded like a perfect scenario for that technology right?
Now I won’t cover the Azure function stuff in depth here, except to say I tried a heap of HTML to PDF approaches and not a single one worked properly. Eventually I worked out that Azure functions restrict the use of GDI+ libraries. Quoting from the linked article…
For the sake of radical attack surface area reduction, the sandbox prevents almost all of the Win32k.sys APIs from being called, which practically means that most of User32/GDI32 system calls are blocked. For most applications this is not an issue since most Azure Web Apps do not require access to Windows UI functionality (they are web applications after all).
However one common pattern that is affected is PDF file generation
Eventually though I was able to ascertain that if you provision your azure functions using an app service plan instead of a consumption plan, it will work. The reason for this is the latter runs on dedicated virtual machines.
Of course now you are up for hosting costs for your app plan. Unless you already have an Azure function app provisioned for other purposes, this is no longer free.
Once I got past the Azure function issues with GDI support, I was easily able to find and use a pre-existing HTML to PDF function found here. This uses a tool called wkhtmltopdf which is a pretty powerful PDF generation library. I simply added the necessary files and configuration and was able to test it successfully in minutes.
Finally all I needed to do to call this function was to create a HTTP action in Flow like so…
Yay! I had my PDF’s!! Even better, this approach does not have the page break issues that the built-in one does!
Conclusion (and comparison)
So here is a little table that summarises the approaches…
Method | Cost | Page Breaks | Features | Complexity |
OneDrive Convert File Action | Free | No | Basic | Low |
Plumsail HTML to PDF action | Not Free | Yes | Medium | Low |
Muhimbi HTML to PDF action | Not Free | Yes | Medium+ | Low |
Azure function | Not Free1 | Yes | Advanced | Medium2 |
1 You will have to pay for the azure function app subscription, but many orgs will have one already so might be very low.
2 I marked this as medium if you are doing basic stuff, but if you want to do stuff like set page size and orientation, you are having to edit code directly so could be classified as High.
Now for my real use-case, I would likely use one of the commercial offerings, but if the organisation was going to do a lot of PDF generation, then the Azure function approach could be quite cost effective. Additionally, expanding the code to deal with additional options might also be justified.
I think the key point is that I was able to quickly work around this issue and deliver good outcomes for my client. So they are not adversely impacted while I wait for the various issues to be resolved.
Thanks for reading
Paul Culmsee
Leave a Reply