PDF extension - most accurate results

From what I read the options we have as of Lucee 6.2 (Tomcat 9 + Java 21) are:

Wondering which PDF plugin currently provides the most accurate, best features (and less error prone) document, in your personal experience?

There is a fork of OpenHTMLToPDF that has been getting some fairly regular updates - https://github.com/openhtmltopdf/openhtmltopdf

At work, we use this almost exclusively as the CSS support is pretty good, and it supports PDFBox 3.x which we also use extensively for PDF generation and manipulation. It’s all Java interop at the end of the day, so there’s some initial legwork, but it’s been a positive experience for us.

Have not used the others much. I know there are folks who use wkHTMLtoPDF with success. I’ve wanted to look into OpenPDF more, but haven’t had much of a need to yet.

There’s also Puppeteer or Playwright paired with Chrome for probably the best modern HTML/CSS support, but I’m not sure how performant that path would be in the end. Depending on the setup, it would be an external process, though Playwright does have a Java version that would allow for more direct integration to a CFML app.

2 Likes

Sounds great. Any instructions on how to install it? Just trough Maven?

What I’m bothered the most with these HTML-to-PDF plugins/components is that you lose the paging controls embedded in CFM (header, footer, numbering, etc.) which is in fact the main advantage of using PDF files instead of other document formats. Am I wrong?

You could go through Maven, or look up the library on an online repo like Maven Central, to see what dependencies you’ll need and just download the jars from there and throw them in a location mapped to this.javasettings in Application.cfc.

I wrote an article on getting started with OpenHTMLToPDF a few years ago. Slightly dated now since the fork of it and PDFBox 3.x being a thing, but perhaps still useful with getting started.

Article - HTML to PDF With CFML & OpenHTMLToPDF - A Blog By Tony Junkes
Example repo on GitHub - GitHub - tonyjunkes/cfml-openhtmltopdf-examples: Code examples for using OpenHTMLToPDF in CFML

Regarding paging controls… That will come down to utilizing some of the print features of CSS like @page, which the abovementioned library does support. I know it also has some custom elements you can throw into the HTML to reuse layouts as headers/footers, but I haven’t used them. And since it’s HTML at the end of the day, you can always toss it into a CFM to leverage includes and the usual dynamic benefits and just output a final blob of HTML to be passed to the renderer.

3 Likes

Very “hacky” :face_with_spiral_eyes:
I know PDF handling is “not the sexiest thing” in an app server, as the @Zackster once put it so eloquently, but it is the main reason CFML (and mainly Adobe’s CF server) is thriving in government infrastructure - and will continue to do sa - also being the main feature I believe is holding Lucee back from taking Adobe’s place as an open source alternative, or at least the lion’s share of the CFML server market.

I just wish we could use the embedded functions in the language itself, not 3rd party extensions with wrappers and API calls.

1 Like

I don’t know if I’d call it hacky :grin:, but when you don’t have a tag/function to do the work you need, you’ve got to do it yourself. I’ve never used Lucee’s PDF extension, and Adobe’s functionality has historically been terrible for my use cases; though they did show it some love in the latest version.

Native support is always a gray area. Especially when considering continued efficiency, maintenance/updates, and demand for the feature to begin with. Personally, I’d rather integrate third-party tools myself that are focusing on doing that thing to the best of its ability. Arguably more work, but then you’re only limited to what the library can do, and not to what the platform is giving you access to. This has long been a gripe of mine with CF where it builds on “doing all the things”, but often times only getting half out the door and then leaving it riddled with bugs, or out of date.

I think if you’re going to see any native/extension traction here, you’re going to have to - Sponsor the work to be done, put in a pull request as volunteered work, or roll your own solution.

2 Likes

I use custom headers/footers and page numbering with wkhtmltopdf. It requires a little javascript, but there are examples of how to do it online.

I wrote a basic custom tag wrapper and shared it on github couple years ago. It supports HeaderURL & FooterURL. I also added other features as it seemed that some command line arguments needed to be in an explicit sort order.

NOTE: WKHTMLTOPDF covers most of what we do, but I recently installed Node․js with Puppeteer to use the built-in screenshot feature as embedded content wasn’t rendering due to third-party javascript feature checking caused it to not render at all.

1 Like

WOW! Thank you @Jamo this is indeed extremely useful, very much appreciated.

another option, we use Docker for our Lucee stuff and I have been using this Puppeteer container https://hub.docker.com/r/zenato/puppeteer for awhile to render some flexbox based css (bulma.io)