Html to pdf question

the tag not supported with my Lucee 5.2.8.50.
alternatively, I can use cfdocument to generate PDF doucment.
what’s your suggestion?
thanks.

See the PDF extension: https://download.lucee.org/

The extension has, when IIRM correctly, has some issues with CSS and images. We use it to create a lot of documents and the initial part with layout and fonts was really hard.

I create a note for myself to check these both for further documents:

Maybe this helps you.

2 Likes

Thanks @Michael_Diederich

We have found the current PDF extension unusable. We have been able to create PDFs from HTML pages by downgrading the PDF extension to version 1.0.0.68 and using Java 8. Hopefully this JIRA project will address some issues with the new Flying Saucer PDF engine: https://luceeserver.atlassian.net/browse/LDEV-2348

1 Like

I found the best solution to this problem was making an HTML to PDF nodejs/chrome AWS Lambda service and having coldfusion utilize that via cfhttp call. Even Adobe’s cfdocument and pdf solutions were never up to par with handling CSS and images very well.

5 Likes

There is OPEN HTML TO PDF https://github.com/danfickle/openhtmltopdf
It is a pure-Java library using PDFbox. I am currently using PDFbox with lucee so it shouldn’t be too difficult to use openhtmltopdf with lucee as well.

3 Likes

Given that https://github.com/danfickle/openhtmltopdf is based off Flying Saucer (the newer engine PDF engine in Lucee) and actively maintained, this sounds like a good idea.

The FS project isn’t so actively maintained https://github.com/flyingsaucerproject/flyingsaucer

Any comments @micstriit ?

Presumably it wouldn’t be so hard, to switch or even add the option of a third PDF engine?

The java source code is here https://github.com/lucee/extension-pdf, it might be a nice project for somebody in the community to pick up?

Lots of nice things like support for CSS REM units

replace flying saucer with active openhtmltopdf fork

https://luceeserver.atlassian.net/browse/LDEV-2968

1 Like

I have never found a good solution that was easy to implement PDF. So what we do use a css print view and save to pdf from the browser. If you figure out a solution please let me know.

I’ve used WKHTMLTOPDF for almost ten years now and never looked back. There is a CFC for this, it is threadsafe and it can generate PDF’s with Javascript and images in them, as long as they’re available over the internet. Of course it utilizes the latest stable version of WebKit, so you get a modern renderengine without all the CFDOCUMENT issues regarding CSS and pagebreaks etc.

Do you have a link to the CFC project you use?

There are few cfc projects out there and I tried a few WKHTMLTOPDF docker containers but have yet to find something that renders flexbox css well without cleaning up PDF files left on the file system.

C here below the full CFC. Running on wkhtmltopdf @ https://wkhtmltopdf.org/downloads.html

Beware that using header and footers requires html-files that are generated at PDF-creation time. These might fill up your temp-folder. I usually run a cleanup script every once so often to remove them.

component output=false hint="source taken from: https://github.com/LoicMahieu/cf-wkhtmltopdf/blob/master/Wkhtmltopdf.cfc" {

/* examples:
==================== ==================== ====================
.fromString(str[, options])
<cfscript>
	wkhtmltopdf = new Wkhtmltopdf();
	pdf = wkhtmltopdf.fromString('<h1>Hello world</h1>');
</cfscript>
<cfcontent type="application/pdf" variable="#pdf#">
==================== ==================== ====================
.fromURL(urlStr[, options])
<cfscript>
	wkhtmltopdf = new Wkhtmltopdf();
	pdf = wkhtmltopdf.fromURL('http://github.com');
</cfscript>
<cfcontent type="application/pdf" variable="#pdf#">
==================== ==================== ====================
.commandPath:
	Default: wkhtmltopdf Allows you to specify path to the wkhtmltopdf.bin or .exe */

	this.commandPath = 'd:/wkhtmltopdf/bin/wkhtmltopdf.exe';

	public function init() {
		return this;
	}

	public function fromString(required string str, struct options = {}, string charset = 'UTF-8') {
		var tmpFile = _tmpFile();
		var fileContent = '';
		try {
			var args = ['--quiet'];
			args.addAll(_optionsToArray(options));
			args.addAll(['-', tmpFile]);
			var p = _exec(this.commandPath, args);
			var output = p.getOutputStream();
			output.write(str.getBytes(charset));
			output.close();
			p.waitFor();
			if (p.exitValue() != 0) {
				return _handleProcessError(p, this.commandPath, args);
			} else {
				fileContent = fileReadBinary(tmpFile);
			}
		} finally {
			fileDelete(tmpFile);
		}
		return fileContent;
	}

	public function fromURL(required string urlStr, struct options = {}) {
		var tmpFile = _tmpFile();
		var fileContent = '';
		try {
			var args = ['--quiet'];
			args.addAll(_optionsToArray(options));
			args.addAll([urlStr, tmpFile]);
			var p = _exec(this.commandPath, args);
			p.waitFor();
			if (p.exitValue() != 0) {
				return _handleProcessError(p, this.commandPath, args);
			} else {
				fileContent = fileReadBinary(tmpFile);
			}
		} finally {
			fileDelete(tmpFile);
		}
		return fileContent;
	}

	// --- privates
	private function _flattenArray() {
		var res = [];
		var keys = structKeyArray(arguments);
		for (var i = 1; i <= arrayLen(keys); i++) {
			var key = keys[i];
			var val = arguments[key];
			res = res.addAll(val);
		}
		return;
	}

	private function _optionsToArray(required struct options) {
		var res = [];
		var keys = structKeyArray(options);
		for (var i = 1; i <= arrayLen(keys); i++) {
			var key = keys[i];
			var val = options[key];
			if (len(key) == 1) {
				key = '-' & key;
			} else {
				key = '--' & _dasherize(key);
			}
			if (!isNumeric(val) && isBoolean(val) && val != false) {
				res.add(key);
			} else {
				res.add(key & ' ' & _quote(val));
			}
		}
		return res;
	}

	private function _quote(required string val) {
		// escape and quote the value if it is a string and this isn't windows
		if (server.os.name != 'UNIX') {
			val = replace(val, '"', '\"', 'all');
			if(val CONTAINS " "){
				return '"#val#"';
			}
		}
		return val;
	}

	private function _dasherize(required string input) {
		return LCase(REReplace(REReplace(input, '\W+', '-', 'all'), '([a-z\d])([A-Z])', '\1-\2', 'all'));
	}

	private function _tmpFile() {
		return getTempFile(getTempDirectory(), 'wkhtmltopdf');
	}

	private function _handleProcessError(required any process, required string command, required array args) {
		var IOUtils = createObject('java', 'org.apache.commons.io.IOUtils');
		var error = process.getErrorStream();
		var fullCommand = _fullCommand(command, args);
		var detail = IOUtils.toString(error);
		var message = [
			'wkhtmltopdf has fail.',
			'Command: `#fullCommand#`',
			'Exit code: `#process.exitValue()#`',
			'Message: `#detail#`'
		];
		throw(message = arrayToList(message, ' - '), detail = detail);
	}

	private function _exec(required string name, required array args) {
		var command = _fullCommand(name, args);
		var runtime = createObject('java', 'java.lang.Runtime').getRuntime();
		var p = runtime.exec(command);
		return p;
	}

	private function _fullCommand(required string name, array args = []) {
		return name & ' ' & arrayToList(args, ' ');
	}

}
2 Likes
1 Like

I’m keen on using Lambda for exactly this kind of stuff! I just haven’t taken the plunge yet! Good to hear that you’re having success with it.

2 Likes

openhtmltopdf is now OSGI compliant (Lucee uses OSGI)

2 Likes

That sounds cool. I’m not really clear on what OSGI is, though I get that it has to do with the plug-n-play nature of some packages.

basically it solves some of the problems with classloader conflicts

OSGI provides some extra isolation against vulnerabilities, as described here

https://luceeserver.atlassian.net/browse/LDEV-1530?focusedCommentId=39488

Right now, I’m just happy that the createObject() function can accept a JAR paths array. I am sure the OSGI stuff is more powerful than that; but, the createObject() function is the most I understand at the moment.

Ben, as Zac says OSGi has several benefits when loading classes, but the key one for me is version isolation. I blogged about this recently.

4 Likes

Oh, that’s a great blog post!

I just added a link to it from the Using Java in Lucee page in docs

3 Likes