CFML or HTML to MS WORD

Hi all,

A customer asks to get a really nice report we’ve made in a webapp (HTML+CSS) into MS Word. I’ve tried a couple of options, like the normal cfcontent/cfheader trick and variants on that, but to no avail. MS Word just does not create anything near what we see on screen (or in a PDF for that matter, created by the WKHTMLTOPDF lib). Moreover, we need to fully link all CSS-, JS- and IMG-files (full domain, instead of relative) and upon saving Word insists it is a HTML-file. Which it strictly speaking is of course, as we spoof it being an application/msword file.

So my question is, has anyone created something for MS Word worth sharing? I know there is Apache POI and a Java Lib called docx4j, but maybe someone already has done this and would like to share at least some pointers or gotchas before I venture into those libs?

Spreadsheets and PDF’s are easy, MS Word not it seems.

Thnx up front!

Sebastiaan

Anyone? Or am I unique in the world of CFML to want to do this?

Yeah, it’s not a common ask. I sent out a tweet inviting anyone to come help answer.

1 Like

I can only chime in and say I use docx4j; I read in a pre-existing word Doc template, populate the contents with HTML and re-serve as a word doc. Docx4j is a bit of a beast (large lib takes a while to load into memory on first use etc.) but it does work. All your CSS would need to be word specific though (i.e padding in CM etc). The advantage of doing is this way is that you can set all your classes in the word template itself (so you know they’ll look right) and just use HTML with the respective classes. CSS3/JS is probably a no-no though…! Happy to share the code if you need it.

3 Likes

I have not done it myself but my experience in the past has been that html to word has always produced mixed results. Just like formatting html emails, I would guess that some types of formatting work better than others.

As far as an api, have you checked out: https://cloudconvert.com/

1 Like

Came here from Brad’s tweet. POI won’t help you convert from HTML to DOC. You have two options from my experience.

1 - Use the jodconverter JAR that leverages Open Office (if you have it installed on the server to convert from HTML to DOC.

2 - Use a third-party service like CloudConvert.

Option 1:
Roughly this code:

jodDocumentConverter = CreateObject("java","com.artofsolving.jodconverter.openoffice.converter.OpenOfficeDocumentConverter"); jodOpenOfficeConnection = CreateObject("java","com.artofsolving.jodconverter.openoffice.connection.SocketOpenOfficeConnection"); jodFile = createObject("java","java.io.File");
LOCAL.inputFile = jodFile.init(arguments.inFile);
LOCAL.outputFile = jodFile.init(arguments.outFile);

LOCAL.jodOpenOfficeConnection.init("localhost", 8100);
LOCAL.jodOpenOfficeConnection.connect();
LOCAL.success = jodOpenOfficeConnection.isConnected();
LOCAL.jodDocumentConverter.init(jodOpenOfficeConnection);
LOCAL.jodDocumentConverter.convert(inputFile, outputFile);

LOCAL.disconn = jodOpenOfficeConnection.disconnect();	

Option 2:




<cfif structKeyExists(processObj, “url”)>

<cfhttp url="https:#processObj.url#" method="post" multipart="yes">
	<cfhttpparam type="formfield" name="input" value="#Attributes.file_input#">
	<cfif Attributes.file_input EQ "upload">
		<cfhttpparam type="file" name="file" file="#Attributes.file_path#">
	<cfelse>
		<cfhttpparam type="formfield" name="file" value="#Attributes.file_path#">
	</cfif>
	<cfhttpparam type="formfield" name="outputformat" value="#Attributes.output_format#">
	<cfif StructKeyExists(Attributes,"margin_top")>
		<cfhttpparam type="formfield" name="converteroptions[margin_top]" value="#Attributes.margin_top#">
	</cfif>
	<cfif StructKeyExists(Attributes,"footer_html")>
		<cfhttpparam type="formfield" name="converteroptions[footer_html]" value="#Attributes.footer_html#">
	</cfif>
	<cfif StructKeyExists(Attributes,"footer_center")>
		<cfhttpparam type="formfield" name="converteroptions[footer_center]" value="#Attributes.footer_center#">
	</cfif>                                
</cfhttp>

<cfset conversionObj = DeSerializeJSON(CFHTTP.FileContent)>

Let me know if either option might work for you and I can provide more details.

4 Likes

Hi @Redtopia, the formatting of HTML e-mails has never really been an issue for me, just use old nested tables and all’s well :expressionless: :wink:
Do you know if the CloudConvert API retains any of the information sent to it? The WORD doc’s that I need to create from an HTML page contain sensitive information that I’d rather not send abroad. We’re based in the Netherlands.
Thnx 4 chminig in!

Hi @Webauthor, thnx 4 these two insightful examples. Both seem doable and both seem easier than the doc4j JAVA library (not withstanding the need to learn MS Word CSS…). Both of course have their challenges, the first one needing to install Libre Office or Open Office on the servers, the second probably has some security and information issues. I will try and make the second option work for me, see what the endresult looks like. I assume that the conversionObj needs to be rendered with cfcontent and cfheader as an application/msword or similar to make the browser open up an Office program?

@bdw429s that’s what I tried to tell my customer as well, but to no avail, for now. How would you guys go about such a request?

Re Docx4J, (and for anyone reading this in the distant future) here’s some starter code (this uses 8.1.6, only tested on lucee):

<cfcontent reset="true">
<cfprocessingdirective pageEncoding="UTF-8">
<cfoutput>
	<cfsavecontent variable="documentContents">
		<h1>Hi!</h1>
	</cfsavecontent>
</cfoutput>
<cfscript>
	// Parse document contents for any odd chars
	// documentContents = CleanHighAscii(documentContents);

	// Select Template Source
	sourceFile = expandPath('/files/templates/Default4.docx');
	input = createObject("java", "java.io.File").init( sourceFile );

	// Load
	wordPackage = createObject("java","org.docx4j.openpackaging.packages.WordprocessingMLPackage").load( input );
 	factory = createObject("java", "org.docx4j.jaxb.Context").getWmlObjectFactory();

 	// Add HTML as giant chunk
	chunkType = createObject("java", "org.docx4j.openpackaging.parts.WordprocessingML.AltChunkType");
	body = wordPackage.getMainDocumentPart();
	body.addAltChunk( chunkType.Html, charsetDecode(documentContents, "UTF-8") );

	// Create Document ID
	documentUUID = createUUID();
	targetFileName = "doc_#documentUUID#.docx";
	targetFile = getTempDirectory() & targetFileName;

	// Create file output
	outFile = createObject("java", "java.io.File").init( targetFile );
	docx4J = createObject("java", "org.docx4j.Docx4J");
	docx4J.save( wordPackage, outFile );

 	cfheader(name="Content-Disposition", value="attachment; filename=#targetFileName#", charset="utf-8");
 	cfcontent(file=outFile);
</cfscript>
2 Likes

Hi, @sebgmc, here’s a function you can put in a CFC. The function takes the following params:

key: CloudConvert Key
input_format: html (you can find all the formats here - https://cloudconvert.com/api/v2/convert#convert-formats)
output_format: docx
file_input: upload or url
file_path: path to the file locally (for upload) or url path

It returns a URL to the converted file. You can do something like:

<cfhttp method="get" url="#RETURN_FROM_FUNCTION#" getasbinary="YES" path="#SOMEDIRECTORY#" file="#SOMEFILENAME#"/>

<cffunction name="ConvertFile" returntype="string" access="public">
		<cfargument name="key" type="string"/>
		<cfargument name="input_format" type="string"/>
		<cfargument name="output_format" type="string"/>
		<cfargument name="file_input" type="string"/>
		<cfargument name="file_path" type="string"/>

		<cfscript>
			//Cloudconvert API - https://cloudconvert.com/api/v2/convert#convert-tasks

			//Default options
			LOCAL.converteroptions = {
				"page_orientation": "portrait",
				"no_images": "null",
				"no_background": "null",
				"disable_javascript": "null",
				"javascript_delay": "200",
				"image_dpi": "600",
				"image_quality": "94",
				"zoom": "1.0",
				"grayscale": "null",
				"print_media_type": "null",
				"lowquality": "null",
				"page_size": "letter",
				"page_width": "null",
				"page_height": "null",
				"enable_forms": "null",
				"margin_top": "10",
				"margin_bottom": "10",
				"margin_left": 0,
				"margin_right": 0,
				"header_left": "null",
				"header_center": "null",
				"header_right": "null",
				"header_line": "null",
				"header_html": "null",
				"footer_left": "null",
				"footer_center": "null",
				"footer_right": "null",
				"footer_line": "null",
				"footer_html": "null",
				"disable_smart_shrinking": "null",
				"run_script": "null",
				"command": "null"
			};
			
			//If passing margin or footer, override default options
			if (StructKeyExists(arguments,"margin_top")) LOCAL.converteroptions.margin_top = arguments.margin_top;
			if (StructKeyExists(arguments,"footer_html")) LOCAL.converteroptions.footer_html = arguments.footer_html;
			if (StructKeyExists(arguments,"footer_center")) LOCAL.converteroptions.footer_center = arguments.footer_center;
			
			//Make first call to get a proces obj (task)			
			cfhttp(method="POST", charset="utf-8", url="https://api.cloudconvert.com/process", result="LOCAL.process") {
				cfhttpparam(name="apikey", type="formfield", value=arguments.key);
				cfhttpparam(name="inputformat", type="formfield", value=arguments.input_format);
				cfhttpparam(name="outputformat", type="formfield", value=arguments.output_format);
			}
			LOCAL.processObj = DeSerializeJSON(LOCAL.process.FileContent);
			
			//Override all of default objects with process object
			LOCAL.conversionObj = LOCAL.processObj;
			
			//Convert file
			cfhttp(method="POST", charset="utf-8", url="https:#LOCAL.processObj.url#", result="LOCAL.conversion", multipart="yes") {
				cfhttpparam(name="input", type="formfield", value=arguments.file_input);

				if (arguments.file_input EQ "upload") {
					cfhttpparam(type="file",name="file",file=arguments.file_path);
				} else {
					cfhttpparam(type="formfield",name="file",value=arguments.file_path);
				}
				cfhttpparam(type="formfield",name="outputformat",value=arguments.output_format);
				if (StructKeyExists(arguments,"margin_top")) {
					cfhttpparam(type="formfield",name="converteroptions[margin_top]",value=arguments.margin_top);
				}
				if (StructKeyExists(arguments,"footer_html")) {
					cfhttpparam(type="formfield",name="converteroptions[footer_html]",value=arguments.footer_html);
				}
				if (StructKeyExists(arguments,"footer_center")) {
					cfhttpparam(type="formfield",name="converteroptions[footer_center]",value=arguments.footer_center);
				}
			}

			LOCAL.conversionObj = DeSerializeJSON(LOCAL.conversion.FileContent);

			return "https:#LOCAL.conversionObj.output.url#";

		</cfscript>
		

	</cffunction>

Let me know if you run into any issues.