Loading custom fonts for use with cfimage. Also known as JVM exception during Garbage Collection

I am trying to add some custom fonts (client supplied) to our lucee application. We need these fonts for use with cfimage (not cfdocument PDF generation). We’ve tried 4 different approaches, and had very limited success. Please bear with the length as I try to get all (and only) the pertinent details in.

I’m hoping that someone else has beat this already. We’ve spent literally man-weeks chasing this down.

ENVIRONMENT
Our environment is Java 7 (or 8), Tomcat 8 (or 8.5), installed in Azure Web Apps. (IIS based hosting) We have tried all sorts of combinations of Java version and Tomcat/Jetty versions. Lucee 4.5.x (specifically 4.5.5.006). Coldbox 4.1.

APPROACHES
Here are the four methods we’ve tried and the issues that have stopped us. If you know how to get past the roadblocks we’ve hit on ANY of these options, it would be most excellent.

  1. Adding the fonts directly to the Windows fonts directory. The Azure Web App security model does not permit this. This simplest of solutions is a non-starter.

  2. Adding the fonts to the fonts.jar file and modifying the pd4fonts.properties files to the lucee install. (as per Adding New Font to PDF using CFDOCUMENT in RAILO iSummation - USA). This gave access to the new fonts for cfdocument, but NOT for cfimage.

  3. Adding JAVA_FONTS environment variable specifying the path to our custom fonts. This did not load the fonts for cfimage. (in retrospect - we didn’t check to see if they worked for cfdocument or not). We tried a number of different permutations of argument setting, and none of them seemed to load. I’m not a Java guru, so I’m definitely leaving the door open to having a syntax error.

  4. CLOSEST TO SUCCESS - We wrote some code in our Coldbox onAppInit to manually find the fonts in the folder, build Java objects, and register the fonts with the graphics engine. This DOES LOAD the fonts, and we CAN use them with cfimage. Unfortunately, after successfully generating images it crashes. More specifically, after a somewhat variable length of time (between 20 min and a full day) we get a Java exception that crashes the JVM. We can generate thousands of images with font on them, but if we let it sit idle for a longish time, it crashes the next time we try to do anything image related. It feels like it is the Garbage Collection crashing rather than the image manipulation threads. I’ll paste some of the log dump here, and then the code following it. If we run the same tests without loading the custom fonts (and just use one of the built in ones) the app will run for days. We cannot find a case of a crash when we only use the built in fonts.

CRASH LOG
Current thread (0x000000001b5c0000): JavaThread “Java2D Disposer” daemon [_thread_in_native, id=1964, stack (0x000000001fbe0000,0x000000001fce0000)]

siginfo: ExceptionCode=0xc0000005, reading address 0xffffffffc0000022

with the following java frames. Always in the FontStrikeDisposer.

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  sun.java2d.d3d.D3DGraphicsDevice.initD3D()Z+0
j  sun.java2d.d3d.D3DGraphicsDevice.<clinit>()V+4
v  ~StubRoutines::call_stub
j  sun.awt.Win32GraphicsEnvironment.makeScreenDevice(I)Ljava/awt/GraphicsDevice;+9
j  sun.java2d.SunGraphicsEnvironment.getScreenDevices()[Ljava/awt/GraphicsDevice;+30
j  sun.awt.Win32GraphicsEnvironment.getDefaultScreenDevice()Ljava/awt/GraphicsDevice;+1
j  sun.font.StrikeCache.disposeStrike(Lsun/font/FontStrikeDisposer;)V+26
j  sun.font.FontStrikeDisposer.dispose()V+19
j  sun.java2d.Disposer.run()V+26
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub

CODE - in the onAppInit method of our Coldbox application

local.p = ExpandPath("./includes/fonts/");
local.d = DirectoryList(local.p,false,"name","*.ttf|*.otf");
for(local.l in local.d){ 
if(not ArrayFindNoCase(local.loadedFonts,ListFirst(local.l,".")))
	ArrayAppend(local.fontsToLoad,local.l); 

} 
if(ArrayLen(local.fontsToLoad) gt 0){    

	local.ge = CreateObject('java','java.awt.GraphicsEnvironment').getLocalGraphicsEnvironment();

	local.f = StructNew();
	for(local.l in local.fontsToLoad){ 

		local.i = CreateObject('java','java.io.FileInputStream').Init(JavaCast("string",local.p&local.l));

		local.f[local.l] = CreateObject('java','java.awt.Font');
		local.x = local.ge.registerFont(local.f[local.l].createFont(local.f[local.l].TRUETYPE_FONT, local.i));
	}
}   

Please let me know if you have questions or suggestions. I don’t really have enough hair to pull any more out at this point. I don’t even know if I should be chasing code bugs or environment configuration. Maybe I’ll just go buy a caseload of RAM on Amazon and find the flag to turn off GC altogether.

Thanks!

1 Like

You can upload your own custom image to Azure.

or

Use CentOs or other version of Linux on your own virtual machine.

We moved away from VMs because we are trying to move to PaaS for all the usual reasons of reduced infrastructure maintenance costs, better scalability, etc. By dumping the VM we are no longer maintaining the OS, and we had hoped, the Java/Tomcat layers as well.

I know with Azure we can keep using Web Apps, and build out our own Java image so that the OS is still PaaS, but the Java is under our control. I really hope to avoid doing this for the sake of a handful of custom fonts. It seems like a sledgehammer driving a nail. (I also would need to learn how to add fonts to the Java install)

Then what you are looking for is more than likely a custom java poi or other add it that will be required to do what you want. Fonts are initialized at the OS level in java

https://docs.oracle.com/javase/8/docs/technotes/guides/intl/fontconfig.html

Other than running ACF, or writing a custom extension itself, I do not believe it is possible to override a java sandbox from inside the sandbox to access your font.

The kicker is that the 4th method I listed (with source code) is loading a custom font. It does allow me to generate thousands of images with my barcodes and customer font on it. It just crashes every day during GC. (I’m almost tempted to put a nightly “service restart” job in place. Almost.) That tells me that the sandbox is allowing it (or the sandbox security has a big hole).

Though a web.config file, I do have access to the commandline used to launch the java sandbox. Is there a way to specify a fonts directory? For example, this is currently in my web.config file:

    <environmentVariable name="CATALINA_OPTS" value="-XX:MaxPermSize=256m -Xmx1524m -XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -Xloggc:garbage-collection.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M" />

It seems like I should be able to add command line flag to redirect the fonts directory to one with my custom fonts in it. I just can’t seem to wrap my head around the syntax. The JAVA_FONTS environment variable made logical sense to me - but I didn’t get it working. Maybe I had the syntax wrong, or maybe it didn’t like just the .ttf file sitting in the directory without being wrapped in some sort of jar with a properties file in it?

Any of this ring any bells for you?

Does the same problem occur if you batch up say 1000 images in a single request, or if you make 1000 individual requests which each generate a single image? 1000 being an arbitrary example…

Your app is taking a user supplied image and overlaying it with text? the variable length of time before the crash suggests that perhaps cimage is having problems with a certain variety of jpeg

As being able to provide a full test case the lucee team can just run to reproduce is essential

Part of the problem is that the test case is so hard to narrow down. We are building an ID pass out of about a dozen different elements. Some PNG, some JPG and some text over top. We build a base cfimage, and then overlay the images and text that apply. Passes get different icons depending on where they are allowed to go, and different text/fonts depending on their name, company, barcode, etc.

There does NOT seem to be a correlation between the number of passes printed, and the onset of the error. I can print a batch of 100 passes (each a cfimage made up of 10 images overlayed and 5-10 pieces of text) without error. Then I let it sit idle for a “long” time (usually at least an hour, often 8+ hours). I then try to print another pass, and it appears to trigger a GC that is different than all the others, and crashes the JVM.

Other times, I can print a small handful of passes and wait for the “long time” to go past, and then the crash will appear. Still other times, I can print 600-700 of the exact same passes without error - until I leave it idle.

The only “user supplied” image is the photo, and even that has already been loaded into a cfimage and cropped and re-saved prior to any of this happening. I can reproduce the error on passes that do not have an end user supplied photo at all. This means that I do have a full and complete catalog of all possible images that form the pass.

If I take the text off the pass entirely - the problem goes away. If I leave the text on, but only use the fonts available in java - the problem goes away. (or at least the “long time” becomes a “long enough time that several days passed without error”)

I will try to set up a stripped down test case of this. Given that it takes a day for each iteration to figure out if I’ve reproduced it, it might be a little bit.

The timing isn’t great. We are approaching peak support period for the client (month of September) and then this site gets torn down. I’ve separated all of the pass printing users onto their own JVM so they don’t crash the rest of the organization. We will be deploying this again in another client site shortly, so we are still keen to get to the bottom of it. I might not get test cases until October though.

Is there any possibility the lucee 5 branch would make this go away? We can’t afford the regression testing time to do this with our current client in production, but I might be able to try that in a test environment.

Sorry for the rambling answer. It’s been a long day.

oh yeah, it’s definitely worth trying the latest build, that’s usually the best bet when tackling a bug in old version

I would suggest letting Java handle your fonts

https://docs.oracle.com/javase/8/docs/technotes/guides/intl/fontconfig.html

Mentions “appendedfontpath”

You should be able to edit the jre/lib/fontconfig.Windows.properties

Help on understaning Java Supported Fonts
http://docs.oracle.com/javase/8/docs/technotes/guides/intl/font.html

You should be able to directly copy your font to the jre location, edit the configuration file and restart tomcat / lucee

The other thing that sticks out is your Java config.

MaxPermSize=256m

You may just be running out of memory during garbage collection.

Set it to at least 512m, I usually set this on most production systems to 1-2 GB or more depending on the application.

I’ve updated the web.config file in Azure so it has this line in it:

    <environmentVariable name="CATALINA_OPTS" value="-Dappendedfontpath=D:\home\site\wwwroot\webapps\ROOT\Application\includes\fonts -XX:MaxPermSize=256m -Xmx1524m -XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -Xloggc:garbage-collection.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M" />

The directory it points to has one ttf and 2 otf font files in it.These are the same files in the same location that previously I was loading with code.

When I run this, no extra fonts are loaded. When I dump the java environment variable “appendedfontpath” I get the correct path back. (I tried both with and without a trailing \). Do I need to do something with properties files, or was java supposed to pick up the ttf files in the directory? (there are other files in that directory too, but only the one ttf)

Permgen isn’t likely to be the problem, looking in the Lucee overview, the non-heap never climbs above 10%.

Tried the latest lucee 5 build instead of lucee 4.5 had exactly the same problem. Ran for about 18 hours and then crashed during GC after sitting idle overnight.

Sorry for the long break between updates. We just had a very successful peak period deployment. The server still crashed with these JVM errors a couple times a day - but the customer didn’t notice because it recovered quickly enough, and we deployed a separate server for the small subset of users who caused the problem.

Now that I can breathe again…

We were able to remove the manual loading of the fonts by updating our Azure Application Settings to have the following key/value pair:

JAVA_OPTS
-Dsun.java2d.fontpath=D:\home\site\wwwroot\webapps\ROOT\Application\includes\fonts

This properly loaded the fonts. However - the bug still occurs.after 12-18 hours of testing, the error still comes back.

We built a stripped down test case, but can’t get it to crash. Maybe it doesn’t use enough memory to trigger the GC.

We also came across this article that seems to describe the behaviour we are seeing. But it isn’t in reference to any JVM we’ve tried.

http://www-01.ibm.com/support/docview.wss?uid=nas3SI24800

I’ll update if we narrow it down more. I’m guessing that others are using cfimage to put fonts on graphics without problems?