Best practice for "long running batch processes"

stp1 · December 3, 2024, 12:34am

Hi, I am having difficulty figuring out what is the best way to achieve usage of CFML as a “non webserver language” Just a regular old language… A scripting language if you will.

Imagine this scenario: You are building a “photo album” app and you expect thousands upon thousands of photos being uploaded. At some point these will need “processing in the background” to some extent. CFML can handle that. We have cfimage!

To my understanding these are your options:

Crank up CFML timeout. Add scheduled task. Have it do the work via cfml web server. How big should we set this? LIke a 2 week timeout? How would we prevent cascading execution? It can be done but it gets complex. Are long timeouts desirable? Thoughts?
Use commandbox to run your cfml in the commandline as a script. I’ve read it is possible. So, we could create a cron job to execute “box scriptname.cfm”
and this can work. But i’ve had some problems with it. I’ve never been able to get the context right… Its almost like to do this, you need box to run in the same context as your current app but box isn’t running your current app, the lucee instance serving your website is. And so now we need a separate Lucee context setup that is a duplicate of the app just to be able to do this right. Not ideal… but maybe someone out there has pulled it off?
Abondon cfml Use cfexecute to launch “python pyscript.py” and Bob’s yer uncle. Except u needed to utilize “cfexecute” … not ideal… But can be done securely when absolutely necessary. Just as long as it never touches any kind of user input. Or avoid cfml altogether and just build a separate Python app to work along side your current one. But hey folks… this is a SAD OPTION … cuz we gotta abandon CFML. But I have a saying, if you are not using the right tool, YOU are the tool!

Can someone out there PLEASE explain the best way to do this. Is it possible? Or are we using the wrong tool when it comes to this?

Really curious to know what some of the more experienced dev’s in this forum do when faced with this.

stp1 · December 4, 2024, 10:50pm

Answering my own question here… using CFTHREAD can get you there. You can create a cfthread and have it not be affected by requestTimeOut’s and you can throttle it with action=“sleep” to make it not bog your server down during intense workloads.

kenricashe · December 7, 2024, 5:31pm

Can’t exactly put my finger on it at the moment, but I fear the use of threads for that would be problematic. They are more difficult to monitor and could run indefinitely?

For background processes I break up long jobs into smaller pieces from a queue, using cron jobs to periodically run each batch. Generally each batch completes in less than a minute, but I also place a lock on the script (stored in the database although you could also use the file system) to prevent multiple concurrent runs. Then you also need to monitor in case any batch has been locked for too long which indicates an issue such as server was unexpectedly rebooted mid-batch.

stp1 · December 9, 2024, 5:30pm

Hi, thank you so much for answering my post. I understand the technique you are suggesting. i have actually done this for many years. But my new problem, is I have reach the threshold of what can be achieved with these 1 minute processes with a queue.

A new project i’m taking on is going to require much more intense processing. And it will require hammering the database in a way that will slow down the app if I don’t introduce some kind of throttling mechanism so it doesn’t bog down the other processes on the server.

When building this queuing system, if your anything like me, you start to wonder, shouldn’t I just do this in python? And I suggest python because it can run “indefinitely” like a real scripting language. And you can introduce “sleep” statements to throttle it.

I had to build a mailchimp clone once. And It needed to send hundreds of thousands of messages. Doing the queue didn’t work. I didn’t know about cfthread at the time. And so after many attempts to do it in cfml, I had to do it with Python because if I used cfml and the queue technique it would take days to send out the messages instead of just hours.

So python worked brilliantly as my sending engine.

But now I am working for a new company that does not want to embrace another language in the tech stack. I’m stuck with cfml… I’m going to try cfthread for the first time in the coming weeks. I will update this post if I hit a major hurdle with it.

Your advice is not falling silent on me. I am going to have to create a “subsystem” to manage the threads. Just as the queue technique requires managing the queue and making a subsystem to prevent concurrent runs.

Please check back here and I’ll report my progress.

And for anyone else stumbling upon this post. PLEASE share how YOU solve this issue. As I want to know if i’m missing a technique I should be considering.

Regards,

STP

psarin · December 10, 2024, 5:57pm

Tried runasync?

stp1 · December 10, 2024, 10:33pm

Thank you for showing this option. I feel like cfthread is maybe a better option but i’m not 100% certain. Can you (or anyone out there) tell me what the performance implications and pros vs cons of using runAsync vs cfthread ?

From asking chat GPT …

When to Use Each

Use cfthread:

If you need precise control over thread behavior.
For advanced scenarios like thread pools, complex dependencies, or thread-specific state management.

Use runAsync():

When you want simplicity and don’t need fine-grained control.
For straightforward tasks where the goal is just to run some code asynchronously and get a result.

So based on that, my new project I will try cfthread.

martin · December 11, 2024, 4:09pm

We make use of cfthread to run a queue of jobs used to handle background processes in applications.

We have a singleton component that is responsible for managing the queue of jobs and the thread that processes them. When a ‘job’ gets added it is stored in a database. This contains the name of a handler/component to use for the job and any arguments to pass in. If our background thread is not currently running, it gets started.

Within the thread, it loops over any incomplete jobs and processes each one in turn. If it runs out of jobs, the thread is allowed to ‘complete’ and stops. In this way, the thread is only running when there are jobs that needs to be done.

The thread itself is set to have a very large request timeout:

setting requestTimeout=2147483647; /* https://groups.google.com/forum/#!topic/lucee/Xxrb9fTVPAk */

This has worked well for us handling some jobs that take over an hour to complete (processing many large Excel files and large .csv data imports). The queuing prevents the server from being overloaded with these jobs.

It can be complicated debugging the jobs that run in the thread. We have plenty of logging for each one.

kenricashe · December 17, 2024, 12:09pm

As my business grows with more email marketing clients, I’ll have to deal with exponentially more outbound emails myself some day, so I am curious to know, why did it not work?

I was thinking it could have a crazy long requesttimeout and sleep() can be emulated with function that uses while and getTickCount(), but I’m guessing those aren’t as performant?

Were there other reasons?

OTOH for email deliverability, I read years ago that Yahoo was more likely to reject as spam when it receives 50+ emails per minute from the same IP, so I set 49 per minute as the limit. Thus my once per minute queue works great for that, though I do hope it’s not an issue anymore as long as the volume ramps up slowly and there is a high percentage of subscriber engagement.

stp1 · December 22, 2024, 12:28am

I think at the time, it was hard for me to increase the requestTimeout. Like adding it as a command in the script didn’t seem to work. Not sure why. It was overridden by something in Lucee admin. And I didn’t know how to get around it.

And I didn’t like how I had to increase requestTimeout for all the other pages in my app to make this one work. We did it anyway, but in the end, it still could not finish all the work before the long time out. I guess we didn’t make it long enough.

From my research, using threads doesn’t seem that bad. The issue at hand, is you need to code some “infrastructure” around it.

For example, you need to build a page to detect what threads are running, list them, and then give you the option to kill them. Or check on them to see if they are still running.

Also you need logging so you can debug the thread etc…

So its not as simple as kicking off a thread. Sounds like you need to build all the code around it to manage the thread and debug the thread.

Its more work, but in the end, the big bonus is you can have cfml run indefinitely, without timeouts affecting you. This is very powerful. And its what I need to accomplish my goals.

I havn’t yet begun work on my next project that will utilize this. But I do plan to build out all the infrastructure needed to manage the threads and keep tabs on them.

I’ll update this post once i’ve got it built out and report if I hit any issues.