Reading a large file from the end backwards


#1

is there an efficient way to read a file line by line, but starting from the end?

my use case is grabbing the just most recent entries from a lucee log files to show
a current server status page in the Log analyzer plugin, with a refresh option which shows
what has happened since the status page was last loaded

is cflooping over the file is most memory efficient approach, or does log4j etc have
something to offer here?


#2

Doing this without getting your hands “dirty”, i don’t believe is possible.

Now, a way to approach the problem. Depending on the filesize, you may place the contents on ram:// and do your work there. This is going to be very fast. If however the file contents are a lot, your code may consume all your RAM and you don’t want to do this. In this case, you can dump the file content into a temp SQL table and just reverse the order with a simple query.

Hope this helps, or at list gives a first direction.
George


#3

Check out the tail command in CommandBox. I did this with a random access file object which is pretty easy to use. It doesn’t load the entire file into memory.


#4

See


#5

this is great, i have some nice stuff happening with the log analyzer plugin :slight_smile:


#6

I figured that you might want to update the plugin if you have time for that :wink:


#7

so i’d like to avoid using listToArray :slight_smile: what’s the best way to parse this

"INFO","XNIO-2 task-1","03/31/2018","01:03:13","Instrumentation","found [lucee.runtime.instrumentation.ExternalAgent] in ClassLoader [sun.misc.Launcher$AppClassLoader@5c647e05]" ?


#8

Well, you can’t really use list functions even if you want to, because the “message” might contain the delimiter "," so you have to parse it as CSV.

I actually wrote an implementation once but I’ll have to look it up.


#9

basically for each row you can trust the first 5 to be well formed, then basically you’re looking for a single double quote at the start of a new line?


#10

I’d just use OpenCSV
http://opencsv.sourceforge.net/

You can use the CSVReader interface if it’s coming from a file, or use CSVParser directly on strings.

Or you can write yet another parser :slight_smile:


#11

That “trust” relies on the assertion that the user kept the default layout in the log settings, and did not use a custom pattern or layout class.

If that’s the case (which I would argue that it is not, based on the above), then you can use the built in functions listFirst(logline, ",", true, 5) and listRest(logline, ",", true, 6) to separate the “metadata” from the “message”.


#12

I’m just going to grab the first chunk up to (i.e the start of the stack) and then do pretty much that

has anyone complained about this old code breaking?

I’m making pretty good progress, even extracting out the cfml stack trace


with click to expand

left on my todo list is some graphic design love and then using the tail function,
showing a summary page which tails all the current logs and shows what’s happened
recently, with filters by severity / log type, with a polling update which just grabs any
changes since the page was displayed / last poll


#13


#14

first alpha of the log analyzer extension version 3.0.0 is up, if you want to try it out,
you can grab the .lex file from here, it’s using this new approach of reading files in reverse

when you click refresh, it just reads the log file in reverse returning any entries since you
last displayed the log file and dynamically inserts them into the page.

I’m keen to know if the log parsing falls over on different log files, please test!


#15

latest improvements

  • the screenshot is the opening page, instant latest logs gratification
  • extracts cfc|cfc|lucee paths, click to search for related logs
  • simply select any text to auto search for other logs with that text
  • auto refresh options


#16

@isapir is there a way to flush the logs if stream timeout isn’t 0?