Thursday, April 7, 2011

Visualizing a Log's Timeline (without going blind reading text)

How to Visualize a Log Timeline

Have you ever tried to read a massive log file? It can be difficult. Logs from multithreaded servers (or worse, clusters) can be miserable to work with. Sometimes a little visualization can help.

Log reading in small single threaded applications is easy. You just read through the log until you see the ERROR, then back up from there to see what went wrong. But what if your server is multithreaded? Then it gets a little more difficult. Or if you're dealing with distributed components like messaging servers and users of those servers. Things can also be ugly there.

Sometimes it's just fun to visualize data, looking for patterns. So how can you do this easily? I like to use R.

Let's take an example. Say you're working with JBoss AS 7, and your log looks something like this:

2011-03-20 21:22:38,854 DEBUG [org.jboss.logging.Log4jService] Installed System.out adapter
2011-03-20 21:22:38,855 DEBUG [org.jboss.logging.Log4jService] Installed System.err adapter
2011-03-20 21:22:38,859 DEBUG [org.jboss.logging.Log4jService] Added ThrowableListener: org.jboss.logging.Log4jService$ThrowableListenerLoggingAdapter@1815338

Pretty dry, right? Working with this log, I'd probably go about visualizing it by computing 2 fields and extracting one for convenience:

  • A timestamp field, formatted so it's easily sortable. You probably could do this in R, or even with shell script utilities, but I took the easy road for this one and used python to work my log for me.

  • A numeric value that maps to the log severity. 0 for 'TRACE', 1 for 'DEBUG', and so on for INFO, WARN and ERROR.

  • For readability, I also extracted the text value of the log level. Shame on me for carrying duplicate data, but it makes the extract file more convient to read.

So the first 4 lines of my extract file look like this:

TimeStamp LogCode LogLevel
76958854 2 DEBUG
76958855 2 DEBUG
76958859 2 DEBUG

The header line is necessary for the R script. I'm sure you would've figured this out, but the first field was calculated by taking the 'hours' value (21 in the first line above) and multiplying it by 60 * 60 * 1000. That was added to the 'minutes' value (22 above) by 60 * 1000. That was added to 38 * 1000, and that was added to 854. As I said above, python works great for that. If you're able to do that quickly in a bash script, my hat's off to you.

But I digress. So we've got our extract file, now to visualize it. Just run it past an R script that reads something like this:

logData <- read.table("my3FiledExtractFile.txt", header=TRUE)
summary (logData)
png("Graph.png", res=200, height=1200, width=1200)
plot (jitter(logData$TimeStamp), logData$LogLevel,
ylab="Log Level", yaxt='n')
axis(2, c(1, 2, 3, 4, 5))

First, the summary() output will provide you with some interesting statistics:

summary (logData)
TimeStamp LogCode LogLevel
Min. :76958854 Min. :2.000 DEBUG:31126
1st Qu.:76997498 1st Qu.:2.000 ERROR: 61
Median :77043075 Median :2.000 INFO : 647
Mean :77096907 Mean :2.027 WARN : 7
3rd Qu.:77236122 3rd Qu.:2.000
Max. :77320903 Max. :5.000

So in an instant we can see what's in this log we're dealing with. But that's not the visualization we're looking for, that's what we see below.

Isn't that cool? We can see we start out with a bunch of DEBUG messages (the heavy line at the bottom), then get our first WARN (the lonely first dot on level 3) , a bunch of more DEBUGs, a bunch of WARNs and then our first ERROR. The timeline reads from left to right, the error levels progress upward for the various levels.

I think data visualization is cool, and I intend to learn more about it to help me draw information from raw data sources. To that end, I've been working with Packt's "R Graphs Cookbook", and will provide a book review here soon.

Happy Visualizing!

1 comment:

F said...

chainsaw + log4j