Plotting bandwidth usage from Apache access logs on a Splunk time chart

this is just a placeholder for a query I don’t want to lose…

Apache access log is not conforming very well to Splunk field definition. I didn’t want to change the way the Apache server logs the information, and neither did I want to amend the Splunk extraction, but I had to come up with a query that can get the bytes portion of the Apache access logs. All, I was after was to find how many GBs of data were served by our Apache server for static resources such as JavaScript and CSS files, so I needed to get was the last portion (the byte size) from each line that was being logged by the Apache server.

The simplest way is to use Regex like this to extract the byte size:

Implementation would look like this:

Eplanation:

 

GET 200 (.js OR .css) – get all log lines that include word GET (someone is pulling the data), with response 200 (successful download) and that is either .JS or .CSS

| rex “(?<byte_size>\d+$)” – take the last number at the end of the line (that’s our byte size)

| bucket _time span=1h – set time span to 1 hour for plotting on time chart

| timechart sum(byte_size) as “Bandwidth (GB)” – create time chart and plot the results under the name Bandwidth (GB)

| eval “Bandwidth (GB)”=round(‘Bandwidth (GB)’/1024/1024/1024,2) – convert total bytes to GB

 

Bandwidth usage by static JS and CSS resources in GB (Visual Representation of the Results)

 

I hope this helps someone…

Conclusion

The above is a quick fix, but the proper way is to amend the httpd.conf or httpd-ssl.conf in Apache and add the field for every value you want to process in Splunk.

In Apache 2.4 typically needs to be added as a log format in front of the TransferLog. Example to create fields for size and response time of each requested process and rotate logs daily:

 

Facebook Comments