Plotting bandwidth usage from Apache access logs on a Splunk time chart

this is just a placeholder for a query I don’t want to lose…

Apache access log is not conforming very well to Splunk field definition. I didn’t want to change the way the Apache server logs the information, and neither did I want to amend the Splunk extraction, but I had to come up with a query that can get the bytes portion of the Apache access logs. All, I was after was to find how many GBs of data were served by our Apache server for static resources such as JavaScript and CSS files, so I needed to get was the last portion (the byte size) from each line that was being logged by the Apache server.

11/17/19 8:39:01.000 PM 192.168.1.1 - - [19/Nov/2019:13:39:01 -0500] "GET /jquery/jquery-ui.min.js HTTP/1.1" 200 240427
11/17/19 8:38:36.000 PM 192.168.1.1 - - [19/Nov/2019:13:38:36 -0500] "GET /21ab251b74e47f7c26.js HTTP/1.1" 200 11331980

The simplest way is to use Regex like this to extract the byte size:

Implementation would look like this:

GET 200 (.js OR .css)
| rex "(?<byte_size>\d+$)"
| bucket _time span=1h
| timechart sum(byte_size) as "Bandwidth (GB)"
| eval "Bandwidth (GB)"=round('Bandwidth (GB)'/1024/1024/1024,2)

Eplanation:

 

GET 200 (.js OR .css) – get all log lines that include word GET (someone is pulling the data), with response 200 (successful download) and that is either .JS or .CSS

| rex “(?<byte_size>\d+$)” – take the last number at the end of the line (that’s our byte size)

| bucket _time span=1h – set time span to 1 hour for plotting on time chart

| timechart sum(byte_size) as “Bandwidth (GB)” – create time chart and plot the results under the name Bandwidth (GB)

| eval “Bandwidth (GB)”=round(‘Bandwidth (GB)’/1024/1024/1024,2) – convert total bytes to GB

 

Bandwidth usage by static JS and CSS resources in GB (Visual Representation of the Results)

 

I hope this helps someone…

Conclusion

The above is a quick fix, but the proper way is to amend the httpd.conf or httpd-ssl.conf in Apache and add the field for every value you want to process in Splunk.

In Apache 2.4 typically needs to be added as a log format in front of the TransferLog. Example to create fields for size and response time of each requested process and rotate logs daily:

LogFormat "%h %l %u %t \"%r\" %>s SizeBytes=%b RequestTime=%D"
TransferLog "|/apache/bin/rotatelogs /apache/logs/access_log.%Y.%m.%d 86400"

 

Facebook Comments