Apache log analysis using Linux commands

date
Aug 18, 2024
outer_link
slug
apache-logs-linux
status
Published
tags
Linux
summary
type
Snippet

Access Log format

Apache logs are usually in the following common format:
LogFormat "%h %l %u %t \"%r\" %>s %b"
Example:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Below are some analysis I have done with Apache logs with linux commands:

Most common IPs

The most frequent appearing IPs in the access logs:
awk '{print $1}' /path/to/logfile | sort | uniq -c | sort -nr | head -n 5
  • awk '{print $1}': obtains the IP address (assuming it is the first field, default based on the log format)
  • sort: sorts the IPs
  • uniq -c: counts the occurrence of each unique IP
  • sort -nr: sorts the counts in descending order
  • head -n 5: gets the top 5 most common IP

No. unique requests based on URL path

cat /path/to/file | awk '{print $2}' FPAT='(^| )[0-9]+|"[^"]*"' | sort -T sort_dir | uniq -c
  • A separate sort_dir dir is used if the log file is extremely large
  • Outputs the number of each request path
  • Add wc -l to end to only find the total number of unique requests, without knowing how many of each type

Sum of bytes of a particular request

cat /path/to/file | grep <any expression to match> | awk '{print $10}' | awk '{ sum += $1 } END { print sum }'
  • Use any expression necessary in the grep command, can chain (or omit altogether) grep commands as well
 
more to come…

Mohamed Irfan © 2025