Apache log analysis using Linux commands
date
Aug 18, 2024
outer_link
slug
apache-logs-linux
status
Published
tags
Linux
summary
type
Snippet
Access Log formatMost common IPsNo. unique requests based on URL pathSum of bytes of a particular request
Access Log format
Apache logs are usually in the following common format:
LogFormat "%h %l %u %t \"%r\" %>s %b"
Example:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
Below are some analysis I have done with Apache logs with linux commands:
Most common IPs
The most frequent appearing IPs in the access logs:
awk '{print $1}' /path/to/logfile | sort | uniq -c | sort -nr | head -n 5
- awk '{print $1}': obtains the IP address (assuming it is the first field, default based on the log format)
- sort: sorts the IPs
- uniq -c: counts the occurrence of each unique IP
- sort -nr: sorts the counts in descending order
- head -n 5: gets the top 5 most common IP
No. unique requests based on URL path
cat /path/to/file | awk '{print $2}' FPAT='(^| )[0-9]+|"[^"]*"' | sort -T sort_dir | uniq -c
- A separate sort_dir dir is used if the log file is extremely large
- Outputs the number of each request path
- Add
wc -l
to end to only find the total number of unique requests, without knowing how many of each type
Sum of bytes of a particular request
cat /path/to/file | grep <any expression to match> | awk '{print $10}' | awk '{ sum += $1 } END { print sum }'
- Use any expression necessary in the grep command, can chain (or omit altogether) grep commands as well
more to come…