平时运维遇到最多的就是nginx的日志分析了,要时常做系统监控,检查IP的访问次数是否有异常,防止恶意访问。
假设我的nginx日志如下:
.......
211.253.43.23 - - [03/Jun/2019:11:41:02 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:41:25 +0800] "POST
211.253.43.23 - - [03/Jun/2019:11:41:25 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:41:26 +0800] "GET
39.100.41.229 - - [03/Jun/2019:11:41:56 +0800] "GET
39.100.41.229 - - [03/Jun/2019:11:41:56 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:41:56 +0800] "POST
211.253.43.23 - - [03/Jun/2019:11:41:57 +0800] "GET
39.100.41.229 - - [03/Jun/2019:11:42:00 +0800] "POST
211.253.43.23 - - [03/Jun/2019:11:42:00 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:42:08 +0800] "POST
211.253.43.23 - - [03/Jun/2019:11:42:11 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:42:11 +0800] "GET
211.253.43.23 - - [03/Jun/2019:11:42:12 +0800] "GET
......
下面是各种统计访问次数的shell代码:
1.2019年8月6日期间访问次数最多的7个IP:
[root@hostname ~]# cat nginx.log | grep '03/Jun/2019' | awk '{print $1}'| sort | uniq -c | sort -k 1 -nr | head -7
2597 211.253.43.23
64 39.100.41.229
19 118.112.56.37
15 223.72.99.60
10 118.112.58.225
9 182.148.58.232
6 116.236.146.22
#
2.2019年8月6日期间访问次数大于等于10次的所有IP地址:
[root@hostname ~]# cat nginx.log | grep '03/Jun/2019' | awk '{print $1}'| sort | uniq -c | awk '{if ($1 > 10) print $2}' | sort -nr
223.72.99.60
211.253.43.23
118.112.56.37
39.100.41.229
3.日志文件中访问次数最多的10个请求(日志每行GET后面的内容)例如 /s?defs=ascii&project=linux-3.18.6,注意不允许有空行,不包含 /robots.txt,.js,.css,*.png 这类静态文件、图片等访问。
[root@hostname ~]# cat nginx.log | grep "GET" | grep -Ev 'txt|js|png|css ' | awk '{ print $7}'| sort | uniq -c | sort -k 1 -n -r | head -11 | awk 'NR>1 {print $2}' >output3.txt
**暂无数据**
4.日志文件中访问状态为 404 的所有访问请求地址:
[root@hostname ~]# cat nginx.log | grep "404" | grep -Ev 'txt|js|png|css ' | awk '{print $7}' | sort | uniq -c |awk '{print $2}' >output4.txt
**暂无数据**