首页 > 试题广场 >

nginx日志分析3-统计访问3次以上的IP

[编程题]nginx日志分析3-统计访问3次以上的IP
  • 热度指数:12161 时间限制:C/C++ 1秒,其他语言2秒 空间限制:C/C++ 256M,其他语言512M
  • 算法知识视频讲解
假设nginx的日志我们存储在nowcoder.txt里,格式如下:
192.168.1.20 - - [21/Apr/2020:14:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [21/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [21/Apr/2020:21:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.23 - - [21/Apr/2020:22:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.24 - - [22/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [22/Apr/2020:15:26:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:08:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:09:20:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:10:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:10:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:14:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:15:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [23/Apr/2020:16:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.24 - - [23/Apr/2020:20:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [23/Apr/2020:20:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:22:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:23:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
现在需要编写shell脚本统计访问3次以上的IP,你的脚本应该输出:
6 192.168.1.22
5 192.168.1.21
4 192.168.1.20
示例1

输入

192.168.1.20 - - [21/Apr/2020:14:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [21/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [21/Apr/2020:21:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [21/Apr/2020:21:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.23 - - [21/Apr/2020:22:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.24 - - [22/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [22/Apr/2020:15:26:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:08:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:09:20:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:10:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:10:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:14:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:15:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:15:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [23/Apr/2020:16:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.24 - - [23/Apr/2020:20:27:49 +0800] "GET /2/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.25 - - [23/Apr/2020:20:27:49 +0800] "GET /3/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.20 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:20:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.22 - - [23/Apr/2020:22:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"
192.168.1.21 - - [23/Apr/2020:23:27:49 +0800] "GET /1/index.php HTTP/1.1" 404 490 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0"

输出

7 192.168.1.22
5 192.168.1.21
4 192.168.1.20
awk '{print $1}' | sort | uniq -c |sort -r| awk '{if($1>3){print($1,$2)}}'
发表于 2021-12-03 14:39:18 回复(0)
awk '{print $1}' nowcoder.txt | sort  | uniq -c | awk '$1>3{print $1,$2}'| sort -rk1

awk '{print $1}' nowcoder.txt  # 查询所有IP
sort  | uniq -c                             # 将IP从小到大排序,并去重,在IP地址前面显示重复次数
awk '$1>3{print $1,$2}'            # 过滤出重复次数大于3的,并且单独输出第一列和第二列(因为使用uniq -c后左边会有空格)
sort -rk1                                     # 根据第一列倒序排序


发表于 2022-06-06 11:52:55 回复(0)
方法1:
awk '{a[$1]++} END{for(i in a) if(a[i]>3) {print a[i],i}}'|sort -r
方法2:
awk '{print $1}'|sort|uniq -c|awk '{if($1>3) {print $1,$2}}'|sort -r

发表于 2022-04-23 15:45:26 回复(1)
awk -F'- -' '{print $1}'|sort |uniq -c|sort -n -r|awk -F' ' '{if($1>3) print $1" "$2}'
发表于 2021-11-23 13:26:10 回复(0)
shell
declare -A map
while read line
    do
        tmp=($line)
        ((map["${tmp[0]}"]++))
    done < nowcoder.txt
for i in ${!map[*]}
    do
        [ ${map[${i}]} -le 3 ] && unset map[$i]
    done
function InsertSort(){
	tmp=()
	for ve in ${map[*]}
	    do
	        tmp[${#tmp[*]}]=$ve
	    done
	q=${#tmp[*]}
	for ((i=0;i<$q;i++))
	    do
	        for ((j=$i+1;j<$q;j++))
	            do
	                if [ ${tmp[$i]} -lt ${tmp[$j]} ];then
	                    t=${tmp[$i]}
	                    tmp[$i]=${tmp[$j]}
	                    tmp[$j]=$t
	                fi
	            done
	    done
}
InsertSort
for ((i=0; i<$q; i++))
    do
        for ve in ${!map[*]}
            do
                if [ ${tmp[$i]} -eq ${map[$ve]} ];then
                    printf "${map[$ve]} $ve\n"
                fi
            done
    done
awk
awk '{
    arr[$1]++
} END {
    for (i in arr) {
        if (arr[i]>3){
            printf("%d %s\n", arr[i], i)
        }
    }
}' | sort -r


发表于 2021-11-26 11:14:01 回复(0)
#!/bin/bash
cat nowcoder.txt|awk '{print $1}'|sort -r|uniq -c|tail -3|awk '{print $1,$2}'
讨巧是肯定的,具体执行还得看情况。本人并不是大佬捏
发表于 2025-06-13 22:23:07 回复(0)
awk '{print $1}' nowcoder.txt | sort |uniq -c |sort -r |awk '{print $1,$2}' | head -3
发表于 2025-01-07 19:29:11 回复(0)
awk '{print $1}' | sort | uniq -c | sort -r | sed 's/^ *//' | grep '^[4-9]'

发表于 2024-12-17 09:39:06 回复(0)
 cat ../data/nginx_log2.txt  | gawk '{ print $1}' | sort | uniq -c | sort -r | gawk '{ if( $1 > 3 ) print $1" " $2 }'
发表于 2024-09-10 17:27:07 回复(0)
file="nowcoder.txt"

cat $file |awk  -F ' - - ' '{print $1}' | sort | uniq -c |awk '{if($1 > 3) {print $1 " " $2}}' | sort -r
发表于 2024-07-04 12:48:08 回复(0)
#!/bin/bash
cat nowcoder.txt |awk -F" " '{print $1}'|sort|uniq -c|sort -rn|awk '{print $1,$2}'|awk '{if($1>3){print($1,$2)}}'
发表于 2024-02-27 10:36:02 回复(0)
cat nowcoder.txt|grep -o 192.168.[0-9].[0-9][0-9]|sort -r|uniq -c|awk '{if($1>3) print $1,$2}'
发表于 2024-02-10 16:01:50 回复(0)
cat nowcoder.txt|awk '{print $1}'|sort |uniq -c |sort -rn|awk '{if($1>3){print $0}}'|awk '{print $1,$2}'
编辑于 2024-01-23 18:39:45 回复(0)
awk '{print $1}' nowcoder.txt|sort|uniq -c|awk -F ' ' '{if($1>3)print $1,$2}'|sort -nr


编辑于 2023-12-08 15:11:53 回复(0)
awk '{a[$1]+=1}END{
    for(i in a){
        if(a[i]>3) print a[i]" "i
        }
}'|sort -rn


发表于 2023-11-12 03:27:28 回复(0)
awk '{map[$1]++}END{for(j in map) if(map[j] > 3) printf map[j] " " j "\n"}' | sort -nk1r
发表于 2023-09-22 18:02:53 回复(0)
cat nowcoder.txt |awk '{print $1}' |sort |uniq -c |sort|sed -n '4,$p'|sort -r |awk '{print $1,$2}'
发表于 2022-09-18 15:22:05 回复(0)
cat nowcoder.txt |awk '{ips[$1]++} END{for(k in ips){if(ips[k]>3)print ips[k],k}}'  | sort -k1nr
发表于 2022-09-16 17:40:56 回复(0)

awk数组+sort

awk -v FS=" - - " '{
    a[$1]++;
}
END{
    for(i in a){
        if(a[i] > 3){
            print(a[i]" "i)
        }
    }
}' nowcoder.txt |sort -nr 
发表于 2022-08-27 22:25:34 回复(0)
awk '{ip[$1]++}END{for(i in ip) {if(ip[i]>3) print ip[i],i }}' nowcoder.txt |sort -k1 -rn
发表于 2022-08-22 16:49:59 回复(0)