If I understand you correctly, you want to parse a list of IPs and identify which class B or C network they belong to. If any such network appears more than 10 times, you want to print that in the notation
X.Y.0.0/16
or
X.Y.Z.0/24
respectively into an output file spam.lst.
I propose the following awk program for the task (let's call it sort.awk):
#!/bin/awk -f
BEGIN{
FS=OFS="."
}
NF==4{
NF=cl
count[$0]++
}
END {
for (n in count) {
if (count[n]>th) {
printf "%s",n
for (i=cl;i<4;i++) {printf ".0"}
printf "/%d\n",8*cl
}
}
}
You would call it as follows:
awk -v cl=2 -v th=1 -f sort.awk ips.txt > spam.lst
The program works as follows:
- You specify the CIDR network class as
awkvariableclas either2for a class B network, or3for a class C network. - You specify the minimum occurence count from which on you want to block the entire subnet as
awkvariableth. - The program sets the input and output separators to
.to split input lines at the.into fields. - Whenever a line contains exactly 4 fields (minimum sanity check for IPs), that number is cut to the value in
clto truncate it to the class B or C network "base address". Then, a counter for this (newly regenerated) base address in the arraycountis increased. - At end-of-file, we iterate over all indices of the array
count(i.e. all unique base addresses that have occurred so far). If the associated count is larger than the threshold, we output the base address, padded to the right with.0and with the netmask in CIDR notation appended.
The output for cl=2, th=1 and the example IP list you showed would look like
108.61.0.0/16
138.68.0.0/16
148.66.0.0/16
The original proposition was meant to integrate into the existing script and looked as follows:
awk -v cl=2 -v nw="8.6.0.0" -F'.' 'BEGIN{split(nw,ref,/\./)} NF==4{for (i=1;i<=cl;i++) {if ($i!=ref[i]) next} print}' ips.txt
Here, we would parse the list of IPs to check whether they fall into the same network as a given network base address, specified via awk variable nw.
- In the beginning, the reference network base IP is split by fields into an array
ref. - For each line encountered, the program first checks if it contains 4 fields (minimum sanity check for an IP). If so, it compares the first
clfields of both the current line and the reference IP. If any of them doesn't match, the line is skipped and processing proceeds to the next line. If all relevant fields matched, the line is printed.