Skip to main content
3 of 5
Include original proposed solution by request of OP
AdminBee
  • 23.6k
  • 25
  • 55
  • 77

If I understand you correctly, you want to parse a list of IPs and identify which class B or C network they belong to. If any such network appears more than 10 times, you want to print that in the notation

X.Y.0.0/16

or

X.Y.Z.0/24

respectively into an output file spam.lst.

I propose the following awk program for the task (let's call it sort.awk):

#!/bin/awk -f

BEGIN{
    FS=OFS="."
}

NF==4{
    NF=cl
    count[$0]++
}

END {
    for (n in count) {
        if (count[n]>th) {
            printf "%s",n
            for (i=cl;i<4;i++) {printf ".0"}
            printf "/%d\n",8*cl
        }
    }
}

You would call it as follows:

awk -v cl=2 -v th=1 -f sort.awk ips.txt > spam.lst

The program works as follows:

  • You specify the CIDR network class as awk variable cl as either 2 for a class B network, or 3 for a class C network.
  • You specify the minimum occurence count from which on you want to block the entire subnet as awk variable th.
  • The program sets the input and output separators to . to split input lines at the . into fields.
  • Whenever a line contains exactly 4 fields (minimum sanity check for IPs), that number is cut to the value in cl to truncate it to the class B or C network "base address". Then, a counter for this (newly regenerated) base address in the array count is increased.
  • At end-of-file, we iterate over all indices of the array count (i.e. all unique base addresses that have occurred so far). If the associated count is larger than the threshold, we output the base address, padded to the right with .0 and with the netmask in CIDR notation appended.

The output for cl=2, th=1 and the example IP list you showed would look like

108.61.0.0/16
138.68.0.0/16
148.66.0.0/16

The original proposition was meant to integrate into the existing script and looked as follows:

awk -v cl=2 -v nw="8.6.0.0" -F'.' 'BEGIN{split(nw,ref,/\./)} NF==4{for (i=1;i<=cl;i++) {if ($i!=ref[i]) next} print}' ips.txt

Here, we would parse the list of IPs to check whether they fall into the same network as a given network base address, specified via awk variable nw.

  • In the beginning, the reference network base IP is split by fields into an array ref.
  • For each line encountered, the program first checks if it contains 4 fields (minimum sanity check for an IP). If so, it compares the first cl fields of both the current line and the reference IP. If any of them doesn't match, the line is skipped and processing proceeds to the next line. If all relevant fields matched, the line is printed.
AdminBee
  • 23.6k
  • 25
  • 55
  • 77