That looks good too. So now let's figure out some ways we could approach the next step of identifying numbers with 2 digits from the set {4,5,6}. My first instinct here is to go for grep. There are also methods for doing this purely in Bash, but I like to use the various tooltools, grep, awk, and sed for doing these types of things, mainly because that's how my mind works.
It also fails for these strings:
$ echo "41412" | grep -E "[456]{2}"
$
So is this method usable? It is if we change tactics a bit. If we filter out the strings with 3+ of the characters in our set, and then filter for the strings with exactly 2but we'll get what we wanthave to rejigger the regex.
$ echo -e "41123\n44123\n44423" | grep -vE "[456]{3,}""41123\n44123\n44423\n41423" | grep -E "[456]"[^456]*([456][^456]*){2}"
44123
44423
41423
The above is presenting 34 types of the strings. The echo -e "41123\n44123\n44423""41123\n44123\n44423\n41423" just prints 34 of the numbers from our range.
$ echo -e "41123\n44123\n44423""41123\n44123\n44423\n41423"
41123
44123
44423
The grep -vE "[456]{3,} will skip lines that contain 3+ of digits from our set.
$ echo -e "41123\n44123\n44423" | grep -vE "[456]{3,}"
41123
4412341423
The last grep will then lookHow does this regex work? It sets up a regex pattern of zero or more "not [456]" followed by either 1 or more [456] or zero or more "not [456]" characters, looking for strings with exactly 2 characters from our setoccurrences of the latter.
for (( CON1=10000; CON1<=99999; CON1++ )) ;
do
if echo $CON1 | grep -vE "[456]{3,}" | grep -q -E "[456]"[^456]*([456][^456]*){2}"; then
echo $CON1
fi
done
But this method proves to be dog slow. The problem is all thosethat grep's. They'reIt's expensive, and we're running `grep 2 times1 time, per iteration through the loop, so that's ~160k~80k times!
To improve that we could move our grep commandscommand outside the loop and run it 1 time, after itsthe list's been generated, like so, using our original version of the script that just echoed the numbers out:
$ ./cmd.bash | grep -vE "[456]{3,}" | grep -E "[456]"[^456]*([456][^456]*){2}"
$ ./cmd.bash | grep -vE "[456]{3,}" | grep -E "[456]"[^456]*([456][^456]*){2}" | paste -s -d"+"
10044+10045+10046+10054+10055+10056+10064+10065+10066+10144+10145+...
$ ./cmd.bash | grep -vE "[456]{3,}" | grep -E "[456]"[^456]*([456][^456]*){2}" | paste -s -d"+" | bc
10574384852409327540
So we need some method for testing if a digit has exactly 2 digits within Bash, but isn't as expensive as calling grep 160k80k times. Modern versions of Bash include the ability to match using the =~ operator, which can do similar matching as grep. Let's take a look at that next.
#!/bin/bash
for (( CON1=10000; CON1<=99999; CON1++ )) ;
if [[ $CON1 =~ [456][^456]*([456][^456]*){2} ]]; then
echo $CON1
fi
done
The =~ operator makes use of the same regex that we developed earlier, using the set notation, [456]{2}. This works, but suffers from a similar issue as grep, mainly that it allows digits with 3+ from the set.
$ ./cmd1.bash | grep -E "[456]{3,}" | head -5
10444
10445
10446
10454
10455
So we can use the same trick here:
#! /bin/bash
for (( CON1=10000; CON1<=99999; CON1++ )) ;
do
if [[ $CON1 =~ [456]{2} && ! $CON1 =~ [456]{3,} ]]; then
echo $CON1
fi
done
In the above if statement, we're looking for strings that contain 2 digits of [456] and do not contain 3 or more. The notation && ! $CON1 =~ [456]{3,} means AND the exclamation point means NOT and $CONT1 =~ [456]{3,}, matches all the strings that contain 3 or more digits from [456].
Still has a problem?
Yes this will only catch strings where digits from [456] are adjacent. So a string like 41511. Check it:
$ ./cmd1.bash | grep 41511
$
How can we fix this? Well one approach would be to change the regex around like so:
if [[ $CON1 =~ [^456]*([456][^456]*){2} ]]; then
echo $CON1
fi
Checking it shows that it works with 41511 now:
How does this regex work? It sets up a regex pattern of zero or more "not [456]" followed by either 1 or more [456] or zero or more "not [456]" characters, looking for 2 occurrences of the latter.