With zsh:
typeset -U groups=( **/*_*_*.*(Ne['REPLY=${${(s[_])REPLY:t}[2]}']) )
typeset -U groups=(...): definegroupsas an array withUnique members**/*_*_*.*: file names with at least one.and at least two_s before the rightmost., at or below the current working directory(Ne['code']): glob qualifiers to further qualify the globN:Nullglob: expand to nothing if there's no matche['code']transform each glob expansion¹ (in$REPLYin thecode)$REPLY:t: thetail (basename) of the file.${(s[_])var}: splits on_(and then we take the second with[2]).
With bash (the GNU shell), GNU find and GNU awk, you can do something similar with:
readarray -td '' groups < <(
LC_ALL=C find . -name '.?*' -prune -o \
-name '*_*_*.*' -printf '%f\0' |
gawk -v RS='\0' -v ORS='\0' -F _ '!seen[$2]++ {print $2}'
)
Those make no assumption as to what characters or non-characters may be found between those first two _ characters.
Both skip hidden files and files in hidden directories. To include them, add the D glob qualifier in zsh or remove the -name '.?*' -prune -o in find.
If there's a large list of files, the find-based one will be more memory friendly as it doesn't store the whole list in memory. You can take a similar approach in zsh with:
typeset -A seen=()
: **/*_*_*.*(Ne['! seen[${${(s[_])REPLY:t}[2]}]='])
groups=( ${(k)seen} )
¹ the exit status of that code also determines whether the file is selected or not, but here the code always returns true