- We perform a loop over all files starting with "p-" in the current directory.
- The first instruction in the loop ensures that the file exists and is a workaround for empty directories (the reason why this is necessary is that on this forum, you will always be told not to parse the output of
ls, so something likeFILES=$(ls p-*); for FILE in $FILES; do ...would be considered a no-go). - Then, we extract the numerals between
p-and_nneeded to generate the first level of your directory structure usingawk(as you suspected, with regular expressions), the same for the numerals betweenn-and_afor the second level. The idea is to use thematchfunction which not only looks for the place where the specified regular expression occurs in your input, but also gives you the "completed" value of all elements enclosed in round brackets( ... )in the array "fields". - Third, we check if the directories for the first and second level of your intended directory structure already exist. If not, we create them.
- Last, we move the file to the target directory.
For one, since the directory names are actually
p-<number>andn-<number>, the same as in your filename, we could have letawkdo the work to extract these characters for us, too, by writingmatch($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields)We can further offload work to
awkby having it generate the directory-subdirectory path at the same time with a suitable argument ofprint:
awk '{match($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields); print fields[1] "/" fields[2]}'
would readily yield (e.g.) p-12345/n-384 for file p-12345_n-384_a-583.pdf. If we combine that with the usage of mkdir -p as indicated by @wurtel, the script could look like
for FILE in p-*
do
if [[ ! -f $FILE ]]; then continue; fi
TARGET="$(awk '{match($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields); print fields[1] "/" fields[2]}' <<< $FILE)"
echo "move $FILE to $TARGET"
mkdir -p "$TARGET"
mv $FILE $TARGET
done