Skip to main content
added 1273 characters in body
Source Link

Edit2: This is what my current working copy looks like. Might be some bugs I need to fix. It accepts to directories in the current path, finds identical path names and removes the ones in the second directory.

#!/bin/bash
echo 'Chosen directories must reside in the current directory.'
echo 'This will find duplicate sub directories between the two and delete the ones in the second path.'
echo ''
read -p 'First directory to compare:' DIR1
read -p 'second directory to compare:' DIR2

depth="${DIR1//[^\/]}"
depth="${#depth}"
recurse='..'

for ((i=1; i<=depth; i++)) {
    recurse="${recurse}/.."
}

cd $DIR1; find . -type d > "$recurse"/list.txt; cd "$recurse"
cd $DIR2; find . -type d >> "$recurse"/list.txt; cd "$recurse"
echo 'Paths found:'
echo ''
awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"
echo ''
read -p 'Delete paths in ${DIR2}? (y/n)' bool
case 'y' in
    $bool)
    echo 'deleting:'
    awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"    
    cd $DIR2
    awk 'seen[$1]++ {print $1}' "$recurse"/list.txt | grep -v "db$" | grep -v "\.$" | xargs rmdir   
    echo ''
esac

Edit2: This is what my current working copy looks like. Might be some bugs I need to fix. It accepts to directories in the current path, finds identical path names and removes the ones in the second directory.

#!/bin/bash
echo 'Chosen directories must reside in the current directory.'
echo 'This will find duplicate sub directories between the two and delete the ones in the second path.'
echo ''
read -p 'First directory to compare:' DIR1
read -p 'second directory to compare:' DIR2

depth="${DIR1//[^\/]}"
depth="${#depth}"
recurse='..'

for ((i=1; i<=depth; i++)) {
    recurse="${recurse}/.."
}

cd $DIR1; find . -type d > "$recurse"/list.txt; cd "$recurse"
cd $DIR2; find . -type d >> "$recurse"/list.txt; cd "$recurse"
echo 'Paths found:'
echo ''
awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"
echo ''
read -p 'Delete paths in ${DIR2}? (y/n)' bool
case 'y' in
    $bool)
    echo 'deleting:'
    awk 'seen[$1]++ {print $1}' list.txt | grep -v "db$" | grep -v "\.$"    
    cd $DIR2
    awk 'seen[$1]++ {print $1}' "$recurse"/list.txt | grep -v "db$" | grep -v "\.$" | xargs rmdir   
    echo ''
esac
Reworded
Source Link

I needEdit: Here's some example data to find a wayhelp show what I'm trying to find identical complete directory paths in two different starting directories and do not care about the contents at allaccomplish.

The answer provided in that post Below is for a single directory and doesn't compare acrosslist of two and, unfortunately, matches things like this:sets of directories.

 
idx1
idx1/dir1defaultdb
idx1/sub1defaultdb/sub2thaweddb
idx1/sub3defaultdb/colddb
idx1/dir1defaultdb/sub1db
idx1/sub2defaultdb/sub3db/sub4rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480

However, these are not identical complete paths.

(path one) idx1/dir1defaultdb/sub1db/sub2rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
(path two) idx1/dir1defaultdb/sub1db/sub2rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/test
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/GlobalMetaData
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata

idx2
idx2/defaultdb
idx2/defaultdb/thaweddb
idx2/defaultdb/colddb
idx2/defaultdb/db
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/test
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/GlobalMetaData
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
 

Additionally, these would not be considered matches because they are not complete paths, as theySay I have sub directoriesthe following one:

idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480

I want to check in the (from above)idx2 directory to see if defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480 exists in it, and if it does I want to print it.

Unfortunately my level of unix skillThe ultimate goal is not quite at the level wherefor each every complete directory (directory has no sub directories, I can tear apartdon't want defaultdb showing up but rather the provided answerchildren) to solve the problembe unique across all top level directories being a list of sub-directories that exist in the partial matchtwo different top level directories.

  From the linked topic:there I will delete one of them.

ls -lR Top_Dir/ | grep -E $(ls -lR Top_Dir/ | grep ^d | rev | cut -d" " -f1 | rev | sort | uniq -d | head -c -1 | tr '\n' '|') | grep -v ^d | sed 's/://'

I need to find a way to find identical complete directory paths in two different starting directories and do not care about the contents at all.

The answer provided in that post is for a single directory and doesn't compare across two and, unfortunately, matches things like this:

/dir1/sub1/sub2/sub3
/dir1/sub1/sub2/sub3/sub4

However, these are not identical complete paths.

(path one) /dir1/sub1/sub2
(path two) /dir1/sub1/sub2

Additionally, these would not be considered matches because they are not complete paths, as they have sub directories (from above).

Unfortunately my level of unix skill is not quite at the level where I can tear apart the provided answer to solve the problem of the partial match.

  From the linked topic:

ls -lR Top_Dir/ | grep -E $(ls -lR Top_Dir/ | grep ^d | rev | cut -d" " -f1 | rev | sort | uniq -d | head -c -1 | tr '\n' '|') | grep -v ^d | sed 's/://'

Edit: Here's some example data to help show what I'm trying to accomplish. Below is a list of two sets of directories.

 
idx1
idx1/defaultdb
idx1/defaultdb/thaweddb
idx1/defaultdb/colddb
idx1/defaultdb/db
idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/test
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1558020652_1558020018_8_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/GlobalMetaData
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558020619_1558019927_5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx1/defaultdb/db/rb_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx1/defaultdb/db/db_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata

idx2
idx2/defaultdb
idx2/defaultdb/thaweddb
idx2/defaultdb/colddb
idx2/defaultdb/db
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019813_1558019569_6_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1557947113_1557947083_0_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019449_1558019418_3_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019898_1558019898_7_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019538_1558019538_5_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1541720372_1541194569_2_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1541720372_1541194569_2_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/test
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558032446_1558018050_1_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1557960001_1557771284_0_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1558019389_1558018342_3_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/GlobalMetaData
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/5_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480
idx2/defaultdb/db/db_1549909440_1549908720_1_AB8C9371-027D-4FE0-B2F3-BAF93F106480/rawdata
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147
idx2/defaultdb/db/rb_1558019873_1558019567_4_9542F466-F8CA-49EB-8120-5409B813F147/rawdata
 

Say I have the following one:

idx1/defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480

I want to check in the idx2 directory to see if defaultdb/db/rb_1558019513_1558019454_4_AB8C9371-027D-4FE0-B2F3-BAF93F106480 exists in it, and if it does I want to print it.

The ultimate goal is for each every complete directory (directory has no sub directories, I don't want defaultdb showing up but rather the children) to be unique across all top level directories being a list of sub-directories that exist in the two different top level directories. From there I will delete one of them.

Source Link

How to find duplicate directory paths even if the contents are different?

I've searched high and low but it seems all but one (Find and list duplicate directories) topic I've found actually deals with my situation, and the result isn't quite what I need.

I need to find a way to find identical complete directory paths in two different starting directories and do not care about the contents at all.

The answer provided in that post is for a single directory and doesn't compare across two and, unfortunately, matches things like this:

/dir1/sub1/sub2/sub3
/dir1/sub1/sub2/sub3/sub4

However, these are not identical complete paths.

(path one) /dir1/sub1/sub2
(path two) /dir1/sub1/sub2

Additionally, these would not be considered matches because they are not complete paths, as they have sub directories (from above).

Unfortunately my level of unix skill is not quite at the level where I can tear apart the provided answer to solve the problem of the partial match.

From the linked topic:

ls -lR Top_Dir/ | grep -E $(ls -lR Top_Dir/ | grep ^d | rev | cut -d" " -f1 | rev | sort | uniq -d | head -c -1 | tr '\n' '|') | grep -v ^d | sed 's/://'