One program created lots of nested sub-folders. I tried to use command
rm -fr * to remove them all. But it's very slow. I'm wondering is there any faster way to delete them all?
-
The fastest method is described here. perl is by far the fastest for files. Then rm to get the empty directories. unix.stackexchange.com/questions/37329/…SDsolar– SDsolar2017-08-17 10:11:40 +00:00Commented Aug 17, 2017 at 10:11
4 Answers
The fastest way to remove them from that directory is to move them out of there, after that just remove them in the background:
mkdir ../.tmp_to_remove
mv -- * ../.tmp_to_remove
rm -rf ../.tmp_to_remove &
This assumes that your current directory is not the toplevel of some mounted partition (i.e. that ../.tmp_to_remove is on the same filesystem).
The -- after mv (as edited in by Stéphane) is necessary if you have any file/directory names starting with a -.
The above removes the files from your current directory in a fraction of a second, as it doesn't have to recursively handle the subdirectories. The actual removal of the tree from the filesystem takes longer, but since it is out of the way, its actual efficiency shouldn't matter that much.
-
@ndemou, never say
.*it matches..bad things can happen. (but in this case only an error message.)Jasen– Jasen2017-11-06 09:19:49 +00:00Commented Nov 6, 2017 at 9:19 -
The 2nd command won't match any .dot files. To really delete all files including .dot files use this command:
mv -- * .??* .[!.] ../.tmp_to_remove. Note that*matches all non .dot file names,.??*matches all dot files with a file name of 3 or more characters and.[!.]matches all dot files with a file name of exactly 2 characters except..-- Thanks to @Jasen for leading me to this path (the result is not pretty but is correct).ndemou– ndemou2017-11-06 14:47:46 +00:00Commented Nov 6, 2017 at 14:47 -
This won't help if you are out-of-space and trying to free up some space by deleting a folder which has significant number of files. The disk and inode usage would remain same in this approach.WaughWaugh– WaughWaugh2019-04-03 09:20:10 +00:00Commented Apr 3, 2019 at 9:20
-
@SayanBose Are you claiming that the third command (
rm -rf ...) will not work when you are out of space? In my experience that is untrue: even if the disk is at 100% the above steps allow you to clean out a directory almost immediately,(getting free space on your disk of course requires more time).Anthon– Anthon2019-04-03 10:07:44 +00:00Commented Apr 3, 2019 at 10:07 -
@Anthon, No, rm -rf will work when you are out of disk space. Your mkdir won't work if you are out of disk space or inodes.WaughWaugh– WaughWaugh2019-04-03 10:47:30 +00:00Commented Apr 3, 2019 at 10:47
rsync is surprisingly fast and simple. You have to create empty directory first,
rsync -a --delete $(mktemp -d) yourdirectory/
yourdirectory/ is the directory from where you want to remove the files.
-
3Interesting use of
rsync. Is it faster thanrm?pfnuesel– pfnuesel2016-04-18 09:17:21 +00:00Commented Apr 18, 2016 at 9:17 -
2@pfnuesel : Yes, see this answer serverfault.com/a/328305/105902.Rahul– Rahul2016-04-18 09:29:21 +00:00Commented Apr 18, 2016 at 9:29
-
1I've had to copy thousands of files from one drive to another. Using cp it crashed the server, eating up all memory. Rsync did the trick without a problem - although I kept htop open in a separate session to kill it when needed. So rsync can be a very useful tool.SPRBRN– SPRBRN2016-04-18 09:56:04 +00:00Commented Apr 18, 2016 at 9:56
The fastest is with rm -rf dirname. I used a snapshotted mountpoint of an ext3 filesystem on RedHat6.4 with 140520 files and 9699 directories. If rm -rf * is slow, it might be because your top-level directory entry has lots of files, and the shell is busy expanding *, which requires an additional readdir and sort. Go up a directory and do rm -rf dirname/.
Method Real time Sys time Variance (+/-)
find dir -delete 0m8.108s 0m3.668s 0.055s
rm -rf dir 0m7.956s 0m3.640s 0.081s
rsync -delete empty/ dir/ 0m8.305s 0m3.918s 0.029s
Notes:
- rsync version : 3.0.6
- rm/coreutils version: 8.4-19
- find/findutils version: 4.4.2-6
-
1Confirmed that. I have 1 million files to delete, spread over 3000 directories that contains thousands of subdirectories inside. Using the find method I was able to delete one directory per minute. Using the rm -rf dirname I am able to delete 1 directories every 2 seconds. I am using this bash command:
for d in */;do rm -rf $d;done. thanks.Duck– Duck2018-06-29 11:15:36 +00:00Commented Jun 29, 2018 at 11:15
One problem with rm -rf *, or its more correct equivalent rm -rf -- * is that the shell has first to list all the (non-hidden) files in the current directory, sort them and pass them to rm, which if the list of files in the current directory is big is going to add some unnecessary extra overhead, and could even fail if the list of file is too big.
Normally, you'd do rm -rf . instead (which would also have the benefit of deleting hidden files as well). But most rm implementations including all POSIX conformant ones will refuse to do that. The reason is that some shells (including all POSIX ones) have that misfeature that the expansion of .* glob would include . and ... Which would mean that rm -rf .* would delete the current and parent directory, so rm has been modified to work around that misfeature of those shells.
Some shells like pdksh (and other Forsyth shell derivatives), zsh or fish don't have that misfeature¹. zsh has a rm builtin which you can enable with autoload zsh/files that, since zsh's .* doesn't include . nor .. works OK with rm -rf .. So in zsh, you can do:
zmodload zsh/files
rm -rf .
On Linux, you can do:
rm -rf /proc/self/cwd/
to empty the current directory or:
rm -rf /dev/fd/3/ 3< some/dir
to empty an arbitrary directory.
(note the trailing /)
On GNU systems, you can do:
find . -delete
Now, if the current directory only has a few entries and the bulk of the files are in subdirs, that won't make a significant difference and rm -rf -- * will probably be the fastest you can get. It's expected for rm -rf (or anything that removes every file) to be expensive as it means reading the content of all directories and calling unlink() on every entry. unlink() itself can be quite expensive as it involves modifying the deleted file's inode, the directory containing the file, and some file system map or other of what areas are free.
rm and find (at least the GNU implementations) already sort the list of files by inode number in each directory which can make a huge difference in terms of performance on ext4 file systems as it reduces the number of changes to the underlying block devices when consecutive (or close to each other) inodes are modified in sequence.
rsync sorts the files by name which could drastically reduce performance unless the by-name order happens to match the by-inum order (like when the files have been created from a sorted list of file names).
One reason why rsync may be faster in some cases is that it doesn't appear to take safety precautions to avoid race conditions that could cause it to descend into the wrong directory if a directory was replaced with a symlink while it's working like rm or find do.
To optimize a bit further:
If you know the maximum depth of your directory tree, you can pass it to find:
find . -maxdepth 3 -delete
That saves find having to try and read the content of the directories at depth 3.
¹ see also the globskipdots option in bash 5.2+
-
"if the list of file is *too* big."fduff– fduff2016-04-18 09:44:42 +00:00Commented Apr 18, 2016 at 9:44
-
In the last paragraph you talk about
rm -rfbeing an expensive operation as it callsunlink()on all entry, but is that not whatfind . -deletewould do too?fduff– fduff2016-04-18 09:46:28 +00:00Commented Apr 18, 2016 at 9:46 -
@fduff, yes. Like I say (maybe not clearly),
find -deletewon't make much difference if there are few files in the current directory, The only difference would be about avoiding creating and sorting and pass around that big list.Stéphane Chazelas– Stéphane Chazelas2016-04-18 09:56:12 +00:00Commented Apr 18, 2016 at 9:56