Hottest 'file-comparison' Answers

65 votes

Two files comparison in bash script?

To just test whether two files are the same, use cmp -s: #!/bin/bash file1="/home/vekomy/santhosh/bigfiles.txt" file2="/home/vekomy/santhosh/bigfile2.txt" if cmp -s "$file1" "$file2"; then ...

Kusalananda♦

356k

answered Oct 12, 2017 at 10:07

62 votes

rsync compare directories?

Surprisingly no answer in 6 years uses the -i option or gives nice output so here I'll go: TLDR - Just show me the commands rsync -rin --ignore-existing "$LEFT_DIR"/ "$RIGHT_DIR"/|sed -e 's/^[^ ]* /...

ndemou

3,029

answered Aug 17, 2018 at 15:51

29 votes

Two files comparison in bash script?

The easiest way is to use the command diff. example: let's suppose the first file is file1.txt and he contains: I need to buy apples. I need to run the laundry. I need to wash the dog. I need to ...

Kingofkech

1,068

answered Oct 12, 2017 at 10:12

16 votes

Accepted

I want to compare values of two files, but not based on position or sequence

Compare the sorted files. In bash (or ksh or zsh), with a process substitution: diff <(sort File1.txt) <(sort File2.txt) In plain sh: sort File1.txt >File1.txt.sorted sort File1.txt >...

Gilles 'SO- stop being evil'

865k

answered Jun 25, 2020 at 19:05

12 votes

Accepted

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

czkawka is an open source tool which was created to find duplicate files (and images, videos or music) and present them through command-line or graphical interfaces, with an emphasis on speed. This ...

A.L

2,000

answered Jul 12, 2022 at 23:21

11 votes

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

You'd probably want to make sure you do a full compare (or hash) on the first and last 1MiB or so, where metadata can live that might be edited without introducing offsets to the compressed data. ...

Peter Cordes

6,690

answered Jul 13, 2022 at 3:48

10 votes

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Does GNU cmp help you? You can use the -s option to suppress output and only use the return value It checks the file size first to skip any comparison on different file size With options -i (skip ...

Philippos

13.7k

answered Jul 12, 2022 at 12:45

9 votes

Accepted

ImageMagick compare without generating diff image

I was struggling with the same issue right now and found the answer: Yes! TL; DR: You specify NULL: as the filename for the diff, i.e. compare -metric rmse foo.png bar.png NULL: ImageMagick's ...

z-nexx

106

answered Feb 14, 2021 at 0:41

9 votes

Accepted

Compare the first 20 lines of two files

Using a shell with process substitutions (<(...)), e.g. bash or zsh: diff <( head -n 20 file1 ) <( head -n 20 file2 ) This run head -n 20 on each file to get the first 20 lines of each, in ...

Kusalananda♦

356k

answered Mar 13, 2021 at 11:04

8 votes

Compare directories but not content of files

I've just discovered tree. tree old_dir/ > tree_old tree new_dir/ > tree_new vimdiff tree_old tree_new

Valentas

369

answered Dec 18, 2020 at 8:45

8 votes

Diff only words in files

diff -w ignores all horizontal whitespace changes, which takes care of indentation but doesn't help if lines have been wrapped to a different width or if lines have been wrapped after text changes. ...

Gilles 'SO- stop being evil'

865k

answered May 29, 2020 at 16:14

6 votes

"cmp -s file1 file2" doesn't produce any output

-s is for silent, it's to tell cmp not to output anything¹ but only to reflect whether the files are identical or not in its exit status so that it can be used for instance in an if shell statement: ...

Stéphane Chazelas

585k

answered Aug 10, 2018 at 17:28

6 votes

I want to compare values of two files, but not based on position or sequence

Sort the files first (in bash): diff <(sort file1) <(sort file2)

Hauke Laging

94.6k

answered Jun 25, 2020 at 19:02

5 votes

Compare files from a list

There are several codes that do much of this work for you, for example: fdupes jdupes rdfind duff A few years ago I posted comparison runs of fdupes and rdfind at http://www.linuxforums.org/forum/...

drl

848

answered Jan 26, 2018 at 13:43

5 votes

How to know if a text file is a subset of another

With perl: if perl -0777 -e '$n = <>; $h = <>; exit(index($h,$n)<0)' needle.txt haystack.txt then echo needle.txt is found in haystack.txt fi -0octal defines the record delimiter. When ...

Stéphane Chazelas

585k

answered Nov 21, 2017 at 15:43

5 votes

Accepted

Extract the indexes of rows that are swapped in order between two files

This is one of those rare occasions when I'd probably use getline due to the size of your input files so we only save a handful of lines in memory at a time instead of >10G: $ cat tst.awk BEGIN { ...

Ed Morton

35.9k

answered May 30, 2022 at 12:53

5 votes

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Shellscript implementation of the OP's, @vume's, idea Background with the example rsync Have a look at rsync. It has several levels of checking if files are identical. The manual man rsync is very ...

sudodus

6,686

answered Jul 12, 2022 at 12:38

5 votes

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

There is a tool called imosum that works similar to e.g. sha256sum, but it only uses three 16 kB blocks. The samples are taken from beginning, middle and end of the file, and file size is included in ...

jpa

1,562

answered Jul 13, 2022 at 17:35

4 votes

Converting number format and comparing the file

With the numfmt utility from GNU Coreutils: numfmt --delimiter='|' --field=2-3 --format='%.2f' < file 1|2.30|2.30|34 1|0.00|0.00|34 1|0.00|0.00|34 1|11.00|11.00|34 1|0.31|0.31|34 1|0.00|0.00|34 1|...

steeldriver

83.8k

answered Apr 24, 2018 at 19:18

4 votes

Accepted

Compare text files skipping N symbols from each line

Using cut: diff <(cut -c 20- file1) <(cut -c 20- file2) Note: with GNU cut the -c character option actually works on bytes not characters, but this should be fine as long as your output starts ...

jesse_b

41.6k

answered May 31, 2018 at 21:25

4 votes

How to know if a text file is a subset of another

From http://www.catonmat.net/blog/set-operations-in-unix-shell/: Comm compares two sorted files line by line. It may be run in such a way that it outputs lines that appear only in the first ...

alecbz

176

answered Jan 24, 2018 at 15:56

4 votes

Accepted

Compare 2 files based on the first column and print the not matched

You can use grep for this: $ grep -vwf <(cut -d, -f1 file1) file2 test4 Explanation grep options: -v, --invert-match Invert the sense of matching, to select non-matching lines. -w, --word-...

terdon♦

252k

answered Feb 13, 2020 at 17:17

4 votes

I want to compare values of two files, but not based on position or sequence

Using awk, you can make a hash index of every distinct input line text, using a command like: awk 'The magic' Q=A fileA Q=B fileB Q=C fileC ... 'The magic' per input line is: { X[$0] = X[$0] Q; } ...

Paul_Pedant

9,414

answered Jun 25, 2020 at 22:56

4 votes

Accepted

Comparison of N identical continuous characters from a set of two files with sequences

$ cat tst.awk BEGIN { wid = 30 } sub(/^>/,"") { hdr=$1; next } NR == FNR { a[hdr]=$0; next } { for ( hdrA in a ) { strA = a[hdrA] lgthA = length(strA) for ( ...

Ed Morton

35.9k

answered Dec 12, 2020 at 4:03

4 votes

Awk- Compare Numbers from Two Files and write Differences in New File

Using any POSIX awk: $ cat tst.awk BEGIN { FS = "[]=[]+" f1 = ARGV[1] f2 = ARGV[2] } { gsub(/[[:space:]]+/,"") gsub(/,/,"& ") key = $1 "[ ...

Ed Morton

35.9k

answered Dec 18, 2023 at 11:47

3 votes

rsync compare directories?

It took me a few tries to get this to work. Nils' answer requires that $TARGET ends in a trailing /, as explained by ジョージ. Here is a version that explicitly adds the trailing /: rsync -avun --delete ...

Orafu

133

answered May 15, 2018 at 6:14

3 votes

Compare directories but not content of files

If you only need to know if files from two file system branch are different (without look inside files) you can do something like this: find /opt/branch1 -type f | sort | xargs -i md5sum {} >/tmp/...

Chaky

31

answered Dec 13, 2018 at 14:33

3 votes

Compare files from a list

You could do: find foo* -name 'bar*Test.groovy' -type f -exec cksum {} + | sort (assuming file paths don't contain newline characters) which would give you a checksum (and size) for each file, ...

Stéphane Chazelas

585k

answered Jan 26, 2018 at 13:06

3 votes

Compare files from a list

Use return value of diff file1 file2 >/dev/null as it returns zero when files are the same and nonzero when files differ. Compare the files in two nested for cycles. Something as: for file1 in $...

Adam Trhon

1,633

answered Jan 26, 2018 at 9:53

3 votes

Compare two logs line by line and show differences and if the order of words from a line are not the same

Not exactly the format you're asking for, but wdiff is probably your best bet: $ wdiff f1.txt f2.txt She has [-132-] {+123+} apples George [-is-] 18 years {+is+} old {+Florin it's leaving+} Michael it'...

Satō Katsura

13.7k

answered Oct 5, 2017 at 9:20

Stack Exchange Network

Tag Info

Hot answers tagged file-comparison

Two files comparison in bash script?

rsync compare directories?

Two files comparison in bash script?

I want to compare values of two files, but not based on position or sequence

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

ImageMagick compare without generating diff image

Compare the first 20 lines of two files

Compare directories but not content of files

Diff only words in files

"cmp -s file1 file2" doesn't produce any output

I want to compare values of two files, but not based on position or sequence

Compare files from a list

How to know if a text file is a subset of another

Extract the indexes of rows that are swapped in order between two files

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Is there a tool or script that can very quickly find duplicates by only comparing filesize and a small fraction of the file contents?

Converting number format and comparing the file

Compare text files skipping N symbols from each line

How to know if a text file is a subset of another

Compare 2 files based on the first column and print the not matched

I want to compare values of two files, but not based on position or sequence

Comparison of N identical continuous characters from a set of two files with sequences

Awk- Compare Numbers from Two Files and write Differences in New File

rsync compare directories?

Compare directories but not content of files

Compare files from a list

Compare files from a list

Compare two logs line by line and show differences and if the order of words from a line are not the same

Tag Info

Hot answers tagged file-comparison

Related Tags