Skip to main content
6 of 8
added 610 characters in body
Stéphane Chazelas
  • 585.1k
  • 96
  • 1.1k
  • 1.7k
$ curl -sI https://google.com | sed -n '/content-length/l'
content-length: 220\r$

See the carriage-return (aka CR, \r, ^M) at the end of the line ($ is sed's way to represent the end of the line). HTTP headers are delimited with CRLF, while the Unix line delimiter is LF.

Also using unsanitised data in arithmetic expressions in bash and other Korn-like shells is a command injection vulnerability, all the more a problem here that you used the -k aka --insecure option allowing MitM attackers to inject arbitrary headers in responses.

On a GNU system, you can use:

local_size=$(stat -Lc %s -- "$dest/$file") || die
remote_size=$(curl -sI -- "$url" | LC_ALL=C grep -Piom1 '^content-length:\s*\K\d+') ||
  die "No content-length"
case $((local_size - remote_size)) in
  (0) echo same;;
  (-*) echo remote bigger;;
  (*) echo local bigger;;
esac

By only returning what \d+ matches in the C locale, we make sure remote_size only contains decimal ASCII digits, removing the ACE vulnerability.

die above could be:

die() {
  [ "$#" -eq 0 ] || printf>&2 '%s\n' "$@"
  exit 1
}

(adapt to whatever logging mechanism you want to use).

Also note that though that would be very unlikely in practice, it's possible for headers to be folded. For instance, the content-length header could be returned as:

Content-Length:<CR>
 123456<CR>

One way to extract the header value is to use formail which is a tool designed to work with RFC822 headers:

remote_size=$(curl... | formail -zcx content-length -U content-length)

With -U content-length, if there's more than one Content-Length header, it's the last one that is returned. Change -U to -u to return the first like with grep -m1 above.

You'll still want to sanitise the result or use ['s (not [[...]]'s!) -lt/-eq/-gt operators instead of ((...)) to avoid the ACE vulnerabilities.

With curl 7.84.0 or newer, you can also get curl to give you the value of that header directly with:

remote_size=$(curl -w '%header{content-length}' -sIo /dev/null -- "$url") || die

Through testing, I find that

  • if there are several occurrences of the header, it will be return the first one only
  • it will complain if the value doesn't start with a digit optionally preceded with a +, but still needs to be sanitised as whatever characters there are after that are passed along.
  • it does support folded headers, but rejects a content-length whose first line has an empty value.
Stéphane Chazelas
  • 585.1k
  • 96
  • 1.1k
  • 1.7k