How to urlencode data for curl command?

Question

I am trying to write a bash script for testing that takes a parameter and sends it through curl to web site. I need to url encode the value to make sure that special characters are processed properly. What is the best way to do this?

Here is my basic script so far:

#!/bin/bash
host=${1:?'bad host'}
value=$2
shift
shift
curl -v -d "param=${value}" http://${host}/somepath $@

See also: How to decode URL-encoded string in shell? for non-curl solutions. — kenorb
– kenorb, Commented Mar 1, 2015 at 17:30
See also: How can I encode and decode percent-encoded strings on the command line? — Anton Tarasenko
– Anton Tarasenko, Commented May 22, 2018 at 19:17

neu242 · Accepted Answer · 2022-04-19 08:20:09Z

699

Use curl --data-urlencode; from man curl:

This posts data, similar to the other --data options with the exception that this performs URL-encoding. To be CGI-compliant, the <data> part should begin with a name followed by a separator and a content specification.

Example usage:

curl \
    --data-urlencode "paramName=value" \
    --data-urlencode "secondParam=value" \
    http://example.com

See the man page for more info.

This requires curl 7.18.0 or newer (released January 2008). Use curl -V to check which version you have.

You can as well encode the query string:

curl --get \
    --data-urlencode "p1=value 1" \
    --data-urlencode "p2=value 2" \
    http://example.com
    # http://example.com?p1=value%201&p2=value%202

edited Apr 19, 2022 at 8:20

neu242

16.8k22 gold badges85 silver badges125 bronze badges

answered Jan 8, 2010 at 13:05

Jacob Rask

24.6k8 gold badges40 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Stan James Over a year ago

Seems to only work for http POST. Documentation here: curl.haxx.se/docs/manpage.html#--data-urlencode

kberg Over a year ago

@StanJames If you use it like so curl can also do the encoding for a GET request. curl -G --data-urlencode "blah=df ssdf sdf" --data-urlencode "blah2=dfsdf sdfsd " http://whatever.com/whatever

Bokeh Over a year ago

@kberg actually, this will only work for query data. curl will append a '?' followed by the urlencoded params. If you want to urlencode some url postfix (such as a CouchDB GET for some document id), then '--data-urlencode' won't work.

Web User Over a year ago

I want to URL encode the URL path (which is used as a parameter in a REST API endpoint). There is no query string parameters involved. How do I do this for a GET request?

BlackJack Over a year ago

@NadavB Escaping the "‽

|

Benjamin Loison · Accepted Answer · 2023-08-09 12:41:52Z

351

Another option is to use jq:

$ printf %s 'input text'|jq -sRr @uri
input%20text
$ jq -rn --arg x 'input text' '$x|@uri'
input%20text

-r (--raw-output) outputs the raw contents of strings instead of JSON string literals. -n (--null-input) doesn't read input from STDIN.

-R (--raw-input) treats input lines as strings instead of parsing them as JSON, and -sR (--slurp --raw-input) reads the input into a single string. You can replace -sRr with -Rr if your input only contains a single line or if you don't want to replace linefeeds with %0A:

$ printf %s\\n multiple\ lines of\ text|jq -Rr @uri
multiple%20lines
of%20text
$ printf %s\\n multiple\ lines of\ text|jq -sRr @uri
multiple%20lines%0Aof%20text%0A

Or this percent-encodes all bytes:

xxd -p|tr -d \\n|sed 's/../%&/g'

edited Aug 9, 2023 at 12:41

Benjamin Loison

5,7414 gold badges19 silver badges37 bronze badges

answered Dec 22, 2015 at 2:33

nisetama

9,1731 gold badge36 silver badges23 bronze badges

10 Comments

nhed Over a year ago

<3 it ... should be top & accepted IMO (yeah if you can tell curl to encode that works and if bash has a builtin that would have been acceptable - but jq seems like a right fit tho i'm far from attaining comfort level with this tool)

ssc Over a year ago

for anyone wondering the same thing as me: @uri is not some variable, but a literal jq filter used for formatting strings and escaping; see jq manual for details (sorry, no direct link, need to search for @uri on the page...)

Ashutosh Jindal Over a year ago

A sample usage of jq to url-encode: printf "http://localhost:8082/" | jq -sRr '@uri'

Vladimir Panteleev Over a year ago

Note, this is not suitable for binary data. jq can only operate on UTF-8 strings, so binary data which is not invalid UTF will be munged into valid UTF-8 before encoding. To test this, run: printf '\xAB\xCD\xEF' | jq -sRr @uri. There is an open PR which may fix this: github.com/stedolan/jq/pull/2314

Alice M. Over a year ago

This really saves the day, thanks. I’d suggest using Bash’s <<< (here-string) rather than a cumbersome printf. It seems that the final newline does not bother jq: jq -rR @uri <<< 'foo bar' → foo%20bar

|

kthompso · Accepted Answer · 2022-05-08 15:04:08Z

Here is the pure BASH answer.

Update: Since many changes have been discussed, I have placed this on https://github.com/sfinktah/bash/blob/master/rawurlencode.inc.sh for anybody to issue a PR against.

Note: This solution is not intended to encode unicode or multi-byte characters - which are quite outside BASH's humble native capabilities. It's only intended to encode symbols that would otherwise ruin argument passing in POST or GET requests, e.g. '&', '=' and so forth.

Very Important Note: DO NOT ATTEMPT TO WRITE YOUR OWN UNICODE CONVERSION FUNCTION, IN ANY LANGUAGE, EVER. See end of answer.

rawurlencode() {
  local string="${1}"
  local strlen=${#string}
  local encoded=""
  local pos c o

  for (( pos=0 ; pos<strlen ; pos++ )); do
     c=${string:$pos:1}
     case "$c" in
        [-_.~a-zA-Z0-9] ) o="${c}" ;;
        * )               printf -v o '%%%02x' "'$c"
     esac
     encoded+="${o}"
  done
  echo "${encoded}"    # You can either set a return variable (FASTER) 
  REPLY="${encoded}"   #+or echo the result (EASIER)... or both... :p
}

You can use it in two ways:

easier:  echo http://url/q?=$( rawurlencode "$args" )
faster:  rawurlencode "$args"; echo http://url/q?${REPLY}

[edited]

Here's the matching rawurldecode() function, which - with all modesty - is awesome.

# Returns a string in which the sequences with percent (%) signs followed by
# two hex digits have been replaced with literal characters.
rawurldecode() {

  # This is perhaps a risky gambit, but since all escape characters must be
  # encoded, we can replace %NN with \xNN and pass the lot to printf -b, which
  # will decode hex for us

  printf -v REPLY '%b' "${1//%/\\x}" # You can either set a return variable (FASTER)

  echo "${REPLY}"  #+or echo the result (EASIER)... or both... :p
}

With the matching set, we can now perform some simple tests:

$ diff rawurlencode.inc.sh \
        <( rawurldecode "$( rawurlencode "$( cat rawurlencode.inc.sh )" )" ) \
        && echo Matched

Output: Matched

And if you really really feel that you need an external tool (well, it will go a lot faster, and might do binary files and such...) I found this on my OpenWRT router...

replace_value=$(echo $replace_value | sed -f /usr/lib/ddns/url_escape.sed)

Where url_escape.sed was a file that contained these rules:

# sed url escaping
s:%:%25:g
s: :%20:g
s:<:%3C:g
s:>:%3E:g
s:#:%23:g
s:{:%7B:g
s:}:%7D:g
s:|:%7C:g
s:\\:%5C:g
s:\^:%5E:g
s:~:%7E:g
s:\[:%5B:g
s:\]:%5D:g
s:`:%60:g
s:;:%3B:g
s:/:%2F:g
s:?:%3F:g
s^:^%3A^g
s:@:%40:g
s:=:%3D:g
s:&:%26:g
s:\$:%24:g
s:\!:%21:g
s:\*:%2A:g

While it is not impossible to write such a script in BASH (probably using xxd and a very lengthy ruleset) capable of handing UTF-8 input, there are faster and more reliable ways. Attempting to decode UTF-8 into UTF-32 is a non-trivial task to do with accuracy, though very easy to do inaccurately such that you think it works until the day it doesn't.

Even the Unicode Consortium removed their sample code after discovering it was no longer 100% compatible with the actual standard.

The Unicode standard is constantly evolving, and has become extremely nuanced. Any implementation you can whip together will not be properly compliant, and if by some extreme effort you managed it, it wouldn't stay compliant.

Unfortunately, this script fails on some characters, such as 'é' and '½', outputting 'e%FFFFFFFFFFFFFFCC' and '%FFFFFFFFFFFFFFC2', respectively (b/c of the per-character loop, I believe).
You may have missed the prominent "Note:" at the top of the answer.
In that first block of code what does the last parameter to printf mean? That is, why is it double-quote, single-quote, dollar-sign, letter-c, double-quote? Does does the single-quote do?
From the bash man page, the printf description: Arguments to non-string format specifiers are treated as C constants, except that a leading plus or minus sign is allowed, and if the leading character is a single or double quote, the value is the numeric value of the following character, using the current locale. Nice :-) I didn't notice that little trick
@ColinFraizer the single quote serves to convert the following character into its numeric value. ref. pubs.opengroup.org/onlinepubs/9699919799/utilities/…
@Matthematics, @dmcontador, @Orwellophile: I was wrong in my previous comment. Solution using xxd is beter and works in any case (for any character). I have updated my script. Anyway, it looks like the rawurldecode() function works exceptionally well. :)
With proper UTF-8 encoding support: rawurlencode() { local LANG=C ; local IFS= ; while read -n1 -r -d "$(echo -n "\000")" c ; do case "$c" in [-_.~a-zA-Z0-9]) echo -n "$c" ;; *) printf '%%%02x' "'$c" ;; esac ; done }. Then: echo -n "Jogging «à l'Hèze»." | rawurlencode produces Jogging%20%c2%ab%c3%a0%20l%27H%c3%a8ze%c2%bb. as expected.

dubek · Accepted Answer · 2010-01-12 09:39:32Z

108

Use Perl's URI::Escape module and uri_escape function in the second line of your bash script:

...

value="$(perl -MURI::Escape -e 'print uri_escape($ARGV[0]);' "$2")"
...

Edit: Fix quoting problems, as suggested by Chris Johnsen in the comments. Thanks!

edited Jan 12, 2010 at 9:39

answered Nov 18, 2008 at 9:34

dubek

12.2k5 gold badges32 silver badges23 bronze badges

8 Comments

blueyed Over a year ago

URI::Escape might not be installed, check my answer in that case.

dubek Over a year ago

I fixed this (use echo, pipe and <>), and now it works even when $2 contains an apostrophe or double-quotes. Thanks!

Chris Johnsen Over a year ago

You do away with echo, too: value="$(perl -MURI::Escape -e 'print uri_escape($ARGV[0]);' "$2")"

mm2001 Over a year ago

Chris Johnsen's version is better. I had ${True} in my test expression and using this via echo tripped up uri_escape / Perl variable expansion.

thecoshman Over a year ago

@jrw32982 yeah, looking back at it, having another language with which to accomplish this task is good. If I could, I'd take back my downvote, but alas it is currently locked in.

|

Bruno Bronosky · Accepted Answer · 2023-06-27 19:19:34Z

One of variants, may be ugly, but simple:

urlencode() {
    local data
    if [[ $# != 1 ]]; then
        echo "Usage: $0 string-to-urlencode"
        return 1
    fi
    data="$(curl -s -o /dev/null -w %{url_effective} --get --data-urlencode "$1" "")"
    if [[ $? != 3 ]]; then
        echo "Unexpected error" 1>&2
        return 2
    fi
    echo "${data##/?}"
    return 0
}

Here is the one-liner version for example (as suggested by Bruno):

# Oneliner updated for curl 7.88.1
date | { curl -Gs -w %{url_effective} --data-urlencode @- ./ ||: } | sed "s/%0[aA]$//;s/^[^?]*?\(.*\)/\1/"

# Verification that it works on input without the trailing \n
printf "%s" "$(date)" | { curl -Gs -w %{url_effective} --data-urlencode @- ./ ||: } | sed "s/%0[aA]$//;s/^[^?]*?\(.*\)/\1/"

# Explanation of what the oneliner is doing
date  `# 1. Generate sample input data ` \
  | \
    { `# groups a set of commands as a unit` \
      curl -Gs -w %{url_effective} --data-urlencode @- ./ `# 2. @- means read stdin` \
      ||: `# since the curl command exits 6, add "OR true"` \
    } \
  | sed \
    -e "s/%0[aA]$//"         `# strip trailing \n if present` \
    -e "s/^[^?]*?\(.*\)/\1/" `# strip leading chars up to and including 1st ?`

This is absolutely brilliant! I really wish you had left it a one line so that people can see how simple it really is. To URL encode the result of the date command… date | curl -Gso /dev/null -w %{url_effective} --data-urlencode @- "" | cut -c 3- (You have to cut the first 2 chars off, because curl's output is a technically a relative URL with a query string.)
@BrunoBronosky Your one-liner variant is good but seemingly adds a "%0A" to the end of the encoding. Users beware. The function version does not seem to have this issue.
In curl 7.88.1 this one-liner does not seem to work anymore leading to empty value.

josch · Accepted Answer · 2014-02-12 20:33:12Z

76

for the sake of completeness, many solutions using sed or awk only translate a special set of characters and are hence quite large by code size and also dont translate other special characters that should be encoded.

a safe way to urlencode would be to just encode every single byte - even those that would've been allowed.

echo -ne 'some random\nbytes' | xxd -plain | tr -d '\n' | sed 's/\(..\)/%\1/g'

xxd is taking care here that the input is handled as bytes and not characters.

edit:

xxd comes with the vim-common package in Debian and I was just on a system where it was not installed and I didnt want to install it. The altornative is to use hexdump from the bsdmainutils package in Debian. According to the following graph, bsdmainutils and vim-common should have an about equal likelihood to be installed:

http://qa.debian.org/popcon-png.php?packages=vim-common%2Cbsdmainutils&show_installed=1&want_legend=1&want_ticks=1

but nevertheless here a version which uses hexdump instead of xxd and allows to avoid the tr call:

echo -ne 'some random\nbytes' | hexdump -v -e '/1 "%02x"' | sed 's/\(..\)/%\1/g'

edited Feb 12, 2014 at 20:33

answered Sep 21, 2011 at 21:10

josch

7,2904 gold badges46 silver badges56 bronze badges

5 Comments

qdii Over a year ago

xxd -plain should happen AFTER tr -d '\n' !

josch Over a year ago

@qdii why? that would not only make it impossible to urlencode newlines but it would also wrongly insert newlines created by xxd into the output.

qdii Over a year ago

@josch. This is just plain wrong. First, any \n characters will be translated by xxd -plain into 0a. Don’t take my word for it, try it yourself: echo -n -e '\n' | xxd -plain This proves that your tr -d '\n' is useless here as there cannot be any \n after xxd -plain Second, echo foobar adds its own \n character in the end of the character string, so xxd -plain is not fed with foobar as expected but with foobar\n. then xxd -plain translates it into some character string that ends in 0a, making it unsuitable for the user. You could add -n to echo to solve it.

josch Over a year ago

@qdii indeed -n was missing for echo but the xxd call belongs in front of the tr -d call. It belongs there so that any newline in foobar is translated by xxd. The tr -d after the xxd call is to remove the newlines that xxd produces. It seems you never have foobar long enough so that xxd produces newlines but for long inputs it will. So the tr -d is necessary. In contrast to your assumption the tr -d was NOT to remove newlines from the input but from the xxd output. I want to keep the newlines in the input. Your only valid point is, that echo adds an unnecessary newline.

josch Over a year ago

@qdii and no offence taken - I just think that you are wrong, except for the echo -n which I was indeed missing

kynan · Accepted Answer · 2022-02-21 19:36:03Z

67

I find it more readable in python:

encoded_value=$(python3 -c "import urllib.parse; print urllib.parse.quote('''$value''')")

the triple ' ensures that single quotes in value won't hurt. urllib is in the standard library. It work for example for this crazy (real world) url:

"http://www.rai.it/dl/audio/" "1264165523944Ho servito il re d'Inghilterra - Puntata 7

edited Feb 21, 2022 at 19:36

kynan

13.7k6 gold badges82 silver badges82 bronze badges

answered Feb 10, 2010 at 10:26

sandro

7115 silver badges2 bronze badges

9 Comments

Stop Slandering Monica Cellio Over a year ago

I had some trouble with quotes and special chars with the triplequoting, this seemed to work for basically everything: encoded_value="$( echo -n "${data}" | python -c "import urllib; import sys; sys.stdout.write(urllib.quote(sys.stdin.read()))" )";

Creshal Over a year ago

Python 3 version would be encoded_value=$(python3 -c "import urllib.parse; print (urllib.parse.quote('''$value'''))").

Evgeniy Generalov Over a year ago

The urllib.parse.quote does not encode forward slashes '/'. urlencode() { python3 -c 'import urllib.parse; import sys; print(urllib.parse.quote(sys.argv[1], safe=""))' "$1" }

Charles Duffy Over a year ago

It would be much safer to refer to sys.argv rather than substituting $value into a string later parsed as code. What if value contained ''' + __import__("os").system("rm -rf ~") + '''?

Rockallite Over a year ago

python -c "import urllib;print urllib.quote(raw_input())" <<< "$data"

|

MDMower · Accepted Answer · 2013-05-31 16:27:49Z

37

I've found the following snippet useful to stick it into a chain of program calls, where URI::Escape might not be installed:

perl -p -e 's/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg'

(source)

edited May 31, 2013 at 16:27

MDMower

85911 silver badges28 bronze badges

answered Nov 10, 2009 at 19:48

blueyed

27.9k4 gold badges83 silver badges71 bronze badges

6 Comments

JohnnyLambada Over a year ago

worked for me. I changed it to perl -lpe ... (the letter ell). This removed the trailing newline, which I needed for my purposes.

Sridhar Sarnobat Over a year ago

FYI, to do the inverse of this, use perl -pe 's/\%(\w\w)/chr hex $1/ge' (source: unix.stackexchange.com/questions/159253/…)

robru Over a year ago

Depending on specifically which characters you need to encode, you can simplify this to perl -pe 's/(\W)/sprintf("%%%02X", ord($1))/ge' which allows letters, numbers, and underscores, but encodes everything else.

Pham Over a year ago

Thanks for response above! Since the use case is for curl: That is: : and / does not need encoding, my final function in my bashrc/zshrc is: perl -lpe 's/([^A-Za-z0-9.\/:])/sprintf("%%%02X", ord($1))/seg

blueyed Over a year ago

@TobiasFeil it comes from stdin.

|

Piotr Czapla · Accepted Answer · 2011-02-25 12:37:16Z

28

If you wish to run GET request and use pure curl just add --get to @Jacob's solution.

Here is an example:

curl -v --get --data-urlencode "access_token=$(cat .fb_access_token)" https://graph.facebook.com/me/feed

answered Feb 25, 2011 at 12:37

Piotr Czapla

26.8k26 gold badges105 silver badges123 bronze badges

Comments

chenzhiwei · Accepted Answer · 2017-07-12 03:08:17Z

17

This may be the best one:

after=$(echo -e "$before" | od -An -tx1 | tr ' ' % | xargs printf "%s")

edited Jul 12, 2017 at 3:08

answered Aug 1, 2013 at 9:14

chenzhiwei

4617 silver badges15 bronze badges

6 Comments

Rob Fagen Over a year ago

This works for me with two additions: 1. replace the -e with -n to avoid adding a newline to the end of the argument and 2. add '%%' to the printf string to put a % in front of each pair of hex digits.

Roman Rhrn Nesterov Over a year ago

works after add $ ahead bracket after=$(echo -e ...

Mark Stosberg Over a year ago

Please explain how this works. The od command is not common.

nisetama Over a year ago

This does not work with OS X's od because it uses a different output format than GNU od. For example printf aa|od -An -tx1 -v|tr \ - prints -----------61--61-------------------------------------------------------- with OS X's od and -61-61 with GNU od. You could use od -An -tx1 -v|sed 's/ */ /g;s/ *$//'|tr \ %|tr -d \\n with either OS X's od or GNU od. xxd -p|sed 's/../%&/g'|tr -d \\n does the same thing, even though xxd is not in POSIX but od is.

Charlie Over a year ago

Although this might work, it escapes every single character

|

davidchambers · Accepted Answer · 2018-05-23 21:01:03Z

17

Here's a Bash solution which doesn't invoke any external programs:

uriencode() {
  s="${1//'%'/%25}"
  s="${s//' '/%20}"
  s="${s//'"'/%22}"
  s="${s//'#'/%23}"
  s="${s//'$'/%24}"
  s="${s//'&'/%26}"
  s="${s//'+'/%2B}"
  s="${s//','/%2C}"
  s="${s//'/'/%2F}"
  s="${s//':'/%3A}"
  s="${s//';'/%3B}"
  s="${s//'='/%3D}"
  s="${s//'?'/%3F}"
  s="${s//'@'/%40}"
  s="${s//'['/%5B}"
  s="${s//']'/%5D}"
  printf %s "$s"
}

edited May 23, 2018 at 21:01

answered Jan 1, 2017 at 2:44

davidchambers

25.1k18 gold badges81 silver badges110 bronze badges

3 Comments

Anton Krug Over a year ago

This behaves differently between the bash versions. On RHEL 6.9 the bash is 4.1.2 and it includes the single quotes. While Debian 9 and bash 4.4.12 is fine with the single quotes. For me removing the single quotes made it work on both. s="${s//','/%2C}"

davidchambers Over a year ago

I updated the answer to reflect your finding, @muni764.

diogovk Over a year ago

Just a warning... this won't encode things like the character á

Pokechu22 · Accepted Answer · 2014-11-09 00:17:39Z

Direct link to awk version : http://www.shelldorado.com/scripts/cmds/urlencode
I used it for years and it works like a charm

:
##########################################################################
# Title      :  urlencode - encode URL data
# Author     :  Heiner Steven ([email protected])
# Date       :  2000-03-15
# Requires   :  awk
# Categories :  File Conversion, WWW, CGI
# SCCS-Id.   :  @(#) urlencode  1.4 06/10/29
##########################################################################
# Description
#   Encode data according to
#       RFC 1738: "Uniform Resource Locators (URL)" and
#       RFC 1866: "Hypertext Markup Language - 2.0" (HTML)
#
#   This encoding is used i.e. for the MIME type
#   "application/x-www-form-urlencoded"
#
# Notes
#    o  The default behaviour is not to encode the line endings. This
#   may not be what was intended, because the result will be
#   multiple lines of output (which cannot be used in an URL or a
#   HTTP "POST" request). If the desired output should be one
#   line, use the "-l" option.
#
#    o  The "-l" option assumes, that the end-of-line is denoted by
#   the character LF (ASCII 10). This is not true for Windows or
#   Mac systems, where the end of a line is denoted by the two
#   characters CR LF (ASCII 13 10).
#   We use this for symmetry; data processed in the following way:
#       cat | urlencode -l | urldecode -l
#   should (and will) result in the original data
#
#    o  Large lines (or binary files) will break many AWK
#       implementations. If you get the message
#       awk: record `...' too long
#        record number xxx
#   consider using GNU AWK (gawk).
#
#    o  urlencode will always terminate it's output with an EOL
#       character
#
# Thanks to Stefan Brozinski for pointing out a bug related to non-standard
# locales.
#
# See also
#   urldecode
##########################################################################

PN=`basename "$0"`          # Program name
VER='1.4'

: ${AWK=awk}

Usage () {
    echo >&2 "$PN - encode URL data, $VER
usage: $PN [-l] [file ...]
    -l:  encode line endings (result will be one line of output)

The default is to encode each input line on its own."
    exit 1
}

Msg () {
    for MsgLine
    do echo "$PN: $MsgLine" >&2
    done
}

Fatal () { Msg "$@"; exit 1; }

set -- `getopt hl "$@" 2>/dev/null` || Usage
[ $# -lt 1 ] && Usage           # "getopt" detected an error

EncodeEOL=no
while [ $# -gt 0 ]
do
    case "$1" in
        -l) EncodeEOL=yes;;
    --) shift; break;;
    -h) Usage;;
    -*) Usage;;
    *)  break;;         # First file name
    esac
    shift
done

LANG=C  export LANG
$AWK '
    BEGIN {
    # We assume an awk implementation that is just plain dumb.
    # We will convert an character to its ASCII value with the
    # table ord[], and produce two-digit hexadecimal output
    # without the printf("%02X") feature.

    EOL = "%0A"     # "end of line" string (encoded)
    split ("1 2 3 4 5 6 7 8 9 A B C D E F", hextab, " ")
    hextab [0] = 0
    for ( i=1; i<=255; ++i ) ord [ sprintf ("%c", i) "" ] = i + 0
    if ("'"$EncodeEOL"'" == "yes") EncodeEOL = 1; else EncodeEOL = 0
    }
    {
    encoded = ""
    for ( i=1; i<=length ($0); ++i ) {
        c = substr ($0, i, 1)
        if ( c ~ /[a-zA-Z0-9.-]/ ) {
        encoded = encoded c     # safe character
        } else if ( c == " " ) {
        encoded = encoded "+"   # special handling
        } else {
        # unsafe character, encode it as a two-digit hex-number
        lo = ord [c] % 16
        hi = int (ord [c] / 16);
        encoded = encoded "%" hextab [hi] hextab [lo]
        }
    }
    if ( EncodeEOL ) {
        printf ("%s", encoded EOL)
    } else {
        print encoded
    }
    }
    END {
        #if ( EncodeEOL ) print ""
    }
' "$@"

Is there a simple variation to get UTF-8 encoding instead of ASCII?

Nestor Urquiza · Accepted Answer · 2017-04-12 17:42:15Z

11

What would parse URLs better than javascript?

node -p "encodeURIComponent('$url')"

answered Apr 12, 2017 at 17:42

Nestor Urquiza

3,07731 silver badges23 bronze badges

9 Comments

Cyrille Pontvieux Over a year ago

Out of op question scope. Not bash, not curl. Even if I'm sure works very good if node is available.

Nestor Urquiza Over a year ago

Why down-voting this and not the python/perl answers? Furthermore how this does not respond the original question "How to urlencode data for curl command?". This can be used from a bash script and the result can be given to a curl command.

Nestor Urquiza Over a year ago

There is no need to use curl if you have another language at your disposal but it does not mean you cannot use it. From the bash perspective curl is an external command just as node is. The solution I propose is to use node and curl inside a bash script. Yes you need a dependency but it is still bash. I am not proposing to do the whole work with node. Therefore this is a valid solution to the question "How to urlencode data for curl command?". The answer to the question is "urlencode the data with a node one-liner".

Michael Krelin - hacker Over a year ago

While I didn't bother to downvote, the problem with this command is that it requires data to be properly escaped for use in javascript. Like try it with single quotes and some backslash madness. If you want to use node, you better read stuff from stdin like node -p 'encodeURIComponent(require("fs").readFileSync(0))'

Mark Stosberg Over a year ago

Be careful with @MichaelKrelin-hacker's solution if you are piping data in from STDIN make sure not to include a trailing newline. For example, echo | ... is wrong, while echo -n | ... suppresses the newline.

|

Cody Gray · Accepted Answer · 2011-01-11 13:27:13Z

10

url=$(echo "$1" | sed -e 's/%/%25/g' -e 's/ /%20/g' -e 's/!/%21/g' -e 's/"/%22/g' -e 's/#/%23/g' -e 's/\$/%24/g' -e 's/\&/%26/g' -e 's/'\''/%27/g' -e 's/(/%28/g' -e 's/)/%29/g' -e 's/\*/%2a/g' -e 's/+/%2b/g' -e 's/,/%2c/g' -e 's/-/%2d/g' -e 's/\./%2e/g' -e 's/\//%2f/g' -e 's/:/%3a/g' -e 's/;/%3b/g' -e 's//%3e/g' -e 's/?/%3f/g' -e 's/@/%40/g' -e 's/\[/%5b/g' -e 's/\\/%5c/g' -e 's/\]/%5d/g' -e 's/\^/%5e/g' -e 's/_/%5f/g' -e 's/`/%60/g' -e 's/{/%7b/g' -e 's/|/%7c/g' -e 's/}/%7d/g' -e 's/~/%7e/g')

this will encode the string inside of $1 and output it in $url. although you don't have to put it in a var if you want. BTW didn't include the sed for tab thought it would turn it into spaces

edited Jan 11, 2011 at 13:27

Cody Gray♦

246k53 gold badges511 silver badges590 bronze badges

answered Jan 11, 2011 at 12:51

manoflinux

1091 silver badge3 bronze badges

5 Comments

Cody Gray Over a year ago

I get the feeling this is not the recommended way to do this.

manoflinux Over a year ago

explain your feeling please.... because I what I have stated works and I have used it in several scripts so I know it works for all the chars I listed. so please explain why someone would not use my code and use perl since the title of this is "URLEncode from a bash script" not a perl script.

Yuval Rimar Over a year ago

sometimes no pearl solution is needed so this can come in handy

Liz Over a year ago

This is not the recommended way to do this because blacklist is bad practice, and this is unicode unfriendly anyway.

mrwaim Over a year ago

This was the most friendly solution compatible with cat file.txt

Darren Weber · Accepted Answer · 2012-01-31 23:10:59Z

9

Using php from a shell script:

value="http://www.google.com"
encoded=$(php -r "echo rawurlencode('$value');")
# encoded = "http%3A%2F%2Fwww.google.com"
echo $(php -r "echo rawurldecode('$encoded');")
# returns: "http://www.google.com"

answered Jan 31, 2012 at 23:10

Darren Weber

1,7141 gold badge21 silver badges22 bronze badges

Comments

Wolfgang Fahl · Accepted Answer · 2020-07-08 06:20:06Z

9

Python 3 based on @sandro's good answer from 2010:

echo "Test & /me" | python -c "import urllib.parse;print (urllib.parse.quote(input()))"

Test%20%26%20/me

answered Jul 8, 2020 at 6:20

Wolfgang Fahl

15.7k11 gold badges101 silver badges204 bronze badges

Comments

masterxilo · Accepted Answer · 2021-02-11 16:42:13Z

9

This nodejs-based answer will use encodeURIComponent on stdin:

uriencode_stdin() {
    node -p 'encodeURIComponent(require("fs").readFileSync(0))'
}

echo -n $'hello\nwörld' | uriencode_stdin
hello%0Aw%C3%B6rld

answered Feb 11, 2021 at 16:42

masterxilo

2,8273 gold badges36 silver badges37 bronze badges

1 Comment

SkyzohKey Over a year ago

Best version out there ;)

kev · Accepted Answer · 2012-11-26 08:48:57Z

8

uni2ascii is very handy:

$ echo -ne '你好世界' | uni2ascii -aJ
%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C

answered Nov 26, 2012 at 8:48

kev

163k49 gold badges286 silver badges282 bronze badges

1 Comment

Boldewyn Over a year ago

This doesn't work for characters inside the ASCII range, that need quoting, like % and space (that last can be remedied with the -s flag)

Klaus · Accepted Answer · 2015-01-20 21:08:51Z

8

You can emulate javascript's encodeURIComponent in perl. Here's the command:

perl -pe 's/([^a-zA-Z0-9_.!~*()'\''-])/sprintf("%%%02X", ord($1))/ge'

You could set this as a bash alias in .bash_profile:

alias encodeURIComponent='perl -pe '\''s/([^a-zA-Z0-9_.!~*()'\''\'\'''\''-])/sprintf("%%%02X",ord($1))/ge'\'

Now you can pipe into encodeURIComponent:

$ echo -n 'hèllo wôrld!' | encodeURIComponent
h%C3%A8llo%20w%C3%B4rld!

answered Jan 20, 2015 at 21:08

Klaus

7347 silver badges15 bronze badges

Comments

ataylor · Accepted Answer · 2012-03-27 19:28:54Z

For those of you looking for a solution that doesn't need perl, here is one that only needs hexdump and awk:

url_encode() {
 [ $# -lt 1 ] && { return; }

 encodedurl="$1";

 # make sure hexdump exists, if not, just give back the url
 [ ! -x "/usr/bin/hexdump" ] && { return; }

 encodedurl=`
   echo $encodedurl | hexdump -v -e '1/1 "%02x\t"' -e '1/1 "%_c\n"' |
   LANG=C awk '
     $1 == "20"                    { printf("%s",   "+"); next } # space becomes plus
     $1 ~  /0[adAD]/               {                      next } # strip newlines
     $2 ~  /^[a-zA-Z0-9.*()\/-]$/  { printf("%s",   $2);  next } # pass through what we can
                                   { printf("%%%s", $1)        } # take hex value of everything else
   '`
}

Stitched together from a couple of places across the net and some local trial and error. It works great!

Marcus Müller · Accepted Answer · 2015-02-26 09:07:20Z

7

Simple PHP option:

echo 'part-that-needs-encoding' | php -R 'echo urlencode($argn);'

edited Feb 26, 2015 at 9:07

Marcus Müller

36.9k4 gold badges59 silver badges105 bronze badges

answered Feb 26, 2015 at 4:40

Ryan

5,1461 gold badge36 silver badges37 bronze badges

1 Comment

Mikko Rantalainen Over a year ago

If the data to be encoded contains any linefeeds, those will be silently dropped by this implementation.

Benjamin Loison · Accepted Answer · 2023-08-09 12:43:43Z

If you don't want to depend on Perl you can also use sed. It's a bit messy, as each character has to be escaped individually. Make a file with the following contents and call it urlencode.sed

s/%/%25/g
s/ /%20/g
s/ /%09/g
s/!/%21/g
s/"/%22/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/'\''/%27/g
s/(/%28/g
s/)/%29/g
s/\*/%2a/g
s/+/%2b/g
s/,/%2c/g
s/-/%2d/g
s/\./%2e/g
s/\//%2f/g
s/:/%3a/g
s/;/%3b/g
s//%3e/g
s/?/%3f/g
s/@/%40/g
s/\[/%5b/g
s/\\/%5c/g
s/\]/%5d/g
s/\^/%5e/g
s/_/%5f/g
s/`/%60/g
s/{/%7b/g
s/|/%7c/g
s/}/%7d/g
s/~/%7e/g
s/      /%09/g

To use it do the following.

STR1=$(echo "https://www.example.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f1)
STR2=$(echo "https://www.example.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f2)
OUT2=$(echo "$STR2" | sed -f urlencode.sed)
echo "$STR1?$OUT2"

This will split the string into a part that needs encoding, and the part that is fine, encode the part that needs it, then stitches back together.

You can put that into a sh script for convenience, maybe have it take a parameter to encode, put it on your path and then you can just call:

urlencode https://www.exxample.com?isThisFun=HellNo

_source

Zombo · Accepted Answer · 2021-01-05 20:10:49Z

6

Here is a POSIX function to do that:

url_encode() {
   awk 'BEGIN {
      for (n = 0; n < 125; n++) {
         m[sprintf("%c", n)] = n
      }
      n = 1
      while (1) {
         s = substr(ARGV[1], n, 1)
         if (s == "") {
            break
         }
         t = s ~ /[[:alnum:]_.!~*\47()-]/ ? t s : t sprintf("%%%02X", m[s])
         n++
      }
      print t
   }' "$1"
}

Example:

value=$(url_encode "$2")

edited Jan 5, 2021 at 20:10

answered Dec 31, 2016 at 5:14

Zombo

1

Comments

jan halfar · Accepted Answer · 2013-12-18 09:13:49Z

5

Another php approach:

echo "encode me" | php -r "echo urlencode(file_get_contents('php://stdin'));"

answered Dec 18, 2013 at 9:13

jan halfar

771 silver badge3 bronze badges

2 Comments

Mathew Hall Over a year ago

echo will append a newline character (hex 0xa). To stop it doing that, use echo -n.

Mikko Rantalainen Over a year ago

If you have to accept unknown data, always use printf "%s" "$data" to avoid interpreting data as commands. If you do e.g. echo -n $data and $data is a string that starts with -e it can do interesting things to the data.

davidchambers · Accepted Answer · 2014-11-06 07:49:18Z

5

Here's the node version:

uriencode() {
  node -p "encodeURIComponent('${1//\'/\\\'}')"
}

edited Nov 6, 2014 at 7:49

answered Jul 7, 2014 at 0:04

davidchambers

25.1k18 gold badges81 silver badges110 bronze badges

5 Comments

Stuart P. Bentley Over a year ago

Won't this break if there are any other characters in the string that aren't valid between single quotes, like a single backslash, or newlines?

davidchambers Over a year ago

Good point. If we're to go to the trouble of escaping all the problematic characters in Bash we might as well perform the replacements directly and avoid node altogether. I posted a Bash-only solution. :)

Mark Stosberg Over a year ago

This variant found elsewhere on the page avoids the quoting issue by reading the value from STDIN: node -p 'encodeURIComponent(require("fs").readFileSync(0))'

ysdx Over a year ago

You can avoid the quoting issue (and potentiel for JavaScript code injections with): node -p "encodeURIComponent(process.argv[1]) "$1"

ysdx Over a year ago

WARNING, I would strongly recommend against using this method if the input may be untrusted. I believe that because backslash is not propertly escaped, you might be able to inject JavaScript code leading to arbitrary code execution.

nulleight · Accepted Answer · 2017-01-31 10:45:24Z

Here is my version for busybox ash shell for an embedded system, I originally adopted Orwellophile's variant:

urlencode()
{
    local S="${1}"
    local encoded=""
    local ch
    local o
    for i in $(seq 0 $((${#S} - 1)) )
    do
        ch=${S:$i:1}
        case "${ch}" in
            [-_.~a-zA-Z0-9]) 
                o="${ch}"
                ;;
            *) 
                o=$(printf '%%%02x' "'$ch")                
                ;;
        esac
        encoded="${encoded}${o}"
    done
    echo ${encoded}
}

urldecode() 
{
    # urldecode <string>
    local url_encoded="${1//+/ }"
    printf '%b' "${url_encoded//%/\\x}"
}

Exadra37 · Accepted Answer · 2022-07-27 17:39:06Z

5

The question is about doing this in bash and there's no need for python or perl as there is in fact a single command that does exactly what you want - "urlencode".

value=$(urlencode "${2}")

This is also much better, as the above perl answer, for example, doesn't encode all characters correctly. Try it with the long dash you get from Word and you get the wrong encoding.

Note, you need "gridsite-clients" installed to provide this command:

sudo apt install gridsite-clients

edited Jul 27, 2022 at 17:39

Exadra37

13.3k3 gold badges50 silver badges65 bronze badges

answered Nov 14, 2014 at 11:55

Dylan

5083 silver badges9 bronze badges

5 Comments

Sridhar Sarnobat Over a year ago

My version of bash (GNU 3.2) doesn't have urlencode. What version are you using?

Dylan Over a year ago

I have 4.3.42, but the urlencode command is provided by "gridsite-clients". Try installing that and you should be fine.

Cyrille Pontvieux Over a year ago

So your answer is not better than any that require others things installed (python, perl, lua, …)

Dylan Over a year ago

Except that it only requires installing a single utility instead of an entire language (and libraries), plus is super simple and clear to see what it's doing.

Doron Behar Over a year ago

A link first for the package / project page providing this command would have been useful.

k107 · Accepted Answer · 2012-06-19 23:45:26Z

3

Ruby, for completeness

value="$(ruby -r cgi -e 'puts CGI.escape(ARGV[0])' "$2")"

answered Jun 19, 2012 at 23:45

k107

16.7k13 gold badges65 silver badges61 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 11:47:26Z

Here's a one-line conversion using Lua, similar to blueyed's answer except with all the RFC 3986 Unreserved Characters left unencoded (like this answer):

url=$(echo 'print((arg[1]:gsub("([^%w%-%.%_%~])",function(c)return("%%%02X"):format(c:byte())end)))' | lua - "$1")

Additionally, you may need to ensure that newlines in your string are converted from LF to CRLF, in which case you can insert a gsub("\r?\n", "\r\n") in the chain before the percent-encoding.

Here's a variant that, in the non-standard style of application/x-www-form-urlencoded, does that newline normalization, as well as encoding spaces as '+' instead of '%20' (which could probably be added to the Perl snippet using a similar technique).

url=$(echo 'print((arg[1]:gsub("\r?\n", "\r\n"):gsub("([^%w%-%.%_%~ ]))",function(c)return("%%%02X"):format(c:byte())end):gsub(" ","+"))' | lua - "$1")

sthames42 · Accepted Answer · 2021-10-15 14:46:46Z

2

In this case, I needed to URL encode the hostname. Don't ask why. Being a minimalist, and a Perl fan, here's what I came up with.

url_encode()
  {
  echo -n "$1" | perl -pe 's/[^a-zA-Z0-9\/_.~-]/sprintf "%%%02x", ord($&)/ge'
  }

Works perfectly for me.

answered Oct 15, 2021 at 14:46

sthames42

1,03810 silver badges26 bronze badges

Collectives™ on Stack Overflow

How to urlencode data for curl command?

39 Answers 39

12 Comments

10 Comments

Here is the pure BASH answer.

20 Comments

8 Comments

14 Comments

5 Comments

9 Comments

6 Comments

Comments

6 Comments

3 Comments

1 Comment

9 Comments

5 Comments

Comments

Comments

1 Comment

1 Comment

Comments

Comments

1 Comment

Comments

Comments

2 Comments

5 Comments

Comments

5 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

39 Answers 39

12 Comments

10 Comments

Here is the pure BASH answer.

20 Comments

8 Comments

14 Comments

5 Comments

9 Comments

6 Comments

Comments

6 Comments

3 Comments

1 Comment

9 Comments

5 Comments

Comments

Comments

1 Comment

1 Comment

Comments

Comments

1 Comment

Comments

Comments

2 Comments

5 Comments

Comments

5 Comments

Comments

Comments

Comments

Linked

Related