4

I am parsing options with getopts but would like to handle long-options as well.

print-args ()
{
 title="$1" ; shift
 printf "\n%s\n" "${title}: \$@:"
 for arg in "$@"; do
   (( i = i + 1 ))
   printf "%s |%s|\n" "${i}." "$arg"
 done
}

getopts_test ()
{
 aggr=()
 for arg in "$@"; do
   case $arg in
    ("--colour"|"--color")     aggr+=( "-c" ) ;;
    ("--colour="*|"--color="*) aggr+=( "-c" "${arg#*=}" ) ;;
    (*)  aggr+=( "$arg" ) ;;
   esac
 done

 print-args "print" "$@"

 eval set -- "${aggr[@]}"
 print-args "eval" "$@"

 set -- "${aggr[@]}"
 print-args "set" "$@"

 local OPTIND OPTARG
 local shortopts="C:"
 while getopts "$shortopts" arg; do
   case $arg in
    ("c") context="$OPTARG" ;;
    (*) break ;;
   esac
 done
 shift $(( OPTIND - 1 ))
}

But I wonder whether the use of set -- "${aggr[@]}" is correct.

Or is the following (using eval) more appropriate?

eval set -- "${aggr[@]}"

I have performed a test shown below. With eval, the string "Gunga Din" is split up, whereas with set -- "${aggr[@]}", it is being parsed correctly as a single string.

getopts_test -f -g 130 --colour="170 20" "Gunga Din"

print: $@:
1. |-f|
2. |-g|
3. |130|
4. |--colour=170 20|
5. |Gunga Din|

eval: $@:
1. |-f|
2. |-g|
3. |130|
4. |-c|
5. |170|
6. |20|
7. |Gunga|
8. |Din|

set: $@:
1. |-f|
2. |-g|
3. |130|
4. |-c|
5. |170 20|
6. |Gunga Din|

Then I ran another function that uses the non-GNU getopt.

getopt_test ()
{
 shortopts="Vuhv::H::w::e::n::l::C:"
 shortopts="${shortopts}bgcrmo"
 longopts="version,usage,help,verbosity::"
 longopts="${longopts},heading::,warning::,error::"
 longopts="${longopts},blu,grn,cyn,red,mgn,org"
 
 opts=$( getopt -o "$shortopts" -l "$longopts" -n "${0##*/}" -- "$@" )

 print-args "\$@:" "$@"
 print-args "opts:" "$opts"

 set -- "$opts"
 print-args "set -- \"$opts\"" "$@"

 eval set -- "$opts"
 print-args "eval set -- \"$opts\"" "$@"

}

This resulted in the following

getopt_test --warning=3 "foo'bar" "Gunga Din"

$@:
1. |--warning=3|
2. |foo'bar|
3. |Gunga Din|

opts:
1. | --warning '3' -- 'foo'\''bar' 'Gunga Din'|

set -- "$opts"
1. | --warning '3' -- 'foo'\''bar' 'Gunga Din'|

eval set -- "$opts"
1. |--warning|
2. |3|
3. |--|
4. |foo'bar|
5. |Gunga Din|

As shown the result of getopt is a single entry with positional arguments re-arranged. This shows the need to use eval set -- "$opts" to split the positional arguments in the opts string into five entries for option parsing and processing.

4
  • 1
    Do you have the GNU getopt tool? It'll handle quite a lot of this for you. (Here, getopt --versiongetopt from util-linux 2.33.1) Commented Oct 30, 2021 at 10:42
  • 3
    @roaima, note that util-linux is not part of the GNU project. Commented Oct 30, 2021 at 14:04
  • @StéphaneChazelas I'd had the impression that the newer getopt was GNU Commented Oct 30, 2021 at 17:33
  • @roaima, no, it appears to be associated with the Linux kernel developers rather than GNU, at least as far as we believe wikipedia en.wikipedia.org/wiki/Util-linux and e.g. the Debian package page also links to www.kernel.org: packages.debian.org/bullseye/util-linux Commented Oct 30, 2021 at 22:40

2 Answers 2

3

The idea there is to preprocess the arguments and change each --context to -C which getopts can then process? I suppose that would work, but note that GNU-style long options can also take arguments in the format --context=foobar, and your construct here doesn't support that. The user would need to know that this particular tool here requires --context foobar as two distinct arguments. Or you'd need to make the preprocessing more complex.

You might also want to check all arguments that start with --, as otherwise e.g. a mistyped --cotnext would go to getopts as-is, and you'd get complaints about unknown options. (Or worse, wrong options would be enabled.)

But I wonder whether the use of set -- "${aggr[@]}" is correct.

Or is the following (using eval) more appropriate?

set -- "${aggr[@]}" expands the elements of the array, to distinct words, and then assigns those words to the positional parameters. Each array element will become exactly one positional parameter, without changes.

eval set -- "${aggr[@]}" would expand all the elements of the array, then join them together with spaces, prepend the set -- and evaluate the result as a shell command. That is, if you have the array elements abc def, $(date >&2), ghi'jkl, the command would be

set -- abc def $(date >&2) ghi'jkl 

which would end up with abc and def as two distinct parameters, and it would print the date to stderr, except that the lone single quote will cause a syntax error.

Using eval would be appropriate if you have something that's designed to produce output that's quoted for shell input.


If you're on Linux (and don't care about portability), you could do what roaima suggested in the comments, and use the util-linux version of getopt (without the s). It supports long options too, there's answers showing how to use it in getopt, getopts or manual parsing - what to use when I want to support both short and long options? and in this SO answer and also my answer here.

Incidentally, with that getopt, you would use eval, since as a command, it's limited to producing just a single string as output, not a list like an array, so it uses shell quoting to work around the issue.

12
  • Yes, I change each --context to -C. Commented Oct 30, 2021 at 11:01
  • The code is intended ta address the portability problem. Have used getopt before. Commented Oct 30, 2021 at 13:30
  • The biggest question is whether to use eval or not, because for the case of getopt, the use of eval seems necessary. Commented Oct 30, 2021 at 13:43
  • @khin, if you have something that's explicitly made to be a shell command, then you use eval. If not, then you don't. Your users probably don't want to enter args with spaces as getopts_test "'foo bar'", instead of the normal getopts_test "foo bar", and they'll probably also expect getopts_test * to work, even if some of the filenames contain whitespace (or shell special characters). Commented Oct 30, 2021 at 17:18
  • For my getopts_test, it like that using an array is a neat idea. What do you think? Customarily, I pass filename as non-option arguments by using the break command. Commented Oct 31, 2021 at 4:39
1

You can parse --foo-style long options with the getopts builtin by adding - as a short option taking an argument to the optstring, and retrieving the actual long option from $OPTARG. Simple example:

while getopts :sc:-: o; do
    case $o in
    :) echo >&2 "option -$OPTARG needs an argument"; continue;;
    '?') echo >&2 "unknown option -$OPTARG"; continue;;
    -) o=${OPTARG%%=*}; OPTARG=${OPTARG#"$o"}; OPTARG=${OPTARG#=};;
    esac
    echo "OPT $o=$OPTARG"
done
shift "$((OPTIND - 1))"
echo "ARGS $*"

which you can then use as either script -c foo or script --context=foo.

If you also want to have the long options validated just like the short ones, and also accept abbreviated forms, you need something more complex. There is not much wisdom in over-engineering a poor shell script like that, but if you want an example, here it is:

short_opts=sc:
long_opts=silent/ch/context:/check/co   # those who take an arg END with :

# override via command line for testing purposes
# if [ "$#" -ge 2 ]; then
#   short_opts=$1; long_opts=$2; shift 2
# fi

while getopts ":$short_opts-:" o; do
    case $o in
    :) echo >&2 "option -$OPTARG needs an argument" ;continue;;
    '?') echo >&2 "bad option -$OPTARG" ;continue;;
    -)  o=${OPTARG%%=*}; OPTARG=${OPTARG#"$o"}; lo=/$long_opts/
        case $lo in
        *"/$o"[!/:]*"/$o"[!/:]*) echo >&2 "ambiguous option --$o"; continue;;
        *"/$o"[:/]*) ;;
        *) o=$o${lo#*"/$o"}; o=${o%%[/:]*} ;;
        esac
        case $lo in
        *"/$o/"*) OPTARG= ;;
        *"/$o:/"*)
            case $OPTARG in
            '='*)   OPTARG=${OPTARG#=};;
            *)  eval "OPTARG=\$$OPTIND"
                if [ "$OPTIND" -le "$#" ] && [ "$OPTARG" != -- ]; then
                    OPTIND=$((OPTIND + 1))
                else
                    echo >&2 "option --$o needs an argument"; continue
                fi;;
            esac;;
        *) echo >&2 "unknown option --$o"; continue;;
        esac
    esac
    echo "OPT $o=$OPTARG"
done
shift "$((OPTIND - 1))"
echo "ARGS $*"

then

$ ./script --context=33
OPT context=33
$ ./script --con=33
OPT context=33
$ ./script --co
OPT co=
$ ./script --context
option --context needs an argument
3
  • Does getopt also take a leading colon? Could I check for : and ? with getopt as well? Commented Oct 31, 2021 at 11:34
  • Might want to use ${OPTARG%%=*} in the first one too (with double %%). The second one has something wrong with recognizing invalid options, e.g. --xyz comes up as xyzsilent (and --sil comes up as silsilent too). But you did say it wasn't properly debugged anyway. Might be easier to just drop support for abbreviated long options. Commented Oct 31, 2021 at 12:02
  • What is the purpose of a leading : with getopt(1)? Why does it shut up error messages? Commented Oct 31, 2021 at 16:41

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.