Skip to main content
added 846 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh93 and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

In [ implementations that have a =~ operator,

[ "$a" '=~' 'a|b' ]
regexp='a|b'
[ "$a" '=~' "$regexp" ]

Is also unambiguous and portable. Of course, parameter expansions must be quoted like in arguments of any command (in zsh, only to prevent empty removal, in other shells to prevent split+glob), | must be quoted like in any argument to any command (as that's otherwise the pipe operator) but quoting has no influence otherwise on interpretation of the regular expression (\ is used as usual to escape regexp operators or introduce other (non-standard) ones such as \<, \d... where available); =~ needs only quoted in zsh (when not emulating other shells) as =cmd is a special filename expansion operator there that expands to the path of the cmd command. Quoting doesn't harm in other shells.

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh93 and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh93 and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

In [ implementations that have a =~ operator,

[ "$a" '=~' 'a|b' ]
regexp='a|b'
[ "$a" '=~' "$regexp" ]

Is also unambiguous and portable. Of course, parameter expansions must be quoted like in arguments of any command (in zsh, only to prevent empty removal, in other shells to prevent split+glob), | must be quoted like in any argument to any command (as that's otherwise the pipe operator) but quoting has no influence otherwise on interpretation of the regular expression (\ is used as usual to escape regexp operators or introduce other (non-standard) ones such as \<, \d... where available); =~ needs only quoted in zsh (when not emulating other shells) as =cmd is a special filename expansion operator there that expands to the path of the cmd command. Quoting doesn't harm in other shells.

edited body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash-3.2bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to kshksh93 and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash-3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash 3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh93 and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

added 171 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash-3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

[[ ... ]] tokenisation clashes with regular expressions and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input)

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]

You can't quote the regexps, because if you do, since bash-3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

[[ ... ]] tokenisation clashes with regular expressions (more on that in my answer to your follow-up question) and \ is overloaded as a shell quoting operator and a regexp operator (with some interference between the two in bash), and even when there's no apparent reason for a clash, the behaviour can be surprising. Rules can be confusing.

Who can tell what these will do without trying it (on all possible input) with any given version of bash?

[[ $a = a|b ]]
[[ $a =~ a|b ]]
[[ $a =~ a&b ]]
[[ $a =~ (a|b) ]]
[[ $a =~ ([)}]*) ]]
[[ $a =~ [/\(] ]]
[[ $a =~ \s+ ]]
[[ $a =~ ( ) ]]
[[ $a =~ [ ] ]]
[[ $a =~ ([ ]) ]]

You can't quote the regexps, because if you do, since bash-3.2 and if bash 3.1 compatibility has not been enabled, quoting the regexps removes the special meaning of RE operator. For instance,

[[ $a =~ 'a|b' ]]

Matches if $a contains a litteral a|b only.

Storing the regexp in a variable avoids all those problems and also makes the code compatible to ksh and zsh (provided you limit yourself to POSIX EREs):

regexp='a|b'
[[ $a =~ $regexp ]] # $regexp should *not* be quoted.

There's no ambiguity in the parsing/tokenising of that shell command, and the regexp that is used is the one stored in the variable without any transformation.

added 163 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k
Loading