3

I am new to bash. I have a question about determining if all characters of one string occur within another string. For example, if the variables are:

var_1="abcdefg"
var_2="bcg"

Then I want to write an if statement of the form:

if [all characters of var_2 occur within var_1]
then
     echo "All characters of var_2 occur in var_1."
else
     echo "Not all characters of var_2 occur in var_1."
fi

In this example, the output should be All characters of var_2 occur in var_1. What would go in the if statement here?

This is what I tried:

if [[ $var_1 == *$var_2* ]]

... but I think this is only determines if var_2 is a substring of var_1. What I want is to determine if the characters of var_2 occur within var_1 in no particular order.

0

3 Answers 3

2

The following oneliner should work:

echo -e "$var_2\0$var_1" | sed -E ':a;s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/;ta;s/^\x0.*/1/;s/.*\x0.*/0/'

It will print 0 or 1 to mean false or true respectively.

This is how it works:

  • echo -e allows using escape sequences, and \0 represents the null character, which I'm using to mark the separation between the two strings bcg and abcdefg.
  • The Sed script is not that complex:
    • -E is a non POSIX option allowing to use ( and ) instead of \( and \) to write capturing groups (and other similar simplifications which I'm not using here);
    • ;s separate commands;
    • :a is a label, and allows one jumping here via ta or ba (I use only the former, keep reading);
    • s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/ does the following (which succeedes if there's at least one character in common between var_2 and var_1):
      • matches and captures the first character of var_2 with (.),
      • matches and captures the following part of var_2 together with the null character, (.*\x0) (yes, what you write as \0 in Bash is \x0 in Sed),
      • matches and captures 0 or more characters,
      • matches what was captured by first group, i.e. by (.),
      • matches and captures 0 or more characters up to the end of var_1,
      • substitutes all that was matched with what was captured by the 2nd, 3rd, and 4th capturing groups: in fact, we've got rid of one character in common between var_2 and var_1;
    • ta test if the previous substitution was successful and, if that's the case, it jumps to :a: this way we are running a loop as long as there's a characters in common between var_2 and var_1;
    • when ther's no characters in common between var_2 and var_1, the test will fail, and the control will fall through ta;
    • s/^\x0.*/1/ matches whatever is left, but only if the null character \x0 is leading, which happens if all letters of var_2 were found in var_1, and changes everything to just 1;
    • s/.*\x0.*/0/ will match everything, as long as there's still \x0 in the string, which happens only if the previous substitution failed, which means that some letter from var_2 was not found in var_1, and change it to 0.
Sign up to request clarification or add additional context in comments.

Comments

2

A very simple method in pure bash:

#!/bin/bash

var_1="abcdefg"
var_2="bcg"

if [[ ${var_2//["$var_1"]} ]]; then
    echo "Not all characters of var_2 occur in var_1."
else
    echo "All characters of var_2 occur in var_1."
fi

The ${var_2//[$var_1]} expands to the value of var_2 with all characters that occur in var_1 deleted. All characters of var_2 occur in var_1 only if that expansion is null string.

4 Comments

This appears to work and is elegant. I want to make sure I understand what's going on here though... the // operator replaces all instances of the elements of var_1 (written as an array [$var_1] ) within var_2 with the null string. So if any elements of var_2 do not exist in var_1, that will be output and since its not equal to the null string, the "else" section is activated. Is this accurate?
@cheddarblake Roughly correct. For more information, you may read Shell Parameter Expansion, the paragraph starting with ${parameter/pattern/string}. Specially the sentence "If string is null, matches of pattern are deleted and the ‘/’ following pattern may be omitted."
@cheddarblake But [$var_1] is not an array at all. It is a bracket expression used in pattern matching. See Pattern Matching
Nice trick +1. And if you use double-quotes [[ ${var_2//["$var_1"]} ]] then it handles correctly any arbitrary sequence of characters!!!
1

Not really an if clause/statement, something like.

#!/usr/bin/env bash

i=0
var_2="bcg"
var_1="abcdefg"
total_str=${#var_2}

while (( i < total_str )); do
  [[ $var_1 = *"${var_2:i++:1}"* ]] || {
    printf >&2 'Not all characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"
    exit 1
  }
done

printf 'All characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"

Output

All characters of the string "bcg" occur in the string "abcdefg".

Changing the value of var_2 to something like

var_2="bxg"

The output should be:

Not all characters of the string "bxg" occur in the string "abcdefg".

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.