Skip to main content
added 617 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars., for example using this input:

$ cat file2
ABC 123
DEF, .*, .*
GHI, .* ok

We get the expected output:

$ awk 'n=index($0 RS,", .*" RS){$0=substr($0,1,n-1)} 1' file2
ABC 123
DEF, .*
GHI, .* ok

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

but then you'd get unexpected output from:

$ awk '{sub(/, .*$/,"")} 1' file2
ABC 123
DEF
GHI

and you'd have to escape the metachars to make them literal to get the expected output:

$ awk '{sub(/, \.\*$/,"")} 1' file2
ABC 123
DEF, .*
GHI, .* ok

which is getting cludgy given all you really wanted was a literal string comparison.

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable and maintainable.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable and maintainable.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars, for example using this input:

$ cat file2
ABC 123
DEF, .*, .*
GHI, .* ok

We get the expected output:

$ awk 'n=index($0 RS,", .*" RS){$0=substr($0,1,n-1)} 1' file2
ABC 123
DEF, .*
GHI, .* ok

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

but then you'd get unexpected output from:

$ awk '{sub(/, .*$/,"")} 1' file2
ABC 123
DEF
GHI

and you'd have to escape the metachars to make them literal to get the expected output:

$ awk '{sub(/, \.\*$/,"")} 1' file2
ABC 123
DEF, .*
GHI, .* ok

which is getting cludgy given all you really wanted was a literal string comparison.

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable and maintainable.

added 17 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable and maintainable.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable and maintainable.

added 336 characters in body
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars. See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

Using any awk:

$ awk 'n=index($0 RS,", characters I don\047t want" RS){$0=substr($0,1,n-1)} 1' file
ABC 123
DEF
GHI, these characters are ok

That's doing a literal string comparison so it'd work even if the string you're trying to match with contained regexp metachars.

If you didn't care about regexp metachars you could just do:

$ awk '{sub(/, characters I don\047t want$/,"")} 1' file
ABC 123
DEF
GHI, these characters are ok

See http://awk.freeshell.org/PrintASingleQuote for why I'm using \047 instead of '.

As for why to use awk instead of python - awk is a mandatory POSIX tool and so is guaranteed to exist on all POSIX-compliant Unix installations while python is not, and it usually takes much less code to manipulate text with awk than it does with python. I suspect we will have to agree to disagree on which is more easily readable.

Post Undeleted by Ed Morton
Post Deleted by Ed Morton
Source Link
Ed Morton
  • 35.9k
  • 6
  • 25
  • 60
Loading