Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

7
  • 2
    the $ anchor in that sed statement isn't necessary -- * is greedy by default Commented Apr 19, 2016 at 10:54
  • 1
    @JeffSchaller, it makes a difference on inputs that contain invalid characters (which . won't match). Commented Apr 19, 2016 at 11:11
  • 1
    I had to go look at pubs.opengroup.org/onlinepubs/9699919799/basedefs/… to see what I missed. "A <period> ( '.' ), when used outside a bracket expression, is a BRE that shall match any character in the supported character set except NUL." Is there an odd locale case to worry about? Commented Apr 19, 2016 at 11:24
  • 3
    In UTF-8 locales (the norm nowadays), . generally won't match characters in single-byte character sets whose value is above 127. é in UTF-8 is c3 a9, while it's e9 in iso8859-1 (the most common western charset before utf-8 became popular). So if you have Stéphane written in the iso8859-1 character set and the current locale's charset is UTF-8, . won't match that 0xe9 byte in between the t and p as that doesn't form a valid character in UTF-8. So s/:.*$// on an input like foo:St<0xe9>phane:C will yield foo:St<0xe9>phane:C while s/:.*// will yield foo<0xe9>phane:C. Commented Apr 19, 2016 at 11:46
  • 2
    (that's something that you can generally ignore when the input is valid text, but worth mentioning once in a while). Commented Apr 19, 2016 at 11:48