6
\$\begingroup\$

You are to write a program that takes a list of strings as input. For every string in the list you are to determine the smallest N such that no other string in the list begins with the same N characters as the string in question. Now compose a new list with all these strings. Here is an example, on the left we have the input and on the right we have the output

aba                   aba
aababba               aa
abbaaabbb          -> abba
bbbbbbabababaaab      b
abbbabbaa             abbb

If a there is a string that is also the beginning of another string you should append a $ to the end to represent the end of the string. Here is an example where that is the case

ababababbabab         abababa
ababbbaabababa        ababbb
ababb                 ababb$
bababba            -> bababb
bbbabababab           bb
bababababbabab        bababa
abababbababa          abababb

Rules

  • Input will only contain alphabetic characters.

  • You will not receive a list with the same entry twice.

  • You take input in any ordered container.

  • This is , you should aim to minimize the byte size of your program.

\$\endgroup\$
6
  • 4
    \$\begingroup\$ Isn't there already a challenge like this? \$\endgroup\$ Commented Jul 10, 2017 at 21:13
  • 2
    \$\begingroup\$ Related. \$\endgroup\$ Commented Jul 10, 2017 at 21:16
  • 3
    \$\begingroup\$ Related. \$\endgroup\$ Commented Jul 10, 2017 at 21:38
  • \$\begingroup\$ Also related. This may have been the one you guys were thinking of, though it's not really a dupe (the "ID"s only have to be unique up to that point in the list) \$\endgroup\$ Commented Jul 11, 2017 at 0:12
  • \$\begingroup\$ Is something like ["ba,"baa","bab"] a valid input? It does not have the same entry twice, but there is no N defined for the first string. If it is a valid input should it return ["ba$","baa","bab"], ["b$","baa","bab"], ["ba$$","baa","bab"], ["b$$","baa","bab"], any of the above, or something else? \$\endgroup\$ Commented Jul 11, 2017 at 11:03

7 Answers 7

6
\$\begingroup\$

Husk, 14 13 12 bytes

Ṡzo↑▲‡Ṫ≠TT'$

Try it online!

Less than half the length of the previous best solution!

Explanation

Ṡzo↑▲‡Ṫ≠TT'$
        TT'$     Pad all strings to equal length by appending dollar signs

     ‡Ṫ≠         Create a 2D matrix with all pairwise differences between
                 strings. ≠ returns the smallest index at wich two lists differ
                 (or 0, if they are equal)

Ṡzo↑▲            For each string, take as many characters as the maximum number
                 in its row of the table.

Since my explanation may not be that clear, here's a small worked example:

Starting list: ["abc","abd","b","bcd"]

Table of differences:

     abc abd  b  bcd     max
 abc  0   3   1   1       3
 abd  3   0   1   1       3
 b    1   1   0   2       2
 bcd  1   1   2   0       2

So we take from each string (with added trailing '$') as many characters as stated in the "max" column.

Final result: ["abc","abd","b$","bc"]

\$\endgroup\$
3
\$\begingroup\$

Jelly,  28 26  16 bytes

-10 bytes by implementing the algorithm found by Leo in their brilliant Husk answer.

i€0Ṁḣ@
;€”$µ=þç"

or...

;€”$µ=i0µþ⁸Ṁ€⁸ḣ"

A monadic link taking and returning lists of lists of characters.

Try it online! (the footer makes a full program which prints the result split by newlines.)

How?

i€0Ṁḣ@ - helper link, create an entry of the output: equality row, string with trailing '$'
i€0    - first index of zero in €ach entry of the row
   Ṁ   - maximum
    ḣ@ - head to index with swapped @rguments (the prefix of the string)

;€”$µ=þç" - link: list of lists of characters (list of "strings")
  ”$      - literal '$'
;€        - concatenate for €ach
    µ     - monadic chain separation (call that x)
      þ   - table of (with x on the left and, implicitly, on the right):
     =    -   equals? (vectorises)
        " - zip with (with the table on the left and, implicitly, x on the right)
       ç  - call the last link as a dyad

My original:

-2 bytes thanks to Erik the Outgolfer (replace ḣJ$ with ;\ and ẎċЀЀ$ with ċ@€€Ẏ$)

NMḢ⁹ḣ;⁸Ṃẋ@”$¤Ṗ
;\€ċ@€€Ẏ$ç"

A monadic link taking and returning lists of lists of characters.

Try it online! (the footer makes a full program which prints the result split by newlines.)

I'm almost certain this is beatable, and probably by a decent margin! (although I have attempted to golf the method.)

The same byte count may be achieved without a helper link too, with:

ḣJ$€ẎċЀЀ$ðNMḢ⁹ḣ;⁸Ṃẋ@”$¤Ṗð"

How?

Note: the reusable link is the second line of code, so start there.

NMḢ⁹ḣ;⁸Ṃẋ@”$¤Ṗ - helper link, create an entry of the output: prefix counts, string
               -   ("prefix counts" should be counts of the prefixes of the "string" in
                    the totality of prefixes of *all* the strings)
N              - negate the counts
 M             - maximal indexes (lengths of prefixes appearing least often, ascending)
  Ḣ            - head (finds the minimal length required), call that ml
   ⁹           - chain's right argument (prefixes)
    ḣ          - head (string) to index ml (gets the minimal length prefix)
            ¤  - nilad followed by links as a nilad:
      ⁸        -   chain's left argument (prefix counts)
       Ṃ       -   minimum (this will either be 1 or 2)
          ”$   -   literal '$'
        ẋ@     -   repeat with swapped @rguments (either "$" or "$$")
     ;         - concatenate
             Ṗ - pop (remove the last "$" - leaving one where the prefix occurs in another
               -      string's prefixes, and none otherwise)

;\€ċ@€€Ẏ$ç" - link: list of lists of characters (list of "strings")
  €         - for €ach string
 \          -   cumulative reduce with:
;           -     concatenation
            - (gets a list of lists of prefixes)
        $   - last two links as a monad:
       Ẏ    -   tighten (flatten by one to make a single list of all prefixes)
    @       -   swap arguments
     €€     -     for each prefix in each list of prefixes
   ċ        -       count occurrences in the tightened list (>=1 since it counts itself)
          " - zip with the dyad (right argument is this link's argument):
         ç  -   last link (helper) as a dyad
\$\endgroup\$
2
  • \$\begingroup\$ There's ;\ which you can use instead of ḣJ$ for -1. \$\endgroup\$ Commented Jul 11, 2017 at 9:19
  • \$\begingroup\$ You can replace ẎċЀЀ$ with ċ@€€Ẏ$ for -1. \$\endgroup\$ Commented Jul 11, 2017 at 9:21
3
\$\begingroup\$

Haskell, 83 79 bytes

f l|let h=map(++"$")l;w x|y<-init x,2>sum[1|z<-h,y<=z,y++"{">z]=w y|1<3=x=w<$>h

Try it online! Example usage: f ["abcd","abh","ab","bacd"] yields ["abc","abh","ab$","b"].

Given a list of strings, the function f appends a trailing $ and applies w to each string, where w iteratively drops the last character of the string (with init) until the next application of init would lead to the string now longer being a unique prefix.

Edit: Three bytes off thanks to Ørjan Johansen's shorter prefix test!

\$\endgroup\$
2
  • \$\begingroup\$ Good point, looks like I can't count. \$\endgroup\$ Commented Jul 10, 2017 at 23:30
  • 1
    \$\begingroup\$ y<=z,y++"{">z is a shorter prefix test. \$\endgroup\$ Commented Jul 11, 2017 at 1:16
2
\$\begingroup\$

Haskell, 109 104 98 97 90 84 83 bytes

Probably not very good but Hopefully this will kick things off a bit.

f s=[[x|x<-map(`take`(a++"$"))[0..],[1]==[1|d<-s,and$zipWith(==)x$d++"$"]]!!0|a<-s]

Try it online!

\$\endgroup\$
1
  • \$\begingroup\$ I think x<=d++"$",x++"{">d is a shorter prefix test. \$\endgroup\$ Commented Jul 11, 2017 at 1:32
1
\$\begingroup\$

JavaScript (ES6), 89 86 bytes

a=>a.map(g=(s,i,a,t=s+'$')=>a.some((t,j)=>j-i&&!t.search(s))?t:g(s.slice(0,-1),i,a,s))

Edit: Saved 3 bytes thanks to @CraigAyre. (Supporting arbitrary characters would still have saved two bytes using !t.indexOf(s).)

\$\endgroup\$
1
  • \$\begingroup\$ Can you use search instead of startsWith?: !t.search(s) \$\endgroup\$ Commented Jul 12, 2017 at 14:58
0
\$\begingroup\$

Mathematica, 141 142 bytes

g[x_:0,___]:=x;f=#<>""&/@(Thread@h[x=#~PadRight~99&/@(Characters[#<>"$"]&/@#),Max/@Outer[g@@Join@@Position[#-#2,y_/;y!=0]&,x,x,1]]/.h->Take)&

I think I can golf it a bit. The code doesn't work on Mathics so you may use Wolfram Sandbox to test.

\$\endgroup\$
0
\$\begingroup\$

PHP, 155 129 102 bytes

for(;++$k<$argc;$t=!print"$t
")for($i=0;count(preg_grep("_^".($t.=$argv[$k][$i++]?:"$")._,$argv))>1;);

expects input to not contain regex special chars and that no argument is -.
Run with php -nr <word> <word> ... or try it online.

breakdown

for(;++$k<$argc;            # loop through arguments
    $t=!print"$t\n")            # 2. print $t, reset $t
    for($i=0;                   # 1. loop through string:
    1<count(preg_grep("_^".(        # 2. continue while more than one argument begins with $t
        $t.=$argv[$k][$i++]?:"$"    # 1. append current character or "$" to $t
    )._,$argv)););
\$\endgroup\$
0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.