5
\$\begingroup\$

The question asked is very straight-forward and is simple enough to solve. What I am looking for is that hopefully I can get some understanding for using the constructs and built-ins of the Haskell language itself.

That said, the question is as follows:

Joseph and Jane are making a contest for apes. During the process, they have to communicate frequently with each other. Since they are not completely human, they cannot speak properly. They have to transfer messages using postcards of small sizes.

To save space on the small postcards, they devise a string compression algorithm:

  • If a character, \$ch\$, occurs \$n(> 1)\$ times in a row, then it will be represented by \$\{ch\}\{n\}\$. where \$\{x\}\$ is the value of \$x\$. For example, if the substring is a sequence of \$4\$ 'a' ("aaaa"). it will be represented as "a4".

  • If a character, \$ch\$, occurs exactly one time in a row, then it will be simply represented as \$\{ch\}\$. For example, if the substring is "a". then it will be represented as "a".

Help Joseph to compress a message. msg.

Input

The only line of input contains a string. msg.

Output

Print the string msg as a compressed message.

Create a messaging system where each word (token) will be compressed based on any consecutive characters.

Single sequenced characters remain the same but multiple identical consecutive characters will be replaced by the character followed by an integer representing the number of times it repeats consecutively.

Example input and output:

Sample Input #00:

> abcaaabbb

Sample Output #00:

> abca3b3

Sample Input #01:

> abcd

Sample Output #01:

> abcd

My code:

-- String representation as a simple Compression Algorithm
--

module Main where
import Text.Printf


compression :: String -> String -> String -> Int -> String
compression input output prevChar count
    | input == "" && output == "" = output
    | prevChar == "" = compression (tail input) output ([head input]) 1
    | input == "" && count > 1 = output ++ prevChar ++ (show count)
    | input == "" && count < 2 = output ++ prevChar
    | prevChar == ([head input]) = compression (tail input) output prevChar (count+1)
    | prevChar /= ([head input]) && count < 2 = compression (tail input) (output ++ prevChar) ([head input]) 1
    | prevChar /= ([head input]) && count > 1 = compression (tail input) (output ++ prevChar ++ (show count)) ([head input]) 1

main :: IO ()
main =
  do
    let s1 = "aaabaaaaccaaaaba"
    printf "Decompressed: %s and Compressed: %s" s1 (compression s1 "" "" 0) 

I feel as though some of my Haskell code can be clunky and could use some more features of the language itself as I mentioned earlier.

Now this isn't a full-blown application or even a part of a module for an actual application but just an exercise to improve my working knowledge and skill for writing Haskell.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ @J_H the text of the description from the image has been added \$\endgroup\$ Commented Sep 13, 2024 at 15:11

3 Answers 3

5
\$\begingroup\$

I agree with some of the criticism mentioned in the other review, notably:

  1. You're trying to write a function from String to String, so you should have such a function.
  2. In general, be more eager to break up your work into smaller named pieces.

I'd like to add another point, which is heavy use of existing functions (take, drop, takeWhile, dropWhile, splitAt, etc.) instead of iterating yourself. You can search for functions on hoogle.haskell.org by entering the desired type signature and it spits out all matching functions. In many cases, you will find an existing function that fits you needs. If not, write it yourself, but keep it as a separate helper function.

Also, use pattern matching in you function signature. If you write input == "" && output == "" in the pattern and then compression (tail input) output ([head input]) in the function body, you probably should have deconstructed the list during pattern matching. In my code below, I use input@(a : _) to extract the head and the full list (including the head), but I could also extract the tail if I wanted to (by replacing _ with a variable name). This line automatically rejects empty lists, so execution will fall through to until it finds a matching pattern.

I strongly disagree with the code provided in the other review, which is equally clunky than your original code, if not more. An idiomatic solution to your problem looks more like this:

compression :: String -> String
compression "" = ""
compression input@(a : _)
  | count > 1 = a : show count ++ compression remainder
  | otherwise = a : compression remainder
  where count = length $ takeWhile (== a) input
        remainder = dropWhile (== a) input

If you want to avoid iterating twice, you can use span, which returns a tuple and is equivalent to (takeWhile, dropWhile). The updated code is:

compression :: String -> String
compression "" = ""
compression input@(a : _)
  | count > 1 = a : show count ++ compression remainder
  | otherwise = a : compression remainder
  where count = length prefix
        (prefix, remainder) = span (== a) input

For absolute best performance, you can create a helper function that replaces dropWhile but also reports the number of dropped items. This avoids creating the prefix list and then counting its length.

compression :: String -> String
compression "" = ""
compression input@(a : _)
  | count > 1 = a : show count ++ compression remainder
  | otherwise = a : compression remainder
  where (remainder, count) = dropWhileAndCount (== a) input

dropWhileAndCount :: (a -> Bool) -> [a] -> ([a], Int)
dropWhileAndCount f = go 0
  where go n (x : xs) | f x = go (n + 1) xs
        go n xs = (xs, n)
\$\endgroup\$
2
  • \$\begingroup\$ Agreed that this is a much nicer solution to the original coding challenge. \$\endgroup\$ Commented Aug 28, 2024 at 16:09
  • \$\begingroup\$ take(drop)While was my exact thought process. I was on Hoogle search for them both right before I had originally posted this. I haven't leveraged where nearly enough and this shows it. I'll start to include this instead of creating more recursive cases. I think this is pretty self-evident with my code provided. Thanks again! \$\endgroup\$ Commented Aug 31, 2024 at 17:42
4
\$\begingroup\$

The first two sections apply to every problem and every programming language.

Use complementary conditions

I recommend against the > 1 and < 2 conditions. When > 1 is one case, the other should be <= 1. How easy is it to spot the mistake in

f x | x < 1 = "foo"
    | x > 2 = "bar"

You would have never written the equivalent

f x | x < 1 = "foo"
    | x >= 3 = "bar"

because it is so obviously wrong. Of course the haskell way of writing it is

f x | x > 1     = "bar"
    | otherwise = "foo"

but I like my code to closely mirror the problem statement, see next section.

Follow the problem statement as closely as possible

This is very pedantic but it helps in larger problems. Your order of conditions is swapped. The question is

  • Single sequenced characters remain the same
  • multiple identical consecutive characters will...

but you handle the multiple case first. Ideally, at some place on code, you can write the code the sequence as the problem statement.

encode count char
  | count == 1 = char       -- Single sequenced characters remain the same but 
  | count >  1 =            -- multiple identical consecutive characters will be replaced
                 char       -- by the character 
                 ++         -- followed by an
                 show count -- integer representing the number of times it repeats consecutively

The comments here are clearly overdoing it, but in the real world you know which part is the most important to the customer or where the requirements are most likely to be changed (slightly).

use standard functions

Learn the standard functions. For the compression algorithm, group jumps to my mind:

group :: Eq a => [a] -> [[a]]

> group "abcaaabbb" 
["a","b","c","aaa","bbb"]

compression = concatMap encode' . group where
   encode' x = encode (length x) x
   encode -- see above
\$\endgroup\$
1
  • \$\begingroup\$ I enjoy reviewing from past times. I re-read your comment and I absolutely agree, and strongly for that matter, with your post. \$\endgroup\$ Commented Aug 19 at 17:29
3
\$\begingroup\$

There are a lot of tools Haskell provides for doing things like this elegantly; I'll try to limit my feedback to boring solutions that will be helpful.

  1. You're trying to write a function from String to String, so you should have such a function. Rename what you've got now as compression', and provide a wrapper.
  2. By convention, try to have "context like" arguments first and "data like" arguments last; this will assist with partial-application of your functions.
  3. Having prevChar :: String isn't good; what if it happened to be "asdf"? Maybe Char would be better.
  4. In general, be more eager to break up your work into smaller named pieces.
  5. The quantity of arguments you're handling, and the quantity of cases you're handling in parallel, should feel like a red flag: you need a better abstraction. You've probably already noticed that Functor won't work for this. With a little experience, you'll start to notice that your current recursion scheme basically is the State monad. I'm using the mtl library for this but we don't need to use the Transformer version; just State will serve.

By the time I got this working, it didn't have much in common with what you wrote, sorry. Hopefully it contains the right number of new things for you to learn!

module Main where

import Control.Monad (forM_, when)
import Control.Monad.State (execState, get, gets, modify, put, State)
import Text.Printf

data CompressorState = CompressorState { output :: String
                                       , prevChar :: Maybe Char
                                       , count :: Int
                                       , input :: String
                                       } deriving (Show)

type Compressor = State CompressorState

compression' :: Compressor ()
compression' = do
  mChar <- getNext
  case mChar of
    Nothing -> flushCount
    Just char -> do
      previous <- getPrevious
      when (Just char /= previous) $
        do flushCount
           emit char
           setPrevious char
      increment
      compression'
  where getNext = do cs@CompressorState{input = i} <- get
                     case i of [] -> return Nothing
                               c:i' -> do put cs{input = i'} -- Using record update syntax
                                          return (Just c)
        getPrevious = gets prevChar
        increment = modify (\cs@CompressorState{count = c} -> cs{count = c + 1})
        flushCount = do c <- gets  count
                        when (1 < c) $ forM_ (show c) emit
                        modify (\cs -> cs{count = 0})
        emit :: Char -> Compressor ()
        emit c = modify (\cs@CompressorState{output = o} -> cs{output = c : o})
        setPrevious :: Char -> Compressor ()
        setPrevious c = modify (\cs -> cs{prevChar = Just c})


compression :: String -> String
compression i = reverse . output $
                  execState compression' CompressorState{ output = ""
                                                        , prevChar = Nothing
                                                        , count = 0
                                                        , input = i
                                                        }

examples :: [(String, String)]
examples = [ ("abcaaabbb", "abca3b3")
           , ("abcd", "abcd")
           , ("aaabaaaaccaaaaba", "a3ba4c2a4ba")
           ]

main :: IO ()
main =
  do
    let results = do -- This is in the List Monad!
                    (input, reference) <- examples
                    let output = compression input
                    return (output == reference, input, output, reference)
    forM_ results print
\$\endgroup\$
5
  • \$\begingroup\$ I have to be honest, I'm not sure I would have used this exact form but I think that is only natural when it comes to writing in a functional form. That said, I was able to learn a few things just by having gone through what you did here and breaking each part down. \$\endgroup\$ Commented Aug 22, 2024 at 13:56
  • \$\begingroup\$ I've seen this Monad scheme many places now where case catches Maybe and Nothing/Just \$\endgroup\$ Commented Aug 23, 2024 at 0:11
  • \$\begingroup\$ All when, put, & get are new to me. I'll stop adding every minor comment but I want to say truly thank you. \$\endgroup\$ Commented Aug 23, 2024 at 1:57
  • 1
    \$\begingroup\$ "I've seen this Monad scheme many places now where case catches Maybe" Just note that case expressions are just expressions, nothing about them is tied to monads. (One extra thing I considered using in my response was the LambdaCase extension, which often makes cases more fun to use.) \$\endgroup\$ Commented Aug 24, 2024 at 0:34
  • \$\begingroup\$ Of course, I've just seen the do notation (where) is used heavily with case where Monads are being used. I'm not at all familiar enough to speak much more on it or LambdaCase. I'm still picking apart several aspects of your example. It's short but packed with nuance. \$\endgroup\$ Commented Aug 24, 2024 at 1:14

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.