2
$\begingroup$

Contiguous arrays do not mix with lazy evaluation. That's why Haskell doesn't have contiguous arrays as a primitive type, and why even GHC has a poor API for them. As such, I sought for a workaround.

One quirk about contiguous arrays is that, though they have O(1) access time, this complexity is an artifact emerging from the word length being finite. If there were no upper bound to the word length, the minimum time complexity of indexing would be logarithmic.

And indeed, even in Haskell, there exists a data structure with log-time indexing, namely the binary tree (relevant CG SE post):

data LogList a = Leaf | Branch a (LogList a) (LogList a)

(!?) :: LogList a -> Int -> Maybe a
Leaf !? _ = Nothing
Branch x _ _ !? 1 = Just x
Branch _ xs ys !? n
  | n <= 0 = Nothing
  | even n = xs !? quot n 2
  | otherwise = ys !? quot n 2

Note that it has log-time cons and uncons as well:

cons :: a -> LogList a -> LogList a
cons x Leaf = Branch x Leaf Leaf
cons x (Branch y xs ys) = Branch x (cons y ys) xs

uncons :: LogList a -> Maybe (a, LogList a)
uncons Leaf = Nothing
uncons (Branch x xs ys) = Just . (,) x $ case uncons xs of
    Nothing -> Leaf
    Just (y, zs) -> Branch y ys zs

And as for insertion and deletion at an arbitrary index, though I've not formally implemented them, I guess they'd take linear-logarithmic time.

Another big bonus is the existence of lazy infinite lists, for example:

repeat :: a -> LogList a
repeat x = Branch x (repeat x) (repeat x)

So why doesn't Haskell provide this data structure in the base package, nor even in the containers package? Is there any caveat I've not seen?

$\endgroup$
4
  • $\begingroup$ Data.Sequence in the containers package uses finger trees. $\endgroup$ Commented May 14, 2024 at 5:45
  • 1
    $\begingroup$ @alephalpha IIRC, Data.Sequence can only have finitely many items. $\endgroup$ Commented May 14, 2024 at 5:46
  • $\begingroup$ Minor typo: "continugous" for "contiguous". $\endgroup$ Commented May 15, 2024 at 13:58
  • 1
    $\begingroup$ I think the main point in favour of using binary trees for immutable lists, is that they allow concatenation in O(1) time, or O(log n) time if you want to keep them balanced. So they might be worth considering over linked lists (aka cons lists) if concatenation is the primary way that lists are meant to be built in your language. $\endgroup$ Commented May 16, 2024 at 19:25

1 Answer 1

2
$\begingroup$

Somewhat

All of the points you've made stand, but here are the most obvious concerns, in a vague order of importance:

  • O(n) space overhead, on top of the data you're storing
  • Pretty bad cache locality
  • Infinite datastructures aren't actually that useful most of the time (big bonus is a stretch at best, I think)
  • Something like a finger tree is really just better for almost everything except allowing infinite structure
$\endgroup$
2
  • 1
    $\begingroup$ What is a "finger tree"? $\endgroup$ Commented May 17, 2024 at 0:12
  • $\begingroup$ @KarlKnechtel See alephalpha's comment above. $\endgroup$ Commented May 17, 2024 at 3:54

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.