The better default then List is... List!
While this is not exactly what the question is looking for, the performance between List and Array is more nuanced then "Array fit everything consecutively in memory so it is good". The benefit of List is also amplified in functional world.
TLDR: Array is still faster in functional languages, but
- List is a bit faster and Array is a bit slower, so the gap is not as big
- List can be made faster with optimizations
- List is more "functional".
List is a bit faster and Array is a bit slower, so the effect is not as big
- Dynamic Array is not necessary more
memory efficient then List. To have amortized O(1) insert, an array allocate more storage then needed to store the elements, copying/moving all elements to a new array when capacity is breached again. This mean dynamic array will use a constant factor more memory than needed. On the other hand, linked list have a overhead (the pointers) multiple of the list length. So basically - the larger the elements size, the more memory array will take up vs list (and array can be more inefficient!). You can use a rootish array, but then you lost some memory locality.
- Memory Strategy matter. In a unmanaged/mark-sweep language, you can't bump allocate, so consecutive call to malloc give non-consecutive address. However in mark-copy/mark-compact you always get consecutive address. Some garbage collector even try to move objects in a way to improve locality so you may gain more.
- If you already box or otherwise store pointers in the array, it is not that cache-friendly and you are paying for cache miss. This make paying for cache miss induced by list less of a bottleneck. And in functional language boxing happens basically everywhere.
List can be made faster with optimizations
Let's look at this classical list definition:
data List0 a = Nil | Cons a (List0 a)
You can imagine the compiler loop-unrolling it, giving you a segment list:
data List1 a = Nil | Cons1 a (List1 a) | Cons2 a a (List1 a)
or
data List2 a = Nil | Cons1 a | Cons2 a a (List2 a)
Note how List2 is isomorphic with List0, but List1 may represent the same List in multiple ways.
This make List1 better then List2 - List2 with odd length, appending with another List2, will cause a deep copy and traversing the other List2. List1 is free from said problem.
If the compiler successfully managed to mostly use Cons2, there will be half as much pointer chasing. You could imagine loop unrolling more - or going even fancier and have List (DynArray a) as the internal represetation for List a. But note that alas, loop unrolling work for other ADT (e.g. treelike) as well, but the DynArray approach become much harder for other ADT.
List is more "functional".
- You can cons a List as many time as you like, but you can only append on an array once. Subsequent apply force a copy or get you into segment-list world.
- Boxing is the bread and butter for functional programming. You want cheap 'copying' for everything? You need boxing! You want laziness? You need boxing! Have polymorphic function but dont want to monomorphize and compile everything multiple time ala C++? You need boxing!
- It is not List, but ADT. In SML/Haskell you dont actually have lots of list, but a wide variety of ADT of different kinds, for example representing ASTs. It is unclear how you can 'extend' Array to store ASTs. And in fact even in unmanaged languages like C, C++, Rust, the canonical solution is to box them and throw locality out of the windows. I would claim that in such case performance is even worse in unmanaged world, as the canonical solution is to use ref-count to managed memory, which is substantially slower than a good GC.
Still, you want Array in your language as they are useful (just not good enough to be a default for general-purpose functional language).
A good functional api will be as follow:
Array: Type -> Type
build: (Nat -> a) -> Nat -> Array a
get: Array a -> Nat -> a (or Maybe a, getting safety for a bit of verboseness)
length: Array a -> Nat
as an example, append could then be written as:
append x y = build (\i -> if i < lx then get x i else get y (i - lx)) (lx + ly)
where
lx = length x
ly = length y
Note two facts:
- It seems inefficient because you have to copy the array over and over, as that is the functional style, but it could be fixed via functional-but-in-place style: <Perceus: Garbage Free Reference Counting with Reuse>. Said approach could even turn the classical quicksort program into an in-place one: Fully in-Place Functional Programming!
- If you want mutability, either for expressiveness reason, or the automated methods above is not good enough: You dont need an imperative array, separated from the above definition! Ref (Array (Ref a)) get you what you want, and you can remove one of the two Ref for precise control. Even cooler - above mutable Array can invoke polymorphic function of immutable form (e.g. map, filter, length)!