16
$\begingroup$

This is something that has slightly annoyed me for a while. A map as a mathematical object (function) is by default "unordered", and the same is for maps as a data structure AKA associative arrays or dictionaries (python). So why did they choose to name the ordered map (usually a binary search tree) std::map and the map data structure (usually a hashmap) std::unordered_map, when std::ordered_map for the ordered map and std::map for the map data structure makes more sense?

For comparison, I'm not sure how browsers implement ES6 Map objects (the standard only seems to require sub-linear time operations), but they only preserve insertion order, so aren't "ordered" in the usual sense.

$\endgroup$
4
  • $\begingroup$ " A map as a mathematical object (function)" ─ This assumes that mathematical functions are the correct model for maps/dictionaries; but functions must define a result for all inputs, whereas dictionaries need not have a value associated with every key. Even then, other mental models such as "a table with two columns" are also viable, and tables naturally have their rows in some order. $\endgroup$ Commented Jul 30, 2024 at 12:14
  • $\begingroup$ @kaya3 every map/dictionary implementation I've seen has to have key-value pairs, even if the value is a null value. The table example is interesting too because the relational model for data doesn't have any concept of row order. $\endgroup$ Commented Jul 30, 2024 at 13:10
  • $\begingroup$ I didn't mean the relational model, I meant actual tables, as in two-dimensional grids with actual rows and columns. The relational model is an abstraction of that; it's more natural to ask why the relational model is unordered, than to ask why table rows are ordered. $\endgroup$ Commented Jul 30, 2024 at 13:14
  • $\begingroup$ Yes, that's true. Map in every other language I've used has not imposed any kind of key order. $\endgroup$ Commented Jul 30, 2024 at 13:28

3 Answers 3

25
$\begingroup$

The short answer: Historical reasons.

Alexander Stepanov's philosophy was always that the algorithms come first. One analogy he always liked was with mathematics. When mathematicians present their work, they generally start with the axioms and work forwards. But when they are actually developing their work, they generally start with the interesting theorems and work backwards.

It's the similar story with the Standard Template Library. He believed that the interesting algorithms come first, and the data structures and their APIs get designed around them. This was especially important if the algorithms were to be generic.

The very earliest versions of the libraries that eventually became Standard Template Library (e.g. the Ada Generic Library Linear Data Structure Packages) didn't include associative containers such as maps and multimaps. Notably, they were also not pitched to the C++ standards committee in the earliest presentations.

Since the Ada Generic Library and original Standard Template Library were designed around algorithms, the interesting algorithms were generally on containers of a single type. When you have containers of single types, it's more or less clear what an "iterator" should look like, for example.

But associative containers have two interesting types: a "key" type and a "value" type. So what does a generic algorithm "iterate" on? Keys? Key-value pairs? We know what Stepanov and Lee eventually settled on, but it had to be worked out.

And that's just the API between containers and generic algorithms. The other difficult part of the API is how a programmer should present their own user-defined types to the STL.

C++ has native syntax for overloading the less-than and equal-test operators. And these need to be implemented already to write generic sort algorithms. Programmers find it quite easy to provide a less-than comparison for their user-defined types compared to, say, a radix query, which is why "library" sort functions tend to be comparison-based rather than radix-based.

In that context, it was clear that implementing associative containers that are based on comparisons was going to be easier and less controversial than ones based on hashing. Hash tables would require designing an API for providing user-declared hash functions. Unlike comparison, C++ didn't have "native" hash function support, so it needed to be designed, and it wasn't clear what the "right" design should be.

(This is a little off track, but it's also worth noting that the research community also wasn't clear as to whether or not "generic" algorithms on hash tables should be "aware" of hash table structure in some way.)

There was an early proposal for adding hash tables to the C++ standard library, more or less tracking HP's implementation, but it was too late to make it into the C++98 standard. What happened in the mean time is that third-party vendors wrote their own hash_map/hash_set (and the multi-variants) implementations that were all broadly similar, but subtly mutually incompatible.

The Boost project was started in 1998 by members of the C++ committee, in part to be a test lab for new proposals to be added to the C++ standards library. When it came time to experiment with hash tables, in around 2003, Jeremy Maitin-Shepard chose unordered rather than hash for Boost's implementation, essentially because none of the vendors had chosen that name, so there's no chance that they would clash.

Boost's proposal eventually was adopted into TR1, and subsequently C++11.

TL;DR version:

Languages like Perl, Python, and ECMAScript started with the data structures and did the algorithms later, which is why the hash tables get the simpler names. C++ started with the algorithms and did the data structures later, which is why the ordered search trees got the simpler names.

$\endgroup$
16
  • 1
    $\begingroup$ Fascinating history. It doesn't make sense to me to start with algorithms before even having data structures to support them. Very often the algorithm to be efficiently implemented depends critically on the data structure. $\endgroup$ Commented Jul 19, 2024 at 3:51
  • 11
    $\begingroup$ @qwr Stepanov's crucial observation is that algorithms don't depend on the data structure specifically, but only on what he called an "iterator category". Quick sort, quick select, etc all work on random-access containers, and hence random access iterators. Similarly, sorting a singly-linked list and a doubly-linked list can use the same algorithm. Sort-decreasing and sort-increasing can also use the same algorithm, only with iterators reversed. And algorithms like "sum all the elements" only ever need generic forward iteration. $\endgroup$ Commented Jul 19, 2024 at 6:22
  • 3
    $\begingroup$ @Pseudonym, and that is why the C++ algorithms are more advanced and flexible than most other languages while simultaneously being a huge pain to use (iterator pair based rather than container based) and hard for a newer developer to reason about. The Ranges library tries to make them somewhat easier to use but ranges themselves are hard to use vs Java streams or C# LINQ. $\endgroup$ Commented Jul 19, 2024 at 14:44
  • 3
    $\begingroup$ That must be why I have to write out std::sort(s.begin(), s.end()); instead of std::sort(s); like a normal language $\endgroup$ Commented Jul 19, 2024 at 17:48
  • 3
    $\begingroup$ We can argue about what's "better" all day. There is a certain amount of pain in making any code generic, especially in an imperative language. Moving the burden from the algorithm writer to the data structure implementor was an interesting and practical design choice. Perhaps even remarkable, when you consider that the world was OO crazy in the late 80s and early 90s, and the STL didn't contain a single virtually inherited anything. $\endgroup$ Commented Jul 20, 2024 at 9:10
4
$\begingroup$

Historical accident. map was part of the original Standard Template Library approved by ANSI/ISO in 1994. std::unordered_map was added to the C++ standard library with C++ 11.

$\endgroup$
0
$\begingroup$

I think a major reason is that a std::map is not ordered, so calling it a ordered_map would be confusing. The term ordered map usually refers to a data structure that preserves items in an order (like a list or vector) and ALSO supports fast lookup by key. The order is independent of the key type (and different ordered maps may contain the same keys and values in a different order).

Maybe you could call std::map a sorted_map (as it stores items sorted by key)

$\endgroup$
3
  • 2
    $\begingroup$ I'd say it is ordered, given that the total order maintained is on the keys. Python calls one of its collections objects OrderedDict that preserves insertion order like your terminology. (Apparently to support insertion order and key lookup, it has a doubly linked list and second dict for that linked list...) I'm totally fine with the name sorted_map and it's probably clearer, but the main point is that a unqualified "map" is unordered/unsorted. $\endgroup$ Commented Jul 19, 2024 at 14:22
  • 2
    $\begingroup$ std::map entries are absolutely ordered. They may not be linear in memory, but neither are the nodes in std::list. Every std::map has a comparator, which defines the order. For a type to be "ordered" simply means that for any elements of the type, a b, and c, less(a, b) implies !less(b, a), less(a, a) is false, and less(a, b) && less(b, c) implies less(a, c). The order is "independent of the key type", but dependent upon the comparator (which may be customized per key type). $\endgroup$ Commented Jul 19, 2024 at 22:29
  • 2
    $\begingroup$ The fact that the name unordered_map was selected for hash-maps specifically to highlight the difference with map makes it pretty clear that from the C++ standard point of view map is considered to be ordered. Also, CS uses "Partial Order" and "Total Order" to describe comparison properties... $\endgroup$ Commented Jul 21, 2024 at 11:16

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.