g++ unordered_multimap: an exercise

I discovered this randomly several weeks ago while debugging something else at work, and I thought it was worth sharing since the g++ STL is widely used.



Step 1: Save the following file as mapdemo.cpp:




#include <iostream>
#include <unordered_map>

int main() {
typedef std::unordered_multimap<int, int> int_multimap;
int_multimap map;
for (int i = 0; i < 10000; ++i) {
map.insert(int_multimap::value_type(17, i));
}
std::cerr << *static_cast<int*>(0);
return 0;
}


Step 2: Compile the file:




g++ --std=c++0x -g mapdemo.cpp


Step 3: Load the file in gdb and examine the buckets:




$ gdb -silent ./a.out
Reading symbols from xxx/a.out...done.
(gdb) run
Starting program: xxx/a.out

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400b4d in main () at mapdemo.cpp:10
10 std::cerr << *static_cast<int*>(0);
(gdb) p map._M_bucket_count
$1 = 15173
(gdb) p map._M_buckets[0]
$2 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[15172]
$3 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[17]
$4 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x605080
(gdb) p *map._M_buckets[17]
$5 = {_M_v = {first = 17, second = 0}, _M_next = 0x670c90}


Yes, g++'s implementation of unordered_multimap (known as hash_multimap in pre-C++0x versions of C++) uses bucket hashing, but the size of the backing array is proportional to the count of elements in the multimap, not the count of distinct keys.



Exercise for the reader: Explain what I just did; explain why the result of step 3 is curious; and then explain why the authors might have chosen to do it this way anyway.



It occurs to me that this would have made a decent interview question if I hadn't written it up here. Oh well, I have others.

0 Response to "g++ unordered_multimap: an exercise"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel