g++ unordered_multimap: an exercise
I discovered this randomly several weeks ago while debugging something else at work, and I thought it was worth sharing since the g++ STL is widely used.
Step 1: Save the following file as mapdemo.cpp:
#include <iostream>
#include <unordered_map>
int main() {
typedef std::unordered_multimap<int, int> int_multimap;
int_multimap map;
for (int i = 0; i < 10000; ++i) {
map.insert(int_multimap::value_type(17, i));
}
std::cerr << *static_cast<int*>(0);
return 0;
}
Step 2: Compile the file:
g++ --std=c++0x -g mapdemo.cpp
Step 3: Load the file in gdb and examine the buckets:
$ gdb -silent ./a.out
Reading symbols from xxx/a.out...done.
(gdb) run
Starting program: xxx/a.out
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400b4d in main () at mapdemo.cpp:10
10 std::cerr << *static_cast<int*>(0);
(gdb) p map._M_bucket_count
$1 = 15173
(gdb) p map._M_buckets[0]
$2 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[15172]
$3 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x0
(gdb) p map._M_buckets[17]
$4 = (std::__detail::_Hash_node<std::pair<int const, int>, false> *) 0x605080
(gdb) p *map._M_buckets[17]
$5 = {_M_v = {first = 17, second = 0}, _M_next = 0x670c90}
Yes, g++'s implementation of unordered_multimap (known as hash_multimap in pre-C++0x versions of C++) uses bucket hashing, but the size of the backing array is proportional to the count of elements in the multimap, not the count of distinct keys.
Exercise for the reader: Explain what I just did; explain why the result of step 3 is curious; and then explain why the authors might have chosen to do it this way anyway.
It occurs to me that this would have made a decent interview question if I hadn't written it up here. Oh well, I have others.
0 Response to "g++ unordered_multimap: an exercise"
Post a Comment