Warning, /sdk/heaptrack/3rdparty/robin-map/README.md is written in an unsupported language. File is not indexed.

0001 [![CI](https://github.com/Tessil/robin-map/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/Tessil/robin-map/actions/workflows/ci.yml)
0002 
0003 ## A C++ implementation of a fast hash map and hash set using robin hood hashing
0004 
0005 The robin-map library is a C++ implementation of a fast hash map and hash set using open-addressing and linear robin hood hashing with backward shift deletion to resolve collisions.
0006 
0007 Four classes are provided: `tsl::robin_map`, `tsl::robin_set`, `tsl::robin_pg_map` and `tsl::robin_pg_set`. The first two are faster and use a power of two growth policy, the last two use a prime growth policy instead and are able to cope better with a poor hash function. Use the prime version if there is a chance of repeating patterns in the lower bits of your hash (e.g. you are storing pointers with an identity hash function). See [GrowthPolicy](#growth-policy) for details.
0008 
0009 A **benchmark** of `tsl::robin_map` against other hash maps may be found [here](https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html). This page also gives some advices on which hash table structure you should try for your use case (useful if you are a bit lost with the multiple hash tables implementations in the `tsl` namespace).
0010 
0011 ### Key features
0012 
0013 - Header-only library, just add the [include](include/) directory to your include path and you are ready to go. If you use CMake, you can also use the `tsl::robin_map` exported target from the [CMakeLists.txt](CMakeLists.txt).
0014 - Fast hash table, check the [benchmark](https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html) for some numbers.
0015 - Support for move-only and non-default constructible key/value.
0016 - Support for heterogeneous lookups allowing the usage of `find` with a type different than `Key` (e.g. if you have a map that uses `std::unique_ptr<foo>` as key, you can use a `foo*` or a `std::uintptr_t` as key parameter to `find` without constructing a `std::unique_ptr<foo>`, see [example](#heterogeneous-lookups)).
0017 - No need to reserve any sentinel value from the keys.
0018 - Possibility to store the hash value alongside the stored key-value for faster rehash and lookup if the hash or the key equal functions are expensive to compute. Note that hash may be stored even if not asked explicitly when the library can detect that it will have no impact on the size of the structure in memory due to alignment. See the [StoreHash](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html#details) template parameter for details.
0019 - If the hash is known before a lookup, it is possible to pass it as parameter to speed-up the lookup (see `precalculated_hash` parameter in [API](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html#a35021b11aabb61820236692a54b3a0f8)).
0020 - Support for efficient serialization and deserialization (see [example](#serialization) and the `serialize/deserialize` methods in the [API](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html) for details).
0021 - The library can be used with exceptions disabled (through `-fno-exceptions` option on Clang and GCC, without an `/EH` option on MSVC or simply by defining `TSL_NO_EXCEPTIONS`). `std::terminate` is used in replacement of the `throw` instruction when exceptions are disabled.
0022 - API closely similar to `std::unordered_map` and `std::unordered_set`.
0023 
0024 ### Differences compared to `std::unordered_map`
0025 
0026 `tsl::robin_map` tries to have an interface similar to `std::unordered_map`, but some differences exist.
0027 - The **strong exception guarantee only holds** if the following statement is true `std::is_nothrow_swappable<value_type>::value && std::is_nothrow_move_constructible<value_type>::value` (where `value_type` is `Key` for `tsl::robin_set` and `std::pair<Key, T>` for `tsl::robin_map`). Otherwise if an exception is thrown during the swap or the move, the structure may end up in a undefined state. Note that per the standard, a `value_type` with a noexcept copy constructor and no move constructor also satisfies this condition and will thus guarantee the strong exception guarantee for the structure (see [API](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html#details) for details).
0028 - The type `Key`, and also `T` in case of map, must be swappable. They must also be copy and/or move constructible.
0029 - Iterator invalidation doesn't behave in the same way, any operation modifying the hash table invalidate them (see [API](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html#details) for details).
0030 - References and pointers to keys or values in the map are invalidated in the same way as iterators to these keys-values.
0031 - For iterators of `tsl::robin_map`, `operator*()` and `operator->()` return a reference and a pointer to `const std::pair<Key, T>` instead of `std::pair<const Key, T>` making the value `T` not modifiable. To modify the value you have to call the `value()` method of the iterator to get a mutable reference. Example:
0032 ```c++
0033 tsl::robin_map<int, int> map = {{1, 1}, {2, 1}, {3, 1}};
0034 for(auto it = map.begin(); it != map.end(); ++it) {
0035     //it->second = 2; // Illegal
0036     it.value() = 2; // Ok
0037 }
0038 ```
0039 - No support for some buckets related methods (like `bucket_size`, `bucket`, ...).
0040 
0041 These differences also apply between `std::unordered_set` and `tsl::robin_set`.
0042 
0043 Thread-safety guarantees are the same as `std::unordered_map/set` (i.e. possible to have multiple readers with no writer).
0044 
0045 ### Growth policy
0046 
0047 The library supports multiple growth policies through the `GrowthPolicy` template parameter. Three policies are provided by the library but you can easily implement your own if needed.
0048 
0049 * **[tsl::rh::power_of_two_growth_policy.](https://tessil.github.io/robin-map/classtsl_1_1rh_1_1power__of__two__growth__policy.html)** Default policy used by `tsl::robin_map/set`. This policy keeps the size of the bucket array of the hash table to a power of two. This constraint allows the policy to avoid the usage of the slow modulo operation to map a hash to a bucket, instead of <code>hash % 2<sup>n</sup></code>, it uses <code>hash & (2<sup>n</sup> - 1)</code> (see [fast modulo](https://en.wikipedia.org/wiki/Modulo_operation#Performance_issues)). Fast but this may cause a lot of collisions with a poor hash function as the modulo with a power of two only masks the most significant bits in the end.
0050 * **[tsl::rh::prime_growth_policy.](https://tessil.github.io/robin-map/classtsl_1_1rh_1_1prime__growth__policy.html)** Default policy used by `tsl::robin_pg_map/set`. The policy keeps the size of the bucket array of the hash table to a prime number. When mapping a hash to a bucket, using a prime number as modulo will result in a better distribution of the hash across the buckets even with a poor hash function. To allow the compiler to optimize the modulo operation, the policy use a lookup table with constant primes modulos (see [API](https://tessil.github.io/robin-map/classtsl_1_1rh_1_1prime__growth__policy.html#details) for details). Slower than `tsl::rh::power_of_two_growth_policy` but more secure.
0051 * **[tsl::rh::mod_growth_policy.](https://tessil.github.io/robin-map/classtsl_1_1rh_1_1mod__growth__policy.html)** The policy grows the map by a customizable growth factor passed in parameter. It then just use the modulo operator to map a hash to a bucket. Slower but more flexible.
0052 
0053 
0054 To implement your own policy, you have to implement the following interface.
0055 
0056 ```c++
0057 struct custom_policy {
0058     // Called on hash table construction and rehash, min_bucket_count_in_out is the minimum buckets
0059     // that the hash table needs. The policy can change it to a higher number of buckets if needed 
0060     // and the hash table will use this value as bucket count. If 0 bucket is asked, then the value
0061     // must stay at 0.
0062     explicit custom_policy(std::size_t& min_bucket_count_in_out);
0063     
0064     // Return the bucket [0, bucket_count()) to which the hash belongs. 
0065     // If bucket_count() is 0, it must always return 0.
0066     std::size_t bucket_for_hash(std::size_t hash) const noexcept;
0067     
0068     // Return the number of buckets that should be used on next growth
0069     std::size_t next_bucket_count() const;
0070     
0071     // Maximum number of buckets supported by the policy
0072     std::size_t max_bucket_count() const;
0073     
0074     // Reset the growth policy as if the policy was created with a bucket count of 0.
0075     // After a clear, the policy must always return 0 when bucket_for_hash() is called.
0076     void clear() noexcept;
0077 }
0078 ```
0079 
0080 ### Installation
0081 
0082 To use robin-map, just add the [include](include/) directory to your include path. It is a **header-only** library.
0083 
0084 If you use CMake, you can also use the `tsl::robin_map` exported target from the [CMakeLists.txt](CMakeLists.txt) with `target_link_libraries`. 
0085 ```cmake
0086 # Example where the robin-map project is stored in a third-party directory
0087 add_subdirectory(third-party/robin-map)
0088 target_link_libraries(your_target PRIVATE tsl::robin_map)  
0089 ```
0090 
0091 If the project has been installed through `make install`, you can also use `find_package(tsl-robin-map REQUIRED)` instead of `add_subdirectory`.
0092 
0093 The library is available in [vcpkg](https://github.com/Microsoft/vcpkg/tree/master/ports/robin-map) and [conan](https://conan.io/center/tsl-robin-map). It's also present in [Debian](https://packages.debian.org/buster/robin-map-dev), [Ubuntu](https://packages.ubuntu.com/disco/robin-map-dev) and [Fedora](https://apps.fedoraproject.org/packages/robin-map-devel) package repositories.
0094 
0095 The code should work with any C++11 standard-compliant compiler and has been tested with GCC 4.8.4, Clang 3.5.0 and Visual Studio 2015.
0096 
0097 To run the tests you will need the Boost Test library and CMake.
0098 
0099 ```bash
0100 git clone https://github.com/Tessil/robin-map.git
0101 cd robin-map/tests
0102 mkdir build
0103 cd build
0104 cmake ..
0105 cmake --build .
0106 ./tsl_robin_map_tests
0107 ```
0108 
0109 ### Usage
0110 
0111 The API can be found [here](https://tessil.github.io/robin-map/). 
0112 
0113 All methods are not documented yet, but they replicate the behavior of the ones in `std::unordered_map` and `std::unordered_set`, except if specified otherwise.
0114 
0115 
0116 ### Example
0117 
0118 ```c++
0119 #include <cstdint>
0120 #include <iostream>
0121 #include <string>
0122 #include <tsl/robin_map.h>
0123 #include <tsl/robin_set.h>
0124 
0125 int main() {
0126     tsl::robin_map<std::string, int> map = {{"a", 1}, {"b", 2}};
0127     map["c"] = 3;
0128     map["d"] = 4;
0129     
0130     map.insert({"e", 5});
0131     map.erase("b");
0132     
0133     for(auto it = map.begin(); it != map.end(); ++it) {
0134         //it->second += 2; // Not valid.
0135         it.value() += 2;
0136     }
0137     
0138     // {d, 6} {a, 3} {e, 7} {c, 5}
0139     for(const auto& key_value : map) {
0140         std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
0141     }
0142     
0143         
0144     if(map.find("a") != map.end()) {
0145         std::cout << "Found \"a\"." << std::endl;
0146     }
0147     
0148     const std::size_t precalculated_hash = std::hash<std::string>()("a");
0149     // If we already know the hash beforehand, we can pass it in parameter to speed-up lookups.
0150     if(map.find("a", precalculated_hash) != map.end()) {
0151         std::cout << "Found \"a\" with hash " << precalculated_hash << "." << std::endl;
0152     }
0153     
0154     
0155     /*
0156      * Calculating the hash and comparing two std::string may be slow. 
0157      * We can store the hash of each std::string in the hash map to make 
0158      * the inserts and lookups faster by setting StoreHash to true.
0159      */ 
0160     tsl::robin_map<std::string, int, std::hash<std::string>, 
0161                    std::equal_to<std::string>,
0162                    std::allocator<std::pair<std::string, int>>,
0163                    true> map2;
0164                        
0165     map2["a"] = 1;
0166     map2["b"] = 2;
0167     
0168     // {a, 1} {b, 2}
0169     for(const auto& key_value : map2) {
0170         std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
0171     }
0172     
0173     
0174     
0175     
0176     tsl::robin_set<int> set;
0177     set.insert({1, 9, 0});
0178     set.insert({2, -1, 9});
0179     
0180     // {0} {1} {2} {9} {-1}
0181     for(const auto& key : set) {
0182         std::cout << "{" << key << "}" << std::endl;
0183     }
0184 }  
0185 ```
0186 
0187 #### Heterogeneous lookups
0188 
0189 Heterogeneous overloads allow the usage of other types than `Key` for lookup and erase operations as long as the used types are hashable and comparable to `Key`.
0190 
0191 To activate the heterogeneous overloads in `tsl::robin_map/set`, the qualified-id `KeyEqual::is_transparent` must be valid. It works the same way as for [`std::map::find`](http://en.cppreference.com/w/cpp/container/map/find). You can either use [`std::equal_to<>`](http://en.cppreference.com/w/cpp/utility/functional/equal_to_void) or define your own function object.
0192 
0193 Both `KeyEqual` and `Hash` will need to be able to deal with the different types.
0194 
0195 ```c++
0196 #include <functional>
0197 #include <iostream>
0198 #include <string>
0199 #include <tsl/robin_map.h>
0200 
0201 
0202 struct employee {
0203     employee(int id, std::string name) : m_id(id), m_name(std::move(name)) {
0204     }
0205     
0206     // Either we include the comparators in the class and we use `std::equal_to<>`...
0207     friend bool operator==(const employee& empl, int empl_id) {
0208         return empl.m_id == empl_id;
0209     }
0210     
0211     friend bool operator==(int empl_id, const employee& empl) {
0212         return empl_id == empl.m_id;
0213     }
0214     
0215     friend bool operator==(const employee& empl1, const employee& empl2) {
0216         return empl1.m_id == empl2.m_id;
0217     }
0218     
0219     
0220     int m_id;
0221     std::string m_name;
0222 };
0223 
0224 // ... or we implement a separate class to compare employees.
0225 struct equal_employee {
0226     using is_transparent = void;
0227     
0228     bool operator()(const employee& empl, int empl_id) const {
0229         return empl.m_id == empl_id;
0230     }
0231     
0232     bool operator()(int empl_id, const employee& empl) const {
0233         return empl_id == empl.m_id;
0234     }
0235     
0236     bool operator()(const employee& empl1, const employee& empl2) const {
0237         return empl1.m_id == empl2.m_id;
0238     }
0239 };
0240 
0241 struct hash_employee {
0242     std::size_t operator()(const employee& empl) const {
0243         return std::hash<int>()(empl.m_id);
0244     }
0245     
0246     std::size_t operator()(int id) const {
0247         return std::hash<int>()(id);
0248     }
0249 };
0250 
0251 
0252 int main() {
0253     // Use std::equal_to<> which will automatically deduce and forward the parameters
0254     tsl::robin_map<employee, int, hash_employee, std::equal_to<>> map; 
0255     map.insert({employee(1, "John Doe"), 2001});
0256     map.insert({employee(2, "Jane Doe"), 2002});
0257     map.insert({employee(3, "John Smith"), 2003});
0258 
0259     // John Smith 2003
0260     auto it = map.find(3);
0261     if(it != map.end()) {
0262         std::cout << it->first.m_name << " " << it->second << std::endl;
0263     }
0264 
0265     map.erase(1);
0266 
0267 
0268 
0269     // Use a custom KeyEqual which has an is_transparent member type
0270     tsl::robin_map<employee, int, hash_employee, equal_employee> map2;
0271     map2.insert({employee(4, "Johnny Doe"), 2004});
0272 
0273     // 2004
0274     std::cout << map2.at(4) << std::endl;
0275 }  
0276 ```
0277 
0278 
0279 #### Serialization
0280 
0281 The library provides an efficient way to serialize and deserialize a map or a set so that it can be saved to a file or send through the network.
0282 To do so, it requires the user to provide a function object for both serialization and deserialization.
0283 
0284 ```c++
0285 struct serializer {
0286     // Must support the following types for U: std::int16_t, std::uint32_t, 
0287     // std::uint64_t, float and std::pair<Key, T> if a map is used or Key for 
0288     // a set.
0289     template<typename U>
0290     void operator()(const U& value);
0291 };
0292 ```
0293 
0294 ```c++
0295 struct deserializer {
0296     // Must support the following types for U: std::int16_t, std::uint32_t, 
0297     // std::uint64_t, float and std::pair<Key, T> if a map is used or Key for 
0298     // a set.
0299     template<typename U>
0300     U operator()();
0301 };
0302 ```
0303 
0304 Note that the implementation leaves binary compatibility (endianness, float binary representation, size of int, ...) of the types it serializes/deserializes in the hands of the provided function objects if compatibility is required.
0305 
0306 More details regarding the `serialize` and `deserialize` methods can be found in the [API](https://tessil.github.io/robin-map/classtsl_1_1robin__map.html).
0307 
0308 ```c++
0309 #include <cassert>
0310 #include <cstdint>
0311 #include <fstream>
0312 #include <type_traits>
0313 #include <tsl/robin_map.h>
0314 
0315 
0316 class serializer {
0317 public:
0318     serializer(const char* file_name) {
0319         m_ostream.exceptions(m_ostream.badbit | m_ostream.failbit);
0320         m_ostream.open(file_name, std::ios::binary);
0321     }
0322     
0323     template<class T,
0324              typename std::enable_if<std::is_arithmetic<T>::value>::type* = nullptr>
0325     void operator()(const T& value) {
0326         m_ostream.write(reinterpret_cast<const char*>(&value), sizeof(T));
0327     }
0328     
0329     void operator()(const std::pair<std::int64_t, std::int64_t>& value) {
0330         (*this)(value.first);
0331         (*this)(value.second);
0332     }
0333 
0334 private:
0335     std::ofstream m_ostream;
0336 };
0337 
0338 class deserializer {
0339 public:
0340     deserializer(const char* file_name) {
0341         m_istream.exceptions(m_istream.badbit | m_istream.failbit | m_istream.eofbit);
0342         m_istream.open(file_name, std::ios::binary);
0343     }
0344     
0345     template<class T>
0346     T operator()() {
0347         T value;
0348         deserialize(value);
0349         
0350         return value;
0351     }
0352     
0353 private:
0354     template<class T,
0355              typename std::enable_if<std::is_arithmetic<T>::value>::type* = nullptr>
0356     void deserialize(T& value) {
0357         m_istream.read(reinterpret_cast<char*>(&value), sizeof(T));
0358     }
0359     
0360     void deserialize(std::pair<std::int64_t, std::int64_t>& value) {
0361         deserialize(value.first);
0362         deserialize(value.second);
0363     }
0364 
0365 private:
0366     std::ifstream m_istream;
0367 };
0368 
0369 
0370 int main() {
0371     const tsl::robin_map<std::int64_t, std::int64_t> map = {{1, -1}, {2, -2}, {3, -3}, {4, -4}};
0372     
0373     
0374     const char* file_name = "robin_map.data";
0375     {
0376         serializer serial(file_name);
0377         map.serialize(serial);
0378     }
0379     
0380     {
0381         deserializer dserial(file_name);
0382         auto map_deserialized = tsl::robin_map<std::int64_t, std::int64_t>::deserialize(dserial);
0383         
0384         assert(map == map_deserialized);
0385     }
0386     
0387     {
0388         deserializer dserial(file_name);
0389         
0390         /**
0391          * If the serialized and deserialized map are hash compatibles (see conditions in API), 
0392          * setting the argument to true speed-up the deserialization process as we don't have 
0393          * to recalculate the hash of each key. We also know how much space each bucket needs.
0394          */
0395         const bool hash_compatible = true;
0396         auto map_deserialized = 
0397             tsl::robin_map<std::int64_t, std::int64_t>::deserialize(dserial, hash_compatible);
0398         
0399         assert(map == map_deserialized);
0400     }
0401 } 
0402 ```
0403 
0404 ##### Serialization with Boost Serialization and compression with zlib
0405 
0406 It is possible to use a serialization library to avoid the boilerplate. 
0407 
0408 The following example uses Boost Serialization with the Boost zlib compression stream to reduce the size of the resulting serialized file. The example requires C++20 due to the usage of the template parameter list syntax in lambdas, but it can be adapted to less recent versions.
0409 
0410 ```c++
0411 #include <boost/archive/binary_iarchive.hpp>
0412 #include <boost/archive/binary_oarchive.hpp>
0413 #include <boost/iostreams/filter/zlib.hpp>
0414 #include <boost/iostreams/filtering_stream.hpp>
0415 #include <boost/serialization/split_free.hpp>
0416 #include <boost/serialization/utility.hpp>
0417 #include <cassert>
0418 #include <cstdint>
0419 #include <fstream>
0420 #include <tsl/robin_map.h>
0421 
0422 
0423 namespace boost { namespace serialization {
0424     template<class Archive, class Key, class T>
0425     void serialize(Archive & ar, tsl::robin_map<Key, T>& map, const unsigned int version) {
0426         split_free(ar, map, version); 
0427     }
0428 
0429     template<class Archive, class Key, class T>
0430     void save(Archive & ar, const tsl::robin_map<Key, T>& map, const unsigned int /*version*/) {
0431         auto serializer = [&ar](const auto& v) { ar & v; };
0432         map.serialize(serializer);
0433     }
0434 
0435     template<class Archive, class Key, class T>
0436     void load(Archive & ar, tsl::robin_map<Key, T>& map, const unsigned int /*version*/) {
0437         auto deserializer = [&ar]<typename U>() { U u; ar & u; return u; };
0438         map = tsl::robin_map<Key, T>::deserialize(deserializer);
0439     }
0440 }}
0441 
0442 
0443 int main() {
0444     tsl::robin_map<std::int64_t, std::int64_t> map = {{1, -1}, {2, -2}, {3, -3}, {4, -4}};
0445     
0446     
0447     const char* file_name = "robin_map.data";
0448     {
0449         std::ofstream ofs;
0450         ofs.exceptions(ofs.badbit | ofs.failbit);
0451         ofs.open(file_name, std::ios::binary);
0452         
0453         boost::iostreams::filtering_ostream fo;
0454         fo.push(boost::iostreams::zlib_compressor());
0455         fo.push(ofs);
0456         
0457         boost::archive::binary_oarchive oa(fo);
0458         
0459         oa << map;
0460     }
0461     
0462     {
0463         std::ifstream ifs;
0464         ifs.exceptions(ifs.badbit | ifs.failbit | ifs.eofbit);
0465         ifs.open(file_name, std::ios::binary);
0466         
0467         boost::iostreams::filtering_istream fi;
0468         fi.push(boost::iostreams::zlib_decompressor());
0469         fi.push(ifs);
0470         
0471         boost::archive::binary_iarchive ia(fi);
0472      
0473         tsl::robin_map<std::int64_t, std::int64_t> map_deserialized;   
0474         ia >> map_deserialized;
0475         
0476         assert(map == map_deserialized);
0477     }
0478 }
0479 ```
0480 
0481 ### License
0482 
0483 The code is licensed under the MIT license, see the [LICENSE file](LICENSE) for details.