This link has been bookmarked by 122 people . It was first bookmarked on 13 Jul 2006, by ryan.
-
25 May 15
-
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
Instead, most hash table designs assume that hash
-
will occur and must be accommodated in some way.
-
-
28 Mar 15
-
The name "open addressing" refers to the fact that the location ("address") of the item is not determined by its hash value
-
Linear probing, in which the interval between probes is fixed (usually 1
-
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array. In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so. For many applications, these restrictions mandate the use of dynamic resizing, with its attendant costs.
-
-
03 Feb 15
-
A basic requirement is that the function should provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions and the cost of resolving them. Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using statistical tests, e.g., a Pearson's chi-squared test for discrete uniform distributions.[5][6]
The distribution needs to be uniform only for table sizes that occur in the application. In particular, if one uses dynamic resizing with exact doubling and halving of the table size s, then the hash function needs to be uniform only when s is a power of two. On the other hand, some hashing algorithms provide uniform hashes only when s is a prime number.[7]
-
-
16 Jan 15
-
Hash table
-
-
-
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
deally, the hash function will assign each key to a unique bucket, but this situation is rarely achievable in practice (usually some keys will hash to the same bucket)
-
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized[2]) constant average cost per operation.[3][4]
-
In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure
-
For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
-
Hashing
-
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
-
Given a key, the algorithm computes an index that suggests where the entry can be found:
-
Choosing a good hash function[edit]
-
Perfect hash function[edit]
-
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collision
-
Perfect hashing allows for constant time lookups in the worst case.
-
This is in contrast to most chaining and open addressing methods, where the time for lookup is low on average, but may be very large (proportional to the number of entries) for some sets of keys.
-
Key statistics
-
Collision resolution
-
Separate chaining
-
Separate chaining with linked lists
-
Separate chaining with list head cells
-
Separate chaining with other structures
-
Open addressing
-
Dynamic resizing
-
Resizing by copying all entries
-
Incremental resizing
-
Monotonic keys
-
Other solutions
-
Performance analysis
-
Features
-
Advantages
-
This advantage is more apparent when the number of entries is large
-
The main advantage of hash tables over other table data structures is speed
-
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
If the set of key-value pairs is fixed and known ahead of time (so insertions and deletions are not allowed), one may reduce the average lookup cost by a careful choice of the hash function
-
In particular, one may be able to devise a hash function that is collision-free, or even perfect (see below). In this case the keys need not be stored in the table.
-
bucket table size, and internal data structures
-
Drawbacks
-
Although operations on a hash table take constant time on average, the cost of a good hash function can be significantly higher than the inner loop of the lookup algorithm for a sequential list or search tree.
-
Thus hash tables are not effective when the number of entries is very small
-
Uses[edit]
-
Associative arrays
-
They are used to implement associative arrays (arrays whose indices are arbitrary strings or other complicated objects),
-
Hash tables are commonly used to implement many types of in-memory tables
-
-
When storing a new item into a typical associative array and a hash collision occurs, but the actual keys themselves are different, the associative array likewise stores both items
-
However, if the key of the new item exactly matches the key of an old item, the associative array typically erases the old item and overwrites it with the new item, so every item in the table has a unique key.
-
Database indexing
-
Hash tables may also be used as disk-based data structures and database indices (such as in dbm) although B-trees are more popular in these applications.
-
Caches
-
Hash tables can be used to implement caches, auxiliary data tables that are used to speed up the access to data that is primarily stored in slower media.
-
In this application, hash collisions can be handled by discarding one of the two colliding entries—usually erasing the old item that is currently stored in the table and overwriting it with the new item, so every item in the table has a unique hash value.
-
Sets[edit]
-
Besides recovering the entry that has a given key, many hash table implementations can also tell whether such an entry exists or no
-
hose structures can therefore be used to implement a set data structure, which merely records whether a given key belongs to a specified set of keys
-
Object representation
-
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects. In this representation, the keys are the names of the members and methods of the object, and the values are pointers to the corresponding member or method.
-
Unique data representation
-
r that purpose, all strings in use by the program are stored in a single string pool implemented as a hash table, which is checked whenever a new string has to be created
-
Hash tables can be used by some programs to avoid creating multiple character strings with the same contents
-
his technique was introduced in Lisp interpreters under the name hash consing, and can be used with many other kinds of data (expression trees in a symbolic algebra system, records in a database, files in a file system, binary decision diagrams, etc.)
-
Implementations
-
In programming languages
-
C++11, for example, the
unordered_mapclass provides hash tables for keys and values of arbitrary type. -
In PHP 5, the Zend 2 engine uses one of the hash functions from Daniel J. Bernstein to generate the hash values used in managing the mappings of data pointers stored in a hash table
-
-
-
The idea of hashing arose independently in different places. In January 1953, H. P. Luhn wrote an internal IBM memorandum that used hashing with chaining.[
-
G. N. Amdahl, E. M. Boehme, N. Rochester, and Arthur Samuel implemented a program using hashing at about the same time. Open addressing with linear probing (relatively prime stepping) is credited to Amdahl, but Ershov (in Russia) had the same idea.[25]
-
-
-
distribute the entries (key/value pairs) across an array of buckets
-
This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.
-
-
24 Oct 14
-
a hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
Ideally, the hash function will assign each key to a unique bucket, but this situation is rarely achievable in practice
-
most hash table designs assume that hash collisions—different keys that are assigned by the hash function to the same bucket—will occur and must be accommodated in some way.
-
the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table
-
Given a key, the algorithm computes an index that suggests where the entry can be found
-
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
-
the hash is independent of the array size, and it is then reduced to an index
-
the function should provide a uniform distribution of hash values.
-
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions.
-
-
21 Mar 14
-
In computing, a hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
-
10 Feb 14
-
A critical statistic for a hash table is called the load factor. This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.
-
-
15 Oct 13
-
a hash table
-
-
hash map
-
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
-
06 Oct 13
-
hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values
-
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table.
-
A critical statistic for a hash table is called the load factor. This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.
-
most hash table implementations have some collision resolution strategy to handle such events
-
Chained hash tables remain effective even when the number of table entries n is much higher than the number of slots. Their performance degrades more gracefully (linearly) with the load factor
-
For separate-chaining, the worst-case scenario is when all entries are inserted into the same bucket, in which case the hash table is ineffective and the cost is that of searching the bucket data structure
-
In another strategy, called open addressing, all entry records are stored in the bucket array itself. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.
-
Open addressing only saves memory if the entries are small (less than four times the size of a pointer) and the load factor is not too small.
-
Ultimately, used sensibly, any kind of hash table algorithm is usually fast enough; and the percentage of a calculation spent in hash table code is low. Memory usage is rarely considered excessive. Therefore, in most cases the differences between these algorithms are marginal, and other considerations typically come into play
-
The main advantage of hash tables over other table data structures is speed. This advantage is more apparent when the number of entries is large. Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects.
-
-
18 Jun 13
-
In this method, the hash is independent of the array size,
-
uniform distribution of hash values
-
basic requirement
-
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array.
-
In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so.
-
Open addressing schemes also put more stringent requirements on the hash function: besides distributing the keys more uniformly over the buckets, the function must also minimize the clustering of hash values that are consecutive in the probe order.
-
-
11 May 13
-
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the
nextpointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective. -
Open addressing only saves memory if the entries are small (less than four times the size of a pointer) and the load factor is not too small. If the load factor is close to zero (that is, there are far more buckets than stored entries), open addressing is wasteful even if each entry is just two words.
-
Open addressing avoids the time overhead of allocating each new entry record, and can be implemented even in the absence of a memory allocator. It also avoids the extra indirection required to access the first entry of each bucket (that is, usually the only one). It also has better locality of reference, particularly with linear probing. With small record sizes, these factors can yield better performance than chaining, particularly for lookups.
-
Hash tables with open addressing are also easier to serialize, because they do not use pointers.
-
Generally speaking, open addressing is better used for hash tables with small records that can be stored within the table (internal storage) and fit in a cache line.
-
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
Hash tables can be used to implement caches, auxiliary data tables that are used to speed up the access to data that is primarily stored in slower media. In this application, hash collisions can be handled by discarding one of the two colliding entries—usually erasing the old item that is currently stored in the table and overwriting it with the new item, so every item in the table has a unique hash value.
-
-
27 Apr 13
-
11 Apr 13
-
More sophisticated data structures, such as balanced search trees, are worth considering only if the load factor is large (about 10 or more), or if the hash distribution is likely to be very non-uniform, or if one must guarantee good performance even in a worst-case scenario.
-
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the
nextpointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective.
-
-
24 Jan 13
-
uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
-
hash collisions—different keys that are assigned by the hash function to the same bucket
-
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
-
the hash is independent of the array size, and it is then reduced to an index
-
using a remainder operation (
%). -
A basic requirement is that the function should provide a uniform distribution of hash values.
-
load factor. This is simply the number of entries divided by the number of buckets
-
If the load factor is kept reasonable, the hash table should perform well, provided the hashing is good. If the load factor grows too large, the hash table will become slow, or it may fail to work
-
A low load factor is not especially beneficial. As load factor approaches 0, the proportion of unused areas in the hash table increases, but there is not necessarily any reduction in search cost. This results in wasted memory.
-
The time for hash table operations is the time to find the bucket (which is constant) plus the time for the list operation
-
open hashing or closed addressing
-
by using a self-balancing tree, the theoretical worst-case time of common hash table operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n)
-
this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs
-
When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence
-
, until an unoccupied slot is found.
-
"open addressing" refers to the fact that
-
the location ("address") of the item is not determined by its hash value.
-
-
21 Oct 12
-
In particular, if one uses dynamic resizing with exact doubling and halving of s, the hash function needs to be uniform only when s is a power of two
-
On the other hand, some hashing algorithms provide uniform hashes only when s is a prime number
-
ash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
the cost of a good hash function can be significantly higher than the inner loop of the lookup
-
hash tables are not effective when the number of entries is very small.
-
there is no efficient way to locate an entry whose key is nearest to a given key
-
a data structure with better worst-case guarantees may be preferable
-
-
08 Aug 12
-
ses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).
-
-
12 Jul 12
-
The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.
-
Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hash collisions—different keys that map to the same hash value—will occur and must be accommodated in some way.
-
In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
-
A non-uniform distribution increases the number of collisions, and the cost of resolving them.
-
For example, by using a self-balancing tree, the theoretical worst-case time of common hash table operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n). However, this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs (e.g. in a real-time application), or if one must guard against many entries hashed to the same slot (e.g. if one expects extremely non-uniform distributions, or in the case of web sites or other publicly accessible services, which are vulnerable to malicious key distributions in requests).
-
In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so. Thus a more aggressive resize scheme is needed
-
-
11 Jul 12
-
16 May 12
-
widely used in many kinds of computer software, particularly for associative arrays
-
-
09 Jan 12
-
non-uniform distribution increases the number of collisions,
-
A basic requirement is that the function should provide a uniform distribution of hash values
-
The distribution needs to be uniform only for table sizes s that occur in the application.
-
Cryptographic hash functions are believed to provide good hash functions for any table size s
-
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions
-
-
20 Dec 11
-
a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).
-
Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice
-
Hash table algorithms calculate an index from the data item's key and use this index to place the data into the array. The implementation of this calculation is the hash function, f:
-
At the heart of the hash table algorithm is a simple array of items; this is often simply called the hash table.
-
-
24 Sep 11
-
a hash table implements an associative array.
-
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions. If minimal perfect hashing is used, every location in the hash table can be used as well
-
-
08 Sep 11
-
12 Aug 11
-
drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array
-
more aggressive resize scheme is needed.
-
besides distributing the keys more uniformly over the buckets, the function must also minimize the clustering of hash values that are consecutive in the probe order.
-
avoids the time overhead of allocating each new entry record,
-
and can be implemented even in the absence of a memory allocator.
-
poor choice for large elements, because these elements fill entire CPU cache lines (negating the cache advantage), and a large amount of space is wasted on large empty table
-
slots.
-
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
speed
-
. Thus hash tables are not effective when the number of entries is very small.
-
For certain string processing applications, such as spell-checking, hash tables may be less efficient than tries,
-
there is no efficient way to locate an entry whose key is nearest to a given key
-
Hash tables in general exhibit poor locality of reference—that is, the data to be accessed is distributed seemingly at random in memory
-
Hash tables become quite inefficient when there are many collisions.
-
Associative arrays
Hash tables are commonly used to implement many types of in-memory tables
-
Hash tables can be used to implement caches,
-
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects
-
-
11 Jul 11
Peter BeensAt the heart of the hash table algorithm is a simple array of items; this is often simply called the hash table. Hash table algorithms calculate an index from the data item's key and use this index to place the data into the array.
-
20 May 11
-
basic requirement is that the function should provide a uniform distribution of hash values
-
the hash function should also avoid clustering,
-
-
20 Apr 11
-
15 Dec 10
-
Ideally, the hash function should map each possible key to a unique slot index,
-
-
08 Dec 10
-
Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at constant average
-
-
26 Oct 10
-
The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.
-
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table.
-
Thus, a hash table implements an associative array.
-
For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
-
A basic requirement is that the function should provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions, and the cost of resolving them.
-
index = f(key, arrayLength)
-
Perfect hash function
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions
-
For open addressing schemes, the hash function should also avoid clustering, the mapping of two or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even if the load factor is low and collisions are infrequent. The popular multiplicative hash[2] is claimed to have particularly poor clustering behavior.[5]
-
Cryptographic hash functions are believed to provide good hash functions for any table size s, either by modulo reduction or by bit masking. They may also be appropriate if there is a risk of malicious users trying to sabotage a network service by submitting requests designed to generate a large number of collisions in the server's hash tables.
-
Perfect hashing allows for constant time lookups in the worst case.
-
Hash collisions are practically unavoidable when hashing a random subset of a large set of possible keys
-
In the strategy known as separate chaining, direct chaining, or simply chaining, each slot of the bucket array is a pointer to a linked list that contains the key-value pairs that hashed to the same location.
-
The technique is also called open hashing or closed addressing,
-
Chained hash tables with linked lists are popular because they require only basic data structures with simple algorithms, and can use simple hash functions that are unsuitable for other methods.
-
For separate-chaining, the worst-case scenario is when all entries were inserted into the same bucket
-
The cost of a table operation is that of scanning the entries of the selected bucket for the desired key. If the distribution of keys is sufficiently uniform, the average cost of a lookup depends only on the average number of keys per bucket—that is, on the load factor.
-
Chained hash tables remain effective even when the number of entries n is much higher than the number of slots. Their performance degrades more gracefully (linearly) with the load factor. For example, a chained hash table with 1000 slots and 10,000 stored keys (load factor 10) is five to ten times slower than a 10,000-slot table (load factor 1); but still 1000 times faster than a plain sequential list, and possibly even faster than a balanced search tree.
-
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the
nextpointer in each entry record can be significant. -
in which case the hash table is ineffective and the cost is that of searching the bucket data structure.
-
Some chaining implementations store the first record of each chain in the slot array itself.[3] The purpose is to increase cache efficiency of hash table access.
-
The bucket chains are often implemented as ordered lists, sorted by the key field; this choice approximately halves the average cost of unsuccessful lookups, compared to an unordered list. However, if some keys are much more likely to come up than others, an unordered list with move-to-front heuristic may be more effective. More sophisticated data structures, such as balanced search trees, are worth considering only if the load factor is large (about 10 or more), or if the hash distribution is likely to be very non-uniform, or if one must guarantee good performance even in the worst-case.
-
Instead of a list, one can use any other data structure that supports the required operations. By using a self-balancing tree, for example, the theoretical worst-case time of a hash table can be brought down to O(log n) rather than O(n). However, this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs (e.g. in a real-time application), or if one expects to have many entries hashed to the same slot (e.g. if one expects extremely non-uniform or even malicious key distributions).
-
Separate
-
In another strategy, called open addressing, all entry records are stored in the bucket array itself. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.
-
When searching for an entry, the buckets are scanned in the same sequence, until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table.[10] The name "open addressing" refers to the fact that the location ("address") of the item is not determined by its hash value.
-
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array.
-
Open addressing only saves memory if the entries are small
-
Hash tables with open addressing are also easier to serialize, because they don't use pointers.
-
Generally speaking, open addressing is better used for hash tables with small records that can be stored within the table (internal storage) and fit in a cache line. They are particularly suitable for elements of one word or less. In cases where the tables are expected to have high load factors, the records are large, or the data is variable-sized, chained hash tables often perform as well or better.
-
Coalesced hashing
-
A hybrid of chaining and open addressing, coalesced hashing links together chains of nodes within the table itself.[10] Like open addressing, it achieves space usage and (somewhat diminished) cache advantages over chaining. Like chaining, it does not exhibit clustering effects; in fact, the table can be efficiently filled to a high density. Unlike chaining, it cannot have more elements than table slots.
-
Cuckoo hashing
-
Another alternative open-addressing solution is cuckoo hashing, which ensures constant lookup time in the worst case, and constant amortized time for insertions and deletions. It uses two or more hash functions, which means any key/value pair could be in two or more locations. For lookup, the first hash function is used; if the key/value is not found, then the second hash function is used, and so on. If a collision happens during insertion, then the key is re-hashed with the second hash function to map it to another bucket. If all hash functions are used and there is still a collision, then the key it collided with is removed to make space for the new key, and the old key is re-hashed with one of the other hash functions, which maps it to another bucket. If that location also results in a collision, then the process repeats until there is no collision or the process traverses all the buckets, at which point the table is resized. By combining multiple hash functions with multiple cells per bucket, very high space utilisation can be achieved.
-
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
-
If the set of key-value pairs is fixed and known ahead of time (so insertions and deletions are not allowed), one may reduce the average lookup cost by a careful choice of the hash function, bucket table size, and internal data structures.
-
Hash tables can be more difficult to implement than self-balancing binary search trees. Choosing an effective hash function for a specific application is more an art than a science. In open-addressed hash tables it is fairly easy to create a poor hash function.
-
Thus hash tables are not effective when the number of entries is very small.
-
For certain string processing applications, such as spell-checking, hash tables may be less efficient than tries, finite automata, or Judy arrays.
-
Also, if each key is represented by a small enough number of bits, then, instead of a hash table, one may use the key directly as the index into an array of values. Note that there are no collisions in this case.
-
Although the average cost per operation is constant and fairly small, the cost of a single operation may be quite high. In particular, if the hash table uses dynamic resizing, an insertion or deletion operation may occasionally take time proportional to the number of entries. This may be a serious drawback in real-time or interactive applications.
-
Hash tables in general exhibit poor locality of reference—that is, the data to be accessed is distributed seemingly at random in memory.
-
Compact data structures such as arrays, searched with linear search, may be faster if the table is relatively small and keys are integers or other short strings
-
-
11 Sep 10
Dante-Gabryell MonsonIn computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).
-
07 Jun 10
-
25 Feb 10
-
07 Jul 09
-
14 May 09
-
23 Apr 09
-
06 Feb 09
-
15 Jan 09
-
21 Sep 08
Marc Slaytonly lax strength requirements needed for hash tables, but their slowness and complexity makes them unappealing. However, using cryp
-
09 Jun 08
-
19 Mar 08
-
H ash tables support the efficient insertion of new entries, expecte >d O(1) > time >.
-
a black hat with knowledge of the hash function may be able to supply information to a hash which creates worst-case behavior by causing excessive collisions, resulting in very poor performance (i.e., a denial of service attack)
-
Hash tables in general exhibit poor locality of reference
-
hash table; both i nsertion and search approac >h O(1) > time >
-
Compared to other associative array data structures, hash tables are most useful when large numbers of records are to be stored, especially if the size of the data set can be predicted.
-
Evaluating a good hash function can be a slow operation. In particular, if simple array indexing can be used instead, this is usually faster.
-
With a good hash function, a hash table can typically contain about 70%–80% as many elements as it does table slots and still perform well.
-
1 + 2 + 4 + 8 + ... + n = 2n - 1.
-
-
14 Mar 08
-
20 Feb 08
-
10 Feb 08
-
02 Feb 08
-
30 Jan 08
-
16 Dec 07
-
10 May 07
-
27 Mar 07
-
04 Dec 06
-
14 Aug 06
-
10 Jul 06
Would you like to comment?
Join Diigo for a free account, or sign in if you are already a member.