Hash table - Wikipedia, the free encyclopedia

25 May 15

yun xia

A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
Instead, most hash table designs assume that hash
will occur and must be accommodated in some way.

1 more annotation...

28 Mar 15

prabhakar ahinave

The name "open addressing" refers to the fact that the location ("address") of the item is not determined by its hash value
Linear probing, in which the interval between probes is fixed (usually 1
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array. In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so. For many applications, these restrictions mandate the use of dynamic resizing, with its attendant costs.

1 more annotation...

03 Feb 15

Gregory Nelson

A basic requirement is that the function should provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions and the cost of resolving them. Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using statistical tests, e.g., a Pearson's chi-squared test for discrete uniform distributions.^[5]^[6]

The distribution needs to be uniform only for table sizes that occur in the application. In particular, if one uses dynamic resizing with exact doubling and halving of the table size s, then the hash function needs to be uniform only when s is a power of two. On the other hand, some hashing algorithms provide uniform hashes only when s is a prime number.^[7]

16 Jan 15

hai_ahi

Hash table
Not to be confused with Hash list or Hash tree.
In computing, a hash table (hash map) is a data structure used to implement an associative array
Hash table Type Unordered associative array Invented 1953
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
a structure that can map keys to values.
Time complexity
in big O notation
deally, the hash function will assign each key to a unique bucket, but this situation is rarely achievable in practice (usually some keys will hash to the same bucket)
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized^[2]) constant average cost per operation.^[3]^[4]
In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure
For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
Hashing
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
Given a key, the algorithm computes an index that suggests where the entry can be found:
Choosing a good hash function[edit]
Perfect hash function[edit]
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collision
Perfect hashing allows for constant time lookups in the worst case.
This is in contrast to most chaining and open addressing methods, where the time for lookup is low on average, but may be very large (proportional to the number of entries) for some sets of keys.
Key statistics
Collision resolution
Separate chaining
Separate chaining with linked lists
Separate chaining with list head cells
Separate chaining with other structures
Open addressing
Dynamic resizing
Resizing by copying all entries
Incremental resizing
Monotonic keys
Other solutions
Performance analysis
Features
Advantages
This advantage is more apparent when the number of entries is large
The main advantage of hash tables over other table data structures is speed
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
If the set of key-value pairs is fixed and known ahead of time (so insertions and deletions are not allowed), one may reduce the average lookup cost by a careful choice of the hash function
In particular, one may be able to devise a hash function that is collision-free, or even perfect (see below). In this case the keys need not be stored in the table.
bucket table size, and internal data structures
Drawbacks
Although operations on a hash table take constant time on average, the cost of a good hash function can be significantly higher than the inner loop of the lookup algorithm for a sequential list or search tree.
Thus hash tables are not effective when the number of entries is very small
Uses[edit]
Associative arrays
They are used to implement associative arrays (arrays whose indices are arbitrary strings or other complicated objects),
Hash tables are commonly used to implement many types of in-memory tables
especially in interpreted programming languages like Perl, Ruby, Python, and PHP.
When storing a new item into a typical associative array and a hash collision occurs, but the actual keys themselves are different, the associative array likewise stores both items
However, if the key of the new item exactly matches the key of an old item, the associative array typically erases the old item and overwrites it with the new item, so every item in the table has a unique key.
Database indexing
Hash tables may also be used as disk-based data structures and database indices (such as in dbm) although B-trees are more popular in these applications.
Caches
Hash tables can be used to implement caches, auxiliary data tables that are used to speed up the access to data that is primarily stored in slower media.
In this application, hash collisions can be handled by discarding one of the two colliding entries—usually erasing the old item that is currently stored in the table and overwriting it with the new item, so every item in the table has a unique hash value.
Sets[edit]
Besides recovering the entry that has a given key, many hash table implementations can also tell whether such an entry exists or no
hose structures can therefore be used to implement a set data structure, which merely records whether a given key belongs to a specified set of keys
Object representation
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects. In this representation, the keys are the names of the members and methods of the object, and the values are pointers to the corresponding member or method.
Unique data representation
r that purpose, all strings in use by the program are stored in a single string pool implemented as a hash table, which is checked whenever a new string has to be created
Hash tables can be used by some programs to avoid creating multiple character strings with the same contents
his technique was introduced in Lisp interpreters under the name hash consing, and can be used with many other kinds of data (expression trees in a symbolic algebra system, records in a database, files in a file system, binary decision diagrams, etc.)
Implementations
In programming languages
C++11, for example, the unordered_map class provides hash tables for keys and values of arbitrary type.
In PHP 5, the Zend 2 engine uses one of the hash functions from Daniel J. Bernstein to generate the hash values used in managing the mappings of data pointers stored in a hash table
Python's built-in hash table implementation, in the form of the dict type, as well as Perl's hash type (%) are used internally to implement namespaces and therefore need to pay more attention to security, i.e. collision attacks
ython sets also use hashes internally, for fast lookup (though they store only keys, not values).^[24]
The idea of hashing arose independently in different places. In January 1953, H. P. Luhn wrote an internal IBM memorandum that used hashing with chaining.^[
G. N. Amdahl, E. M. Boehme, N. Rochester, and Arthur Samuel implemented a program using hashing at about the same time. Open addressing with linear probing (relatively prime stepping) is credited to Amdahl, but Ershov (in Russia) had the same idea.^[25]

70 more annotations...

zyx321zzz

distribute the entries (key/value pairs) across an array of buckets
This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.

24 Oct 14

kronayne91

algorithm alignment sequencing hash programming

a hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
Ideally, the hash function will assign each key to a unique bucket, but this situation is rarely achievable in practice
most hash table designs assume that hash collisions—different keys that are assigned by the hash function to the same bucket—will occur and must be accommodated in some way.
the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table
Given a key, the algorithm computes an index that suggests where the entry can be found
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
the hash is independent of the array size, and it is then reduced to an index
the function should provide a uniform distribution of hash values.
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions.

7 more annotations...

21 Mar 14

Benjamin G

In computing, a hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

10 Feb 14

datta_

A critical statistic for a hash table is called the load factor. This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.

15 Oct 13

zzanghwi

a hash table
data structure used to implement an associative array, a structure that can map keys to values.
hash map
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

2 more annotations...

06 Oct 13

Meredith Hitchcock

hash table (also hash map) is a data structure used to implement an associative array, a structure that can map keys to values
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table.
A critical statistic for a hash table is called the load factor. This is simply the number of entries divided by the number of buckets, that is, n/k where n is the number of entries and k is the number of buckets.
most hash table implementations have some collision resolution strategy to handle such events
Chained hash tables remain effective even when the number of table entries n is much higher than the number of slots. Their performance degrades more gracefully (linearly) with the load factor
For separate-chaining, the worst-case scenario is when all entries are inserted into the same bucket, in which case the hash table is ineffective and the cost is that of searching the bucket data structure
In another strategy, called open addressing, all entry records are stored in the bucket array itself. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.
Open addressing only saves memory if the entries are small (less than four times the size of a pointer) and the load factor is not too small.
Ultimately, used sensibly, any kind of hash table algorithm is usually fast enough; and the percentage of a calculation spent in hash table code is low. Memory usage is rarely considered excessive. Therefore, in most cases the differences between these algorithms are marginal, and other considerations typically come into play
The main advantage of hash tables over other table data structures is speed. This advantage is more apparent when the number of entries is large. Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects.

9 more annotations...

18 Jun 13

yingsai dong

In this method, the hash is independent of the array size,
uniform distribution of hash values
basic requirement
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array.
In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so.
Open addressing schemes also put more stringent requirements on the hash function: besides distributing the keys more uniformly over the buckets, the function must also minimize the clustering of hash values that are consecutive in the probe order.

4 more annotations...

11 May 13

mgupta

Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the next pointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective.
Open addressing only saves memory if the entries are small (less than four times the size of a pointer) and the load factor is not too small. If the load factor is close to zero (that is, there are far more buckets than stored entries), open addressing is wasteful even if each entry is just two words.
Open addressing avoids the time overhead of allocating each new entry record, and can be implemented even in the absence of a memory allocator. It also avoids the extra indirection required to access the first entry of each bucket (that is, usually the only one). It also has better locality of reference, particularly with linear probing. With small record sizes, these factors can yield better performance than chaining, particularly for lookups.
Hash tables with open addressing are also easier to serialize, because they do not use pointers.
Generally speaking, open addressing is better used for hash tables with small records that can be stored within the table (internal storage) and fit in a cache line.
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
Hash tables can be used to implement caches, auxiliary data tables that are used to speed up the access to data that is primarily stored in slower media. In this application, hash collisions can be handled by discarding one of the two colliding entries—usually erasing the old item that is currently stored in the table and overwriting it with the new item, so every item in the table has a unique hash value.

5 more annotations...

27 Apr 13

Radoslaw Tomaszewski

programming hash_table algorithms data_structure algorithm

11 Apr 13

lquerel

More sophisticated data structures, such as balanced search trees, are worth considering only if the load factor is large (about 10 or more), or if the hash distribution is likely to be very non-uniform, or if one must guarantee good performance even in a worst-case scenario.
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the next pointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective.

24 Jan 13

darkpanda89

uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
hash collisions—different keys that are assigned by the hash function to the same bucket
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets.
the hash is independent of the array size, and it is then reduced to an index
using a remainder operation (%).
A basic requirement is that the function should provide a uniform distribution of hash values.
load factor. This is simply the number of entries divided by the number of buckets
If the load factor is kept reasonable, the hash table should perform well, provided the hashing is good. If the load factor grows too large, the hash table will become slow, or it may fail to work
A low load factor is not especially beneficial. As load factor approaches 0, the proportion of unused areas in the hash table increases, but there is not necessarily any reduction in search cost. This results in wasted memory.
The time for hash table operations is the time to find the bucket (which is constant) plus the time for the list operation
open hashing or closed addressing
by using a self-balancing tree, the theoretical worst-case time of common hash table operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n)
this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs
When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence
, until an unoccupied slot is found.
"open addressing" refers to the fact that
the location ("address") of the item is not determined by its hash value.

15 more annotations...

21 Oct 12

qqibrow

In particular, if one uses dynamic resizing with exact doubling and halving of s, the hash function needs to be uniform only when s is a power of two
On the other hand, some hashing algorithms provide uniform hashes only when s is a prime number
ash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
the cost of a good hash function can be significantly higher than the inner loop of the lookup
hash tables are not effective when the number of entries is very small.
there is no efficient way to locate an entry whose key is nearest to a given key
a data structure with better worst-case guarantees may be preferable

5 more annotations...

08 Aug 12

Jakub Bares

ses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).

12 Jul 12

Shira Rockowitz

bioalgo hash_table hash

The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.
Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hash collisions—different keys that map to the same hash value—will occur and must be accommodated in some way.
In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
A non-uniform distribution increases the number of collisions, and the cost of resolving them.
For example, by using a self-balancing tree, the theoretical worst-case time of common hash table operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n). However, this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs (e.g. in a real-time application), or if one must guard against many entries hashed to the same slot (e.g. if one expects extremely non-uniform distributions, or in the case of web sites or other publicly accessible services, which are vulnerable to malicious key distributions in requests).
In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so. Thus a more aggressive resize scheme is needed

4 more annotations...

11 Jul 12

Dan Rahmel

lkg lkg study datastructure algorithm programming hash algorithms

16 May 12

avinashp

widely used in many kinds of computer software, particularly for associative arrays

09 Jan 12

kamrad

DataStructures algorithm hash

non-uniform distribution increases the number of collisions,
A basic requirement is that the function should provide a uniform distribution of hash values
The distribution needs to be uniform only for table sizes s that occur in the application.
Cryptographic hash functions are believed to provide good hash functions for any table size s
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions
Fast and Compact Hash Tables for Integer Keys

4 more annotations...

20 Dec 11

colors huang

a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).
Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice
Hash table algorithms calculate an index from the data item's key and use this index to place the data into the array. The implementation of this calculation is the hash function, f:
At the heart of the hash table algorithm is a simple array of items; this is often simply called the hash table.

2 more annotations...

24 Sep 11

Abhay Bothra

a hash table implements an associative array.
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions. If minimal perfect hashing is used, every location in the hash table can be used as well

08 Sep 11

Simon Cheng

12 Aug 11

karthikroopkumar

drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array
more aggressive resize scheme is needed.
besides distributing the keys more uniformly over the buckets, the function must also minimize the clustering of hash values that are consecutive in the probe order.
avoids the time overhead of allocating each new entry record,
and can be implemented even in the absence of a memory allocator.
poor choice for large elements, because these elements fill entire CPU cache lines (negating the cache advantage), and a large amount of space is wasted on large empty table
slots.
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
speed
. Thus hash tables are not effective when the number of entries is very small.
For certain string processing applications, such as spell-checking, hash tables may be less efficient than tries,
there is no efficient way to locate an entry whose key is nearest to a given key
Hash tables in general exhibit poor locality of reference—that is, the data to be accessed is distributed seemingly at random in memory
Hash tables become quite inefficient when there are many collisions.
Associative arrays

Hash tables are commonly used to implement many types of in-memory tables
Hash tables can be used to implement caches,
Several dynamic languages, such as Perl, Python, JavaScript, and Ruby, use hash tables to implement objects

15 more annotations...

11 Jul 11

Peter Beens

At the heart of the hash table algorithm is a simple array of items; this is often simply called the hash table. Hash table algorithms calculate an index from the data item's key and use this index to place the data into the array.

hash table search ICSxx ICS wikipedia wiki OnICS

20 May 11

Shawn Chen

basic requirement is that the function should provide a uniform distribution of hash values
the hash function should also avoid clustering,

20 Apr 11

Tu Phan

Lý thuyết giải thuật Hash table (nên đọc trước)

c++ stl algorithm hash

15 Dec 10

paehdur

Ideally, the hash function should map each possible key to a unique slot index,

08 Dec 10

utsuk srivastava

Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at constant average

26 Oct 10

Nakul Lahoti

The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table.
Thus, a hash table implements an associative array.
For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.
A basic requirement is that the function should provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions, and the cost of resolving them.
index = f(key, arrayLength)
Perfect hash function

If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash table that has no collisions
For open addressing schemes, the hash function should also avoid clustering, the mapping of two or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even if the load factor is low and collisions are infrequent. The popular multiplicative hash^[2] is claimed to have particularly poor clustering behavior.^[5]
Cryptographic hash functions are believed to provide good hash functions for any table size s, either by modulo reduction or by bit masking. They may also be appropriate if there is a risk of malicious users trying to sabotage a network service by submitting requests designed to generate a large number of collisions in the server's hash tables.
Perfect hashing allows for constant time lookups in the worst case.
Hash collisions are practically unavoidable when hashing a random subset of a large set of possible keys
In the strategy known as separate chaining, direct chaining, or simply chaining, each slot of the bucket array is a pointer to a linked list that contains the key-value pairs that hashed to the same location.
The technique is also called open hashing or closed addressing,
Chained hash tables with linked lists are popular because they require only basic data structures with simple algorithms, and can use simple hash functions that are unsuitable for other methods.
For separate-chaining, the worst-case scenario is when all entries were inserted into the same bucket
The cost of a table operation is that of scanning the entries of the selected bucket for the desired key. If the distribution of keys is sufficiently uniform, the average cost of a lookup depends only on the average number of keys per bucket—that is, on the load factor.
Chained hash tables remain effective even when the number of entries n is much higher than the number of slots. Their performance degrades more gracefully (linearly) with the load factor. For example, a chained hash table with 1000 slots and 10,000 stored keys (load factor 10) is five to ten times slower than a 10,000-slot table (load factor 1); but still 1000 times faster than a plain sequential list, and possibly even faster than a balanced search tree.
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the next pointer in each entry record can be significant.
in which case the hash table is ineffective and the cost is that of searching the bucket data structure.
Some chaining implementations store the first record of each chain in the slot array itself.^[3] The purpose is to increase cache efficiency of hash table access.
The bucket chains are often implemented as ordered lists, sorted by the key field; this choice approximately halves the average cost of unsuccessful lookups, compared to an unordered list. However, if some keys are much more likely to come up than others, an unordered list with move-to-front heuristic may be more effective. More sophisticated data structures, such as balanced search trees, are worth considering only if the load factor is large (about 10 or more), or if the hash distribution is likely to be very non-uniform, or if one must guarantee good performance even in the worst-case.
Instead of a list, one can use any other data structure that supports the required operations. By using a self-balancing tree, for example, the theoretical worst-case time of a hash table can be brought down to O(log n) rather than O(n). However, this approach is only worth the trouble and extra memory cost if long delays must be avoided at all costs (e.g. in a real-time application), or if one expects to have many entries hashed to the same slot (e.g. if one expects extremely non-uniform or even malicious key distributions).
Separate
In another strategy, called open addressing, all entry records are stored in the bucket array itself. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence, until an unoccupied slot is found.
When searching for an entry, the buckets are scanned in the same sequence, until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table.^[10] The name "open addressing" refers to the fact that the location ("address") of the item is not determined by its hash value.
A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array.
Open addressing only saves memory if the entries are small
Hash tables with open addressing are also easier to serialize, because they don't use pointers.
Generally speaking, open addressing is better used for hash tables with small records that can be stored within the table (internal storage) and fit in a cache line. They are particularly suitable for elements of one word or less. In cases where the tables are expected to have high load factors, the records are large, or the data is variable-sized, chained hash tables often perform as well or better.
Coalesced hashing
A hybrid of chaining and open addressing, coalesced hashing links together chains of nodes within the table itself.^[10] Like open addressing, it achieves space usage and (somewhat diminished) cache advantages over chaining. Like chaining, it does not exhibit clustering effects; in fact, the table can be efficiently filled to a high density. Unlike chaining, it cannot have more elements than table slots.
Cuckoo hashing
Another alternative open-addressing solution is cuckoo hashing, which ensures constant lookup time in the worst case, and constant amortized time for insertions and deletions. It uses two or more hash functions, which means any key/value pair could be in two or more locations. For lookup, the first hash function is used; if the key/value is not found, then the second hash function is used, and so on. If a collision happens during insertion, then the key is re-hashed with the second hash function to map it to another bucket. If all hash functions are used and there is still a collision, then the key it collided with is removed to make space for the new key, and the old key is re-hashed with one of the other hash functions, which maps it to another bucket. If that location also results in a collision, then the process repeats until there is no collision or the process traverses all the buckets, at which point the table is resized. By combining multiple hash functions with multiple cells per bucket, very high space utilisation can be achieved.
Hash tables are particularly efficient when the maximum number of entries can be predicted in advance, so that the bucket array can be allocated once with the optimum size and never resized.
If the set of key-value pairs is fixed and known ahead of time (so insertions and deletions are not allowed), one may reduce the average lookup cost by a careful choice of the hash function, bucket table size, and internal data structures.
Hash tables can be more difficult to implement than self-balancing binary search trees. Choosing an effective hash function for a specific application is more an art than a science. In open-addressed hash tables it is fairly easy to create a poor hash function.
Thus hash tables are not effective when the number of entries is very small.
For certain string processing applications, such as spell-checking, hash tables may be less efficient than tries, finite automata, or Judy arrays.
Also, if each key is represented by a small enough number of bits, then, instead of a hash table, one may use the key directly as the index into an array of values. Note that there are no collisions in this case.
Although the average cost per operation is constant and fairly small, the cost of a single operation may be quite high. In particular, if the hash table uses dynamic resizing, an insertion or deletion operation may occasionally take time proportional to the number of entries. This may be a serious drawback in real-time or interactive applications.
Hash tables in general exhibit poor locality of reference—that is, the data to be accessed is distributed seemingly at random in memory.
Compact data structures such as arrays, searched with linear search, may be faster if the table is relatively small and keys are integers or other short strings

40 more annotations...

11 Sep 10

Dante-Gabryell Monson

In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number).

database data ReQuest programming netention automenta

07 Jun 10

elitehackers

hash table wikipedia math vol-policy coding

25 Feb 10

J Spooktalker

computing reference

07 Jul 09

Scott Bower

database

14 May 09

Donghai Ma

hash programming wikipedia algorithms reference datastructures computer algorithm

23 Apr 09

minstrum balance

thesis

06 Feb 09

peisheng wang

hash

15 Jan 09

Capt. Ben Smith

hash

21 Sep 08

Marc Slayton

ly lax strength requirements needed for hash tables, but their slowness and complexity makes them unappealing. However, using cryp

hash table programming delicious

amit papnai

hashtable DS

09 Jun 08

Nguyen Tien Si

hash programming algorithm data-structure algorithms

19 Mar 08

j p

H ash tables support the efficient insertion of new entries, expecte >d O(1) > time >.
a black hat with knowledge of the hash function may be able to supply information to a hash which creates worst-case behavior by causing excessive collisions, resulting in very poor performance (i.e., a denial of service attack)
Hash tables in general exhibit poor locality of reference
hash table; both i nsertion and search approac >h O(1) > time >
Compared to other associative array data structures, hash tables are most useful when large numbers of records are to be stored, especially if the size of the data set can be predicted.
Evaluating a good hash function can be a slow operation. In particular, if simple array indexing can be used instead, this is usually faster.
Simple Uniformed Hashing Assumption
With a good hash function, a hash table can typically contain about 70%–80% as many elements as it does table slots and still perform well.
1 + 2 + 4 + 8 + ... + n = 2n - 1.