Labels

Wednesday, April 28, 2010

Stride of an array

In computer programming, the stride of an array (also increment or step size) refers to the number of locations in memory between successive array elements, measured in bytes or in units of the size of the array's elements.

An array with stride 1 has elements which are contiguous in memory. Such arrays are sometimes said to have unit stride. Unit stride arrays are generally more efficient than non-unit stride arrays, due to the effects of caching. This is attributed to the Principle of Locality, specifically spatial locality

Unit stride array

An array whose elements are stored contiguously in memory. Unit stride arrays are more efficient with respect to caches than nonunit stride arrays are.

STructure padding and how to avoid it

Re: What is structure padding ?
Answer
# 1
[See http://www.geocities.com/vijoeyz/faq/c/padding.txt]

All modern CPUs expect that the fundamental types --
int's, float's and
long's -- are stored in the memory at their natural
boundary; typically, at
addresses that are multiples of their length. Some CPU work
efficiently if the
memory is properly aligned, and some can work in either case.

For the following examples, let us assume:

sizeof (int) == 4
sizeof (char) == 1
sizeof (float) == 4

When a C compiler processes a structure, it adds padding
bit(s)/byte(s), if
required, between the members to ensure proper alignment.
Consider the
following scenario:

> What is the difference between the following structures:
>
> struct pad1
> {
> int a;
> char c;
> float f;
> };
>

"a" and "f" should occur at an address multiple of 4,
whereas "c" can take
any -- odd or even -- address. So, the structure appears in
the memory as
shown:

___________________
| a0 | a1 | a2 | a3 | 4-byte alignement
------------------- P is padding byte
| c0 | P0 | P1 | P2 |
-------------------
| f0 | f1 | f2 | f3 |
-------------------

> and
>
> struct pad2
> {
> float f;
> int a;
> char c;
> };
>
___________________
| f0 | f1 | f2 | f3 | 4-byte alignement
------------------- P is padding byte
| a0 | a1 | a2 | a3 |
-------------------
| c0 | P0 | P1 | P2 |
-------------------

Following point are worth noting:

* The compiler also ensures that the structure as a
whole appears at an
aligned address.

* Padding does NOT occur at the beginning of a structure.

* The value of padding bytes or bits are
implementation defined.

> What is the use of padding?

* Padding is useful, for example, in conforming to
externally imposed
layouts of machine registers.

* The obvious advantage is efficient access by CPU.


> And finally what is ring buffer?Where is it used.Someone
pls. explain
> in detail.

* A buffer of data which is of fixed size; when it
fills, further data is
placed back at the start of the buffer,
overwriting the old data,
in a "ring". Commonly used in device drivers.

For more examples, use the Google the keyword "define:
ring buffer",
excluding the quotes.
Is This Answer Correct ? 8 Yes 0 No
Vijoeyz

Re: What is structure padding ?
Answer
# 2
structure padding is used to pad the data in
such a way that it can be sent to external devices.
Sometimes, the data is padded in such a way
that it can be used on little endian vs big
endian processors
Padding is done to fast access of data from memory.



Re: How to avoid structure padding in C?
Answer
# 1
by using #pragma you can avoid structure padding. and that
to u can use it in linux or unix if i m not wrong.
Is This Answer Correct ? 13 Yes 1 No
Ravi

Re: How to avoid structure padding in C?
Answer
# 2
yes u can use pragma to change change byte alignment.
for e.g.
typedef struct _s1{
unsigned int i;
unsigned char c;
unsigned long a;
unsigned short e;
} s1;
Size of this structure is of 11 Bytes. but due to default
byte alignment(8 byte) which is different for different
compilers. The size of structure would be 16 Bytes.
In order to change the alignment, we will have to do
something like this.

#pragma pack(push,1)
typedef struct _s1{
unsigned int i;
unsigned char c;
unsigned long a;
unsigned short e;
//unsigned char b;
} s1;
#pragma pack(pop)

This will change the byte alignment to 1 Byte. and thus size
of structure will be exactly 11 bytes
Is This Answer Correct ? 10 Yes 6 No
Lokesh Mogra


Re: How to avoid structure padding in C?
Answer
# 3
Those are 3 different things.

Structure Padding

Most processors require specific memory alignment on
variables certain types. Normally the minimum alignment is
the size of the basic type in question, fo instance this is
common

char variables can be byte aligned and appear at any byte
boundary

short (2 byte) variables must be 2 byte aligned, they can
appear at any even byte boundary. This means that 0x10004567
is not a valid location for a short variable but 0x10004566 is.

long (4 byte) variables must be 4 byte aligned, they can
only appear at byte boundarys that are a multiple of 4
bytes. This means that 0x10004566 is not a valid location
for a long variable but 0x10004568 is.

Structure padding occurs because the members of the
structure must appear at the correect byte boundary, to
achieve this the compiler puts in padding bytes (or bits if
bit fields are in use) so that the structure members appear
in the correct location. Additionally the size of the
structure must be such that in an array of the structures
all the structures are correctly aligned in memory so there
may be padding bytes at the end of the structure too

struct example {
char c1;
short s1;
char c2;
long l1;
char c3;
}

In this structure, assuming the alignment scheme I have
previously stated then

c1 can appear at any byte boundary, however s1 must appear
at a 2 byte boundary so there is a padding byte between c1
and s1.

c2 can then appear in the available memory location, however
l1 must be at a 4 byte boundary so there are 3 padding bytes
between c2 and l1

c3 then appear in the available memory location, however
because the structure contains a long member the structure
must be 4 byte aligned and must be a multiple of 4 bytes in
size. Therefore there are 3 padding bytes at the end of the
structure. it would appear in memory in this order

c1
padding byte
s1 byte 1
s1 byte 2
c2
padding byte
padding byte
padding byte
l1 byte 1
l1 byte 2
l1 byte 3
l1 byte 4
c3
padding byte
padding byte
padding byte

The structure would be 16 bytes long.

re-written like this

struct example {
long l1;
short s1;
char c1;
char c2;
char c3;
}

Then l1 appears at the correct byte alignment, s1 will be
correctly aligned so no need for padding between l1 and s1.
c1, c2, c3 can appear at any location. The structure must be
a multiple of 4 bytes in size since it contains a long so 3
padding bytes appear after c3

It appears in memory in the order

l1 byte 1
l1 byte 2
l1 byte 3
l1 byte 4
s1 byte 1
s1 byte 2
c1
c2
c3
padding byte
padding byte
padding byte

and is only 12 bytes long.


I should point out that structure packing is platform and
compiler (and in some cases compiler switch) dependent.




Memory Pools are just a section of memory reserved for
allocating temporarily to other parts of the application


A memory leak occurs when you allocate some memory from the
heap(or a pool) and then delete all references to that
memory without returning it to the pool it was allocated from.
Is This Answer Correct ? 22 Yes 1 No
Santhi Perumal

Re: How to avoid structure padding in C?
Answer
# 4
by using #pragma pack



STRUCTURE PADDING

1:Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system). Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.
For example, when the computer's word size is 4 bytes, the data to be read should be at a memory offset which is some multiple of 4. When this is not the case, e.g. the data starts at the 14th byte instead of the 16th byte, then the computer has to read two 4-byte chunks and do some calculation before the requested data has been read, or it may generate an alignment fault. Even though the previous data structure ends at the 14th byte, the next data structure should start at the 16th byte. Two padding bytes are inserted between the two data structures to align the next data structure to the 16th byte.

2:

Data structure padding

Although the compiler (or interpreter) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member is properly aligned. In addition the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned.

Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by ascending or descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice the sum of the alignment requirements for the least aligned half of the structure members.

Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might. It is also possible to tell most C and C++ compilers to "pack" the members of a structure to a certain level of alignment, e.g. "pack(2)" means align data members larger than a byte to a two-byte boundary so that any padding members are at most one byte long.

One use for such "packed" structures is to conserve memory. For example, a structure containing a single byte and a four-byte integer would require three additional bytes of padding. A large array of such structures would use 37.5% less memory if they are packed, although accessing each structure might take longer. This compromise may be considered a form of space-time tradeoff.

Although use of "packed" structures is most frequently used to conserve memory space, it may also be used to format a data structure for transmission using a standard protocol. However in this usage, care must also be taken to ensure that the values of the struct members are stored with the endianness required by the protocol (often network byte order), which may be different from the endianness used natively by the host machine.

[edit] Computing padding

The following formulas provide the number of padding bytes required to align the start of a data structure (where mod is the modulo operator):

padding = align - (offset mod align)
new offset = offset + padding = offset + align - (offset mod align)

For example, the padding to add to offset 0x59d for a structure aligned to every 4 bytes is 3. The structure will then start at 0x5a0, which is a multiple of 4. Note that when offset already is a multiple of align, taking the modulo of align - (offset mod align) is required to get a padding of 0.

Or alternatively when the alignment is a power of two, the following formulas provides the new offset (where & is a bitwise AND and ~ a bitwise NOT):

new offset = align + ((offset - 1) & ~(align - 1))
padding = (align + ((offset - 1) & ~(align - 1))) - offset

[edit] Typical alignment of C structs on x86

Data structure members are stored sequentially in a memory so that in the structure below the member Data1 will always precede Data2 and Data2 will always precede Data3:

struct MyData
{
short Data1;
short Data2;
short Data3;
};

If the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned. Data1 would be at offset 0, Data2 at offset 2 and Data3 at offset 4. The size of this structure would be 6 bytes.

The type of each member of the structure usually has a default alignment, meaning that it will, unless otherwise requested by the programmer, be aligned on a pre-determined boundary. The following typical alignments are valid for compilers from Microsoft, Borland, and GNU when compiling for 32-bit x86:

  • A char (one byte) will be 1-byte aligned.
  • A short (two bytes) will be 2-byte aligned.
  • An int (four bytes) will be 4-byte aligned.
  • A float (four bytes) will be 4-byte aligned.
  • A double (eight bytes) will be 8-byte aligned on Windows and 4-byte aligned on Linux.
  • A long double (twelve bytes) will be 4-byte aligned on Linux.
  • Any pointer (four bytes) will be 4-byte aligned on Linux. (e.g.: char*, int*)

The only notable difference in alignment for a 64-bit Linux system when compared to a 32 bit is:

  • A double (eight bytes) will be 8-byte aligned.
  • A long double (Sixteen bytes) will be 16-byte aligned.
  • Any pointer (eight bytes) will be 8-byte aligned.

Here is a structure with members of various types, totaling 8 bytes before compilation:

struct MixedData
{
char Data1;
short Data2;
int Data3;
char Data4;
};

After compilation the data structure will be supplemented with padding bytes to ensure a proper alignment for each of its members:

struct MixedData  /* after compilation */
{
char Data1;
char Padding1[1]; /* For the following 'short' to be aligned on a 2 byte boundary */
short Data2;
int Data3;
char Data4;
char Padding2[3];
};

The compiled size of the structure is now 12 bytes. It is important to note that the last member is padded with the number of bytes required to conform to the largest type of the structure. In this case 3 bytes are added to the last member to pad the structure to the size of a long word.

It is possible to change the alignment of structures to reduce the memory they require (or to conform to an existing format) by reordering structure members or changing the compiler’s alignment (or “packing”) of structure members.

struct MixedData  /* after reordering */
{
char Data1;
char Data4; /* reordered */
short Data2;
int Data3;
};

The compiled size of the structure now matches the pre-compiled size of 8 bytes. Note that Padding1[1] has been replaced (and thus eliminated) by Data4 and Padding2[3] is no longer necessary as the structure is already aligned to the size of a long word.

The alternative method of enforcing the MixedData structure to be aligned to a one byte boundary will cause the pre-processor to discard the pre-determined alignment of the structure members and thus no padding bytes would be inserted.

While there is no standard way of defining the alignment of structure members, some compilers use #pragma directives to specify packing inside source files. Here is an example:

#pragma pack(push)  /* push current alignment to stack */
#pragma pack(1) /* set alignment to 1 byte boundary */

struct MyPackedData
{
char Data1;
long Data2;
char Data3;
};

#pragma pack(pop) /* restore original alignment from stack */

This structure would have a compiled size of 6 bytes. The above directives are available in compilers from Microsoft, Borland, GNU and many others.

[edit] Default packing and #pragma pack

On some MS compilers, particularly for the RISC processor, there is a relationship between project default packing (the /Zp directive) and the #pragma pack directive which is unexpected for most people.

The #pragma pack directive can only be used to reduce the packing size of a structure from the project default packing. This leads to interoperability problems with library headers which use for example #pragma pack(8) if you set a project packing to smaller than this. The MSDN documentation[1] states that if the #pragma pack packing is larger than or equal to the project packing, it will be ignored.

For this reason, one should never set a project packing to any value other than the default of 8 bytes, as it would break the #pragma pack directives used in library headers and result in binary incompatibilities between structures.

In particular, setting /Zp1 breaks all #pragma pack directives other than #pragma pack(1).

However, this limitation is not present when compiling for desktop processors, such as the x86 architecture.

[edit] Allocating memory aligned to cache lines

It would be beneficial to allocate memory aligned to cache lines. If an array is partitioned for more than one thread to operate on, having the sub-array boundaries unaligned to cache lines could lead to performance degradation. Here is an example to allocate memory (double array of size 8) aligned to cache line of 64 bytes.

#include 
double *foo(void) {
double *var;
int ok;

ok = posix_memalign(&var, 64, 8);

if(ok != 0)
return NULL;

return var;
}

4:Structure Padding


Most processors require specific memory alignment on variables certain
types. Normally the minimum alignment is the size of the basic type in
question, fo instance this is common


char variables can be byte aligned and appear at any byte boundary


short (2 byte) variables must be 2 byte aligned, they can appear at any
even byte boundary. This means that 0x10004567 is not a valid location
for a short variable but 0x10004566 is.


long (4 byte) variables must be 4 byte aligned, they can only appear at
byte boundarys that are a multiple of 4 bytes. This means that
0x10004566 is not a valid location for a long variable but 0x10004568
is.


Structure padding occurs because the members of the structure must
appear at the correect byte boundary, to achieve this the compiler puts
in padding bytes (or bits if bit fields are in use) so that the
structure members appear in the correct location. Additionally the size
of the structure must be such that in an array of the structures all
the structures are correctly aligned in memory so there may be padding
bytes at the end of the structure too


struct example {

char c1;

short s1;

char c2;

long l1;

char c3;

}


In this structure, assuming the alignment scheme I have previously stated then


c1 can appear at any byte boundary, however s1 must appear at a 2 byte boundary so there is a padding byte between c1 and s1.


c2 can then appear in the available memory location, however l1 must be
at a 4 byte boundary so there are 3 padding bytes between c2 and l1


c3 then appear in the available memory location, however because the
structure contains a long member the structure must be 4 byte aligned
and must be a multiple of 4 bytes in size. Therefore there are 3
padding bytes at the end of the structure. it would appear in memory in
this order


c1

padding byte

s1 byte 1

s1 byte 2

c2

padding byte

padding byte

padding byte

l1 byte 1

l1 byte 2

l1 byte 3

l1 byte 4

c3

padding byte

padding byte

padding byte


The structure would be 16 bytes long.


re-written like this


struct example {

long l1;

short s1;

char c1;

char c2;

char c3;

}


Then l1 appears at the correct byte alignment, s1 will be correctly
aligned so no need for padding between l1 and s1. c1, c2, c3 can appear
at any location. The structure must be a multiple of 4 bytes in size
since it contains a long so 3 padding bytes appear after c3


It appears in memory in the order


l1 byte 1

l1 byte 2

l1 byte 3

l1 byte 4

s1 byte 1

s1 byte 2

c1

c2

c3

padding byte

padding byte

padding byte


and is only 12 bytes long.



I should point out that structure packing is platform and compiler (and in some cases compiler switch) dependent.





Memory Pools are just a section of memory reserved for allocating temporarily to other parts of the application



A memory leak occurs when you allocate some memory from the heap(or a
pool) and then delete all references to that memory without returning
it to the pool it was allocated from.
Familiar Sight

Oct 3 '06

re: what is structure padding



1.) The simple answer is compilers pad structures to optimize data transfers. This is an hardware architecture issue. Most modern CPUs perform best when fundemental types, like 'int' or 'float', are algined on memory boundarys of a particular size (eg. often a 4byte word on 32bit archs). Many architectures don't allow misaligned access or if they do inccur a performance penalty.

When a compiler processes a structure declaration it will add extra bytes between fields to meet alignment needs.

Expand|Select|Wrap|Line Numbers
  1. struct MyStructA {
  2. char a;
  3. char b;
  4. int c;
  5. };

  6. struct MyStructB {
  7. char a;
  8. int c;
  9. char b;
  10. };

  11. int main(void) {
  12. int sizeA = sizeof(struct MyStructA);
  13. int sizeB = sizeof(struct MyStructB);

  14. printf("A = %d\n", sizeA);
  15. printf("B = %d\n", sizeB);

  16. return 0;
  17. }

Results compiling with gcc on a Linux x86 system, it will vary for other CPU/OS/compiler combination. In this example the gcc compiler is aligning the structure fields on 4byte boundaries. In MyStructA we have (1 byte + 1 byte + 2 bytes of padding + 4 bytes = 8bytes). In MyStructB we have (1 byte + 3 bytes of padding + 4 bytes + 1 byte + 3 bytes of padding = 12 bytes).

Expand|Select|Wrap|Line Numbers
  1. tyreld@susie:~/Desktop> ./a.out
  2. A = 8
  3. B = 12

This example illustrates a good rule of thumb. Generally, it is good to group structure fields of the same type together to minimize the extra padding.

2.) The following link gives a basic overview of memory pools .

3.) My laziness as a programmer is clearly showing as I'm going to refer you now to the wikipedia page discussing memory leaks.
Expert

Join Date: Feb 2007
Location: Bangalore
Posts: 182
#4: May 11 '07

re: what is structure padding


Those are 3 different things.

Structure Padding


char variables can be byte aligned and appear at any byte boundary

short (2 byte) variables must be 2 byte aligned, they can appear at any even byte boundary. This means that 0x10004567 is not a valid location for a short variable but 0x10004566 is.

long (4 byte) variables must be 4 byte aligned, they can only appear at byte boundarys that are a multiple of 4 bytes. This means that 0x10004566 is not a valid location for a long variable but 0x10004568 is.

Structure padding occurs because the members of the structure must appear at the correect byte boundary, to achieve this the compiler puts in padding bytes (or bits if bit fields are in use) so that the structure members appear in the correct location. Additionally the size of the structure must be such that in an array of the structures all the structures are correctly aligned in memory so there may be padding bytes at the end of the structure too
I was just brushing up my c skills and especially structure padding. i have the following doubt
Expand|Select|Wrap|Line Numbers
  1. #include

  2. #pragma pack(push)

  3. #pragma pack(1)

  4. struct test{

  5. char a;//1 byte //Genrally padded so that int can start from 4 byte alignment

  6. int b;//

  7. char c;//1 byte //generally padded so that d can start in 2 byte allignment

  8. short d;

  9. };

  10. #pragma pack(pop)

  11. typedef struct test testStructure;

  12. int main()

  13. {

  14. testStructure testData;

  15. testStructure testDataArray[4];

  16. int iterator;

  17. printf("\n Address of test Data is 0x %x \n", &testData);

  18. printf(" Address of char a in testData is 0x %x \n", &(testData.a));

  19. printf(" Address of int b in testData is 0x %x \n", &(testData.b));

  20. printf(" Address of char c in testData is 0x %x \n", &(testData.c));

  21. printf(" Address of short d in testData is 0x %x \n", &(testData.d));

  22. printf(" size of struct test is %d\n", sizeof(struct test));

  23. printf("\n Printing the address in Array\n");

  24. for(iterator = 0; iterator <>

  25. {

  26. printf("Iterator value is %d \n", iterator);

  27. printf("\n Address of testDataArray is 0x %x \n", &testDataArray[iterator]);

  28. printf(" Address of char a in testDataArray is 0x %x \n", &(testDataArray[iterator].a));

  29. printf(" Address of int b in testDataArray is 0x %x \n", &(testDataArray[iterator].b));

  30. printf(" Address of char c in testDataArray is 0x %x \n", &(testDataArray[iterator].c));

  31. printf(" Address of short d in testDataArray is 0x %x \n", &(testDataArray[iterator].d));

  32. }

  33. return 0;



  34. }

The output of this program compiled in windows xp, visual studio 2005 is as follows


Address of test Data is 0x 12ff50
Address of char a in testData is 0x 12ff50
Address of int b in testData is 0x 12ff51
Address of char c in testData is 0x 12ff55
Address of short d in testData is 0x 12ff56
size of struct test is 8
Printing the address in Array
Iterator value is 0
Address of testDataArray is 0x 12ff58
Address of char a in testDataArray is 0x 12ff58
Address of int b in testDataArray is 0x 12ff59
Address of char c in testDataArray is 0x 12ff5d
Address of short d in testDataArray is 0x 12ff5e
Iterator value is 1
Address of testDataArray is 0x 12ff60
Address of char a in testDataArray is 0x 12ff60
Address of int b in testDataArray is 0x 12ff61
Address of char c in testDataArray is 0x 12ff65
Address of short d in testDataArray is 0x 12ff66
Iterator value is 2
Address of testDataArray is 0x 12ff68
Address of char a in testDataArray is 0x 12ff68
Address of int b in testDataArray is 0x 12ff69
Address of char c in testDataArray is 0x 12ff6d
Address of short d in testDataArray is 0x 12ff6e
Iterator value is 3
Address of testDataArray is 0x 12ff70
Address of char a in testDataArray is 0x 12ff70
Address of int b in testDataArray is 0x 12ff71
Address of char c in testDataArray is 0x 12ff75
Address of short d in testDataArray is 0x 12ff76


My doubt:

When we use #pragma pack the compilers usually stops padding the structure, then how are these variables accessed. And this is contradictory with " Additionally the size of the structure must be such that in an array of the structures all the structures are correctly aligned in memory so there may be padding bytes at the end of the structure too".

Though we save much space here why the compilers dont have packing as default?

sorry for poor english.
Banfa's Avatar
Site Moderator


re: what is structure padding


Though we save much space here why the compilers dont have packing as default?
The reason that compilers put in padding is to align members on boundaries that make the data easy to access.

Taking your example

Expand|Select|Wrap|Line Numbers
  1. #pragma pack(1)

  2. struct test{

  3. char a;//1 byte //Genrally padded so that int can start from 4 byte alignment

  4. int b;//

  5. char c;//1 byte //generally padded so that d can start in 2 byte allignment

  6. short d;
  7. };

the int would be aligned on a 4 byte boundary because this allows the compiler to produce machine code using the processors 32bit access instructions (which often have to act on a 32bit boundary). By packing the code into memory the machine code produced can no longer use the 32bit access routines and has to read and load each byte of the int individually and reconstitute the 32 bit value in it's registers from the individual bytes.

This takes many more processor instructions.

So the answer to your question is that while not padding the structure saves space it also makes the code much more efficient and if memory is not at a premium then the wastage caused by padding is considered worth it for the increased processor performance.

Tuesday, April 27, 2010

hashing 1

A hash table, put simply, is an abstraction of an array that allows any value to be used as an index. While an array requires that indices be integers, a hash table can use a floating-point value, a string, another array, or even a structure as the index. This index is called the key, and the contents of the array element at that index is called the value. So a hash table is a data structure that stores key/value pairs and can be quickly searched by the key. Because insertion and removal are operations dependent on the speed of the search, they tend to be fast as well.

To achieve this magic, a hash table uses a helper function that converts any object into an integral index suitable for subscripting the array. For example, to index an array with strings representing the numbers 0 through 9, a hash function might look like this:

Code:

1 unsigned hash ( const char key )
2 {
3 return key - '0';
4 }

The first character is expected to always be '0', '1', '2', '3', '4', '5', '6', '7', '8', or '9', and the trick of subtracting '0' from one of those characters evaluates to the integer value that the character represents. This works because you can subtract the lowest value in a contiguous range from another value in that range and find the zero-based distance between them. So ('2'-'0') results in 2, ('0'-'0') results in 0, and ('9'-'0') results in 9.
For those of you who aren't familiar with this trick, it might help to work with the actual values rather than character constants. Let's take an arbitrary range of [12, 16). You want 12 to represent 0, 13 to represent 1, 14 to represent 2, and so on up to 16 which should represent 4. To do this you can take any value in the range and subtract the lowest value, 12, from it. Try it with a calculator.
The hash function shown above can then be used to index a suitable array. For illustrative purposes, I'll include a complete test program for you to play with. Notice how problems occur when two entries are hashed to the same index. This is called a collision, and properly handling collisions is where most of the effort in implementing hash tables comes in. Here is the example program:
Code:


1 #include
2
3 /* Find the number of elements in an array */
4 #define length(a) ( sizeof a / sizeof *a )
5
6 static size_t hash ( const char key )
7 {
8 return key - '0';
9 }
10
11 static void jsw_flush ( void );
12 static int get_digit ( char *ch );
13
14 static void fill_table ( char table[] );
15 static void show_table ( const char table[], size_t size );
16
17 int main ( void )
18 {
19 /* This is our hash table */
20 char table[10] = {0};
21
22 fill_table ( table );
23 show_table ( table, length ( table ) );
24
25 return 0;
26 }
27
28 /*
29 Discard remaining characters in stdin
30 up to the next newline or end-of-file
31 */
32 void jsw_flush ( void )
33 {
34 int ch;
35
36 do
37 ch = getchar();
38 while ( ch != '\n' && ch != EOF );
39
40 clearerr ( stdin );
41 }
42
43 /*
44 Get a single character digit, '0'-'9'.
45 Return 1 on success, 0 on input error,
46 end-of-file, or invalid input
47 */
48 int get_digit ( char *ch )
49 {
50 int rc;
51
52 printf ( "Enter a single digit: " );
53 fflush ( stdout );
54
55 if ( rc = scanf ( "%1[0123456789]", ch ) == 1 ) {
56 /* At least a newline is left over; clear it all */
57 jsw_flush();
58 }
59
60 return rc;
61 }
62
63 /* Populate the hash table */
64 void fill_table ( char table[] )
65 {
66 char ch;
67
68 while ( get_digit ( &ch ) ) {
69 /* Create an index from the character */
70 unsigned i = hash ( ch );
71
72 if ( table[i] != 0 ) {
73 /* Duplicate indexes are called collisions */
74 printf ( "That index has been filled\n" );
75 }
76 else {
77 /* The element is empty; fill it in */
78 table[i] = ch;
79 }
80 }
81 }
82
83 /* Display the contents of the hash table */
84 void show_table ( const char table[], size_t size )
85 {
86 size_t i;
87
88 for ( i = 0; i < h =" hash">size;
19
20 if ( table->table[h] != EMPTY ) {
21 /* The index is being used, send the key to overflow */
22 if ( overflow->last == overflow->size ) {
23 /* The overflow is full, throw an error */
24 return 0;
25 }
26
27 overflow->table[overflow->last++] = key;
28 }
29 else {
30 /* The index is free; fill it in */
31 table->table[h] = key;
32 }
33
34 return 1;
35 }

Rather than use a cursor to mark the end of the overflow table, we can also search from the beginning each time. This makes deletions simpler, but also complicates updating the overflow table:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 struct overflow_table {
7 void **table;
8 size_t size;
9 };
10
11 int jsw_insert (
12 struct hash_table *table,
13 struct overflow_table *overflow,
14 void *key, size_t size )
15 {
16 /* Get an index for the key */
17 size_t h = hash ( key, size ) % table->size;
18
19 if ( table->table[h] != EMPTY ) {
20 size_t i;
21
22 /* Find the first open overflow slot */
23 for ( i = 0; i <>size; i++ ) {
24 if ( overflow->table[i] == EMPTY ) {
25 overflow->table[i] = key;
26 break;
27 }
28 }
29
30 if ( i == overflow->size ) {
31 /* There are no open slots; throw an error */
32 return 0;
33 }
34 }
35 else {
36 /* The index is free; fill it in */
37 table->table[h] = key;
38 }
39
40 return 1;
41 }

However, this solution is brittle at best because insertions will be slow if there is a collision with an overflow table that is large and mostly full. Searching will be slow for this same reason. If you allow deletions then what happens when searching for an item that overflowed, but the key in the hash table that caused the collision was deleted? In such a case you would have to search the overflow table even if a search in the hash table showed that that index was empty. So the most obvious method has subtle problems that would be awkward to fix. Fortunately, there are other ways to resolve collisions that are much easier to work with.

Separate chaining
An experienced programmer will probably consider using an array of linked lists as the hash table. When there is a collision, the colliding key can be pushed onto the list, thus preserving both values without a lot of awkwardness. When searching for a key, the index is hashed and then the list is searched, and deletion is as simple as deletion from a linked list. In theory this sounds like a good way to resolve collisions, and in practice, it is. The method is called separate chaining, where each linked list is called a chain. These chains are “separate” because they are not technically a part of the hash table, simply additions that are tacked on as necessary.
The immediate question is how should the list be structured? Since the performance of a search depends on the performance of searching a linked list, it makes sense in theory to keep the lists ordered. That way for a minor cost during insertion, the average case for both a successful and unsuccessful search is the same, on average only half the list would be searched. If the lists are unordered, an unsuccessful search must look through the entire chain.
This sounds good in theory, but in practice it is more work than necessary because with a good hash function, the chains are likely to be short enough that the difference between an ordered list and an unordered list is negligable. Even more importantly, if new keys are inserted at the front of the list, aside from very fast insertion performance, the chain can then be used as a stack, which has desirable properties for frequency of access. In practice, front insertion is the easiest to implement, and has the best performance with a hash function that minimizes collisions.
Another option that has been toyed with is using a binary search tree instead of a linked list for the chains. This is, much like the ordered list, more trouble than it is worth, but you are welcome to experiment. Another variation is a list with a header. My jsw_hlib uses this variation to ease gathering the chain length measurement.
A separately chained hash table is simply an array of linked lists. Once you index the array, it is just a matter of searching, inserting, or removing in a linked list. Assuming an unordered chain, the search is almost trivial...almost:

Code:

1 struct jsw_node {
2 void *key;
3 struct jsw_node *next;
4 };
5
6 struct hash_table {
7 struct jsw_node **table;
8 size_t size;
9 };
10
11 int jsw_find (
12 struct hash_table *table,
13 void *key, size_t size )
14 {
15 /* Get an index for the key */
16 size_t h = hash ( key, size ) % table->size;
17
18 if ( table->table[h] != EMPTY ) {
19 /* Find the matching key in the chain, if any */
20 struct jsw_node *it = table->table[h];
21
22 while ( it != NULL ) {
23 if ( compare ( key, it->key ) == 0 )
24 return 1;
25
26 it = it->next;
27 }
28 }
29
30 return 0;
31 }

Insertion into a separately chained hash table can be even simpler than the search if duplicates are allowed, as the chain does not need to be searched for a matching key first. Simply create a header if necessary and push a new node with the key and value onto the front. If duplicates are not allowed, an extra search needs to be performed to see if the key is already present in the chain:

Code:

1 struct jsw_node {
2 void *key;
3 struct jsw_node *next;
4 };
5
6 struct hash_table {
7 struct jsw_node **table;
8 size_t size;
9 };
10
11 int jsw_insert (
12 struct hash_table *table,
13 void *key, size_t size,
14 int allow_duplicates )
15 {
16 struct jsw_node *it;
17
18 /* Get an index for the key */
19 size_t h = hash ( key, size ) % table->size;
20
21 if ( !allow_duplicates ) {
22 /* Search for any duplicate keys */
23 it = table->table[h];
24
25 while ( it != NULL ) {
26 if ( compare ( key, it->key ) == 0 ) {
27 /* Stop the insertion if a duplicate is found */
28 return 0;
29 }
30
31 it = it->next;
32 }
33 }
34
35 /* Create a new node for the chain */
36 it = malloc ( sizeof *it );
37
38 if ( it == NULL )
39 return 0;
40
41 /* Attach the new node to the front of the chain */
42 it->key = key;
43 it->next = table->table[h];
44 table->table[h] = it;
45
46 return 1;
47 }

When removing from a separately chained hash table, a search is required for the old key. Fortunately, this is the hardest part of the algorithm and the key is gone, which is a welcome change from the other collision resolution strategies. However, keep in mind that if duplicates are allowed, your deletion code might need to remove all occurances of the key rather than simply the first. For conciseness, I'll leave the remove_node function as an exercise. It's nothing more than a normal linked list removal algorithm:

Code:

1 struct jsw_node {
2 void *key;
3 struct jsw_node *next;
4 };
5
6 struct hash_table {
7 struct jsw_node **table;
8 size_t size;
9 };
10
11 int jsw_remove (
12 struct hash_table *table,
13 void *key, size_t size,
14 int allow_duplicates )
15 {
16 int item_found = 0;
17
18 /* Get an index for the key */
19 size_t h = hash ( key, size ) % table->size;
20
21 if ( table->table[h] == NULL ) {
22 /* The chain is empty, no more work needed */
23 return 0;
24 }
25
26 /* Always try to remove the first occurrence */
27 item_found = remove_node ( &table->table[h], key );
28
29 if ( item_found && allow_duplicates ) {
30 /* Try to remove all subsequent occurrences */
31 while ( remove_node ( &table->table[h], key ) )
32 ;
33
34 /* At least one was found if we got here */
35 item_found = true;
36 }
37
38 return item_found;
39 }

Also keep in mind that this is a very basic algorithm and some fairly simple optimizations can improve the performance. For example, rather than searching from the beginning each time for removing duplicates, remove_node might be edited to give you a pointer to the node before the one that was deleted, such that it can be re-entrant and the removal can start again where it left off. Whether this is a practical change or not depends heavily on the average length of the chains in your hash table. If the average length is small, the extra code would likely be more of a handicap than a benefit.
Separate chaining is probably the most popular collision resolution strategy because it is so flexible. The concept is simple, and while the implementation may be more complicated than other methods, it has several advantages:

* The table has no hard size limit.
* Performance degrades gracefully with more collisions.
* The table easily handles duplicate keys.
* Deletion is simple and permanent.

Of course, there are disadvantages as well:

* Rebuilding the table if it is resized is slightly harder.
* Potentially more memory is used because of the links.
* Potentially slower because links need to be dereferenced.

More often than not, the advantages outweigh the disadvantages. Because the chains grow and handle collisions gracefully, the need to resize a hash table is greatly diminished. With a good hash function, you probably won't see any severe performance hits until the majority of the chains reach a length where working with them becomes a bottleneck. So lacking any application-specific requirements, a separately chained hash table is a good default when you need a hash table.
Open addressing
While separate chaining is tempting, there are ways to resolve collisions without resorting to any dynamically growing alternate data structure. These solutions resolve collisions by placing the colliding key into another index in a deterministic way. This process is then repeated for searches, so that any key can be found even if the actual index doesn't match the hashed index any longer. These methods are collectively known as “open addressing”.
Linear probing
By far the simplest open addressing strategy is the linear probe. When a collision is detected, the index is incremented (wrapping around to the beginning if it reaches the end) until an empty bucket is found. Typically, the increment is by 1, but other step sizes can be used as well, though more care should be taken because only a step size of 1 guarantees that all buckets will be encountered without extra precautions.
In linear probing, keys tend to cluster. That is, several parts of the table may become full quickly while others remain completely empty. Because linear probing expects a large number of empty buckets uniformly interspersed among the used buckets, clusters will cause a large number of buckets to be walked through before an empty bucket is found. This slows down search significantly, and in turn slows down insertion and removal as well. As the load factor increases, so does the effect of clustering. The following is an example of primary clustering:

Code:

Insert 426 at 8: | | | | | | | | |426| | |
Insert 921 at 8: | | | | | | | | |416|921| |
Insert 714 at 0: |714| | | | | | | |416|921| |
Insert 129 at 8: |714| | | | | | | |416|921|129|
Insert 514 at 8: |714|514| | | | | | |416|921|129|
Insert 287 at 1: |714|514|287| | | | | |416|921|129|

Notice how each insertion at index 8 takes longer and longer to find a an empty bucket. This is primary clustering, and when the table becomes more than half full, it becomes a severe problem, slowing down the list exponentially. Primary clustering can be improved with a good hash function, but the best way to deal with it is to change the step size.
Searching with linear probing is quite literally, trivial:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_find (
7 struct hash_table *table,
8 void *key, size_t size )
9 {
10 /* Get the initial index for the key */
11 size_t h = hash ( key, size ) % table->size;
12
13 /* Potentially infinite loop if the table is full */
14 while ( table->table[h] != EMPTY ) {
15 if ( compare ( key, table->table[h] ) == 0 )
16 return 1;
17
18 /* Move forward by 1, wrap if necessary */
19 if ( ++h == table->size )
20 h = 0;
21 }
22
23 return 0;
24 }

The wrap around is performed by incrementing the index and resetting it to 0 if it hits the table size. This is actually a naive implementation, and if you'd like you can use modulo arithmetic to the same effect with h = ( h + 1 ) % table->size. This may or may not be more efficient due to the cost of a division as opposed to the cost of a branch. In general you should refrain from worrying about such micro-optimizations and use whichever is clearer.
Also notice how a test for the table being full is not made. This might be done by saving table->table[h], and comparing each successive probe against it, or maintaining a counter of how many iterations have been made and stopping when that counter hits the table size. Any technique for finding out if you're at an index that you've already visited will suffice.
However, this test should only be present as a stopgap against exceptional failure. Open addressing schemes do not handle high load factors well and you should avoid filling the table completely. In fact, the table should half full at most for a generally safe open addressing load factor. Some open addressing techniques allow more, but half full is a good time to start worrying.
The step size is almost always 1 with linear probing, but it is acceptable to use other step sizes as long as the step size is relatively prime to the table size so that every index is eventually visited. If this restriction isn't met, all of the indices may not be visited, and the search may fail or loop infinitely even if the key exists in the table.
Insertion into a hash table with linear probing is simply a variation of searching, where the key is inserted into the first empty bucket. If the table does not allow duplicates, an early return should be made. This works because the probes will always end up going over a previously entered duplicate before hitting an empty slot. However, that does introduce a problem with deletion that we'll look at shortly:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_insert (
7 struct hash_table *table,
8 void *key, size_t size,
9 int allow_duplicates )
10 {
11 /* Get the initial index for the key */
12 size_t h = hash ( key, size ) % table->size;
13
14 while ( table->table[h] != EMPTY ) {
15 if ( !allow_duplicates && compare ( key, table->table[h] ) == 0 )
16 return 0;
17
18 if ( ++h == table->size )
19 h = 0;
20 }
21
22 table[h] = key;
23
24 return 1;
25 }

Removal in any open addressing strategy is where things get interesting. A direct removal is not possible because future probes could rely on the key being removed, and if an empty bucket is created where a full bucket once was, searches will incorrectly fail. All in all, the obvious way to remove from a hash table with open addressing will destroy the data structure and cause it to work improperly. With linear probing, it is possible to remove a key and then re-insert each of the keys in the same cluster, but that solution is somewhat complicated.
A more general solution that works for all of the open addressing schemes is to mark a bucket as deleted in such a way so that searches will not fail. This is a rather hackish solution, but it works well in practice:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_remove (
7 struct hash_table *table,
8 void *key, size_t size,
9 int allow_duplicates )
10 {
11 int item_found = 0;
12
13 /* Get the initial index for the key */
14 size_t h = hash ( key, size ) % table->size;
15
16 while ( table[h] != EMPTY ) {
17 if ( compare ( key, table->table[h] ) == 0 ) {
18 item_found = 1;
19 table[h] = DELETED;
20
21 if ( !allow_duplicates ) {
22 /* No need to keep searching the cluster */
23 break;
24 }
25 }
26
27 if ( ++h == table->size )
28 h = 0;
29 }
30
31 return item_found;
32 }

The catch to this is that if duplicate items are allowed, you can't just delete the first occurrence and be satisfied. You have to keep going through the entire cluster and make sure that all occurrences are marked as deleted. Only the current cluster needs to be searched though, because once you hit an empty slot, it's guaranteed that no duplicates could be probed beyond it.
Keep in mind that the insertion algorithm must also be modified to insert into deleted buckets rather than ignoring them for this technique to work properly. The change is quite easy. Note that because the removal algorithm takes care not to leave any duplicates, the insertion algorithm can safely assume that an EMPTY or DELETED slot marks a valid insertion point. Inserting in a DELETED slot won't leave an accidental duplicate further along the probe line.

Code:


1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_insert (
7 struct hash_table *table,
8 void *key, size_t size,
9 int allow_duplicates )
10 {
11 /* Get the initial index for the key */
12 size_t h = hash ( key, size ) % table->size;
13
14 while ( table->table[h] != EMPTY && table->table[h] != DELETED ) {
15 if ( !allow_duplicates && compare ( key, table->table[h] ) == 0 )
16 return 0;
17
18 if ( ++h == table->size )
19 h = 0;
20 }
21
22 table[h] = key;
23
24 return 1;
25 }

Occasionally these deleted items will begin to fill the table as they take the place of real keys. In that case, the deleted items must be removed and the entire table rebuilt from scratch. We will cover this operation shortly.
Quadratic probing
In an attempt to alleviate the problem of primary clustering, a non-constant step size can be used. In general, it has been found that a quadratically growing step size works well. Simply start with a step size of 1 and quadratically increase the step size until the desired index is found. This strategy is called quadratic probing, and the search algorithm is only slightly more complicated than linear probing, and insertion and deletion are equally simple changes. Just add a step and use it to increase the index:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_find (
7 struct hash_table *table,
8 void *key, size_t size )
9 {
10 size_t step;
11
12 /* Get the initial index for the key */
13 size_t h = hash ( key, size ) % table->size;
14
15 for ( step = 1; table->table[h] != EMPTY; step++ ) {
16 if ( compare ( key, table->table[h] ) == 0 )
17 return 1;
18
19 /* Move forward by quadratically, wrap if necessary */
20 h = ( h + ( step * step - step ) / 2 ) % table->size;
21 }
22
23 return 0;
24 }

Quadratic probing alleviates primary clustering because probes are no longer close together according to some small constant. Rather, the probe sequence covers buckets that are far apart and because the step size is not constant, primary clustering is no longer apparent. However, because the probe sequence is always the same from each bucket, the effect of secondary clustering can be seen. However, secondary clustering is not nearly as big of a problem as primary clustering, and the slowdown is hardly noticeable because quadratic probing only works if the load factor is less than .5 and the table size is prime. Both of these requirements go a long way toward improving clustering anyway.
In general, quadratic probing is fast and avoids primary clustering, and it is a marked improvement over linear probing, but quadratic probing is restrictive in basic implementations and overcoming those restrictions can be tricky. For example, basic quadratic probing is only guaranteed to work if the table size is prime and the load factor is less than .5. Beyond that you are on your own. While this is a nice step forward, double hashing is generally a better method for open addressing because it avoids primary clustering without the annoying restrictions.
Double hashing
Double hashing uses two independent hash functions. The first hash function is used as usual, and the second hash function is used to create a step size. Because the key itself determines the step size, primary clustering is avoided. The algorithm is simple, but two rules must be adhered to for double hashing to work correctly: First, the second hash cannot ever return 0 or there will be an infinite loop. Second, much like with linear probing, the table size must either be prime, or a power of two with the second hash returning an odd number. Beyond that, the implementation is similar to the other open addressing methods discussed so far. So similar, in fact, that I'll show only the search algorithm. The insertion and removal algorithms are identical to linear probing with the step changes, just like quadratic probing:

Code:

1 struct hash_table {
2 void **table;
3 size_t size;
4 };
5
6 int jsw_find (
7 struct hash_table *table,
8 void *key, size_t size )
9 {
10 /* Get the initial index for the key */
11 size_t h = hash ( key, size ) % table->size;
12
13 /* Get the step size */
14 size_t step = hash2 ( key ) % ( table->size - 1 ) + 1;
15
16 while ( table->table[h] != EMPTY ) {
17 if ( compare ( key, table->table[h] ) == 0 )
18 return 1;
19
20 /* Move forward by quadratically, wrap if necessary */
21 h = ( h + step ) % table->size;
22 }
23
24 return 0;
25 }

Coalesced chaining
Note: This is a modification of my Wikipedia article on Coalesced Hashing.
Coalesced chaining is a lesser known strategy that forms a hybrid of separate chaining and open addressing. In a separately chained hash table, items that hash to the same index are placed on a list at that index. This can result in a great deal of wasted memory because the table itself must be large enough to maintain a load factor that performs well (typically twice the expected number of items), and extra memory must be used for all but the first item in a chain (unless list headers are used, in which case extra memory must be used for all items in a chain).
Given a sequence of randomly generated three character long strings, the following table might be generated with a table of size 10:

Code:

(null)
"clq"
"qur"
(null)
(null)
"dim"
"aty" --> "qsu"
"rhv"
"qrj" --> "ofu" --> "gcl" -- > "ecd
(null)
(null)

This strategy is effective and very easy to implement. However, sometimes the extra memory use might be prohibitive, and the most common alternative using open addressing has uncomfortable disadvantages when it comes to primary clustering. Coalesced hashing uses a similar technique as separate chaining, but instead of allocating new nodes for the linked list, buckets in the table are used, just like open addressing. The first empty bucket in the table is considered a collision bucket. When a collision occurs anywhere in the table, the key is placed in the collision bucket and a link is made between the colliding index and the collision bucket. Then the next empty bucket is searched for to give the next collision bucket.
Because the collision bucket moves in a predictable pattern unrelated to how the keys are hashed, the effect of primary clustering is avoided, and search times remain efficient. The lists are interleaved around one another, so they are said to coalesce, hence the name, coalesced chaining. The code to insert into a coalesced hash table is surprisingly short for how convoluted the concept seems at first.
Code:


1 struct jsw_node {
2 void *key;
3 struct jsw_node *next;
4 };
5
6 struct hash_table {
7 struct jsw_node **table;
8 size_t size;
9 size_t cursor;
10 };
11
12 int jsw_insert (
13 struct hash_table *table,
14 void *key, size_t size )
15 {
16 /* Get an index for the key */
17 size_t h = hash ( key, size ) % table->size;
18
19 if ( table->table[h] == EMPTY ) {
20 /* The slot is empty, no more work needed */
21 table->table[h] = key;
22 }
23 else {
24 struct jsw_node *it;
25
26 /* Find the next open slot */
27 while ( table->cursor <>size
28 && table->table[table->cursor] != EMPTY ) {
29
30 ++table->cursor;
31 }
32
33 if ( table->cursor == table->size )
34 return 0;
35
36 table->table[table->cursor] = key;
37
38 /* Find the end of the coalesced chain */
39 for ( it = table->table[h]; it->next != NULL; it = it->next )
40 ;
41
42 it->next = table->table[table->cursor];
43 }
44
45 return 1;
46 }

Search with coalesced chaining is identical to separate chaining, but removal is not as simple. In fact, because coalesced chaining is technically an open addressing strategy, a sentinel marking the bucket as deleted must be used, just as with the other open addressing schemes.
Rehashing
Occasionally, because the operation is very slow, a table needs to be resized, or deleted sentinel items need to be removed to make way for real keys. Unfortunately, the only way to do this is to take every key out of the table and rebuild it completely from scratch. Naturally, this process is very slow compared to other operations, so it should be performed very rarely. An example of rehashing can be found in the jsw_hlib source code.
Performance
The performance properties of hash tables are well known. In my quest to avoid complex mathematical calcuations in my tutorials, and thus cater to the other 99% of programmers, I will summarize.
A hash table has O(1) performance in the best case. The majority of the work in implementing a hash table is ensuring that this best case is also the average case by choosing a good hash function that minimizes collisions. However, it is more informative to say that the average expected performance of a hash table is somewhere between O(1) and O(log N) with a good hash function, and there is a strong bias toward O(1).
The worst case for a hash table is O(N), and there is no way to avoid the worst case using a basic hash table. If guaranteed good performance is required then a more suitable structure should be used, such as a deterministic skip list or balanced binary search tree.
The biggest factor in the performance of a hash table, if the hash function is good, is the load factor of the table. The load factor is defined as the result of dividing the number of used buckets by the size of the table, or M/N where M is the number of used buckets and N is the table size. All collision resolution strategies perform equally well when the load factor is .5, which is the optimum load factor most of the time, and each begins to degrade as the load factor increases.
The general recommendation is to maintain a .5 load factor when possible, and never to go above a .7 load factor. However, if memory is scarce, separately chained hash tables degrade more slowly and more gracefully than open addressing tables, and you can have higher load factors. Open addressing schemes are limited by this guideline because of the pronounced slowdown with larger load factors, and the implementations of open addressing typically assume that the table is largely empty.
However, separately chained schemes are not bound as tightly to the guideline and a table size of the expected number of keys will not perform as badly as it would with an open addressing strategy. The search times will simply be a bit slower due to longer chains. For the speed conscious, more empty buckets means shorter chains, and to ensure good performance, the a larger table is desirable.

Comparison
Of these five collision resolution strategies, which one is best? Much like everything in computer science, it depends. Typically the choice is between separate chaining and double hashing because those two support the most flexibility and the best performance properties. Linear probing breaks down to easily because of primary clustering, quadratic probing is too restrictive, and coalesced chaining is somewhat of the worst of both worlds where the complexity of chains are coupled with the disadvantages of open addressing. In the end a choice should be made on a case by case basis.

Disadvantages
Hash tables are not a silver bullet. If the hash function is poor then there will be excessive collisions and performance will quickly approach O(N). On top of this, some hash functions are slower than others, and a slow hash function could increase the constant for O(1) due to expensive operations and the hash table could be slower than other data structures that are theoretically slower.
Of the collision resolution strategies discussed, only separate chaining can easily handle deletion of a key. The open addressing schemes do not lend themselves well to deleting a key because other keys may rely on it as part of a search probe sequence.
Separate chaining is also the only collision resolution strategy that can easily handle duplicate keys. A few moments thought will show this to be true because to search for duplicates in a chain, only the entire chain must be searched, whereas with duplicates in an open addressed table, the entire cluster must be searched for linear probing, and the entire table for non-linear probing!
Hash tables are based on arrays, which are notoriously difficult to resize once they have been allocated. As such, a hash table is said to have a constant size unless you want to implement the table as a dynamic array and rehash all of the keys whenever the size changes. This is an incredibly expensive operation if the table is even remotely large and should be avoided as much as possible by increasing the size of the table by a fairly large amount (such as half again the current table size, or N * 1.5). That way the reorganizations are minimal for real-time applications and can be amortized into other operations for other applications.
The keys in a table need to be rehashed when the size is changed because each insertion and subsequent search depends on the table size to give the correct result. If the table size changes, the indices for all of the previously hashed keys are invalidated because future searches will no longer work correctly.
Because a hash table is an unordered data structure, certain operations are difficult and expensive. Range queries, selection, and sorted traversals are possible only if the keys are copied into a sorted data structure. There are hash table implementations that keep the keys in order, but they are far from efficient.
A solution to this problem is a hybrid data structure called a hashlist. The idea is to use a sorted data structure such as a binary search tree or sorted linked list as the primary storage unit. Then the keys in the primary structure are hashed and references to them are stored in the hash table. This has excellent performance properties, but in practice the hashlist can be more work than it is worth.
The keys in a hash table are ideally spread in a uniform, pseudorandom pattern throughout the table. This is a good thing in theory, but in practice, processors will create a cache of frequently accessed memory under the assumption that adjacent memory is more likely to be accessed next than non-adjacent memory. As such, unless a fix like the hashlist is used, searching in a hash table may result in excessive cache misses for medium to large tables because the jumping around performed by a hash table does not fit the usage pattern for efficient caching. For smaller tables that fit completely in a cache line, any performance problems are unlikely to be noticeable.

Conclusion
Hash tables are a powerful and flexible data structure for unordered searching and storage. There are subtle issues when resolving collisions, but all in all the hash table is easy to implement and provides near constant performance in the averages case.