1
0
mirror of https://git.savannah.gnu.org/git/gperf.git synced 2025-12-02 13:09:22 +00:00

Regenerated for 3.0.

This commit is contained in:
Bruno Haible
2003-05-07 13:35:57 +00:00
parent 7f2d2ba065
commit c4b3453ac3
14 changed files with 2161 additions and 1032 deletions

View File

@@ -1,21 +1,27 @@
.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.022. .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.23.
.TH GPERF "1" "September 2000" "GNU gperf 2.7.2" FSF .TH GPERF "1" "May 2003" "GNU gperf 3.0" FSF
.SH NAME .SH NAME
gperf \- generate a perfect hash function from a key set gperf \- generate a perfect hash function from a key set
.SH SYNOPSIS .SH SYNOPSIS
.B gperf .B gperf
[\fIOPTION\fR]... [\fIINPUT-FILE\fR] [\fIOPTION\fR]... [\fIINPUT-FILE\fR]
.SH DESCRIPTION .SH DESCRIPTION
GNU `gperf' generates perfect hash functions. GNU 'gperf' generates perfect hash functions.
.PP .PP
If a long option shows an argument as mandatory, then it is mandatory If a long option shows an argument as mandatory, then it is mandatory
for the equivalent short option also. for the equivalent short option also.
.SS "Output file location:"
.HP
\fB\-\-output\-file\fR=\fIFILE\fR Write output to specified file.
.PP
The results are written to standard output if no output file is specified
or if it is -.
.SS "Input file interpretation:" .SS "Input file interpretation:"
.TP .TP
\fB\-e\fR, \fB\-\-delimiters\fR=\fIDELIMITER\-LIST\fR \fB\-e\fR, \fB\-\-delimiters\fR=\fIDELIMITER\-LIST\fR
Allow user to provide a string containing delimiters Allow user to provide a string containing delimiters
used to separate keywords from their attributes. used to separate keywords from their attributes.
Default is ",\en". Default is ",".
.TP .TP
\fB\-t\fR, \fB\-\-struct\-type\fR \fB\-t\fR, \fB\-\-struct\-type\fR
Allows the user to include a structured type Allows the user to include a structured type
@@ -23,6 +29,11 @@ declaration for generated code. Any text before %%
is considered part of the type declaration. Key is considered part of the type declaration. Key
words and additional fields may follow this, one words and additional fields may follow this, one
group of fields per line. group of fields per line.
.TP
\fB\-\-ignore\-case\fR
Consider upper and lower case ASCII characters as
equivalent. Note that locale dependent case mappings
are ignored.
.SS "Language for the output code:" .SS "Language for the output code:"
.TP .TP
\fB\-L\fR, \fB\-\-language\fR=\fILANGUAGE\-NAME\fR \fB\-L\fR, \fB\-\-language\fR=\fILANGUAGE\-NAME\fR
@@ -39,21 +50,27 @@ structure.
Initializers for additional components in the keyword Initializers for additional components in the keyword
structure. structure.
.TP .TP
\fB\-H\fR, \fB\-\-hash\-fn\-name\fR=\fINAME\fR \fB\-H\fR, \fB\-\-hash\-function\-name\fR=\fINAME\fR
Specify name of generated hash function. Default is Specify name of generated hash function. Default is
`hash'. \&'hash'.
.TP .TP
\fB\-N\fR, \fB\-\-lookup\-fn\-name\fR=\fINAME\fR \fB\-N\fR, \fB\-\-lookup\-function\-name\fR=\fINAME\fR
Specify name of generated lookup function. Default Specify name of generated lookup function. Default
name is `in_word_set'. name is 'in_word_set'.
.TP .TP
\fB\-Z\fR, \fB\-\-class\-name\fR=\fINAME\fR \fB\-Z\fR, \fB\-\-class\-name\fR=\fINAME\fR
Specify name of generated C++ class. Default name is Specify name of generated C++ class. Default name is
`Perfect_Hash'. \&'Perfect_Hash'.
.TP .TP
\fB\-7\fR, \fB\-\-seven\-bit\fR \fB\-7\fR, \fB\-\-seven\-bit\fR
Assume 7-bit characters. Assume 7-bit characters.
.TP .TP
\fB\-l\fR, \fB\-\-compare\-lengths\fR
Compare key lengths before trying a string
comparison. This is necessary if the keywords
contain NUL bytes. It also helps cut down on the
number of string comparisons made during the lookup.
.TP
\fB\-c\fR, \fB\-\-compare\-strncmp\fR \fB\-c\fR, \fB\-\-compare\-strncmp\fR
Generate comparison code using strncmp rather than Generate comparison code using strncmp rather than
strcmp. strcmp.
@@ -70,14 +87,27 @@ lookup function rather than with defines.
Include the necessary system include file <string.h> Include the necessary system include file <string.h>
at the beginning of the code. at the beginning of the code.
.TP .TP
\fB\-G\fR, \fB\-\-global\fR \fB\-G\fR, \fB\-\-global\-table\fR
Generate the static table of keywords as a static Generate the static table of keywords as a static
global variable, rather than hiding it inside of the global variable, rather than hiding it inside of the
lookup function (which is the default behavior). lookup function (which is the default behavior).
.TP .TP
\fB\-P\fR, \fB\-\-pic\fR
Optimize the generated table for inclusion in shared
libraries. This reduces the startup time of programs
using a shared library containing the generated code.
.TP
\fB\-Q\fR, \fB\-\-string\-pool\-name\fR=\fINAME\fR
Specify name of string pool generated by option \fB\-\-pic\fR.
Default name is 'stringpool'.
.TP
\fB\-\-null\-strings\fR
Use NULL strings instead of empty strings for empty
keyword table entries.
.TP
\fB\-W\fR, \fB\-\-word\-array\-name\fR=\fINAME\fR \fB\-W\fR, \fB\-\-word\-array\-name\fR=\fINAME\fR
Specify name of word list array. Default name is Specify name of word list array. Default name is
`wordlist'. \&'wordlist'.
.TP .TP
\fB\-S\fR, \fB\-\-switch\fR=\fICOUNT\fR \fB\-S\fR, \fB\-\-switch\fR=\fICOUNT\fR
Causes the generated C code to use a switch Causes the generated C code to use a switch
@@ -99,30 +129,23 @@ defined elsewhere.
.TP .TP
\fB\-k\fR, \fB\-\-key\-positions\fR=\fIKEYS\fR \fB\-k\fR, \fB\-\-key\-positions\fR=\fIKEYS\fR
Select the key positions used in the hash function. Select the key positions used in the hash function.
The allowable choices range between 1-126, inclusive. The allowable choices range between 1-255, inclusive.
The positions are separated by commas, ranges may be The positions are separated by commas, ranges may be
used, and key positions may occur in any order. used, and key positions may occur in any order.
Also, the meta-character '*' causes the generated Also, the meta-character '*' causes the generated
hash function to consider ALL key positions, and $ hash function to consider ALL key positions, and $
indicates the ``final character'' of a key, e.g., indicates the "final character" of a key, e.g.,
$,1,2,4,6-10. $,1,2,4,6-10.
.TP .TP
\fB\-l\fR, \fB\-\-compare\-strlen\fR
Compare key lengths before trying a string
comparison. This helps cut down on the number of
string comparisons made during the lookup.
.TP
\fB\-D\fR, \fB\-\-duplicates\fR \fB\-D\fR, \fB\-\-duplicates\fR
Handle keywords that hash to duplicate values. This Handle keywords that hash to duplicate values. This
is useful for certain highly redundant keyword sets. is useful for certain highly redundant keyword sets.
.TP .TP
\fB\-f\fR, \fB\-\-fast\fR=\fIITERATIONS\fR \fB\-m\fR, \fB\-\-multiple\-iterations\fR=\fIITERATIONS\fR
Generate the gen-perf.hash function ``fast''. This Perform multiple choices of the \fB\-i\fR and \fB\-j\fR values,
decreases gperf's running time at the cost of and choose the best results. This increases the
minimizing generated table size. The numeric running time by a factor of ITERATIONS but does a
argument represents the number of times to iterate good job minimizing the generated table size.
when resolving a collision. `0' means ``iterate by
the number of keywords''.
.TP .TP
\fB\-i\fR, \fB\-\-initial\-asso\fR=\fIN\fR \fB\-i\fR, \fB\-\-initial\-asso\fR=\fIN\fR
Provide an initial value for the associate values Provide an initial value for the associate values
@@ -130,7 +153,7 @@ array. Default is 0. Setting this value larger helps
inflate the size of the final table. inflate the size of the final table.
.TP .TP
\fB\-j\fR, \fB\-\-jump\fR=\fIJUMP\-VALUE\fR \fB\-j\fR, \fB\-\-jump\fR=\fIJUMP\-VALUE\fR
Affects the ``jump value'', i.e., how far to advance Affects the "jump value", i.e., how far to advance
the associated character value upon collisions. Must the associated character value upon collisions. Must
be an odd number, default is 5. be an odd number, default is 5.
.TP .TP
@@ -138,25 +161,20 @@ be an odd number, default is 5.
Do not include the length of the keyword when Do not include the length of the keyword when
computing the hash function. computing the hash function.
.TP .TP
\fB\-o\fR, \fB\-\-occurrence\-sort\fR
Reorders input keys by frequency of occurrence of
the key sets. This should decrease the search time
dramatically.
.TP
\fB\-r\fR, \fB\-\-random\fR \fB\-r\fR, \fB\-\-random\fR
Utilizes randomness to initialize the associated Utilizes randomness to initialize the associated
values table. values table.
.TP .TP
\fB\-s\fR, \fB\-\-size\-multiple\fR=\fIN\fR \fB\-s\fR, \fB\-\-size\-multiple\fR=\fIN\fR
Affects the size of the generated hash table. The Affects the size of the generated hash table. The
numeric argument N indicates ``how many times larger numeric argument N indicates "how many times larger
or smaller'' the associated value range should be, or smaller" the associated value range should be,
in relationship to the number of keys, e.g. a value in relationship to the number of keys, e.g. a value
of 3 means ``allow the maximum associated value to of 3 means "allow the maximum associated value to
be about 3 times larger than the number of input be about 3 times larger than the number of input
keys.'' Conversely, a value of \fB\-3\fR means ``make the keys". Conversely, a value of 1/3 means "make the
maximum associated value about 3 times smaller than maximum associated value about 3 times smaller than
the number of input keys. A larger table should the number of input keys". A larger table should
decrease the time required for an unsuccessful decrease the time required for an unsuccessful
search, at the expense of extra table space. Default search, at the expense of extra table space. Default
value is 1. value is 1.
@@ -171,8 +189,15 @@ Print the gperf version number.
\fB\-d\fR, \fB\-\-debug\fR \fB\-d\fR, \fB\-\-debug\fR
Enables the debugging option (produces verbose Enables the debugging option (produces verbose
output to the standard error). output to the standard error).
.SH AUTHOR
Written by Douglas C. Schmidt and Bruno Haible.
.SH "REPORTING BUGS" .SH "REPORTING BUGS"
Report bugs to <bug-gnu-utils@gnu.org>. Report bugs to <bug-gnu-gperf@gnu.org>.
.SH COPYRIGHT
Copyright \(co 1989-1998, 2000-2003 Free Software Foundation, Inc.
.br
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
.SH "SEE ALSO" .SH "SEE ALSO"
The full documentation for The full documentation for
.B gperf .B gperf

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,12 +1,12 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - GNU GENERAL PUBLIC LICENSE</TITLE> <TITLE>Perfect Hash Function Generator - GNU GENERAL PUBLIC LICENSE</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the first, previous, <A HREF="gperf_2.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the first, previous, <A HREF="gperf_2.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
@@ -455,6 +455,6 @@ Public License instead of this License.
</P> </P>
<P><HR><P> <P><HR><P>
Go to the first, previous, <A HREF="gperf_2.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the first, previous, <A HREF="gperf_2.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,82 +1,104 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 8 Bibliography</TITLE> <TITLE>Perfect Hash Function Generator - Concept Index</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_9.html">previous</A>, <A HREF="gperf_11.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_9.html">previous</A>, next, last section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
<H1><A NAME="SEC23" HREF="gperf_toc.html#TOC23">8 Bibliography</A></H1> <H1><A NAME="SEC28" HREF="gperf_toc.html#TOC28">Concept Index</A></H1>
<P> <P>
[1] Chang, C.C.: <I>A Scheme for Constructing Ordered Minimal Perfect <H2>%</H2>
Hashing Functions</I> Information Sciences 39(1986), 187-195. <DIR>
<LI><A HREF="gperf_5.html#IDX8"><SAMP>`%%'</SAMP></A>
[2] Cichelli, Richard J. <I>Author's Response to "On Cichelli's Minimal Perfect Hash <LI><A HREF="gperf_5.html#IDX18"><SAMP>`%7bit'</SAMP></A>
Functions Method"</I> Communications of the ACM, 23, 12(December 1980), 729. <LI><A HREF="gperf_5.html#IDX19"><SAMP>`%compare-lengths'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX20"><SAMP>`%compare-strncmp'</SAMP></A>
[3] Cichelli, Richard J. <I>Minimal Perfect Hash Functions Made Simple</I> <LI><A HREF="gperf_5.html#IDX17"><SAMP>`%define class-name'</SAMP></A>
Communications of the ACM, 23, 1(January 1980), 17-19. <LI><A HREF="gperf_5.html#IDX15"><SAMP>`%define hash-function-name'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX14"><SAMP>`%define initializer-suffix'</SAMP></A>
[4] Cook, C. R. and Oldehoeft, R.R. <I>A Letter Oriented Minimal <LI><A HREF="gperf_5.html#IDX16"><SAMP>`%define lookup-function-name'</SAMP></A>
Perfect Hashing Function</I> SIGPLAN Notices, 17, 9(September 1982), 18-27. <LI><A HREF="gperf_5.html#IDX13"><SAMP>`%define slot-name'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX26"><SAMP>`%define string-pool-name'</SAMP></A>
</P> <LI><A HREF="gperf_5.html#IDX28"><SAMP>`%define word-array-name'</SAMP></A>
<P> <LI><A HREF="gperf_5.html#IDX9"><SAMP>`%delimiters'</SAMP></A>
[5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M. <LI><A HREF="gperf_5.html#IDX22"><SAMP>`%enum'</SAMP></A>
<I>Practical Perfect Hashing</I> Computer Journal, 28, 1(January 1985), 54-58. <LI><A HREF="gperf_5.html#IDX24"><SAMP>`%global-table'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX11"><SAMP>`%ignore-case'</SAMP></A>
[6] Jaeschke, G. <I>Reciprocal Hashing: A Method for Generating Minimal <LI><A HREF="gperf_5.html#IDX23"><SAMP>`%includes'</SAMP></A>
Perfect Hashing Functions</I> Communications of the ACM, 24, 12(December <LI><A HREF="gperf_5.html#IDX12"><SAMP>`%language'</SAMP></A>
1981), 829-833. <LI><A HREF="gperf_5.html#IDX27"><SAMP>`%null-strings'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX30"><SAMP>`%omit-struct-type'</SAMP></A>
</P> <LI><A HREF="gperf_5.html#IDX25"><SAMP>`%pic'</SAMP></A>
<P> <LI><A HREF="gperf_5.html#IDX21"><SAMP>`%readonly-tables'</SAMP></A>
[7] Jaeschke, G. and Osterburg, G. <I>On Cichelli's Minimal Perfect <LI><A HREF="gperf_5.html#IDX10"><SAMP>`%struct-type'</SAMP></A>
Hash Functions Method</I> Communications of the ACM, 23, 12(December 1980), <LI><A HREF="gperf_5.html#IDX29"><SAMP>`%switch'</SAMP></A>
728-729. <LI><A HREF="gperf_5.html#IDX31"><SAMP>`%{'</SAMP></A>
<LI><A HREF="gperf_5.html#IDX32"><SAMP>`%}'</SAMP></A>
</P> </DIR>
<P> <H2>a</H2>
[8] Sager, Thomas J. <I>A Polynomial Time Generator for Minimal Perfect <DIR>
Hash Functions</I> Communications of the ACM, 28, 5(December 1985), 523-532 <LI><A HREF="gperf_6.html#IDX42">Array name</A>
</DIR>
</P> <H2>b</H2>
<P> <DIR>
[9] Schmidt, Douglas C. <I>GPERF: A Perfect Hash Function Generator</I> <LI><A HREF="gperf_2.html#IDX1">Bugs</A>
Second USENIX C++ Conference Proceedings, April 1990. </DIR>
<H2>c</H2>
</P> <DIR>
<P> <LI><A HREF="gperf_6.html#IDX41">Class name</A>
[10] Sebesta, R.W. and Taylor, M.A. <I>Minimal Perfect Hash Functions </DIR>
for Reserved Word Lists</I> SIGPLAN Notices, 20, 12(September 1985), 47-53. <H2>d</H2>
<DIR>
</P> <LI><A HREF="gperf_5.html#IDX5">Declaration section</A>
<P> <LI><A HREF="gperf_6.html#IDX38">Delimiters</A>
[11] Sprugnoli, R. <I>Perfect Hashing Functions: A Single Probe <LI><A HREF="gperf_6.html#IDX44">Duplicates</A>
Retrieving Method for Static Sets</I> Communications of the ACM, 20 </DIR>
11(November 1977), 841-850. <H2>f</H2>
<DIR>
</P> <LI><A HREF="gperf_5.html#IDX4">Format</A>
<P> <LI><A HREF="gperf_5.html#IDX7">Functions section</A>
[12] Stallman, Richard M. <I>Using and Porting GNU CC</I> Free Software Foundation, </DIR>
1988. <H2>h</H2>
<DIR>
</P> <LI><A HREF="gperf_5.html#IDX34">hash</A>
<P> <LI><A HREF="gperf_5.html#IDX33">hash table</A>
[13] Stroustrup, Bjarne <I>The C++ Programming Language.</I> Addison-Wesley, 1986. </DIR>
<H2>i</H2>
</P> <DIR>
<P> <LI><A HREF="gperf_5.html#IDX35">in_word_set</A>
[14] Tiemann, Michael D. <I>User's Guide to GNU C++</I> Free Software <LI><A HREF="gperf_6.html#IDX40">Initializers</A>
Foundation, 1989. </DIR>
<H2>j</H2>
<DIR>
<LI><A HREF="gperf_6.html#IDX45">Jump value</A>
</DIR>
<H2>k</H2>
<DIR>
<LI><A HREF="gperf_5.html#IDX6">Keywords section</A>
</DIR>
<H2>m</H2>
<DIR>
<LI><A HREF="gperf_4.html#IDX3">Minimal perfect hash functions</A>
</DIR>
<H2>n</H2>
<DIR>
<LI><A HREF="gperf_5.html#IDX37">NUL</A>
</DIR>
<H2>s</H2>
<DIR>
<LI><A HREF="gperf_6.html#IDX39">Slot name</A>
<LI><A HREF="gperf_4.html#IDX2">Static search structure</A>
<LI><A HREF="gperf_5.html#IDX36"><CODE>switch</CODE></A>, <A HREF="gperf_6.html#IDX43"><CODE>switch</CODE></A>
</DIR>
</P> </P>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_9.html">previous</A>, <A HREF="gperf_11.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_9.html">previous</A>, next, last section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,12 +1,12 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - Contributors to GNU gperf Utility</TITLE> <TITLE>Perfect Hash Function Generator - Contributors to GNU gperf Utility</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_1.html">previous</A>, <A HREF="gperf_3.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_1.html">previous</A>, <A HREF="gperf_3.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
@@ -18,15 +18,13 @@ Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_1.html">previous</A>,
<A NAME="IDX1"></A> <A NAME="IDX1"></A>
The GNU <CODE>gperf</CODE> perfect hash function generator utility was The GNU <CODE>gperf</CODE> perfect hash function generator utility was
originally written in GNU C++ by Douglas C. Schmidt. It is now also written in GNU C++ by Douglas C. Schmidt. The general
available in a highly-portable "old-style" C version. The general
idea for the perfect hash function generator was inspired by Keith idea for the perfect hash function generator was inspired by Keith
Bostic's algorithm written in C, and distributed to net.sources around Bostic's algorithm written in C, and distributed to net.sources around
1984. The current program is a heavily modified, enhanced, and extended 1984. The current program is a heavily modified, enhanced, and extended
implementation of Keith's basic idea, created at the University of implementation of Keith's basic idea, created at the University of
California, Irvine. Bugs, patches, and suggestions should be reported California, Irvine. Bugs, patches, and suggestions should be reported
to both <CODE>&#60;bug-gnu-utils@gnu.org&#62;</CODE> and to <CODE>&#60;bug-gnu-gperf@gnu.org&#62;</CODE>.
<CODE>&#60;gperf-bugs@lists.sourceforge.net&#62;</CODE>.
<LI> <LI>
@@ -39,11 +37,12 @@ that greatly helped improve the quality and functionality of <CODE>gperf</CODE>.
<LI> <LI>
A testsuite was added by Bruno Haible. He also rewrote the output Bruno Haible enhanced and optimized the search algorithm. He also rewrote
routines for better reliability. the input routines and the output routines for better reliability, and
added a testsuite.
</UL> </UL>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_1.html">previous</A>, <A HREF="gperf_3.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_1.html">previous</A>, <A HREF="gperf_3.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,12 +1,12 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 1 Introduction</TITLE> <TITLE>Perfect Hash Function Generator - 1 Introduction</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_2.html">previous</A>, <A HREF="gperf_4.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_2.html">previous</A>, <A HREF="gperf_4.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
@@ -16,8 +16,8 @@ Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_2.html">previous</A>,
<CODE>gperf</CODE> is a perfect hash function generator written in C++. It <CODE>gperf</CODE> is a perfect hash function generator written in C++. It
transforms an <VAR>n</VAR> element user-specified keyword set <VAR>W</VAR> into a transforms an <VAR>n</VAR> element user-specified keyword set <VAR>W</VAR> into a
perfect hash function <VAR>F</VAR>. <VAR>F</VAR> uniquely maps keywords in perfect hash function <VAR>F</VAR>. <VAR>F</VAR> uniquely maps keywords in
<VAR>W</VAR> onto the range 0..<VAR>k</VAR>, where <VAR>k</VAR> &#62;= <VAR>n</VAR>. If <VAR>k</VAR> <VAR>W</VAR> onto the range 0..<VAR>k</VAR>, where <VAR>k</VAR> &#62;= <VAR>n-1</VAR>. If <VAR>k</VAR>
= <VAR>n</VAR> then <VAR>F</VAR> is a <EM>minimal</EM> perfect hash function. = <VAR>n-1</VAR> then <VAR>F</VAR> is a <EM>minimal</EM> perfect hash function.
<CODE>gperf</CODE> generates a 0..<VAR>k</VAR> element static lookup table and a <CODE>gperf</CODE> generates a 0..<VAR>k</VAR> element static lookup table and a
pair of C functions. These functions determine whether a given pair of C functions. These functions determine whether a given
character string <VAR>s</VAR> occurs in <VAR>W</VAR>, using at most one probe into character string <VAR>s</VAR> occurs in <VAR>W</VAR>, using at most one probe into
@@ -27,14 +27,15 @@ the lookup table.
<P> <P>
<CODE>gperf</CODE> currently generates the reserved keyword recognizer for <CODE>gperf</CODE> currently generates the reserved keyword recognizer for
lexical analyzers in several production and research compilers and lexical analyzers in several production and research compilers and
language processing tools, including GNU C, GNU C++, GNU Pascal, GNU language processing tools, including GNU C, GNU C++, GNU Java, GNU Pascal,
Modula 3, and GNU indent. Complete C++ source code for <CODE>gperf</CODE> is GNU Modula 3, and GNU indent. Complete C++ source code for <CODE>gperf</CODE> is
available via anonymous ftp from <CODE>ftp://ftp.gnu.org/pub/gnu/gperf/</CODE>. available from <CODE>http://ftp.gnu.org/pub/gnu/gperf/</CODE>.
A paper describing <CODE>gperf</CODE>'s design and implementation in greater A paper describing <CODE>gperf</CODE>'s design and implementation in greater
detail is available in the Second USENIX C++ Conference proceedings. detail is available in the Second USENIX C++ Conference proceedings
or from <CODE>http://www.cs.wustl.edu/~schmidt/resume.html</CODE>.
</P> </P>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_2.html">previous</A>, <A HREF="gperf_4.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_2.html">previous</A>, <A HREF="gperf_4.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,12 +1,12 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 2 Static search structures and GNU gperf</TITLE> <TITLE>Perfect Hash Function Generator - 2 Static search structures and GNU gperf</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
@@ -19,7 +19,7 @@ Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>,
A <STRONG>static search structure</STRONG> is an Abstract Data Type with certain A <STRONG>static search structure</STRONG> is an Abstract Data Type with certain
fundamental operations, e.g., <EM>initialize</EM>, <EM>insert</EM>, fundamental operations, e.g., <EM>initialize</EM>, <EM>insert</EM>,
and <EM>retrieve</EM>. Conceptually, all insertions occur before any and <EM>retrieve</EM>. Conceptually, all insertions occur before any
retrievals. In practice, <CODE>gperf</CODE> generates a <CODE>static</CODE> array retrievals. In practice, <CODE>gperf</CODE> generates a <EM>static</EM> array
containing search set keywords and any associated attributes specified containing search set keywords and any associated attributes specified
by the user. Thus, there is essentially no execution-time cost for the by the user. Thus, there is essentially no execution-time cost for the
insertions. It is a useful data structure for representing <EM>static insertions. It is a useful data structure for representing <EM>static
@@ -86,13 +86,13 @@ the drudgery associated with constructing time- and space-efficient
search structures by hand. It has proven a useful and practical tool search structures by hand. It has proven a useful and practical tool
for serious programming projects. Output from <CODE>gperf</CODE> is currently for serious programming projects. Output from <CODE>gperf</CODE> is currently
used in several production and research compilers, including GNU C, GNU used in several production and research compilers, including GNU C, GNU
C++, GNU Pascal, and GNU Modula 3. The latter two compilers are not yet C++, GNU Java, GNU Pascal, and GNU Modula 3. The latter two compilers are
part of the official GNU distribution. Each compiler utilizes not yet part of the official GNU distribution. Each compiler utilizes
<CODE>gperf</CODE> to automatically generate static search structures that <CODE>gperf</CODE> to automatically generate static search structures that
efficiently identify their respective reserved keywords. efficiently identify their respective reserved keywords.
</P> </P>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_3.html">previous</A>, <A HREF="gperf_5.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,12 +1,12 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 3 High-Level Description of GNU gperf</TITLE> <TITLE>Perfect Hash Function Generator - 3 High-Level Description of GNU gperf</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_4.html">previous</A>, <A HREF="gperf_6.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_4.html">previous</A>, <A HREF="gperf_6.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
@@ -14,7 +14,7 @@ Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_4.html">previous</A>,
<P> <P>
The perfect hash function generator <CODE>gperf</CODE> reads a set of The perfect hash function generator <CODE>gperf</CODE> reads a set of
"keywords" from a <STRONG>keyfile</STRONG> (or from the standard input by "keywords" from an input file (or from the standard input by
default). It attempts to derive a perfect hashing function that default). It attempts to derive a perfect hashing function that
recognizes a member of the <STRONG>static keyword set</STRONG> with at most a recognizes a member of the <STRONG>static keyword set</STRONG> with at most a
single probe into the lookup table. If <CODE>gperf</CODE> succeeds in single probe into the lookup table. If <CODE>gperf</CODE> succeeds in
@@ -37,7 +37,7 @@ somewhat. Actual results depend on your C compiler, of course.
</P> </P>
<P> <P>
In general, <CODE>gperf</CODE> assigns values to the characters it is using In general, <CODE>gperf</CODE> assigns values to the bytes it is using
for hashing until some set of values gives each keyword a unique value. for hashing until some set of values gives each keyword a unique value.
A helpful heuristic is that the larger the hash value range, the easier A helpful heuristic is that the larger the hash value range, the easier
it is for <CODE>gperf</CODE> to find and generate a perfect hash function. it is for <CODE>gperf</CODE> to find and generate a perfect hash function.
@@ -52,7 +52,7 @@ Experimentation is the key to getting the most from <CODE>gperf</CODE>.
<A NAME="IDX5"></A> <A NAME="IDX5"></A>
<A NAME="IDX6"></A> <A NAME="IDX6"></A>
<A NAME="IDX7"></A> <A NAME="IDX7"></A>
You can control the input keyfile format by varying certain command-line You can control the input file format by varying certain command-line
arguments, in particular the <SAMP>`-t'</SAMP> option. The input's appearance arguments, in particular the <SAMP>`-t'</SAMP> option. The input's appearance
is similar to GNU utilities <CODE>flex</CODE> and <CODE>bison</CODE> (or UNIX is similar to GNU utilities <CODE>flex</CODE> and <CODE>bison</CODE> (or UNIX
utilities <CODE>lex</CODE> and <CODE>yacc</CODE>). Here's an outline of the general utilities <CODE>lex</CODE> and <CODE>yacc</CODE>). Here's an outline of the general
@@ -69,25 +69,53 @@ functions
</PRE> </PRE>
<P> <P>
<EM>Unlike</EM> <CODE>flex</CODE> or <CODE>bison</CODE>, all sections of <EM>Unlike</EM> <CODE>flex</CODE> or <CODE>bison</CODE>, the declarations section and
<CODE>gperf</CODE>'s input are optional. The following sections describe the the functions section are optional. The following sections describe the
input format for each section. input format for each section.
</P> </P>
<P>
It is possible to omit the declaration section entirely, if the <SAMP>`-t'</SAMP>
option is not given. In this case the input file begins directly with the
first keyword line, e.g.:
</P>
<PRE>
january
february
march
april
...
</PRE>
<H3><A NAME="SEC9" HREF="gperf_toc.html#TOC9">3.1.1 Declarations</A></H3>
<P>
The keyword input file optionally contains a section for including
arbitrary C declarations and definitions, <CODE>gperf</CODE> declarations that
act like command-line options, as well as for providing a user-supplied
<CODE>struct</CODE>.
</P>
<H3><A NAME="SEC9" HREF="gperf_toc.html#TOC9">3.1.1 <CODE>struct</CODE> Declarations and C Code Inclusion</A></H3> <H4><A NAME="SEC10" HREF="gperf_toc.html#TOC10">3.1.1.1 User-supplied <CODE>struct</CODE></A></H4>
<P> <P>
The keyword input file optionally contains a section for including If the <SAMP>`-t'</SAMP> option (or, equivalently, the <SAMP>`%struct-type'</SAMP> declaration)
arbitrary C declarations and definitions, as well as provisions for
providing a user-supplied <CODE>struct</CODE>. If the <SAMP>`-t'</SAMP> option
<EM>is</EM> enabled, you <EM>must</EM> provide a C <CODE>struct</CODE> as the last <EM>is</EM> enabled, you <EM>must</EM> provide a C <CODE>struct</CODE> as the last
component in the declaration section from the keyfile file. The first component in the declaration section from the input file. The first
field in this struct must be a <CODE>char *</CODE> or <CODE>const char *</CODE> field in this struct must be of type <CODE>char *</CODE> or <CODE>const char *</CODE>
identifier called <SAMP>`name'</SAMP>, although it is possible to modify this if the <SAMP>`-P'</SAMP> option is not given, or of type <CODE>int</CODE> if the option
field's name with the <SAMP>`-K'</SAMP> option described below. <SAMP>`-P'</SAMP> (or, equivalently, the <SAMP>`%pic'</SAMP> declaration) is enabled.
This first field must be called <SAMP>`name'</SAMP>, although it is possible to modify
its name with the <SAMP>`-K'</SAMP> option (or, equivalently, the
<SAMP>`%define slot-name'</SAMP> declaration) described below.
</P> </P>
<P> <P>
@@ -121,9 +149,260 @@ appearing left justified in the first column, as in the UNIX utility
<CODE>lex</CODE>. <CODE>lex</CODE>.
</P> </P>
<H4><A NAME="SEC11" HREF="gperf_toc.html#TOC11">3.1.1.2 Gperf Declarations</A></H4>
<P> <P>
The declaration section can contain <CODE>gperf</CODE> declarations. They
influence the way <CODE>gperf</CODE> works, like command line options do.
In fact, every such declaration is equivalent to a command line option.
There are three forms of declarations:
</P>
<OL>
<LI>
Declarations without argument, like <SAMP>`%compare-lengths'</SAMP>.
<LI>
Declarations with an argument, like <SAMP>`%switch=<VAR>count</VAR>'</SAMP>.
<LI>
Declarations of names of entities in the output file, like
<SAMP>`%define lookup-function-name <VAR>name</VAR>'</SAMP>.
</OL>
<P>
When a declaration is given both in the input file and as a command line
option, the command-line option's value prevails.
</P>
<P>
The following <CODE>gperf</CODE> declarations are available.
</P>
<DL COMPACT>
<DT><SAMP>`%delimiters=<VAR>delimiter-list</VAR>'</SAMP>
<DD>
<A NAME="IDX9"></A> <A NAME="IDX9"></A>
Allows you to provide a string containing delimiters used to
separate keywords from their attributes. The default is ",". This
option is essential if you want to use keywords that have embedded
commas or newlines.
<DT><SAMP>`%struct-type'</SAMP>
<DD>
<A NAME="IDX10"></A> <A NAME="IDX10"></A>
Allows you to include a <CODE>struct</CODE> type declaration for generated
code; see above for an example.
<DT><SAMP>`%ignore-case'</SAMP>
<DD>
<A NAME="IDX11"></A>
Consider upper and lower case ASCII characters as equivalent. The string
comparison will use a case insignificant character comparison. Note that
locale dependent case mappings are ignored.
<DT><SAMP>`%language=<VAR>language-name</VAR>'</SAMP>
<DD>
<A NAME="IDX12"></A>
Instructs <CODE>gperf</CODE> to generate code in the language specified by the
option's argument. Languages handled are currently:
<DL COMPACT>
<DT><SAMP>`KR-C'</SAMP>
<DD>
Old-style K&#38;R C. This language is understood by old-style C compilers and
ANSI C compilers, but ANSI C compilers may flag warnings (or even errors)
because of lacking <SAMP>`const'</SAMP>.
<DT><SAMP>`C'</SAMP>
<DD>
Common C. This language is understood by ANSI C compilers, and also by
old-style C compilers, provided that you <CODE>#define const</CODE> to empty
for compilers which don't know about this keyword.
<DT><SAMP>`ANSI-C'</SAMP>
<DD>
ANSI C. This language is understood by ANSI C compilers and C++ compilers.
<DT><SAMP>`C++'</SAMP>
<DD>
C++. This language is understood by C++ compilers.
</DL>
The default is C.
<DT><SAMP>`%define slot-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX13"></A>
This declaration is only useful when option <SAMP>`-t'</SAMP> (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) has been given.
By default, the program assumes the structure component identifier for
the keyword is <SAMP>`name'</SAMP>. This option allows an arbitrary choice of
identifier for this component, although it still must occur as the first
field in your supplied <CODE>struct</CODE>.
<DT><SAMP>`%define initializer-suffix <VAR>initializers</VAR>'</SAMP>
<DD>
<A NAME="IDX14"></A>
This declaration is only useful when option <SAMP>`-t'</SAMP> (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) has been given.
It permits to specify initializers for the structure members following
<VAR>slot-name</VAR> in empty hash table entries. The list of initializers
should start with a comma. By default, the emitted code will
zero-initialize structure members following <VAR>slot-name</VAR>.
<DT><SAMP>`%define hash-function-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX15"></A>
Allows you to specify the name for the generated hash function. Default
name is <SAMP>`hash'</SAMP>. This option permits the use of two hash tables in
the same file.
<DT><SAMP>`%define lookup-function-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX16"></A>
Allows you to specify the name for the generated lookup function.
Default name is <SAMP>`in_word_set'</SAMP>. This option permits multiple
generated hash functions to be used in the same application.
<DT><SAMP>`%define class-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX17"></A>
This option is only useful when option <SAMP>`-L C++'</SAMP> (or, equivalently,
the <SAMP>`%language=C++'</SAMP> declaration) has been given. It
allows you to specify the name of generated C++ class. Default name is
<CODE>Perfect_Hash</CODE>.
<DT><SAMP>`%7bit'</SAMP>
<DD>
<A NAME="IDX18"></A>
This option specifies that all strings that will be passed as arguments
to the generated hash function and the generated lookup function will
solely consist of 7-bit ASCII characters (bytes in the range 0..127).
(Note that the ANSI C functions <CODE>isalnum</CODE> and <CODE>isgraph</CODE> do
<EM>not</EM> guarantee that a byte is in this range. Only an explicit
test like <SAMP>`c &#62;= 'A' &#38;&#38; c &#60;= 'Z''</SAMP> guarantees this.)
<DT><SAMP>`%compare-lengths'</SAMP>
<DD>
<A NAME="IDX19"></A>
Compare keyword lengths before trying a string comparison. This option
is mandatory for binary comparisons (see section <A HREF="gperf_5.html#SEC17">3.3 Use of NUL bytes</A>). It also might
cut down on the number of string comparisons made during the lookup, since
keywords with different lengths are never compared via <CODE>strcmp</CODE>.
However, using <SAMP>`%compare-lengths'</SAMP> might greatly increase the size of the
generated C code if the lookup table range is large (which implies that
the switch option <SAMP>`-S'</SAMP> or <SAMP>`%switch'</SAMP> is not enabled), since the length
table contains as many elements as there are entries in the lookup table.
<DT><SAMP>`%compare-strncmp'</SAMP>
<DD>
<A NAME="IDX20"></A>
Generates C code that uses the <CODE>strncmp</CODE> function to perform
string comparisons. The default action is to use <CODE>strcmp</CODE>.
<DT><SAMP>`%readonly-tables'</SAMP>
<DD>
<A NAME="IDX21"></A>
Makes the contents of all generated lookup tables constant, i.e.,
"readonly". Many compilers can generate more efficient code for this
by putting the tables in readonly memory.
<DT><SAMP>`%enum'</SAMP>
<DD>
<A NAME="IDX22"></A>
Define constant values using an enum local to the lookup function rather
than with #defines. This also means that different lookup functions can
reside in the same file. Thanks to James Clark <CODE>&#60;jjc@ai.mit.edu&#62;</CODE>.
<DT><SAMP>`%includes'</SAMP>
<DD>
<A NAME="IDX23"></A>
Include the necessary system include file, <CODE>&#60;string.h&#62;</CODE>, at the
beginning of the code. By default, this is not done; the user must
include this header file himself to allow compilation of the code.
<DT><SAMP>`%global-table'</SAMP>
<DD>
<A NAME="IDX24"></A>
Generate the static table of keywords as a static global variable,
rather than hiding it inside of the lookup function (which is the
default behavior).
<DT><SAMP>`%pic'</SAMP>
<DD>
<A NAME="IDX25"></A>
Optimize the generated table for inclusion in shared libraries. This
reduces the startup time of programs using a shared library containing
the generated code. If the <SAMP>`%struct-type'</SAMP> declaration (or,
equivalently, the option <SAMP>`-t'</SAMP>) is also given, the first field of the
user-defined struct must be of type <SAMP>`int'</SAMP>, not <SAMP>`char *'</SAMP>, because
it will contain offsets into the string pool instead of actual strings.
To convert such an offset to a string, you can use the expression
<SAMP>`stringpool + <VAR>o</VAR>'</SAMP>, where <VAR>o</VAR> is the offset. The string pool
name can be changed through the <SAMP>`%define string-pool-name'</SAMP> declaration.
<DT><SAMP>`%define string-pool-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX26"></A>
Allows you to specify the name of the generated string pool created by
the declaration <SAMP>`%pic'</SAMP> (or, equivalently, the option <SAMP>`-P'</SAMP>).
The default name is <SAMP>`stringpool'</SAMP>. This declaration permits the use of
two hash tables in the same file, with <SAMP>`%pic'</SAMP> and even when the
<SAMP>`%global-table'</SAMP> declaration (or, equivalently, the option <SAMP>`-G'</SAMP>)
is given.
<DT><SAMP>`%null-strings'</SAMP>
<DD>
<A NAME="IDX27"></A>
Use NULL strings instead of empty strings for empty keyword table entries.
This reduces the startup time of programs using a shared library containing
the generated code (but not as much as the declaration <SAMP>`%pic'</SAMP>), at the
expense of one more test-and-branch instruction at run time.
<DT><SAMP>`%define word-array-name <VAR>name</VAR>'</SAMP>
<DD>
<A NAME="IDX28"></A>
Allows you to specify the name for the generated array containing the
hash table. Default name is <SAMP>`wordlist'</SAMP>. This option permits the
use of two hash tables in the same file, even when the option <SAMP>`-G'</SAMP>
(or, equivalently, the <SAMP>`%global-table'</SAMP> declaration) is given.
<DT><SAMP>`%switch=<VAR>count</VAR>'</SAMP>
<DD>
<A NAME="IDX29"></A>
Causes the generated C code to use a <CODE>switch</CODE> statement scheme,
rather than an array lookup table. This can lead to a reduction in both
time and space requirements for some input files. The argument to this
option determines how many <CODE>switch</CODE> statements are generated. A
value of 1 generates 1 <CODE>switch</CODE> containing all the elements, a
value of 2 generates 2 tables with 1/2 the elements in each
<CODE>switch</CODE>, etc. This is useful since many C compilers cannot
correctly generate code for large <CODE>switch</CODE> statements. This option
was inspired in part by Keith Bostic's original C program.
<DT><SAMP>`%omit-struct-type'</SAMP>
<DD>
<A NAME="IDX30"></A>
Prevents the transfer of the type declaration to the output file. Use
this option if the type is already defined elsewhere.
</DL>
<H4><A NAME="SEC12" HREF="gperf_toc.html#TOC12">3.1.1.3 C Code Inclusion</A></H4>
<P>
<A NAME="IDX31"></A>
<A NAME="IDX32"></A>
Using a syntax similar to GNU utilities <CODE>flex</CODE> and <CODE>bison</CODE>, it Using a syntax similar to GNU utilities <CODE>flex</CODE> and <CODE>bison</CODE>, it
is possible to directly include C source text and comments verbatim into is possible to directly include C source text and comments verbatim into
the generated output file. This is accomplished by enclosing the region the generated output file. This is accomplished by enclosing the region
@@ -147,37 +426,25 @@ march, 3, 31, 31
... ...
</PRE> </PRE>
<P>
It is possible to omit the declaration section entirely. In this case
the keyfile begins directly with the first keyword line, e.g.:
</P>
<PRE>
january, 1, 31, 31
february, 2, 28, 29
march, 3, 31, 31
april, 4, 30, 30
...
</PRE>
<H3><A NAME="SEC13" HREF="gperf_toc.html#TOC13">3.1.2 Format for Keyword Entries</A></H3>
<H3><A NAME="SEC10" HREF="gperf_toc.html#TOC10">3.1.2 Format for Keyword Entries</A></H3>
<P> <P>
The second keyfile format section contains lines of keywords and any The second input file format section contains lines of keywords and any
associated attributes you might supply. A line beginning with <SAMP>`#'</SAMP> associated attributes you might supply. A line beginning with <SAMP>`#'</SAMP>
in the first column is considered a comment. Everything following the in the first column is considered a comment. Everything following the
<SAMP>`#'</SAMP> is ignored, up to and including the following newline. <SAMP>`#'</SAMP> is ignored, up to and including the following newline. A line
beginning with <SAMP>`%'</SAMP> in the first column is an option declaration and
must not occur within the keywords section.
</P> </P>
<P> <P>
The first field of each non-comment line is always the key itself. It The first field of each non-comment line is always the keyword itself. It
can be given in two ways: as a simple name, i.e., without surrounding can be given in two ways: as a simple name, i.e., without surrounding
string quotation marks, or as a string enclosed in double-quotes, in string quotation marks, or as a string enclosed in double-quotes, in
C syntax, possibly with backslash escapes like <CODE>\"</CODE> or <CODE>\234</CODE> C syntax, possibly with backslash escapes like <CODE>\"</CODE> or <CODE>\234</CODE>
or <CODE>\xa8</CODE>. In either case, it must start right at the beginning or <CODE>\xa8</CODE>. In either case, it must start right at the beginning
of the line, without leading whitespace. of the line, without leading whitespace.
In this context, a "field" is considered to extend up to, but In this context, a "field" is considered to extend up to, but
not include, the first blank, comma, or newline. Here is a simple not include, the first blank, comma, or newline. Here is a simple
@@ -209,14 +476,15 @@ Additional fields may optionally follow the leading keyword. Fields
should be separated by commas, and terminate at the end of line. What should be separated by commas, and terminate at the end of line. What
these fields mean is entirely up to you; they are used to initialize the these fields mean is entirely up to you; they are used to initialize the
elements of the user-defined <CODE>struct</CODE> provided by you in the elements of the user-defined <CODE>struct</CODE> provided by you in the
declaration section. If the <SAMP>`-t'</SAMP> option is <EM>not</EM> enabled declaration section. If the <SAMP>`-t'</SAMP> option (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) is <EM>not</EM> enabled
these fields are simply ignored. All previous examples except the last these fields are simply ignored. All previous examples except the last
one contain keyword attributes. one contain keyword attributes.
</P> </P>
<H3><A NAME="SEC11" HREF="gperf_toc.html#TOC11">3.1.3 Including Additional C Functions</A></H3> <H3><A NAME="SEC14" HREF="gperf_toc.html#TOC14">3.1.3 Including Additional C Functions</A></H3>
<P> <P>
The optional third section also corresponds closely with conventions The optional third section also corresponds closely with conventions
@@ -229,9 +497,57 @@ section is valid C.
</P> </P>
<H2><A NAME="SEC12" HREF="gperf_toc.html#TOC12">3.2 Output Format for Generated C Code with <CODE>gperf</CODE></A></H2> <H3><A NAME="SEC15" HREF="gperf_toc.html#TOC15">3.1.4 Where to place directives for GNU <CODE>indent</CODE>.</A></H3>
<P> <P>
<A NAME="IDX11"></A> If you want to invoke GNU <CODE>indent</CODE> on a <CODE>gperf</CODE> input file,
you will see that GNU <CODE>indent</CODE> doesn't understand the <SAMP>`%%'</SAMP>,
<SAMP>`%{'</SAMP> and <SAMP>`%}'</SAMP> directives that control <CODE>gperf</CODE>'s
interpretation of the input file. Therefore you have to insert some
directives for GNU <CODE>indent</CODE>. More precisely, assuming the most
general input file structure
</P>
<PRE>
declarations part 1
%{
verbatim code
%}
declarations part 2
%%
keywords
%%
functions
</PRE>
<P>
you would insert <SAMP>`*INDENT-OFF*'</SAMP> and <SAMP>`*INDENT-ON*'</SAMP> comments
as follows:
</P>
<PRE>
/* *INDENT-OFF* */
declarations part 1
%{
/* *INDENT-ON* */
verbatim code
/* *INDENT-OFF* */
%}
declarations part 2
%%
keywords
%%
/* *INDENT-ON* */
functions
</PRE>
<H2><A NAME="SEC16" HREF="gperf_toc.html#TOC16">3.2 Output Format for Generated C Code with <CODE>gperf</CODE></A></H2>
<P>
<A NAME="IDX33"></A>
</P> </P>
<P> <P>
@@ -246,34 +562,36 @@ function prototypes are as follows:
<P> <P>
<DL> <DL>
<DT><U>Function:</U> unsigned int <B>hash</B> <I>(const char * <VAR>str</VAR>, unsigned int <VAR>len</VAR>)</I> <DT><U>Function:</U> unsigned int <B>hash</B> <I>(const char * <VAR>str</VAR>, unsigned int <VAR>len</VAR>)</I>
<DD><A NAME="IDX12"></A> <DD><A NAME="IDX34"></A>
By default, the generated <CODE>hash</CODE> function returns an integer value By default, the generated <CODE>hash</CODE> function returns an integer value
created by adding <VAR>len</VAR> to several user-specified <VAR>str</VAR> key created by adding <VAR>len</VAR> to several user-specified <VAR>str</VAR> byte
positions indexed into an <STRONG>associated values</STRONG> table stored in a positions indexed into an <STRONG>associated values</STRONG> table stored in a
local static array. The associated values table is constructed local static array. The associated values table is constructed
internally by <CODE>gperf</CODE> and later output as a static local C array internally by <CODE>gperf</CODE> and later output as a static local C array
called <SAMP>`hash_table'</SAMP>; its meaning and properties are described below called <SAMP>`hash_table'</SAMP>. The relevant selected positions (i.e. indices
(see section <A HREF="gperf_9.html#SEC22">7 Implementation Details of GNU <CODE>gperf</CODE></A>). The relevant key positions are specified via into <VAR>str</VAR>) are specified via the <SAMP>`-k'</SAMP> option when running
the <SAMP>`-k'</SAMP> option when running <CODE>gperf</CODE>, as detailed in the <CODE>gperf</CODE>, as detailed in the <EM>Options</EM> section below (see section <A HREF="gperf_6.html#SEC18">4 Invoking <CODE>gperf</CODE></A>).
<EM>Options</EM> section below(see section <A HREF="gperf_6.html#SEC14">4 Invoking <CODE>gperf</CODE></A>).
</DL> </DL>
</P> </P>
<P> <P>
<DL> <DL>
<DT><U>Function:</U> <B>in_word_set</B> <I>(const char * <VAR>str</VAR>, unsigned int <VAR>len</VAR>)</I> <DT><U>Function:</U> <B>in_word_set</B> <I>(const char * <VAR>str</VAR>, unsigned int <VAR>len</VAR>)</I>
<DD><A NAME="IDX13"></A> <DD><A NAME="IDX35"></A>
If <VAR>str</VAR> is in the keyword set, returns a pointer to that If <VAR>str</VAR> is in the keyword set, returns a pointer to that
keyword. More exactly, if the option <SAMP>`-t'</SAMP> was given, it returns keyword. More exactly, if the option <SAMP>`-t'</SAMP> (or, equivalently, the
a pointer to the matching keyword's structure. Otherwise it returns <SAMP>`%struct-type'</SAMP> declaration) was given, it returns
a pointer to the matching keyword's structure. Otherwise it returns
<CODE>NULL</CODE>. <CODE>NULL</CODE>.
</DL> </DL>
</P> </P>
<P> <P>
If the option <SAMP>`-c'</SAMP> is not used, <VAR>str</VAR> must be a NUL terminated If the option <SAMP>`-c'</SAMP> (or, equivalently, the <SAMP>`%compare-strncmp'</SAMP>
string of exactly length <VAR>len</VAR>. If <SAMP>`-c'</SAMP> is used, <VAR>str</VAR> must declaration) is not used, <VAR>str</VAR> must be a NUL terminated
simply be an array of <VAR>len</VAR> characters and does not need to be NUL string of exactly length <VAR>len</VAR>. If <SAMP>`-c'</SAMP> (or, equivalently, the
<SAMP>`%compare-strncmp'</SAMP> declaration) is used, <VAR>str</VAR> must
simply be an array of <VAR>len</VAR> bytes and does not need to be NUL
terminated. terminated.
</P> </P>
@@ -294,7 +612,7 @@ Make use of the user-defined <CODE>struct</CODE>.
<DD> <DD>
<DT><SAMP>`--switch=<VAR>total-switch-statements</VAR>'</SAMP> <DT><SAMP>`--switch=<VAR>total-switch-statements</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX14"></A> <A NAME="IDX36"></A>
Generate 1 or more C <CODE>switch</CODE> statement rather than use a large, Generate 1 or more C <CODE>switch</CODE> statement rather than use a large,
(and potentially sparse) static array. Although the exact time and (and potentially sparse) static array. Although the exact time and
space savings of this approach vary according to your C compiler's space savings of this approach vary according to your C compiler's
@@ -303,9 +621,11 @@ code.
</DL> </DL>
<P> <P>
If the <SAMP>`-t'</SAMP> and <SAMP>`-S'</SAMP> options are omitted, the default action If the <SAMP>`-t'</SAMP> and <SAMP>`-S'</SAMP> options (or, equivalently, the
is to generate a <CODE>char *</CODE> array containing the keys, together with <SAMP>`%struct-type'</SAMP> and <SAMP>`%switch'</SAMP> declarations) are omitted, the default
additional null strings used for padding the array. By experimenting action
is to generate a <CODE>char *</CODE> array containing the keywords, together with
additional empty strings used for padding the array. By experimenting
with the various input and output options, and timing the resulting C with the various input and output options, and timing the resulting C
code, you can determine the best option choices for different keyword code, you can determine the best option choices for different keyword
set characteristics. set characteristics.
@@ -313,36 +633,39 @@ set characteristics.
</P> </P>
<H2><A NAME="SEC13" HREF="gperf_toc.html#TOC13">3.3 Use of NUL characters</A></H2> <H2><A NAME="SEC17" HREF="gperf_toc.html#TOC17">3.3 Use of NUL bytes</A></H2>
<P> <P>
<A NAME="IDX15"></A> <A NAME="IDX37"></A>
</P> </P>
<P> <P>
By default, the code generated by <CODE>gperf</CODE> operates on zero By default, the code generated by <CODE>gperf</CODE> operates on zero
terminated strings, the usual representation of strings in C. This means terminated strings, the usual representation of strings in C. This means
that the keywords in the input file must not contain NUL characters, that the keywords in the input file must not contain NUL bytes,
and the <VAR>str</VAR> argument passed to <CODE>hash</CODE> or <CODE>in_word_set</CODE> and the <VAR>str</VAR> argument passed to <CODE>hash</CODE> or <CODE>in_word_set</CODE>
must be NUL terminated and have exactly length <VAR>len</VAR>. must be NUL terminated and have exactly length <VAR>len</VAR>.
</P> </P>
<P> <P>
If option <SAMP>`-c'</SAMP> is used, then the <VAR>str</VAR> argument does not need If option <SAMP>`-c'</SAMP> (or, equivalently, the <SAMP>`%compare-strncmp'</SAMP>
to be NUL terminated. The code generated by <CODE>gperf</CODE> will only declaration) is used, then the <VAR>str</VAR> argument does not need
to be NUL terminated. The code generated by <CODE>gperf</CODE> will only
access the first <VAR>len</VAR>, not <VAR>len+1</VAR>, bytes starting at <VAR>str</VAR>. access the first <VAR>len</VAR>, not <VAR>len+1</VAR>, bytes starting at <VAR>str</VAR>.
However, the keywords in the input file still must not contain NUL However, the keywords in the input file still must not contain NUL
characters. bytes.
</P> </P>
<P> <P>
If option <SAMP>`-l'</SAMP> is used, then the hash table performs binary If option <SAMP>`-l'</SAMP> (or, equivalently, the <SAMP>`%compare-lengths'</SAMP>
comparison. The keywords in the input file may contain NUL characters, declaration) is used, then the hash table performs binary
comparison. The keywords in the input file may contain NUL bytes,
written in string syntax as <CODE>\000</CODE> or <CODE>\x00</CODE>, and the code written in string syntax as <CODE>\000</CODE> or <CODE>\x00</CODE>, and the code
generated by <CODE>gperf</CODE> will treat NUL like any other character. generated by <CODE>gperf</CODE> will treat NUL like any other byte.
Also, in this case the <SAMP>`-c'</SAMP> option is ignored. Also, in this case the <SAMP>`-c'</SAMP> option (or, equivalently, the
<SAMP>`%compare-strncmp'</SAMP> declaration) is ignored.
</P> </P>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_4.html">previous</A>, <A HREF="gperf_6.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_4.html">previous</A>, <A HREF="gperf_6.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,38 +1,59 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 4 Invoking gperf</TITLE> <TITLE>Perfect Hash Function Generator - 4 Invoking gperf</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
<H1><A NAME="SEC14" HREF="gperf_toc.html#TOC14">4 Invoking <CODE>gperf</CODE></A></H1> <H1><A NAME="SEC18" HREF="gperf_toc.html#TOC18">4 Invoking <CODE>gperf</CODE></A></H1>
<P> <P>
There are <EM>many</EM> options to <CODE>gperf</CODE>. They were added to make There are <EM>many</EM> options to <CODE>gperf</CODE>. They were added to make
the program more convenient for use with real applications. "On-line" the program more convenient for use with real applications. "On-line"
help is readily available via the <SAMP>`-h'</SAMP> option. Here is the help is readily available via the <SAMP>`--help'</SAMP> option. Here is the
complete list of options. complete list of options.
</P> </P>
<H2><A NAME="SEC15" HREF="gperf_toc.html#TOC15">4.1 Options that affect Interpretation of the Input File</A></H2> <H2><A NAME="SEC19" HREF="gperf_toc.html#TOC19">4.1 Specifying the Location of the Output File</A></H2>
<DL COMPACT> <DL COMPACT>
<DT><SAMP>`--output-file=<VAR>file</VAR>'</SAMP>
<DD>
Allows you to specify the name of the file to which the output is written to.
</DL>
<P>
The results are written to standard output if no output file is specified
or if it is <SAMP>`-'</SAMP>.
</P>
<H2><A NAME="SEC20" HREF="gperf_toc.html#TOC20">4.2 Options that affect Interpretation of the Input File</A></H2>
<P>
These options are also available as declarations in the input file
(see section <A HREF="gperf_5.html#SEC11">3.1.1.2 Gperf Declarations</A>).
</P>
<DL COMPACT>
<DT><SAMP>`-e <VAR>keyword-delimiter-list</VAR>'</SAMP> <DT><SAMP>`-e <VAR>keyword-delimiter-list</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--delimiters=<VAR>keyword-delimiter-list</VAR>'</SAMP> <DT><SAMP>`--delimiters=<VAR>keyword-delimiter-list</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX16"></A> <A NAME="IDX38"></A>
Allows the user to provide a string containing delimiters used to Allows you to provide a string containing delimiters used to
separate keywords from their attributes. The default is ",\n". This separate keywords from their attributes. The default is ",". This
option is essential if you want to use keywords that have embedded option is essential if you want to use keywords that have embedded
commas or newlines. One useful trick is to use -e'TAB', where TAB is commas or newlines. One useful trick is to use -e'TAB', where TAB is
the literal tab character. the literal tab character.
@@ -47,12 +68,29 @@ part of the type declaration. Keywords and additional fields may follow
this, one group of fields per line. A set of examples for generating this, one group of fields per line. A set of examples for generating
perfect hash tables and functions for Ada, C, C++, Pascal, Modula 2, perfect hash tables and functions for Ada, C, C++, Pascal, Modula 2,
Modula 3 and JavaScript reserved words are distributed with this release. Modula 3 and JavaScript reserved words are distributed with this release.
<DT><SAMP>`--ignore-case'</SAMP>
<DD>
Consider upper and lower case ASCII characters as equivalent. The string
comparison will use a case insignificant character comparison. Note that
locale dependent case mappings are ignored. This option is therefore not
suitable if a properly internationalized or locale aware case mapping
should be used. (For example, in a Turkish locale, the upper case equivalent
of the lowercase ASCII letter <SAMP>`i'</SAMP> is the non-ASCII character
<SAMP>`capital i with dot above'</SAMP>.) For this case, it is better to apply
an uppercase or lowercase conversion on the string before passing it to
the <CODE>gperf</CODE> generated function.
</DL> </DL>
<H2><A NAME="SEC16" HREF="gperf_toc.html#TOC16">4.2 Options to specify the Language for the Output Code</A></H2> <H2><A NAME="SEC21" HREF="gperf_toc.html#TOC21">4.3 Options to specify the Language for the Output Code</A></H2>
<P>
These options are also available as declarations in the input file
(see section <A HREF="gperf_5.html#SEC11">3.1.1.2 Gperf Declarations</A>).
</P>
<DL COMPACT> <DL COMPACT>
<DT><SAMP>`-L <VAR>generated-language-name</VAR>'</SAMP> <DT><SAMP>`-L <VAR>generated-language-name</VAR>'</SAMP>
@@ -66,23 +104,23 @@ option's argument. Languages handled are currently:
<DT><SAMP>`KR-C'</SAMP> <DT><SAMP>`KR-C'</SAMP>
<DD> <DD>
Old-style K&#38;R C. This language is understood by old-style C compilers and Old-style K&#38;R C. This language is understood by old-style C compilers and
ANSI C compilers, but ANSI C compilers may flag warnings (or even errors) ANSI C compilers, but ANSI C compilers may flag warnings (or even errors)
because of lacking <SAMP>`const'</SAMP>. because of lacking <SAMP>`const'</SAMP>.
<DT><SAMP>`C'</SAMP> <DT><SAMP>`C'</SAMP>
<DD> <DD>
Common C. This language is understood by ANSI C compilers, and also by Common C. This language is understood by ANSI C compilers, and also by
old-style C compilers, provided that you <CODE>#define const</CODE> to empty old-style C compilers, provided that you <CODE>#define const</CODE> to empty
for compilers which don't know about this keyword. for compilers which don't know about this keyword.
<DT><SAMP>`ANSI-C'</SAMP> <DT><SAMP>`ANSI-C'</SAMP>
<DD> <DD>
ANSI C. This language is understood by ANSI C compilers and C++ compilers. ANSI C. This language is understood by ANSI C compilers and C++ compilers.
<DT><SAMP>`C++'</SAMP> <DT><SAMP>`C++'</SAMP>
<DD> <DD>
C++. This language is understood by C++ compilers. C++. This language is understood by C++ compilers.
</DL> </DL>
The default is C. The default is C.
@@ -90,26 +128,32 @@ The default is C.
<DT><SAMP>`-a'</SAMP> <DT><SAMP>`-a'</SAMP>
<DD> <DD>
This option is supported for compatibility with previous releases of This option is supported for compatibility with previous releases of
<CODE>gperf</CODE>. It does not do anything. <CODE>gperf</CODE>. It does not do anything.
<DT><SAMP>`-g'</SAMP> <DT><SAMP>`-g'</SAMP>
<DD> <DD>
This option is supported for compatibility with previous releases of This option is supported for compatibility with previous releases of
<CODE>gperf</CODE>. It does not do anything. <CODE>gperf</CODE>. It does not do anything.
</DL> </DL>
<H2><A NAME="SEC17" HREF="gperf_toc.html#TOC17">4.3 Options for fine tuning Details in the Output Code</A></H2> <H2><A NAME="SEC22" HREF="gperf_toc.html#TOC22">4.4 Options for fine tuning Details in the Output Code</A></H2>
<P>
Most of these options are also available as declarations in the input file
(see section <A HREF="gperf_5.html#SEC11">3.1.1.2 Gperf Declarations</A>).
</P>
<DL COMPACT> <DL COMPACT>
<DT><SAMP>`-K <VAR>key-name</VAR>'</SAMP> <DT><SAMP>`-K <VAR>slot-name</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--slot-name=<VAR>key-name</VAR>'</SAMP> <DT><SAMP>`--slot-name=<VAR>slot-name</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX17"></A> <A NAME="IDX39"></A>
This option is only useful when option <SAMP>`-t'</SAMP> has been given. This option is only useful when option <SAMP>`-t'</SAMP> (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) has been given.
By default, the program assumes the structure component identifier for By default, the program assumes the structure component identifier for
the keyword is <SAMP>`name'</SAMP>. This option allows an arbitrary choice of the keyword is <SAMP>`name'</SAMP>. This option allows an arbitrary choice of
identifier for this component, although it still must occur as the first identifier for this component, although it still must occur as the first
@@ -119,16 +163,17 @@ field in your supplied <CODE>struct</CODE>.
<DD> <DD>
<DT><SAMP>`--initializer-suffix=<VAR>initializers</VAR>'</SAMP> <DT><SAMP>`--initializer-suffix=<VAR>initializers</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX18"></A> <A NAME="IDX40"></A>
This option is only useful when option <SAMP>`-t'</SAMP> has been given. This option is only useful when option <SAMP>`-t'</SAMP> (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) has been given.
It permits to specify initializers for the structure members following It permits to specify initializers for the structure members following
<VAR>key name</VAR> in empty hash table entries. The list of initializers <VAR>slot-name</VAR> in empty hash table entries. The list of initializers
should start with a comma. By default, the emitted code will should start with a comma. By default, the emitted code will
zero-initialize structure members following <VAR>key name</VAR>. zero-initialize structure members following <VAR>slot-name</VAR>.
<DT><SAMP>`-H <VAR>hash-function-name</VAR>'</SAMP> <DT><SAMP>`-H <VAR>hash-function-name</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--hash-fn-name=<VAR>hash-function-name</VAR>'</SAMP> <DT><SAMP>`--hash-function-name=<VAR>hash-function-name</VAR>'</SAMP>
<DD> <DD>
Allows you to specify the name for the generated hash function. Default Allows you to specify the name for the generated hash function. Default
name is <SAMP>`hash'</SAMP>. This option permits the use of two hash tables in name is <SAMP>`hash'</SAMP>. This option permits the use of two hash tables in
@@ -136,19 +181,19 @@ the same file.
<DT><SAMP>`-N <VAR>lookup-function-name</VAR>'</SAMP> <DT><SAMP>`-N <VAR>lookup-function-name</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--lookup-fn-name=<VAR>lookup-function-name</VAR>'</SAMP> <DT><SAMP>`--lookup-function-name=<VAR>lookup-function-name</VAR>'</SAMP>
<DD> <DD>
Allows you to specify the name for the generated lookup function. Allows you to specify the name for the generated lookup function.
Default name is <SAMP>`in_word_set'</SAMP>. This option permits completely Default name is <SAMP>`in_word_set'</SAMP>. This option permits multiple
automatic generation of perfect hash functions, especially when multiple generated hash functions to be used in the same application.
generated hash functions are used in the same application.
<DT><SAMP>`-Z <VAR>class-name</VAR>'</SAMP> <DT><SAMP>`-Z <VAR>class-name</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--class-name=<VAR>class-name</VAR>'</SAMP> <DT><SAMP>`--class-name=<VAR>class-name</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX19"></A> <A NAME="IDX41"></A>
This option is only useful when option <SAMP>`-L C++'</SAMP> has been given. It This option is only useful when option <SAMP>`-L C++'</SAMP> (or, equivalently,
the <SAMP>`%language=C++'</SAMP> declaration) has been given. It
allows you to specify the name of generated C++ class. Default name is allows you to specify the name of generated C++ class. Default name is
<CODE>Perfect_Hash</CODE>. <CODE>Perfect_Hash</CODE>.
@@ -158,12 +203,25 @@ allows you to specify the name of generated C++ class. Default name is
<DD> <DD>
This option specifies that all strings that will be passed as arguments This option specifies that all strings that will be passed as arguments
to the generated hash function and the generated lookup function will to the generated hash function and the generated lookup function will
solely consist of 7-bit ASCII characters (characters in the range 0..127). solely consist of 7-bit ASCII characters (bytes in the range 0..127).
(Note that the ANSI C functions <CODE>isalnum</CODE> and <CODE>isgraph</CODE> do (Note that the ANSI C functions <CODE>isalnum</CODE> and <CODE>isgraph</CODE> do
<EM>not</EM> guarantee that a character is in this range. Only an explicit <EM>not</EM> guarantee that a byte is in this range. Only an explicit
test like <SAMP>`c &#62;= 'A' &#38;&#38; c &#60;= 'Z''</SAMP> guarantees this.) This was the test like <SAMP>`c &#62;= 'A' &#38;&#38; c &#60;= 'Z''</SAMP> guarantees this.) This was the
default in versions of <CODE>gperf</CODE> earlier than 2.7; now the default is default in versions of <CODE>gperf</CODE> earlier than 2.7; now the default is
to assume 8-bit characters. to support 8-bit and multibyte characters.
<DT><SAMP>`-l'</SAMP>
<DD>
<DT><SAMP>`--compare-lengths'</SAMP>
<DD>
Compare keyword lengths before trying a string comparison. This option
is mandatory for binary comparisons (see section <A HREF="gperf_5.html#SEC17">3.3 Use of NUL bytes</A>). It also might
cut down on the number of string comparisons made during the lookup, since
keywords with different lengths are never compared via <CODE>strcmp</CODE>.
However, using <SAMP>`-l'</SAMP> might greatly increase the size of the
generated C code if the lookup table range is large (which implies that
the switch option <SAMP>`-S'</SAMP> or <SAMP>`%switch'</SAMP> is not enabled), since the length
table contains as many elements as there are entries in the lookup table.
<DT><SAMP>`-c'</SAMP> <DT><SAMP>`-c'</SAMP>
<DD> <DD>
@@ -198,35 +256,66 @@ include this header file himself to allow compilation of the code.
<DT><SAMP>`-G'</SAMP> <DT><SAMP>`-G'</SAMP>
<DD> <DD>
<DT><SAMP>`--global'</SAMP> <DT><SAMP>`--global-table'</SAMP>
<DD> <DD>
Generate the static table of keywords as a static global variable, Generate the static table of keywords as a static global variable,
rather than hiding it inside of the lookup function (which is the rather than hiding it inside of the lookup function (which is the
default behavior). default behavior).
<DT><SAMP>`-P'</SAMP>
<DD>
<DT><SAMP>`--pic'</SAMP>
<DD>
Optimize the generated table for inclusion in shared libraries. This
reduces the startup time of programs using a shared library containing
the generated code. If the option <SAMP>`-t'</SAMP> (or, equivalently, the
<SAMP>`%struct-type'</SAMP> declaration) is also given, the first field of the
user-defined struct must be of type <SAMP>`int'</SAMP>, not <SAMP>`char *'</SAMP>, because
it will contain offsets into the string pool instead of actual strings.
To convert such an offset to a string, you can use the expression
<SAMP>`stringpool + <VAR>o</VAR>'</SAMP>, where <VAR>o</VAR> is the offset. The string pool
name can be changed through the option <SAMP>`--string-pool-name'</SAMP>.
<DT><SAMP>`-Q <VAR>string-pool-name</VAR>'</SAMP>
<DD>
<DT><SAMP>`--string-pool-name=<VAR>string-pool-name</VAR>'</SAMP>
<DD>
Allows you to specify the name of the generated string pool created by
option <SAMP>`-P'</SAMP>. The default name is <SAMP>`stringpool'</SAMP>. This option
permits the use of two hash tables in the same file, with <SAMP>`-P'</SAMP> and
even when the option <SAMP>`-G'</SAMP> (or, equivalently, the <SAMP>`%global-table'</SAMP>
declaration) is given.
<DT><SAMP>`--null-strings'</SAMP>
<DD>
Use NULL strings instead of empty strings for empty keyword table entries.
This reduces the startup time of programs using a shared library containing
the generated code (but not as much as option <SAMP>`-P'</SAMP>), at the expense
of one more test-and-branch instruction at run time.
<DT><SAMP>`-W <VAR>hash-table-array-name</VAR>'</SAMP> <DT><SAMP>`-W <VAR>hash-table-array-name</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--word-array-name=<VAR>hash-table-array-name</VAR>'</SAMP> <DT><SAMP>`--word-array-name=<VAR>hash-table-array-name</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX20"></A> <A NAME="IDX42"></A>
Allows you to specify the name for the generated array containing the Allows you to specify the name for the generated array containing the
hash table. Default name is <SAMP>`wordlist'</SAMP>. This option permits the hash table. Default name is <SAMP>`wordlist'</SAMP>. This option permits the
use of two hash tables in the same file, even when the option <SAMP>`-G'</SAMP> use of two hash tables in the same file, even when the option <SAMP>`-G'</SAMP>
is given. (or, equivalently, the <SAMP>`%global-table'</SAMP> declaration) is given.
<DT><SAMP>`-S <VAR>total-switch-statements</VAR>'</SAMP> <DT><SAMP>`-S <VAR>total-switch-statements</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--switch=<VAR>total-switch-statements</VAR>'</SAMP> <DT><SAMP>`--switch=<VAR>total-switch-statements</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX21"></A> <A NAME="IDX43"></A>
Causes the generated C code to use a <CODE>switch</CODE> statement scheme, Causes the generated C code to use a <CODE>switch</CODE> statement scheme,
rather than an array lookup table. This can lead to a reduction in both rather than an array lookup table. This can lead to a reduction in both
time and space requirements for some keyfiles. The argument to this time and space requirements for some input files. The argument to this
option determines how many <CODE>switch</CODE> statements are generated. A option determines how many <CODE>switch</CODE> statements are generated. A
value of 1 generates 1 <CODE>switch</CODE> containing all the elements, a value of 1 generates 1 <CODE>switch</CODE> containing all the elements, a
value of 2 generates 2 tables with 1/2 the elements in each value of 2 generates 2 tables with 1/2 the elements in each
<CODE>switch</CODE>, etc. This is useful since many C compilers cannot <CODE>switch</CODE>, etc. This is useful since many C compilers cannot
correctly generate code for large <CODE>switch</CODE> statements. This option correctly generate code for large <CODE>switch</CODE> statements. This option
was inspired in part by Keith Bostic's original C program. was inspired in part by Keith Bostic's original C program.
<DT><SAMP>`-T'</SAMP> <DT><SAMP>`-T'</SAMP>
@@ -239,92 +328,66 @@ this option if the type is already defined elsewhere.
<DT><SAMP>`-p'</SAMP> <DT><SAMP>`-p'</SAMP>
<DD> <DD>
This option is supported for compatibility with previous releases of This option is supported for compatibility with previous releases of
<CODE>gperf</CODE>. It does not do anything. <CODE>gperf</CODE>. It does not do anything.
</DL> </DL>
<H2><A NAME="SEC18" HREF="gperf_toc.html#TOC18">4.4 Options for changing the Algorithms employed by <CODE>gperf</CODE></A></H2> <H2><A NAME="SEC23" HREF="gperf_toc.html#TOC23">4.5 Options for changing the Algorithms employed by <CODE>gperf</CODE></A></H2>
<DL COMPACT> <DL COMPACT>
<DT><SAMP>`-k <VAR>keys</VAR>'</SAMP> <DT><SAMP>`-k <VAR>selected-byte-positions</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--key-positions=<VAR>keys</VAR>'</SAMP> <DT><SAMP>`--key-positions=<VAR>selected-byte-positions</VAR>'</SAMP>
<DD> <DD>
Allows selection of the character key positions used in the keywords' Allows selection of the byte positions used in the keywords'
hash function. The allowable choices range between 1-126, inclusive. hash function. The allowable choices range between 1-255, inclusive.
The positions are separated by commas, e.g., <SAMP>`-k 9,4,13,14'</SAMP>; The positions are separated by commas, e.g., <SAMP>`-k 9,4,13,14'</SAMP>;
ranges may be used, e.g., <SAMP>`-k 2-7'</SAMP>; and positions may occur ranges may be used, e.g., <SAMP>`-k 2-7'</SAMP>; and positions may occur
in any order. Furthermore, the meta-character '*' causes the generated in any order. Furthermore, the wildcard '*' causes the generated
hash function to consider <STRONG>all</STRONG> character positions in each key, hash function to consider <STRONG>all</STRONG> byte positions in each keyword,
whereas '$' instructs the hash function to use the "final character" whereas '$' instructs the hash function to use the "final byte"
of a key (this is the only way to use a character position greater than of a keyword (this is the only way to use a byte position greater than
126, incidentally). 255, incidentally).
For instance, the option <SAMP>`-k 1,2,4,6-10,'$''</SAMP> generates a hash For instance, the option <SAMP>`-k 1,2,4,6-10,'$''</SAMP> generates a hash
function that considers positions 1,2,4,6,7,8,9,10, plus the last function that considers positions 1,2,4,6,7,8,9,10, plus the last
character in each key (which may differ for each key, obviously). Keys byte in each keyword (which may be at a different position for each
with length less than the indicated key positions work properly, since keyword, obviously). Keywords
selected key positions exceeding the key length are simply not with length less than the indicated byte positions work properly, since
selected byte positions exceeding the keyword length are simply not
referenced in the hash function. referenced in the hash function.
<DT><SAMP>`-l'</SAMP> This option is not normally needed since version 2.8 of <CODE>gperf</CODE>;
<DD> the default byte positions are computed depending on the keyword set,
<DT><SAMP>`--compare-strlen'</SAMP> through a search that minimizes the number of byte positions.
<DD>
Compare key lengths before trying a string comparison. This might cut
down on the number of string comparisons made during the lookup, since
keys with different lengths are never compared via <CODE>strcmp</CODE>.
However, using <SAMP>`-l'</SAMP> might greatly increase the size of the
generated C code if the lookup table range is large (which implies that
the switch option <SAMP>`-S'</SAMP> is not enabled), since the length table
contains as many elements as there are entries in the lookup table.
This option is mandatory for binary comparisons (see section <A HREF="gperf_5.html#SEC13">3.3 Use of NUL characters</A>).
<DT><SAMP>`-D'</SAMP> <DT><SAMP>`-D'</SAMP>
<DD> <DD>
<DT><SAMP>`--duplicates'</SAMP> <DT><SAMP>`--duplicates'</SAMP>
<DD> <DD>
<A NAME="IDX22"></A> <A NAME="IDX44"></A>
Handle keywords whose key position sets hash to duplicate values. Handle keywords whose selected byte sets hash to duplicate values.
Duplicate hash values occur for two reasons: Duplicate hash values can occur if a set of keywords has the same names, but
possesses different attributes, or if the selected byte positions are not well
chosen. With the -D option <CODE>gperf</CODE> treats all these keywords as
<UL>
<LI>
Since <CODE>gperf</CODE> does not backtrack it is possible for it to process
all your input keywords without finding a unique mapping for each word.
However, frequently only a very small number of duplicates occur, and
the majority of keys still require one probe into the table.
<LI>
Sometimes a set of keys may have the same names, but possess different
attributes. With the -D option <CODE>gperf</CODE> treats all these keys as
part of an equivalence class and generates a perfect hash function with part of an equivalence class and generates a perfect hash function with
multiple comparisons for duplicate keys. It is up to you to completely multiple comparisons for duplicate keywords. It is up to you to completely
disambiguate the keywords by modifying the generated C code. However, disambiguate the keywords by modifying the generated C code. However,
<CODE>gperf</CODE> helps you out by organizing the output. <CODE>gperf</CODE> helps you out by organizing the output.
</UL>
Option <SAMP>`-D'</SAMP> is extremely useful for certain large or highly
redundant keyword sets, e.g., assembler instruction opcodes.
Using this option usually means that the generated hash function is no Using this option usually means that the generated hash function is no
longer perfect. On the other hand, it permits <CODE>gperf</CODE> to work on longer perfect. On the other hand, it permits <CODE>gperf</CODE> to work on
keyword sets that it otherwise could not handle. keyword sets that it otherwise could not handle.
<DT><SAMP>`-f <VAR>iteration-amount</VAR>'</SAMP> <DT><SAMP>`-m <VAR>iterations</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--fast=<VAR>iteration-amount</VAR>'</SAMP> <DT><SAMP>`--multiple-iterations=<VAR>iterations</VAR>'</SAMP>
<DD> <DD>
Generate the perfect hash function "fast". This decreases Perform multiple choices of the <SAMP>`-i'</SAMP> and <SAMP>`-j'</SAMP> values, and
<CODE>gperf</CODE>'s running time at the cost of minimizing generated choose the best results. This increases the running time by a factor of
table-size. The iteration amount represents the number of times to <VAR>iterations</VAR> but does a good job minimizing the generated table size.
iterate when resolving a collision. `0' means iterate by the number of
keywords. This option is probably most useful when used in conjunction
with options <SAMP>`-D'</SAMP> and/or <SAMP>`-S'</SAMP> for <EM>large</EM> keyword sets.
<DT><SAMP>`-i <VAR>initial-value</VAR>'</SAMP> <DT><SAMP>`-i <VAR>initial-value</VAR>'</SAMP>
<DD> <DD>
@@ -333,16 +396,17 @@ with options <SAMP>`-D'</SAMP> and/or <SAMP>`-S'</SAMP> for <EM>large</EM> keywo
Provides an initial <VAR>value</VAR> for the associate values array. Default Provides an initial <VAR>value</VAR> for the associate values array. Default
is 0. Increasing the initial value helps inflate the final table size, is 0. Increasing the initial value helps inflate the final table size,
possibly leading to more time efficient keyword lookups. Note that this possibly leading to more time efficient keyword lookups. Note that this
option is not particularly useful when <SAMP>`-S'</SAMP> is used. Also, option is not particularly useful when <SAMP>`-S'</SAMP> (or, equivalently,
<SAMP>`%switch'</SAMP>) is used. Also,
<SAMP>`-i'</SAMP> is overridden when the <SAMP>`-r'</SAMP> option is used. <SAMP>`-i'</SAMP> is overridden when the <SAMP>`-r'</SAMP> option is used.
<DT><SAMP>`-j <VAR>jump-value</VAR>'</SAMP> <DT><SAMP>`-j <VAR>jump-value</VAR>'</SAMP>
<DD> <DD>
<DT><SAMP>`--jump=<VAR>jump-value</VAR>'</SAMP> <DT><SAMP>`--jump=<VAR>jump-value</VAR>'</SAMP>
<DD> <DD>
<A NAME="IDX23"></A> <A NAME="IDX45"></A>
Affects the "jump value", i.e., how far to advance the associated Affects the "jump value", i.e., how far to advance the associated
character value upon collisions. <VAR>Jump-value</VAR> is rounded up to an byte value upon collisions. <VAR>Jump-value</VAR> is rounded up to an
odd number, the default is 5. If the <VAR>jump-value</VAR> is 0 <CODE>gperf</CODE> odd number, the default is 5. If the <VAR>jump-value</VAR> is 0 <CODE>gperf</CODE>
jumps by random amounts. jumps by random amounts.
@@ -354,24 +418,6 @@ Instructs the generator not to include the length of a keyword when
computing its hash value. This may save a few assembly instructions in computing its hash value. This may save a few assembly instructions in
the generated lookup table. the generated lookup table.
<DT><SAMP>`-o'</SAMP>
<DD>
<DT><SAMP>`--occurrence-sort'</SAMP>
<DD>
Reorders the keywords by sorting the keywords so that frequently
occuring key position set components appear first. A second reordering
pass follows so that keys with "already determined values" are placed
towards the front of the keylist. This may decrease the time required
to generate a perfect hash function for many keyword sets, and also
produce more minimal perfect hash functions. The reason for this is
that the reordering helps prune the search time by handling inevitable
collisions early in the search process. On the other hand, if the
number of keywords is <EM>very</EM> large using <SAMP>`-o'</SAMP> may
<EM>increase</EM> <CODE>gperf</CODE>'s execution time, since collisions will
begin earlier and continue throughout the remainder of keyword
processing. See Cichelli's paper from the January 1980 Communications
of the ACM for details.
<DT><SAMP>`-r'</SAMP> <DT><SAMP>`-r'</SAMP>
<DD> <DD>
<DT><SAMP>`--random'</SAMP> <DT><SAMP>`--random'</SAMP>
@@ -380,8 +426,7 @@ Utilizes randomness to initialize the associated values table. This
frequently generates solutions faster than using deterministic frequently generates solutions faster than using deterministic
initialization (which starts all associated values at 0). Furthermore, initialization (which starts all associated values at 0). Furthermore,
using the randomization option generally increases the size of the using the randomization option generally increases the size of the
table. If <CODE>gperf</CODE> has difficultly with a certain keyword set try using table.
<SAMP>`-r'</SAMP> or <SAMP>`-D'</SAMP>.
<DT><SAMP>`-s <VAR>size-multiple</VAR>'</SAMP> <DT><SAMP>`-s <VAR>size-multiple</VAR>'</SAMP>
<DD> <DD>
@@ -389,36 +434,31 @@ table. If <CODE>gperf</CODE> has difficultly with a certain keyword set try usi
<DD> <DD>
Affects the size of the generated hash table. The numeric argument for Affects the size of the generated hash table. The numeric argument for
this option indicates "how many times larger or smaller" the maximum this option indicates "how many times larger or smaller" the maximum
associated value range should be, in relationship to the number of keys. associated value range should be, in relationship to the number of keywords.
If the <VAR>size-multiple</VAR> is negative the maximum associated value is It can be written as an integer, a floating-point number or a fraction.
calculated by <EM>dividing</EM> it into the total number of keys. For For example, a value of 3 means "allow the maximum associated value to be
example, a value of 3 means "allow the maximum associated value to be about 3 times larger than the number of input keywords".
about 3 times larger than the number of input keys". Conversely, a value of 1/3 means "allow the maximum associated value to
be about 3 times smaller than the number of input keywords". Values
smaller than 1 are useful for limiting the overall size of the generated hash
table, though the option <SAMP>`-m'</SAMP> is better at this purpose.
Conversely, a value of -3 means "allow the maximum associated value to If `generate switch' option <SAMP>`-S'</SAMP> (or, equivalently, <SAMP>`%switch'</SAMP>) is
be about 3 times smaller than the number of input keys". Negative <EM>not</EM> enabled, the maximum
values are useful for limiting the overall size of the generated hash
table, though this usually increases the number of duplicate hash
values.
If `generate switch' option <SAMP>`-S'</SAMP> is <EM>not</EM> enabled, the maximum
associated value influences the static array table size, and a larger associated value influences the static array table size, and a larger
table should decrease the time required for an unsuccessful search, at table should decrease the time required for an unsuccessful search, at
the expense of extra table space. the expense of extra table space.
The default value is 1, thus the default maximum associated value about The default value is 1, thus the default maximum associated value about
the same size as the number of keys (for efficiency, the maximum the same size as the number of keywords (for efficiency, the maximum
associated value is always rounded up to a power of 2). The actual associated value is always rounded up to a power of 2). The actual
table size may vary somewhat, since this technique is essentially a table size may vary somewhat, since this technique is essentially a
heuristic. In particular, setting this value too high slows down heuristic.
<CODE>gperf</CODE>'s runtime, since it must search through a much larger range
of values. Judicious use of the <SAMP>`-f'</SAMP> option helps alleviate this
overhead, however.
</DL> </DL>
<H2><A NAME="SEC19" HREF="gperf_toc.html#TOC19">4.5 Informative Output</A></H2> <H2><A NAME="SEC24" HREF="gperf_toc.html#TOC24">4.6 Informative Output</A></H2>
<DL COMPACT> <DL COMPACT>
@@ -448,6 +488,6 @@ option is enabled.
</DL> </DL>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_5.html">previous</A>, <A HREF="gperf_7.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,16 +1,16 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 5 Known Bugs and Limitations with gperf</TITLE> <TITLE>Perfect Hash Function Generator - 5 Known Bugs and Limitations with gperf</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_6.html">previous</A>, <A HREF="gperf_8.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_6.html">previous</A>, <A HREF="gperf_8.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
<H1><A NAME="SEC20" HREF="gperf_toc.html#TOC20">5 Known Bugs and Limitations with <CODE>gperf</CODE></A></H1> <H1><A NAME="SEC25" HREF="gperf_toc.html#TOC25">5 Known Bugs and Limitations with <CODE>gperf</CODE></A></H1>
<P> <P>
The following are some limitations with the current release of The following are some limitations with the current release of
@@ -29,16 +29,6 @@ work efficiently on much larger keyword sets (over 15,000 keywords).
When processing large keyword sets it helps greatly to have over 8 megs When processing large keyword sets it helps greatly to have over 8 megs
of RAM. of RAM.
However, since <CODE>gperf</CODE> does not backtrack no guaranteed solution
occurs on every run. On the other hand, it is usually easy to obtain a
solution by varying the option parameters. In particular, try the
<SAMP>`-r'</SAMP> option, and also try changing the default arguments to the
<SAMP>`-s'</SAMP> and <SAMP>`-j'</SAMP> options. To <EM>guarantee</EM> a solution, use
the <SAMP>`-D'</SAMP> and <SAMP>`-S'</SAMP> options, although the final results are not
likely to be a <EM>perfect</EM> hash function anymore! Finally, use the
<SAMP>`-f'</SAMP> option if you want <CODE>gperf</CODE> to generate the perfect hash
function <EM>fast</EM>, with less emphasis on making it minimal.
<LI> <LI>
The size of the generate static keyword array can get <EM>extremely</EM> The size of the generate static keyword array can get <EM>extremely</EM>
@@ -47,20 +37,20 @@ similar. This tends to slow down the compilation of the generated C
code, and <EM>greatly</EM> inflates the object code size. If this code, and <EM>greatly</EM> inflates the object code size. If this
situation occurs, consider using the <SAMP>`-S'</SAMP> option to reduce data situation occurs, consider using the <SAMP>`-S'</SAMP> option to reduce data
size, potentially increasing keyword recognition time a negligible size, potentially increasing keyword recognition time a negligible
amount. Since many C compilers cannot correctly generated code for amount. Since many C compilers cannot correctly generate code for
large switch statements it is important to qualify the <VAR>-S</VAR> option large switch statements it is important to qualify the <VAR>-S</VAR> option
with an appropriate numerical argument that controls the number of with an appropriate numerical argument that controls the number of
switch statements generated. switch statements generated.
<LI> <LI>
The maximum number of key positions selected for a given key has an The maximum number of selected byte positions has an
arbitrary limit of 126. This restriction should be removed, and if arbitrary limit of 255. This restriction should be removed, and if
anyone considers this a problem write me and let me know so I can remove anyone considers this a problem write me and let me know so I can remove
the constraint. the constraint.
</UL> </UL>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_6.html">previous</A>, <A HREF="gperf_8.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_6.html">previous</A>, <A HREF="gperf_8.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,16 +1,16 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 6 Things Still Left to Do</TITLE> <TITLE>Perfect Hash Function Generator - 6 Things Still Left to Do</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
<H1><A NAME="SEC21" HREF="gperf_toc.html#TOC21">6 Things Still Left to Do</A></H1> <H1><A NAME="SEC26" HREF="gperf_toc.html#TOC26">6 Things Still Left to Do</A></H1>
<P> <P>
It should be "relatively" easy to replace the current perfect hash It should be "relatively" easy to replace the current perfect hash
@@ -23,19 +23,10 @@ worthwhile improvements include:
<UL> <UL>
<LI> <LI>
Make the algorithm more robust. At present, the program halts with an
error diagnostic if it can't find a direct solution and the <SAMP>`-D'</SAMP>
option is not enabled. A more comprehensive, albeit computationally
expensive, approach would employ backtracking or enable alternative
options and retry. It's not clear how helpful this would be, in
general, since most search sets are rather small in practice.
<LI>
Another useful extension involves modifying the program to generate Another useful extension involves modifying the program to generate
"minimal" perfect hash functions (under certain circumstances, the "minimal" perfect hash functions (under certain circumstances, the
current version can be rather extravagant in the generated table size). current version can be rather extravagant in the generated table size).
Again, this is mostly of theoretical interest, since a sparse table This is mostly of theoretical interest, since a sparse table
often produces faster lookups, and use of the <SAMP>`-S'</SAMP> <CODE>switch</CODE> often produces faster lookups, and use of the <SAMP>`-S'</SAMP> <CODE>switch</CODE>
option can minimize the data size, at the expense of slightly longer option can minimize the data size, at the expense of slightly longer
lookups (note that the gcc compiler generally produces good code for lookups (note that the gcc compiler generally produces good code for
@@ -44,11 +35,11 @@ lookups (note that the gcc compiler generally produces good code for
<LI> <LI>
In addition to improving the algorithm, it would also be useful to In addition to improving the algorithm, it would also be useful to
generate a C++ class or Ada package as the code output, in addition to generate an Ada package as the code output, in addition to the current
the current C routines. C and C++ routines.
</UL> </UL>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,30 +1,95 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - 7 Implementation Details of GNU gperf</TITLE> <TITLE>Perfect Hash Function Generator - 7 Bibliography</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_8.html">previous</A>, <A HREF="gperf_10.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_8.html">previous</A>, <A HREF="gperf_10.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P> <P><HR><P>
<H1><A NAME="SEC22" HREF="gperf_toc.html#TOC22">7 Implementation Details of GNU <CODE>gperf</CODE></A></H1> <H1><A NAME="SEC27" HREF="gperf_toc.html#TOC27">7 Bibliography</A></H1>
<P> <P>
A paper describing the high-level description of the data structures and [1] Chang, C.C.: <I>A Scheme for Constructing Ordered Minimal Perfect
algorithms used to implement <CODE>gperf</CODE> will soon be available. This Hashing Functions</I> Information Sciences 39(1986), 187-195.
paper is useful not only from a maintenance and enhancement perspective,
but also because they demonstrate several clever and useful programming
techniques, e.g., `Iteration Number' boolean arrays, double
hashing, a "safe" and efficient method for reading arbitrarily long
input from a file, and a provably optimal algorithm for simultaneously
determining both the minimum and maximum elements in a list.
</P> </P>
<P>
[2] Cichelli, Richard J. <I>Author's Response to "On Cichelli's Minimal Perfect Hash
Functions Method"</I> Communications of the ACM, 23, 12(December 1980), 729.
</P>
<P>
[3] Cichelli, Richard J. <I>Minimal Perfect Hash Functions Made Simple</I>
Communications of the ACM, 23, 1(January 1980), 17-19.
</P>
<P>
[4] Cook, C. R. and Oldehoeft, R.R. <I>A Letter Oriented Minimal
Perfect Hashing Function</I> SIGPLAN Notices, 17, 9(September 1982), 18-27.
</P>
<P>
[5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M.
<I>Practical Perfect Hashing</I> Computer Journal, 28, 1(January 1985), 54-58.
</P>
<P>
[6] Jaeschke, G. <I>Reciprocal Hashing: A Method for Generating Minimal
Perfect Hashing Functions</I> Communications of the ACM, 24, 12(December
1981), 829-833.
</P>
<P>
[7] Jaeschke, G. and Osterburg, G. <I>On Cichelli's Minimal Perfect
Hash Functions Method</I> Communications of the ACM, 23, 12(December 1980),
728-729.
</P>
<P>
[8] Sager, Thomas J. <I>A Polynomial Time Generator for Minimal Perfect
Hash Functions</I> Communications of the ACM, 28, 5(December 1985), 523-532
</P>
<P>
[9] Schmidt, Douglas C. <I>GPERF: A Perfect Hash Function Generator</I>
Second USENIX C++ Conference Proceedings, April 1990.
</P>
<P>
[10] Schmidt, Douglas C. <I>GPERF: A Perfect Hash Function Generator</I>
C++ Report, SIGS 10 10 (November/December 1998).
</P>
<P>
[11] Sebesta, R.W. and Taylor, M.A. <I>Minimal Perfect Hash Functions
for Reserved Word Lists</I> SIGPLAN Notices, 20, 12(September 1985), 47-53.
</P>
<P>
[12] Sprugnoli, R. <I>Perfect Hashing Functions: A Single Probe
Retrieving Method for Static Sets</I> Communications of the ACM, 20
11(November 1977), 841-850.
</P>
<P>
[13] Stallman, Richard M. <I>Using and Porting GNU CC</I> Free Software Foundation,
1988.
</P>
<P>
[14] Stroustrup, Bjarne <I>The C++ Programming Language.</I> Addison-Wesley, 1986.
</P>
<P>
[15] Tiemann, Michael D. <I>User's Guide to GNU C++</I> Free Software
Foundation, 1989.
</P>
<P><HR><P> <P><HR><P>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_8.html">previous</A>, <A HREF="gperf_10.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>. Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_8.html">previous</A>, <A HREF="gperf_10.html">next</A>, <A HREF="gperf_10.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
</BODY> </BODY>
</HTML> </HTML>

View File

@@ -1,15 +1,16 @@
<HTML> <HTML>
<HEAD> <HEAD>
<!-- This HTML file has been created by texi2html 1.51 <!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 26 September 2000 --> from gperf.texi on 7 May 2003 -->
<TITLE>Perfect Hash Function Generator - Table of Contents</TITLE> <TITLE>Perfect Hash Function Generator - Table of Contents</TITLE>
</HEAD> </HEAD>
<BODY> <BODY>
<H1>User's Guide to <CODE>gperf</CODE> 2.7.2</H1> <H1>User's Guide to <CODE>gperf</CODE> 3.0</H1>
<H2>The GNU Perfect Hash Function Generator</H2> <H2>The GNU Perfect Hash Function Generator</H2>
<H2>Edition 2.7.2, 26 September 2000</H2> <H2>Edition 3.0, 7 May 2003</H2>
<ADDRESS>Douglas C. Schmidt</ADDRESS> <ADDRESS>Douglas C. Schmidt</ADDRESS>
<ADDRESS>Bruno Haible</ADDRESS>
<P> <P>
<P><HR><P> <P><HR><P>
<UL> <UL>
@@ -25,29 +26,35 @@
<UL> <UL>
<LI><A NAME="TOC8" HREF="gperf_5.html#SEC8">3.1 Input Format to <CODE>gperf</CODE></A> <LI><A NAME="TOC8" HREF="gperf_5.html#SEC8">3.1 Input Format to <CODE>gperf</CODE></A>
<UL> <UL>
<LI><A NAME="TOC9" HREF="gperf_5.html#SEC9">3.1.1 <CODE>struct</CODE> Declarations and C Code Inclusion</A> <LI><A NAME="TOC9" HREF="gperf_5.html#SEC9">3.1.1 Declarations</A>
<LI><A NAME="TOC10" HREF="gperf_5.html#SEC10">3.1.2 Format for Keyword Entries</A>
<LI><A NAME="TOC11" HREF="gperf_5.html#SEC11">3.1.3 Including Additional C Functions</A>
</UL>
<LI><A NAME="TOC12" HREF="gperf_5.html#SEC12">3.2 Output Format for Generated C Code with <CODE>gperf</CODE></A>
<LI><A NAME="TOC13" HREF="gperf_5.html#SEC13">3.3 Use of NUL characters</A>
</UL>
<LI><A NAME="TOC14" HREF="gperf_6.html#SEC14">4 Invoking <CODE>gperf</CODE></A>
<UL> <UL>
<LI><A NAME="TOC15" HREF="gperf_6.html#SEC15">4.1 Options that affect Interpretation of the Input File</A> <LI><A NAME="TOC10" HREF="gperf_5.html#SEC10">3.1.1.1 User-supplied <CODE>struct</CODE></A>
<LI><A NAME="TOC16" HREF="gperf_6.html#SEC16">4.2 Options to specify the Language for the Output Code</A> <LI><A NAME="TOC11" HREF="gperf_5.html#SEC11">3.1.1.2 Gperf Declarations</A>
<LI><A NAME="TOC17" HREF="gperf_6.html#SEC17">4.3 Options for fine tuning Details in the Output Code</A> <LI><A NAME="TOC12" HREF="gperf_5.html#SEC12">3.1.1.3 C Code Inclusion</A>
<LI><A NAME="TOC18" HREF="gperf_6.html#SEC18">4.4 Options for changing the Algorithms employed by <CODE>gperf</CODE></A>
<LI><A NAME="TOC19" HREF="gperf_6.html#SEC19">4.5 Informative Output</A>
</UL> </UL>
<LI><A NAME="TOC20" HREF="gperf_7.html#SEC20">5 Known Bugs and Limitations with <CODE>gperf</CODE></A> <LI><A NAME="TOC13" HREF="gperf_5.html#SEC13">3.1.2 Format for Keyword Entries</A>
<LI><A NAME="TOC21" HREF="gperf_8.html#SEC21">6 Things Still Left to Do</A> <LI><A NAME="TOC14" HREF="gperf_5.html#SEC14">3.1.3 Including Additional C Functions</A>
<LI><A NAME="TOC22" HREF="gperf_9.html#SEC22">7 Implementation Details of GNU <CODE>gperf</CODE></A> <LI><A NAME="TOC15" HREF="gperf_5.html#SEC15">3.1.4 Where to place directives for GNU <CODE>indent</CODE>.</A>
<LI><A NAME="TOC23" HREF="gperf_10.html#SEC23">8 Bibliography</A> </UL>
<LI><A NAME="TOC24" HREF="gperf_11.html#SEC24">Concept Index</A> <LI><A NAME="TOC16" HREF="gperf_5.html#SEC16">3.2 Output Format for Generated C Code with <CODE>gperf</CODE></A>
<LI><A NAME="TOC17" HREF="gperf_5.html#SEC17">3.3 Use of NUL bytes</A>
</UL>
<LI><A NAME="TOC18" HREF="gperf_6.html#SEC18">4 Invoking <CODE>gperf</CODE></A>
<UL>
<LI><A NAME="TOC19" HREF="gperf_6.html#SEC19">4.1 Specifying the Location of the Output File</A>
<LI><A NAME="TOC20" HREF="gperf_6.html#SEC20">4.2 Options that affect Interpretation of the Input File</A>
<LI><A NAME="TOC21" HREF="gperf_6.html#SEC21">4.3 Options to specify the Language for the Output Code</A>
<LI><A NAME="TOC22" HREF="gperf_6.html#SEC22">4.4 Options for fine tuning Details in the Output Code</A>
<LI><A NAME="TOC23" HREF="gperf_6.html#SEC23">4.5 Options for changing the Algorithms employed by <CODE>gperf</CODE></A>
<LI><A NAME="TOC24" HREF="gperf_6.html#SEC24">4.6 Informative Output</A>
</UL>
<LI><A NAME="TOC25" HREF="gperf_7.html#SEC25">5 Known Bugs and Limitations with <CODE>gperf</CODE></A>
<LI><A NAME="TOC26" HREF="gperf_8.html#SEC26">6 Things Still Left to Do</A>
<LI><A NAME="TOC27" HREF="gperf_9.html#SEC27">7 Bibliography</A>
<LI><A NAME="TOC28" HREF="gperf_10.html#SEC28">Concept Index</A>
</UL> </UL>
<P><HR><P> <P><HR><P>
This document was generated on 26 September 2000 using the This document was generated on 7 May 2003 using the
<A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A> <A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A>
translator version 1.51.</P> translator version 1.51.</P>
</BODY> </BODY>