1
0
mirror of https://git.savannah.gnu.org/git/gperf.git synced 2025-12-02 13:09:22 +00:00

Regenerated.

This commit is contained in:
Bruno Haible
2000-08-20 18:44:23 +00:00
parent 4bdae61027
commit 745d7fc2fb
23 changed files with 4031 additions and 3212 deletions

View File

@@ -1,63 +1,51 @@
<HTML>
<HEAD>
<!-- This HTML file has been created by texi2html 1.51
from gperf.texi on 15 April 1998 -->
from gperf.texi on 20 August 2000 -->
<TITLE>User's Guide to gperf - 5 Known Bugs and Limitations with gperf</TITLE>
<TITLE>Perfect Hash Function Generator - 6 Things Still Left to Do</TITLE>
</HEAD>
<BODY>
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
<P><HR><P>
<H1><A NAME="SEC20" HREF="gperf_toc.html#TOC20">5 Known Bugs and Limitations with <CODE>gperf</CODE></A></H1>
<H1><A NAME="SEC21" HREF="gperf_toc.html#TOC21">6 Things Still Left to Do</A></H1>
<P>
The following are some limitations with the current release of
<CODE>gperf</CODE>:
It should be "relatively" easy to replace the current perfect hash
function algorithm with a more exhaustive approach; the perfect hash
module is essential independent from other program modules. Additional
worthwhile improvements include:
</P>
<UL>
<LI>
The <CODE>gperf</CODE> utility is tuned to execute quickly, and works quickly
for small to medium size data sets (around 1000 keywords). It is
extremely useful for maintaining perfect hash functions for compiler
keyword sets. Several recent enhancements now enable <CODE>gperf</CODE> to
work efficiently on much larger keyword sets (over 15,000 keywords).
When processing large keyword sets it helps greatly to have over 8 megs
of RAM.
However, since <CODE>gperf</CODE> does not backtrack no guaranteed solution
occurs on every run. On the other hand, it is usually easy to obtain a
solution by varying the option parameters. In particular, try the
<SAMP>`-r'</SAMP> option, and also try changing the default arguments to the
<SAMP>`-s'</SAMP> and <SAMP>`-j'</SAMP> options. To <EM>guarantee</EM> a solution, use
the <SAMP>`-D'</SAMP> and <SAMP>`-S'</SAMP> options, although the final results are not
likely to be a <EM>perfect</EM> hash function anymore! Finally, use the
<SAMP>`-f'</SAMP> option if you want <CODE>gperf</CODE> to generate the perfect hash
function <EM>fast</EM>, with less emphasis on making it minimal.
Make the algorithm more robust. At present, the program halts with an
error diagnostic if it can't find a direct solution and the <SAMP>`-D'</SAMP>
option is not enabled. A more comprehensive, albeit computationally
expensive, approach would employ backtracking or enable alternative
options and retry. It's not clear how helpful this would be, in
general, since most search sets are rather small in practice.
<LI>
The size of the generate static keyword array can get <EM>extremely</EM>
large if the input keyword file is large or if the keywords are quite
similar. This tends to slow down the compilation of the generated C
code, and <EM>greatly</EM> inflates the object code size. If this
situation occurs, consider using the <SAMP>`-S'</SAMP> option to reduce data
size, potentially increasing keyword recognition time a negligible
amount. Since many C compilers cannot correctly generated code for
large switch statements it is important to qualify the <VAR>-S</VAR> option
with an appropriate numerical argument that controls the number of
switch statements generated.
Another useful extension involves modifying the program to generate
"minimal" perfect hash functions (under certain circumstances, the
current version can be rather extravagant in the generated table size).
Again, this is mostly of theoretical interest, since a sparse table
often produces faster lookups, and use of the <SAMP>`-S'</SAMP> <CODE>switch</CODE>
option can minimize the data size, at the expense of slightly longer
lookups (note that the gcc compiler generally produces good code for
<CODE>switch</CODE> statements, reducing the need for more complex schemes).
<LI>
The maximum number of key positions selected for a given key has an
arbitrary limit of 126. This restriction should be removed, and if
anyone considers this a problem write me and let me know so I can remove
the constraint.
In addition to improving the algorithm, it would also be useful to
generate a C++ class or Ada package as the code output, in addition to
the current C routines.
</UL>
<P><HR><P>