mirror of
https://git.savannah.gnu.org/git/gperf.git
synced 2025-12-02 13:09:22 +00:00
Regenerated.
This commit is contained in:
@@ -1,63 +1,51 @@
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<!-- This HTML file has been created by texi2html 1.51
|
||||
from gperf.texi on 15 April 1998 -->
|
||||
from gperf.texi on 20 August 2000 -->
|
||||
|
||||
<TITLE>User's Guide to gperf - 5 Known Bugs and Limitations with gperf</TITLE>
|
||||
<TITLE>Perfect Hash Function Generator - 6 Things Still Left to Do</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
Go to the <A HREF="gperf_1.html">first</A>, <A HREF="gperf_7.html">previous</A>, <A HREF="gperf_9.html">next</A>, <A HREF="gperf_11.html">last</A> section, <A HREF="gperf_toc.html">table of contents</A>.
|
||||
<P><HR><P>
|
||||
|
||||
|
||||
<H1><A NAME="SEC20" HREF="gperf_toc.html#TOC20">5 Known Bugs and Limitations with <CODE>gperf</CODE></A></H1>
|
||||
<H1><A NAME="SEC21" HREF="gperf_toc.html#TOC21">6 Things Still Left to Do</A></H1>
|
||||
|
||||
<P>
|
||||
The following are some limitations with the current release of
|
||||
<CODE>gperf</CODE>:
|
||||
It should be "relatively" easy to replace the current perfect hash
|
||||
function algorithm with a more exhaustive approach; the perfect hash
|
||||
module is essential independent from other program modules. Additional
|
||||
worthwhile improvements include:
|
||||
|
||||
</P>
|
||||
|
||||
<UL>
|
||||
<LI>
|
||||
|
||||
The <CODE>gperf</CODE> utility is tuned to execute quickly, and works quickly
|
||||
for small to medium size data sets (around 1000 keywords). It is
|
||||
extremely useful for maintaining perfect hash functions for compiler
|
||||
keyword sets. Several recent enhancements now enable <CODE>gperf</CODE> to
|
||||
work efficiently on much larger keyword sets (over 15,000 keywords).
|
||||
When processing large keyword sets it helps greatly to have over 8 megs
|
||||
of RAM.
|
||||
|
||||
However, since <CODE>gperf</CODE> does not backtrack no guaranteed solution
|
||||
occurs on every run. On the other hand, it is usually easy to obtain a
|
||||
solution by varying the option parameters. In particular, try the
|
||||
<SAMP>`-r'</SAMP> option, and also try changing the default arguments to the
|
||||
<SAMP>`-s'</SAMP> and <SAMP>`-j'</SAMP> options. To <EM>guarantee</EM> a solution, use
|
||||
the <SAMP>`-D'</SAMP> and <SAMP>`-S'</SAMP> options, although the final results are not
|
||||
likely to be a <EM>perfect</EM> hash function anymore! Finally, use the
|
||||
<SAMP>`-f'</SAMP> option if you want <CODE>gperf</CODE> to generate the perfect hash
|
||||
function <EM>fast</EM>, with less emphasis on making it minimal.
|
||||
Make the algorithm more robust. At present, the program halts with an
|
||||
error diagnostic if it can't find a direct solution and the <SAMP>`-D'</SAMP>
|
||||
option is not enabled. A more comprehensive, albeit computationally
|
||||
expensive, approach would employ backtracking or enable alternative
|
||||
options and retry. It's not clear how helpful this would be, in
|
||||
general, since most search sets are rather small in practice.
|
||||
|
||||
<LI>
|
||||
|
||||
The size of the generate static keyword array can get <EM>extremely</EM>
|
||||
large if the input keyword file is large or if the keywords are quite
|
||||
similar. This tends to slow down the compilation of the generated C
|
||||
code, and <EM>greatly</EM> inflates the object code size. If this
|
||||
situation occurs, consider using the <SAMP>`-S'</SAMP> option to reduce data
|
||||
size, potentially increasing keyword recognition time a negligible
|
||||
amount. Since many C compilers cannot correctly generated code for
|
||||
large switch statements it is important to qualify the <VAR>-S</VAR> option
|
||||
with an appropriate numerical argument that controls the number of
|
||||
switch statements generated.
|
||||
Another useful extension involves modifying the program to generate
|
||||
"minimal" perfect hash functions (under certain circumstances, the
|
||||
current version can be rather extravagant in the generated table size).
|
||||
Again, this is mostly of theoretical interest, since a sparse table
|
||||
often produces faster lookups, and use of the <SAMP>`-S'</SAMP> <CODE>switch</CODE>
|
||||
option can minimize the data size, at the expense of slightly longer
|
||||
lookups (note that the gcc compiler generally produces good code for
|
||||
<CODE>switch</CODE> statements, reducing the need for more complex schemes).
|
||||
|
||||
<LI>
|
||||
|
||||
The maximum number of key positions selected for a given key has an
|
||||
arbitrary limit of 126. This restriction should be removed, and if
|
||||
anyone considers this a problem write me and let me know so I can remove
|
||||
the constraint.
|
||||
In addition to improving the algorithm, it would also be useful to
|
||||
generate a C++ class or Ada package as the code output, in addition to
|
||||
the current C routines.
|
||||
</UL>
|
||||
|
||||
<P><HR><P>
|
||||
|
||||
Reference in New Issue
Block a user