1
0
mirror of https://git.savannah.gnu.org/git/gperf.git synced 2025-12-02 21:19:24 +00:00

Document the use of NULs.

This commit is contained in:
Bruno Haible
2000-08-20 17:20:23 +00:00
parent c0eb520394
commit 1ad4108b34
2 changed files with 39 additions and 10 deletions

View File

@@ -43,6 +43,7 @@
result. result.
(Gen_Perf::hash): Use explicit length of char_set. (Gen_Perf::hash): Use explicit length of char_set.
(Gen_Perf::change): Specify explicit length of key. (Gen_Perf::change): Specify explicit length of key.
* doc/gperf.texi: Document it.
* doc/help2man: New file, help2man version 1.022. * doc/help2man: New file, help2man version 1.022.
* Makefile.devel (all): Add doc/gperf.1. * Makefile.devel (all): Add doc/gperf.1.

View File

@@ -115,6 +115,7 @@ High-Level Description of GNU @code{gperf}
* Input Format:: Input Format to @code{gperf} * Input Format:: Input Format to @code{gperf}
* Output Format:: Output Format for Generated C Code with @code{gperf} * Output Format:: Output Format for Generated C Code with @code{gperf}
* Binary Strings:: Use of NUL characters
Input Format to @code{gperf} Input Format to @code{gperf}
@@ -259,6 +260,7 @@ efficiently identify their respective reserved keywords.
@menu @menu
* Input Format:: Input Format to @code{gperf} * Input Format:: Input Format to @code{gperf}
* Output Format:: Output Format for Generated C Code with @code{gperf} * Output Format:: Output Format for Generated C Code with @code{gperf}
* Binary Strings:: Use of NUL characters
@end menu @end menu
The perfect hash function generator @code{gperf} reads a set of The perfect hash function generator @code{gperf} reads a set of
@@ -327,9 +329,9 @@ arbitrary C declarations and definitions, as well as provisions for
providing a user-supplied @code{struct}. If the @samp{-t} option providing a user-supplied @code{struct}. If the @samp{-t} option
@emph{is} enabled, you @emph{must} provide a C @code{struct} as the last @emph{is} enabled, you @emph{must} provide a C @code{struct} as the last
component in the declaration section from the keyfile file. The first component in the declaration section from the keyfile file. The first
field in this struct must be a @code{char *} identifier called @samp{name}, field in this struct must be a @code{char *} or @code{const char *}
although it is possible to modify this field's name with the @samp{-K} identifier called @samp{name}, although it is possible to modify this
option described below. field's name with the @samp{-K} option described below.
Here is a simple example, using months of the year and their attributes as Here is a simple example, using months of the year and their attributes as
input: input:
@@ -406,15 +408,18 @@ in the first column is considered a comment. Everything following the
@samp{#} is ignored, up to and including the following newline. @samp{#} is ignored, up to and including the following newline.
The first field of each non-comment line is always the key itself. It The first field of each non-comment line is always the key itself. It
should be given as a simple name, i.e., without surrounding can be given in two ways: as a simple name, i.e., without surrounding
string quotation marks, and be left-justified flush against the first string quotation marks, or as a string enclosed in double-quotes, in
column. In this context, a ``field'' is considered to extend up to, but C syntax, possibly with backslash escapes like @code{\"} or @code{\234}
or @code{\xa8}. In either case, it must start right at the beginning
of the line, without leading whitespace.
In this context, a ``field'' is considered to extend up to, but
not include, the first blank, comma, or newline. Here is a simple not include, the first blank, comma, or newline. Here is a simple
example taken from a partial list of C reserved words: example taken from a partial list of C reserved words:
@example @example
@group @group
# These are a few C reserved words, see the c.@code{gperf} file # These are a few C reserved words, see the c.gperf file
# for a complete list of ANSI C reserved words. # for a complete list of ANSI C reserved words.
unsigned unsigned
sizeof sizeof
@@ -449,7 +454,7 @@ file, is included verbatim into the generated output file. Naturally,
it is your responsibility to ensure that the code contained in this it is your responsibility to ensure that the code contained in this
section is valid C. section is valid C.
@node Output Format, , Input Format, Description @node Output Format, Binary Strings, Input Format, Description
@section Output Format for Generated C Code with @code{gperf} @section Output Format for Generated C Code with @code{gperf}
@cindex hash table @cindex hash table
@@ -509,6 +514,28 @@ with the various input and output options, and timing the resulting C
code, you can determine the best option choices for different keyword code, you can determine the best option choices for different keyword
set characteristics. set characteristics.
@node Binary Strings, , Output Format, Description
@section Use of NUL characters
@cindex NUL
By default, the code generated by @code{gperf} operates on zero
terminated strings, the usual representation of strings in C. This means
that the keywords in the input file must not contain NUL characters,
and the @var{str} argument passed to @code{hash} or @code{in_word_set}
must be NUL terminated and have exactly length @var{len}.
If option @samp{-c} is used, then the @var{str} argument does not need
to be NUL terminated. The code generated by @code{gperf} will only
access the first @var{len}, not @var{len+1}, bytes starting at @var{str}.
However, the keywords in the input file still must not contain NUL
characters.
If option @samp{-l} is used, then the hash table performs binary
comparison. The keywords in the input file may contain NUL characters,
written in string syntax as @code{\000} or @code{\x00}, and the code
generated by @code{gperf} will treat NUL like any other character.
Also, in this case the @samp{-c} option is ignored.
@node Options, Bugs, Description, Top @node Options, Bugs, Description, Top
@chapter Invoking @code{gperf} @chapter Invoking @code{gperf}
@@ -636,8 +663,8 @@ solely consist of 7-bit ASCII characters (characters in the range 0..127).
(Note that the ANSI C functions @code{isalnum} and @code{isgraph} do (Note that the ANSI C functions @code{isalnum} and @code{isgraph} do
@emph{not} guarantee that a character is in this range. Only an explicit @emph{not} guarantee that a character is in this range. Only an explicit
test like @samp{c >= 'A' && c <= 'Z'} guarantees this.) This was the test like @samp{c >= 'A' && c <= 'Z'} guarantees this.) This was the
default in earlier versions of @code{gperf}; now the default is to assume default in versions of @code{gperf} earlier than 2.7; now the default is
8-bit characters. to assume 8-bit characters.
@item -c @item -c
@itemx --compare-strncmp @itemx --compare-strncmp
@@ -731,6 +758,7 @@ However, using @samp{-l} might greatly increase the size of the
generated C code if the lookup table range is large (which implies that generated C code if the lookup table range is large (which implies that
the switch option @samp{-S} is not enabled), since the length table the switch option @samp{-S} is not enabled), since the length table
contains as many elements as there are entries in the lookup table. contains as many elements as there are entries in the lookup table.
This option is mandatory for binary comparisons (@pxref{Binary Strings}).
@item -D @item -D
@itemx --duplicates @itemx --duplicates