GREP — Find Regular Expressions in Files
Revision History for Release 8.0
Program Dated 4 May 2005 / This Document Dated 4 May 2005
Copyright © 1986–2005 Stan Brown, Oak Road Systems
Program Dated 4 May 2005 / This Document Dated 4 May 2005
Copyright © 1986–2005 Stan Brown, Oak Road Systems
Summary: This document is the complete revision history, starting with the most recent changes. To use GREP, please see the GREP Quick Start and the GREP Manual.
New features:
Compatibility note: Release 8.0 changes the default output format for text lines (not paragraphs). Earlier releases simply dumped the input lines naïvely. Release 8.0 displays control characters in ^letter format and other non-printing characters in <nn> format. In most cases this is a good thing, but if you want the old behavior specify the /o1 option on the command line or in the environment variable.
With the /G0 option (fixed-width text lines), earlier releases of GREP would chop the line into chunks of txwid+1 characters as specified in the /W option; release 8.0 chops the line into chunks of exactly txwid characters.
Bugs fixed:
Other program changes:
Documentation changes:
The TOURKEY program (see release 7.51) was compiled with a 16-bit compiler to keep it small. It worked fine on my Windows XP Pro, but a user reported that it didn't work on his. I recompiled it with a 32-bit compiler. Also, though TOURKEY isn't intended for standalone use, I added a program identification and help message if you run it with the /? option.
In the tour itself, I simplified the regex used with GREP32 to illustrate paragraph mode .
The GREP executables and the other documentation are unchanged from release 7.5.
Following a report that the tour didn't work in Windows XP, I investigated and found that the default installation of Windows XP doesn't have the CHOICE program, which the tour uses to get instructions from the user. I wrote a workalike, TOURKEY, which is now included with the distribution.
The executables and documentation are unchanged.
Program changes affecting text files:
Program changes in the handling of multiple regexes (/F option) with the /J2 or /J3 option:
The above two changes don't affect single regexes with the /J2 or /J3 option.
Other program changes:
Documentation changes (in addition to descriptions of the program changes):
This release fixes two bugs that were introduced in release 7.4. Both have to do with the /J2 or /J3 option:
Program changes:
/J0
through /J3
. When a line, record, or
buffer contains one or more matches, you can now tell GREP to display
the entire line, only the first matching part, or every matching part
with or without overlaps.GREP /R3 /J /N
was displaying the byte
number of the start of the buffer. Now it displays the byte number
of the start of the match, as it should.Document changes:
/J2
and /J3
selections there are some changes to how
free-format binary files are read
(/R3 option) and to the
recommended buffer width (/W option).-o
as equivalent to the
/J option.Program change: Add a warning in the help message about regexes
that start with -
or /
, in response to a
user's question.
Documentation changes:
Program changes:
\"something\"
.Documentation changes:
[
character.Program changes: The latest changes from PCRE release 3.9 were adapted
into the code for the /E2
and
/E4
options. Those changes fixed some esoteric bugs
but did not add any features. (This affects only GREP32 since those
options don't exist in GREP16, but the GREP16 release number was updated
for consistency.)
Documentation changes:
Release 7.21, 12 Apr 2004, fixed a bug: when the regex contained an opening [ but no closing ], the command-line parser mistakenly diagnosed it as a bad filename instead of a bad regex.
This release fixed a memory allocation bug that affected the
/E4
option. For some
regexes, an internal error was generated.
The new /E4
option
tells GREP to search for your regex as a word.
An optional second argument on the
/M
option now lets
you define what is a "word" character, if you need to.
Release 7.02, 24 Mar 2002, was an internal checkpoint release for a few minor changes:
Per user request, when you specify a list of input files with the
/@
option leading and
trailing blanks are now removed from the filespecs.
The new value 4 is returned in
ERRORLEVEL if GREP found no
files to match any of your input filespecs. A warning message also
appears, if you specified multiple filespecs on the command line or
used the /@
option.
The message "no files at all matched filespec" became "no files exist like filespec". This should reduce confusion with the (possibly normal) situation where the input file exists but none of its lines match the regex.
The warning message "Some matches in the middle of long lines may
have been missed. You might want to try the /Wn option" now
suggests both text and binary widths, when GREP is deciding whether
files are text or binary (/R-1
or /R-2
option), This is just a clarification of the
warning message; there is no change in how GREP handles files.
The reference manual
says that, when an input filespec doesn't match any files, GREP
reminds you to check your exclusions.
In 7.0 and 7.01 the warning appeared even if you never used the
/X
option; now it doesn't.
Program changes:
The warning message "the X option is
useful only with named input files" is now "the X option is ignored
when reading only standard input", and is suppressed
when you have only stored /X
options in the
environment variable.
The warning message "the R option
applies only to named files, not standard input" now appears only when
you told GREP to read files in binary (/R2
or
/R3
), not when you told GREP to sense the file type
(/R-1
or /R-2
).
In reading text files, GREP 7.01 doesn't check for error unless the read function returns a non-normal status. The way it was done in 7.0 was correct but wasted a very few CPU cycles.
Documentation changes:
In the Troubleshooting section of the user guide, "GREP is finding too many matches!" was changed to "How do I search for a word?" New question "How do I find all files that contain 'this' and 'that'?" was added.
The style sheet was developed further, and validated.
As promised, this release was driven by users' responses to the poll distributed with release 6.9.
Program changes:
Filenames have more extensive wildcards. The wildcard rules were added to the help message.
GREP16 users, please note that ABC*
now means the same
thing as ABC*.*
, namely all filenames that begin with
ABC.
You can use the new
/@
option to give GREP a
file list in a file, when a program generates your list of input files
or there are simply too many for the command line.
You can use the new
/X
option to exclude
filenames that match certain patterns.
The
/K
count option was
added, to tell GREP to stop reading each file after reporting the
first count matches.
When no input files are named, and input is from the keyboard (not redirected from file), GREP now prompts you to enter each line. Previously, it just sat there and waited for you to realize your mistake.
The message prompting you for regexes
(/F-
option) is now
written directly to the console, not to standard error (which could be
redirected in some operating systems.) And a prompt appears for each
regex after the first.
The text of several error and warning messages was tweaked slightly for consistency, and the help text was expanded a bit more. Customized text was removed from all "insufficient memory" messages, in favor of automatically generated file and line numbers.
Documentation changes:
A complete list of GREP's warnings and error messages, with explanations, was added to the reference manual.
The description of command line syntax was overhauled. Input redirection was clarified, and a section was added on output redirection. The description of named input files was greatly simplified, with the detailed (and new) rules moved to the reference manual.
The HTML versions now use an experimental style sheet, which should make them slightly more attractive on your screen.
This was a pre-release of 7.0. A poll distributed with this release asked users to indicate which requested features they would like to see in the program.
Program changes:
Binary file modes have been completely overhauled. The old
/R
toggle has been replaced by a mode selection
via the /R2
option
for record-oriented binary or the
/R3
option
for free-format binary. This solves a known but serious problem
with the old binary mode, where a match would be missed if it
crossed a block boundary.
The new /R-1
or
/R-2
option lets you tell GREP to figure out on its
own whether each input file is text or binary. (/R-1
and
/R-2
are available only with the registered version.)
You can now specify different widths for binary blocks and text
lines with the /W
option.
The /W
option minimum value, formerly 10 characters, is
now 2.
(GREP32 only) The new /M
option allows correct
case-blind matching and
character classes with
non-English letters and other 8-bit characters.
(A test file is included.)
Binary output now displays the hex value
<nn>
for every non-printing character,
according to the current character set
(/M
option).
A new demo file, TOUR.BAT, is included.
The new /J
option
tells GREP to display from each line only the portion that matched the
search string, basic regex, or extended regex. In release 6.0, this
was added for extended regexes only, and you specified the
/E3
option. /E3
is still allowed, but is equivalent to /E2 /J
.
In record-oriented binary files
(/R2
option),
the ^
anchor at the start and $
anchor at the end of a basic regex now mean the start and end of
block, for better consistency with extended regexes.
GREP is now about 5% faster on text files, and NUL (ASCII 0) will no longer cause the rest of a line to be ignored.
With the /L
option,
GREP now stops reading a file as soon as it finds a match, unless the
/V
option is also set.
Debug output now includes additional information:
/W
option
Bug fix: Release 6.0 began diagnosing
wildcard filespecs that produce no files at all, even when the
/S
(subdirectory) option is set. Unfortunately, the code
to do that broke the previous code that diagnosed missing single
files. The code to check the various cases is now collected in one
place, and diagnosis should be correct and complete.
Bug fix: The numeric values of the
/Q
option were added to
debug output and the help message, where they should have been for
release 6.0.
Documentation changes (in addition to those driven by the above program changes):
The user guide had grown more and more unwieldy. All the details of options and regexes are now split off into a separate reference manual, which is included in the download file but not published on the Web.
A table of options with one-line descriptions and a summary of regexes were added to the user guide; both are hyperlinked to sections in the reference manual. This may make it easier to find the feature you need.
Information on greedy and ungreedy quantifiers was added to the reference manual and to the help message.
The groups of options in the reference manual and the help message were rearranged to input, pattern-matching, output, and general.
The writeups of several options in the help message were expanded.
Note: There were beta test releases numbered 5.95, 5.97, and 5.98. For users who participated in the beta test, a number in parentheses indicates the release where a particular change was made.
Program changes:
(5.95) The big news is the addition of extended regular expressions. GREP can now handle constructs like alternatives |, optionals ?, general quantifiers {...}, subexpressions (...), and more. With extended regexes, you can choose whether GREP reports matching lines as usual, or just the portion of each line that matches the extended regex.
(5.95) You can now bypass regular expressions entirely and just search for literal text.
(5.97) You can now turn the Special Rules for the Command Line on and off.
(5.98) You now have finer-grained control over warning messages
with numeric levels for the /Q
option.
(5.97) GREP now displays a
warning message for any file spec that matches no files, even if the
/S
option (search
subdirectories) is on.
(5.97) You can now put a plus sign
after an option to turn it on (as opposed to toggling it). Example:
repeated /N
will flip the option between on and off, but
/N+
turns on the option regardless of any previous
settings.
(5.98) Character types and assertions were added to the help message.
(5.95) When you put a regex on the command line, you may need to
enclose it in quotes, and if you do then GREP strips them (as past
releases did). But, in agreement with the user guide, it no longer
does that with regexes entered in a file or at the keyboard
(/F
option).
(5.95) Two bugs in parsing oddly constructed character classes
were fixed.
(1) If you had a regex like []abc]
, GREP
wrongly took the first square bracket as ending the class. GREP now
correctly treats a ] at the beginning of a character class as an
ordinary character.
(2) A minus sign at the end of a class, like [abc-]
,
is now treated correctly as a normal character.
(5.95) The user guide told you that when your regex begins with a - or / character, you should prefix it with a \ to keep it from being taken as an option. GREP was considering that leading \ as part of the regex; it doesn't any more.
(5.98) If you specify named input files and also redirected input (<file), GREP has always ignored the redirected input. A warning message now appears in this case.
(5.97) The /Z
option used to reset all options except /F
; it now
resets all options as documented.
User guide changes:
(5.95) The section on regular expressions was pretty much rewritten, rather than try to shoehorn in all the stuff for extended regexes. (Only about 80% of the syntax of extended regular expressions was added to the user guide, though all is supported in the program. For a complete description, the relevant parts of the PCRE man page are also included.)
(5.95) A section on troubleshooting was added.
(5.97) The section on the environment
variable now makes clear that the /Z
option resets all
options including /D
and /F
.
(5.97) The section on Special Rules for the Command Line was reorganized to make it clearer when you do and don't need the rules, and information was added on the new ability to turn them on and off.
This is a repackaging for Simtel; there are no significant functional changes.
If you specified current directory on another disk, such as "d:*.htm", GREP was taking that as root directory, "d:\*.htm". Apparently no one but the program author ever does such a thing!
Unfortunately, a bug was introduced in release 5.3: under certain circumstances, GREP got confused about whether it was working from standard input or input files. This release corrects that bug, with my apologies to everyone who downloaded the buggy 5.3.
New features:
/Y
option lets you search
for lines that contain multiple regexes in any order.
/D
option now allows the pseudo
filename "-", for debugging display on standard output.
Other changes:
more
to paginate the message.
Program changes:
/R
option to read and
display files in binary mode
/W
option to set the line width
(formerly fixed at 255 characters), and warn the user if longer lines
were found
/Z
option to reset all
options
z-a
; previously they were silently treated like the three
characters "z", "-", "a"
grep
with no options or regular expression; instead,
suggest using grep /? |more
d:*
, the drive was ignored
User guide changes:
Program changes:
[\244-\246\248-\255]
,
with adjacent escaped 8-bit character ranges, were not expanded
correctly
\e
in regular
expressions was expanded to
Control-W (octal 27) instead of Escape (decimal 27)
User guide changes:
Clarify and expand
"Special rules for the command line"
section of the user guide,
clarify the descriptions of
the /D
and /I
options,
and add quite a few internal hyperlinks
Program changes:
/S
option for searching subdirectories
/A
option
for searching hidden and system files
User guide changes:
/F-
option
/I
, character classes entered in
lower case were expanded incorrectly
/U
option (UNIX-style output) and the
/Q
option (quiet). Distinguish between option errors in
the environment variable and on the command line
/P
); diagnose conflicts
only after the last option has been scanned
/D
(debug), which
is no longer a toggle
/P
(show context lines) to 2,2 if user specifies
/P
without numbers
/F
option (multiple regular expressions in a
file)
First shareware release: On 19 Nov 1998 the documents were updated and GREP was packaged for release, with no changes to the program.
/P
option, remove the < > widgets
around the line numbers of the context lines, so that the lines
that actually match are more easily seen
/L
option
[]
as a user error, not an internal error
/S
option (search subdirectories), because
GREP's code now uses the C run-time library to interpret file specs on
the command line
[...]
in regular expressions, the
backslash was swallowing an extra character
/0
and /1
options to control
exit status
/D
now shows the number of matches in each
file, total matches, and exit status
\0x9A
, \045
,
and \211
in the regular expression
/P
option (show context lines around the
actual matches)
/D
is on the command line
+
in regular expressions (match one or more
occurrences)
/S
option (search subdirectories)
/B
option: now the default
is to show only the names of matching files
/C
(count matches) is
specified and /B
is not
[...]
character classes containing the range
character -
weren't always expanded correctly
/B
option (show only the names of files that
contain matches)
/D
option show the input pattern as well as
the decoded version
[...]
weren't working with
the /I
option
/I
option (ignore case)
\e
(escape) and
\q
(equal sign)