-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,22 +44,34 @@ UniversalCodeGrep (ucg) is an extremely fast grep-like tool specialized for sear | |
|
||
## Introduction | ||
|
||
UniversalCodeGrep (ucg) is an extremely fast grep-like tool specialized for searching large bodies of source code. It is intended to be largely command-line compatible with [`Ack`](http://beyondgrep.com/), to some extent with [`ag`](http://geoff.greer.fm/ag/), and where appropriate with `grep`. Search patterns are specified as PCRE regexes. | ||
UniversalCodeGrep (`ucg`) is an extremely fast grep-like tool specialized for searching large bodies of source code. It is intended to be largely command-line compatible with [`Ack`](http://beyondgrep.com/), to some extent with [`ag`](http://geoff.greer.fm/ag/), and where appropriate with `grep`. Search patterns are specified as PCRE regexes. | ||
|
||
### Speed | ||
`ucg` is intended to address the impatient programmer's code searching needs. `ucg` is written in C++11 and takes advantage of the concurrency (and other) support of the language to increase scanning speed while reducing reliance on third-party libraries and increasing portability. Regex scanning is provided by the [PCRE library](http://www.pcre.org/), with its [JIT compilation feature](http://www.pcre.org/original/doc/html/pcrejit.html) providing a huge performance gain on most platforms. | ||
`ucg` is intended to address the impatient programmer's code searching needs. `ucg` is written in C++11 and takes advantage of the concurrency (and other) support of the language to increase scanning speed while reducing reliance on third-party libraries and increasing portability. Regex scanning is provided by the [PCRE2 library](http://www.pcre.org/), with its [JIT compilation feature](http://www.pcre.org/current/doc/html/pcre2jit.html) providing a huge performance gain on most platforms. | ||
|
||
As a consequence of its use of these facilities and its overall design for maximum concurrency and speed, `ucg` is extremely fast. Under Fedora 23, scanning the Boost 1.58.0 source tree with `ucg` 0.2.2, [`ag`](http://geoff.greer.fm/ag/) 0.31.0, and `ack` 2.14 produces the following results: | ||
As a consequence of its use of these facilities and its overall design for maximum concurrency and speed, `ucg` is extremely fast. Under Fedora 24, scanning the Boost 1.58.0 source tree with `ucg` 0.3.0, [`ag`](http://geoff.greer.fm/ag/) 0.31.0, and `ack` 2.14 produces the following results: | ||
|
||
| Command | Elapsed Real Time, Average of 5 Runs | | ||
| Command | Elapsed Real Time, Average of 10 Runs | | ||
|---------|-----------------------| | ||
| `time ucg --noenv --cpp 'BOOST.*HPP' ~/src/boost_1_58_0` | ~ 0.404 seconds | | ||
| `time ag --cpp 'BOOST.*HPP' ~/src/boost_1_58_0` | ~ 5.8862 seconds | | ||
| `time ack --noenv --cpp 'BOOST.*HPP' ~/src/boost_1_58_0` | ~ 12.0398 seconds | | ||
|
||
#### Benchmark: '#include\s+".*"' on Boost source | ||
|
||
| Command | Program Version | Elapsed Real Time, Average of 10 Runs | Num Matched Lines | Num Diff Chars | | ||
|---------|-----------------|---------------------------------------|-------------------|----------------| | ||
| `ucg --noenv --cpp '#include\s+.*' ../../../../../boost_1_58_0` | 0.3.0 | 0.212767 | 9511 | 189 | | ||
| `/usr/bin/ucg --noenv --cpp '#include\s+.*' ../../../../../boost_1_58_0` | 0.2.2 | 0.262368 | 9511 | 189 | | ||
| `/usr/bin/ag --cpp '#include\s+.*' ../../../../../boost_1_58_0` | 0.32.0 | 1.90161 | 9511 | 189 | | ||
| `/usr/bin/rg -n -t cpp '#include\s+.*' ../../../../../boost_1_58_0` | 0.2.3 | 0.262967 | 9509 | 0 | | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
gvansickle
Author
Owner
|
||
| `/usr/bin/pcre2grep -rn --color '--exclude=^.*(?<!\.cpp|\.hpp|\.h|\.cc|\.cxx)$' '#include\s+.*' ../../../../../boost_1_58_0` | 10.21 2016-01-12 | 0.818627 | 9527 | 1386 | | ||
| `grep -Ern --color --include=\*.cpp --include=\*.hpp --include=\*.h --include=\*.cc --include=\*.cxx '#include\s+.*' ../../../../../boost_1_58_0` | grep (GNU grep) 2.25 | 0.366634 | 9509 | 0 | | ||
|
||
|
||
UniversalCodeGrep is in fact somewhat faster than `grep` itself. Again under Fedora 23 and searching the Boost 1.58.0 source tree, `ucg` bests grep 2.22 not only in ease-of-use but in raw speed: | ||
|
||
| Command | Elapsed Real Time, Average of 5 Runs | | ||
| Command | Elapsed Real Time, Average of 10 Runs | | ||
|---------|--------------------------------------| | ||
| `time grep -Ern --color --include=\*.cpp --include=\*.hpp --include=\*.h --include=\*.cc --include=\*.cxx 'BOOST.*HPP' ~/src/boost_1_58_0` | ~ 0.9852 seconds | | ||
| `time ucg --noenv --cpp 'BOOST.*HPP' ~/src/boost_1_58_0` | ~ 0.404 seconds | | ||
|
@@ -72,6 +84,9 @@ The resulting matches are identical. | |
|
||
## Installation | ||
|
||
UniversalCodeGrep binaries are currently available for Fedora 23/24/25/rawhide and Centos 7. Binaries for other platforms (Ubuntu, Arch, openSUSE) are coming soon. | ||
|
||
<!-- COMING SOON | ||
### Ubuntu PPA | ||
If you are a Ubuntu user, the easiest way to install UniversalCodeGrep is from the Launchpad PPA [here](https://launchpad.net/~grvs/+archive/ubuntu/ucg). To install from the command line: | ||
|
@@ -84,10 +99,11 @@ sudo apt-get update | |
# Install ucg: | ||
sudo apt-get install universalcodegrep | ||
``` | ||
--> | ||
|
||
### Red Hat/Fedora/CentOS dnf/yum Repository | ||
### Fedora/CentOS Copr Repository | ||
|
||
If you are a Red Hat, Fedora, or CentOS user, the easiest way to install UniversalCodeGrep is from the Fedora Copr-hosted dnf/yum repository [here](https://copr.fedoraproject.org/coprs/grvs/UniversalCodeGrep). Installation is as simple as: | ||
If you are a Fedora or CentOS user, the easiest way to install UniversalCodeGrep is from the Fedora Copr-hosted dnf/yum repository [here](https://copr.fedoraproject.org/coprs/grvs/UniversalCodeGrep). Installation is as simple as: | ||
|
||
```sh | ||
# Add the Copr repo to your system: | ||
|
@@ -113,15 +129,15 @@ makepkg -sri | |
|
||
### openSUSE Binary RPMs | ||
|
||
Binary RPMs for openSUSE are available [here](https://github.com/gvansickle/ucg/releases/tag/0.2.2). | ||
Binary RPMs for openSUSE are available [here](https://github.com/gvansickle/ucg/releases/tag/0.3.0). | ||
|
||
### Building the Source Tarball | ||
|
||
UniversalCodeGrep can be built and installed from the distribution tarball (available [here](https://github.com/gvansickle/ucg/releases/download/0.2.2/universalcodegrep-0.2.2.tar.gz)) in the standard autotools manner: | ||
UniversalCodeGrep can be built and installed from the distribution tarball (available [here](https://github.com/gvansickle/ucg/releases/download/0.3.0/universalcodegrep-0.3.0.tar.gz)) in the standard autotools manner: | ||
|
||
```sh | ||
tar -xaf universalcodegrep-0.2.2.tar.gz | ||
cd universalcodegrep-0.2.2.tar.gz | ||
tar -xaf universalcodegrep-0.3.0.tar.gz | ||
cd universalcodegrep-0.3.0.tar.gz | ||
./configure | ||
make | ||
make install | ||
|
@@ -173,7 +189,7 @@ If no `FILES OR DIRECTORIES` are specified, searching starts in the current dire | |
|
||
### Command Line Options | ||
|
||
Version 0.2.2 of `ucg` supports a significant subset of the options supported by `ack`. Future releases will have support for more options. | ||
Version 0.3.0 of `ucg` supports a significant subset of the options supported by `ack`. Future releases will have support for more options. | ||
|
||
#### Searching | ||
|
||
|
I suspect you need to pass
-u
here for this to be a fair comparison. (Which is done in my blog post.)Could you also provide instructions on how to get the corpus you're benchmarking with?