How To Install and Use Ack, a Grep Replacement for Developers, on Ubuntu 14.04
How To Install and Use Ack, a Grep Replacement for Developers, on Ubuntu 14.04
We hope you find this tutorial helpful. In addition to guides like this one, we provide simple cloud infrastructure for developers. Learn more →

How To Install and Use Ack, a Grep Replacement for Developers, on Ubuntu 14.04

PostedMay 27, 2014 39.6k views System Tools Ubuntu

Introduction

When searching for text in files or in a directory structure, Linux and Unix-like systems have many tools that can assist you. One of the most common of these is grep, which stands for global regular expression print.

Using grep, you can easily search for any pattern that can be expressed with regular expressions within any set of textual input. However, it is not the fastest tool, and it was created as a general purpose tool without any kind of optimization.

For searching source code specifically, a tool inspired by grep called ack was invented. It leverages Perl regular expressions to efficiently search source code for patterns, while taking care to not include the results you don't care about.

In this guide, we'll discuss how to use ack as a super-powered grep replacement for picking out patterns from your source code. The ack tool is available on any platform that can use Perl, but we'll be demonstrating the utility on an Ubuntu 14.04 server.

Install Ack

To get started, the first step is to install the ack tool on your machine.

On an Ubuntu or Debian machine, this is as simple as installing the utility from the default repositories. The package is called ack-grep:

sudo apt-get update
sudo apt-get install ack-grep

Since the executable is also installed as ack-grep, we can tell our system to shorten this to ack with for our command line use of the tool by typing this command:

sudo dpkg-divert --local --divert /usr/bin/ack --rename --add /usr/bin/ack-grep

Now, the tool will respond to the name ack instead of ack-grep.

If you are planning on using ack on other systems, the installation method may vary. The Perl module in CPAN is called App::Ack. On other Linux distributions, the package names in the repositories may be different.

What Ack Pays Attention To

Before we get into the actual usage of ack, let's discuss for a moment how it differs from grep and what files are within the realm of ack.

The ack tool was created specifically for finding text within the source code of programs. Because of this, the tool has been optimized to search certain files and ignore others.

For instance, if you are searching your project's directory structure, you will almost never want to search the version control system's repository hierarchy. This contains information about older versions of files, and would likely result in many duplicates. Ack realizes that this is not where you want to search, so it ignores these directories. This leads to more focused results, as well as fewer false positives.

In a similar vein, it will ignore common backup files created by certain text editors. It will also not attempt to search non-coding files commonly found in source directories, such as "minified" versions of web files, image and PDF files, etc. All of these things lead to better results for almost all searches. You can always override these settings during execution.

Another feature of ack is that it knows about the source files of different languages. You can ask it to find all Python files in the directory structure. It will return all files that end with .py, but it will also return any file that begins with the lines:

#!/path/to/python

This will match files identified by their extension and also files instructed to call the Python interpreter using the common first line magic number calls:

#!/path/to/interpreter/to/run

This creates a powerful way to categorize very different kinds of files as being related. You can also add or modify the groupings to your liking.

Prepare the Environment

The best way to demonstrate the power of ack is to use it on a source code directory.

Luckily, we can easily pull down a source tree from a public site like GitHub. Install git so that we can pull down a repository:

sudo apt-get install git

Now, we need to grab a project. The neovim project is a good example because it contains many different kinds of files. Let's clone that repository to our home directory:

cd ~
git clone https://github.com/neovim/neovim.git

Now, let's move into that directory to get started:

cd neovim

Check out the different files to get an idea of the variety we have:

ls

BACKERS.md       CMakeLists.txt   Doxyfile   scripts      uncrustify.cfg
clint-files.txt  config           Makefile   src          vim-license.txt
clint.py         contrib          neovim.rb  test
cmake            CONTRIBUTING.md  README.md  third-party

Just in that top-level directory, we see markdown files, plain text, a Ruby file, a Python file. And the main portion of the project is written in C.

We also want to set a few things to make our lives easier.

We want to pipe the output directly into less if the results are larger than our terminal window. This will prevent the output scrolling uncontrollably off the screen.

Do that by typing:

echo '--pager=less -RFX' >> ~/.ackrc

This will create our ack configuration file and add its first non-default option. We tell it to pipe output to less with some options that will allow it to display colored output and intelligently handle the pass.

Simple Searching with Ack

Let's get started. To begin, let me demonstrate the difference between what grep would search and what ack searches.

Grep searches every file in the directory structure for matches. We can see the total number of files in this project by typing:

find . | wc -l

566

At the time of this writing, there are 566 total files in the neovim project. To find out how many of those files that ack cares about, we can type:

ack -f | wc -l

497

As you can see, we've already eliminated around 12% of the files to be searched without even doing anything.

Let's say we want to find out all of the instances where the pattern "restrict" is found in this project. We can type:

ack restrict

Doxyfile
1851:# that the size of a graph can be further restricted by MAX_DOT_GRAPH_DEPTH.
1860:# code bases. Also note that the size of a graph can be further restricted by
1861:# DOT_GRAPH_MAX_NODES. Using a depth of 0 means no depth restriction.

vim-license.txt
3:I)  There are no restrictions on distributing unmodified copies of Vim except
5:    unmodified parts of Vim, likewise unrestricted except that they must
. . .

As you can see, ack divides up the instances of "restrict" by the file where the matches were found. Furthermore, it gives the exact line number.

But as you can see, some of the instances (all of the sample portion I copied) are matching variations of "restrict" like "restricted" and "restriction". What if we only want the word "restrict"?

We can use the -w flag to tell it to search for instances of our pattern surrounded by word boundaries. This will eliminate the other tenses of the word:

ack -w restrict

vim-license.txt
37:       add.  The changes and their license must not restrict others from

clint.py
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.

src/nvim/fileio.c
6846:   * Allow nesting of autocommands, but restrict the depth, because it's
. . .

As you can see, our results now only show "restrict" without the variations we saw before. The output is much more focused.

You may have noticed above that the results that we have are found in different kinds of file types. One is a plain text file, there is one found in a Python file, and there are multiple cases in C source files.

If we want to tell ack to only show us the results found in Python files, we can do this painlessly by typing:

ack -w --python restrict

clint.py
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.

We haven't had to specify the file patterns that we were looking for. We haven't had to craft special regular expressions to catch the type of files we want without matching others. Ack simply knows the files of many common languages and you can refer to them by name.

We'll go over how you can modify the files ack returns for each language and how to define your own language groups later.

Analyzing our Search Focus

We have gone from a very broad result, to only one by adding some very simple flags. Let's see exactly how much we've narrowed down our results.

We can use the flags -ch, which we can think of as a simple idiom meaning "how many matches were returned?". By itself, the -c flag tells ack to return only the count of matching lines in each file, like this:

ack -c restrict

Doxyfile:3
Makefile:0
uncrustify.cfg:0
.travis.yml:0
neovim.rb:0
vim-license.txt:5

This will return a line for every file, even those with no matches.

The -h flag by itself suppresses the filename prefix in the output and eliminates the files with zero results. Together, they'll spit out a single number representing the number of lines where the search was matched:

ack -ch restrict

101

We started with 101 results. When we told it to pay attention to word boundaries, we cut a large chunk of these out:

ack -ch -w restrict

16

And of course, when we specified that we only wanted to see Python files, we narrowed our results to a single match:

ack -ch -w --python restrict

1

Not only have we narrowed our search, but by adding the language restriction, we've actually sped up the search. Ack does not simply filter the results based on the language you request, it does this before searching to save itself from having to search irrelevant files.

We can see this by timing the searches with the time command:

time ack -ch restrict

101

real    0m0.407s
user    0m0.363s
sys     0m0.041s

Now let's try the language-specific subset search:

time ack -ch -w --python restrict

1

real    0m0.204s
user    0m0.175s
sys     0m0.028s

The second is significantly faster.

Modifying the Search Output

We've already talked about modifying the search output a little bit when we went over the -c and -h flags. There are other helpful flags that can help us shape the output that we want.

For instance, as you saw before, the -c flag prints out the number of lines where a match pattern was found in each file. We modified it with -h, before, but we could also modify it with -l instead. This will only return numbers for files where the match was found:

ack -cl restrict

Doxyfile:3
vim-license.txt:5
clint.py:1
test/unit/formatc.lua:1
src/nvim/main.c:4
src/nvim/ex_cmds.c:5
src/nvim/misc1.c:1
. . .

As you can see, all of the lines that end with "0" have been pruned from the output.

If you want to see the column that a match is found within a line, you can tell ack to print that information as well with the --column option:

ack -w --column --python restrict
clint.py
107:31:      Specify a number 0-5 to restrict errors to certain verbosity levels.

The second number that is given is the column number where the match's first character occurs. Some editors let you go to a specific line and column, which makes this very helpful.

For instance, if you open the client.py file with the vim text editor, you could go to the exact position of the match by typing 107G to get to the line, and then 31| to get to the column position. This kind of precise positioning can be really helpful, especially if you are searching for a common substring within larger words.

If you need more context for the results, you can tell ack to print out lines before or after the match occurrence. For instance, to print out 5 lines before the "restrict" match in the python file, we can use the -B flag like this:

ack -w --python -B 5 restrict
102-    output=vs7
103-      By default, the output is formatted to ease emacs parsing.  Visual Studio
104-      compatible output (vs7) may also be used.  Other formats are unsupported.
105-
106-    verbose=#
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.

You can specify the number of context lines after the match with the -A flag:

ack -w --python -A 2 restrict
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.
108-
109-    filter=-x,+y,...

You can specify a general purpose context specification that will print a number of lines above and below the matches with the -C flag. For instance, to get 3 lines of context in either direction, type:

ack -w --python -C 3 restrict
104-      compatible output (vs7) may also be used.  Other formats are unsupported.
105-
106-    verbose=#
107:      Specify a number 0-5 to restrict errors to certain verbosity levels.
108-
109-    filter=-x,+y,...
110-      Specify a comma-separated list of category-filters to apply: only

To just print the files that have matches, instead of printing the matches themselves, you can use the -f flag:

ack -f --python

clint.py
contrib/YouCompleteMe/ycm_extra_conf.py

We can do the same thing, but also specify a pattern for the file/directory structure by using the -g flag. For instance, we can search for all of the C language files that have the pattern "log" somewhere in their path by typing:

ack -g log --cc

src/nvim/log.h
src/nvim/log.c

Working with File Types

We've seen the basics of how to filter by file type. We can tell ack to only show us the C language files by typing:

ack -f --cc

test/includes/pre/sys/stat.h
src/nvim/log.h
src/nvim/farsi.h
src/nvim/main.c
src/nvim/ex_cmds.c
src/nvim/os/channel.c
src/nvim/os/server.c
. . .

You can see all of the languages that ack knows about, and which extensions and file properties it associates with each category by typing:

ack --help-types

Usage: ack-grep [OPTION]... PATTERN [FILES OR DIRECTORIES]

The following is the list of filetypes supported by ack-grep.  You can
specify a file type with the --type=TYPE format, or the --TYPE
format.  For example, both --type=perl and --perl work.

Note that some extensions may appear in multiple types.  For example,
.pod files are both Perl and Parrot.

    --[no]actionscript .as .mxml
    --[no]ada          .ada .adb .ads
    --[no]asm          .asm .s
    --[no]asp          .asp
. . .

As you can see, this gives you the matching parameters for each file type. You can also tell ack to exclude files of a certain category by preceding a type with "no".

So we could see the number of C language files we have by typing:

ack -f --cc | wc -l

191

And we can do the reverse to see the number of non-C language files by typing:

ack -f --nocc | wc -l

306

What if we want to modify a type categorization? For instance, what if we want to match .sass, .scss, and .less files when we are looking for CSS files. We can see that these are already matched within the type "sass" and type "less" categories, but we can also add them to the CSS category if we would like.

To do this, we can use this general syntax:

ack --type-add=TYPE:FILTER:ARGS

The --type-add command appends additional match rules for a specified TYPE. The FILTER in this case is ext, which means match by file extension. We can then tell it that we want to add those additional extensions.

The full command would look like this:

ack --type-add=css:ext:sass,scss,less

This however only applies to the current command (which isn't doing any searching). We could add the searching by typing:

ack --type-add=css:ext:sass,scss,less -f --css

This would return any files that end in .css, .sass, .scss, and .less. There don't happen to be any of these files in our project. Either way, this command is not very useful because it only exists for the current command. You can make this permanent by adding it to your ~/.ackrc file:

echo "--type-add=css:ext:sass,less" >> ~/.ackrc

If we want to create an entirely new type, we would use the --type-set option instead. The syntax is entirely the same, the only difference being that it is used to define a non-existent type.

As you have probably gathered, the TYPE from our initial syntax specification is just the category name. The FILTER we saw was the file extension, but we can use other filters as well.

We can match the file name directly by using the is filter. To create a type called example that matches a file called example.txt, we could add this to our ~/.ackrc file:

--type-set=example:is:example.txt

We can also define matches with normal regular expressions by using the match filter. For instance, if we wanted to to create a type called "bashcnf" that matches ".bashrc" and ".bash_profile" files, we could type:

echo "--type-set=bashcnf:match:/.bash(rc|_profile)/" >> ~/.ackrc

Conclusion

As you can see, ack is a very flexible tool for working with programming source code. Even if you are just using it to find files within your Linux environment, most of the time, the increased power of ack will be useful.

By Justin Ellingwood

7 Comments

Creative Commons License