Nathaniel Knight

Reflections, diversions, and opinions from a progressive ex-physicist programmer dad with a sore back.

Reading Code with Emacs: Finding things to look at

In the previous article about reading code with Emacs we looked at some tools that are useful when you've already found the code you need to work on. If you're trying to read through many hundreds or thousands of lines and figure out what's relevant, those tools probably aren't very helpful; you need to find the important parts of your program first!

This article is about some of the Emacs tools I like to use for digging through mountains of code. We'll start with occur, which helps summarize a single file, then move on to rgrep, which searches through whole directories, and finally finish up with bookmarks, for marking the important parts of code.

Fundamentals

Emacs has solid and deep tools for strings, regular expressions, and symbols right out of the box. You should glance through the search section of the manual to see what's there. You don't need to learn all the details, but you'll know to go look for them when the time is right.

Confession: I discovered several features (Like symbol search! and lax search!) while reviewing the documentation for this post, so it's a good thing to revisit even if you're an Emacs veteran.

Summarizing files with occur

Once you've got the hang of the basic string and regex search I suggest checking out occur, which finds all the occurrences of a pattern in your current file and displays them in a temporary buffer (called *Occur*). This is handy for finding "everywhere this function/exception/constant is used" or "all the function definitions" without having to use a heavier-weight parsing tool.

For example, here's a file from the source code of Flask where we've highlighted all the imports with M-x occur RET import RET

__version__ = '0.11.2-dev'

# utilities we import from Werkzeug and Jinja2 that are unused
# in the module but are exported as public interface.
from werkzeug.exceptions import abort
from werkzeug.utils import redirect
from jinja2 import Markup, escape

from .app import Flask, Request, Response
 __init__.py 14 matches for "import" in buffer: __init__.py
     15:# utilities we import from Werkzeug and Jinja2 that are unused
     17:from werkzeug.exceptions import abort
     18:from werkzeug.utils import redirect
     19:from jinja2 import Markup, escape
     21:from .app import Flask, Request, Response
     22:from .config import Config
 *Occur*

Clicking or pressing Return in the *Occur* buffer will send you to the location of the match so you can see it in context. You can also have occur display leading or trailing of lines when it searches! You can read the documentation with C-h f occur RET.

Search directories with rgrep

If you need to search a whole code-base for a function, class, or exception, turn to rgrep, which will search whole directories for a regular expression and list the results for you.

For example, here's all all of the import statements in all of Flask's source code found using rgrep. (I've cut out some of the line beginning with find for formatting; it's a giant gnarly shell command the rgrep builds to do the search for you using the GNU findutils.)

__version__ = '0.11.2-dev'

# utilities we import from Werkzeug and Jinja2 that are unused
# in the module but are exported as public interface.
from werkzeug.exceptions import abort
from werkzeug.utils import redirect
from jinja2 import Markup, escape

from .app import Flask, Request, Response
 __init__.py -*- mode: grep; default-directory: "~/tmp/flask/flask/" -*-
Grep started at Thu Sep 29 21:27:03

find . -type d \( -path \*/SCCS -o -path \*/RCS -o -path \*/CVS ...
./__init__.py:15:# utilities we import from Werkzeug and Jinja2 that are unused
./__init__.py:17:from werkzeug.exceptions import abort
./__init__.py:18:from werkzeug.utils import redirect
./__init__.py:19:from jinja2 import Markup, escape
./__init__.py:21:from .app import Flask, Request, Response
./__init__.py:22:from .config import Config
 *grep*  

You can limit your search by providing a glob pattern, so you can search just *.py files, or test_* files for example.

A note for Windows users, since rgrep depends on the GNU findutils it may or may not work for you, depending on your setup.

Jump anywhere with Bookmarks

Finally, if you end up in the dire situation of having to keep track of several points in your code which are far apart and not easy to find, you might have to resort to bookmarks. They're useful if you frequently have to jump back to an important function, or a tricky class definition and don't want to search for it's name every time. I don't often use bookmarks, but they can be just the thing in these kinds of cases.

Bookmarks let you save a point in a file, give it a name, and jump back to it later. The essential functions for navigating in this fashion are:

The Emacs manual has a section on them if you need more advanced features.

Onwards!

These tools are what I reach for first when exploring a program, and hopefully you'll find them useful too. If you find you need something more, I would suggest looking into language-specific plug-ins for your project.

Google emacs tools <my language> to see what's available—there are a few different places to find Emacs packages. I try to stick with things in MELPA (they tend to be well supported tidy to install/uninstall with Emacs's packaging system) but there are sometimes pearls in other places. Unless you're into really esoteric languages, there are probably tools for jumping to function or class definitions, tracking control flow, inferring types, and probably other things!