A rabbit hole full of Lisp
At work I contribute to a moderately-sized monorepo at 70 thousand files,
8-digit lines of code and hundreds of PRs merged every day. One day I
opened a remote buffer at that repository and ran
find-fileis an interactive function that shows a narrowed list of files in the current directory, prompts the user to filter and scroll through candidates, and for a file to open.
Emacs froze for 5 seconds before showing me the
find-file prompt. Which
isn’t great, because when writing software, opening files is actually
something one needs to do all the time.
Luckily, Emacs is “the extensible, customizable, self-documenting real-time
display editor”, and comes with profiling capabilities:
starts a profile and
M-x profiler-report displays a call tree showing how
much CPU cycles are spent in each function call after starting the profile.
Starting a profile and running
M-x find-file showed that all time was being
spent in a function called
ffap-guess-file-name-at-point, which was being
file-name-at-point-functions, an abnormal hook run when
I checked the documentation for
M-x describe-function ffap-guess-file-name-at-point and it didn’t seem
to be something essential, so I removed the hook by running
M-x eval-expression, writing the form below, and pressing
This solved the immediate problem of Emacs blocking for 5 seconds every
time I ran
find-file, with no noticeable drawbacks.
file-name-at-point-functions. I can't reproduce it anymore. The initial issue might have been
- caused by having manually mutated the Emacs environment via ad-hoc code evaluation (drifting from the state defined in configuration)
- caused by settings or packages that aren't in my configuration anymore
- fixed by settings or packages that were recently added to my configuration
- fixed by some recent package upgrade
Or some combination of the above. I have no idea exactly what. Which is to say: maintaining Emacs configurations is complicated.
I could now navigate around and open files. The next thing I tried in this
remote git repository was searching through project files. The great
projectile package provides the
projectile-find-file function for that,
but I had previously given up making projectile perform well with remote
buffers; given how things are currently implemented it seems to be
impractical. So I installed the find-file-in-project package for use on
remote projects exclusively:
M-x package-install find-file-in-project.
find-file-in-project (aliased as
- show a narrowed list of all project files in the minibuffer
- prompt the user to filter and scroll through candidates
- open a file when
RETis pressed on a candidate.
To disable projectile on remote buffers I had the following form in my configuration.
Which causes the
projectile-project-root function to not run its usual
implementation on remote buffers, but instead return
projectile-project-root is used as a way to either get the project root for
a given buffer (remote or not), or as a boolean predicate to test if the
buffer is in a project (e.g., a git repository directory). Having it return
nil on remote buffers effectively disables projectile on remote buffers.
I then wrote a function that falls back to
ffip when projectile is
disabled and bound it to the keybinding I had for
so that I could press the same keybinding whenever I wanted to search for
projects files, and not have to think about whether I’m on a remote buffer
And called it:
Emacs froze for 30 seconds. After that, it showed the prompt with the narrowed list of files in the project. 30 seconds! What was it doing during the whole time? Let’s try out the profiler again.
Start a new profile:
Call the function to be profiled:
M-x maybe-projectile-find-file(it freezes Emacs again for 30 seconds)
And display the report:
This tells us that 98% of the CPU time was spent in whatever
TAB on a line will expand it by showing its child function
... shows that Emacs spent 64% of CPU time in
ivy--insert-minibuffer and 9% of the time—roughly 3 whole seconds!—
garbage collecting. I had
garbage-collection-messages set to
t so I could
already tell that Emacs was GCing a lot; enabling this setting makes a
message be displayed in the echo area whenever Emacs garbage collects. I
could also see the Emacs process consuming 100% of one CPU core while it was
frozen and unresponsive to input.
profilerpackage implements a sampling profiler. The
elppackage can be used for getting actual wall clock times.
Drilling down on
#<compiled 0x131f715d2b6fa0a8> shows that cycles there
(17% of CPU time) were spent on Emacs waiting for user input, so we can
ignore it for now.
As I get deep in drilling down on
ivy--insert-minibuffer, names in the
“Function” column start getting truncated because the column is too narrow.
A quick Google search (via
M-x google-this emacs profiler report width)
shows me how to make it wider:
Describing those variables with
M-x describe-variable shows that the
default values are
From the profiler report buffer I run
M-x eval-expression, paste the form
C-y and press
RET. I also persist this form to my
c in the profiler report buffer (bound to
profiler-report-render-calltree) redraws it, now with a wider column,
allowing me to see the function names.
Here is the abbreviated expanded relevant portion of the call stack.
A couple of things to unpack here. From lines 8-11 it could deduced that
ffip maps a lambda that calls
expand-file-name over all completion
candidates, which in this case are around 70 thousand file names. Running
M-x find-function ffip-project-search and narrowing to the relevant region
in the function shows exactly that:
find-functionshows the definition of a given function, in its source file.
On line 11 of the profiler report we can see that 60% of 30 seconds (18
seconds) was spent on
expand-file-name calls. By dividing 18 seconds by
70000 we get that
expand-file-name calls took 250µs on average. 250µs is how
long a modern computer takes to read 1MB sequentially from RAM! Why would my
computer need to do that amount of work 70000 times just to display a
narrowed list of files?
Let’s see if the function documentation for
expand-file-name provides any
M-x describe-function expand-file-name
Ok, so it sounds like
expand-file-name essentially transforms a file path
into an absolute path, based on either the current buffer’s directory or
optionally, a directory passed in as an additional argument. Let’s try
evaluating some forms with
M-x eval-expression both on a local and a
remote buffer to get a sense of what it does.
In a local dired buffer at my local home directory:
*dired /Users/mpereira @ macbook*
In a remote dired buffer at my remote home directory:
*dired /home/mpereira @ remote-host*
expand-file-name call in
ffip-project-search doesn’t specify a
DEFAULT-DIRECTORY (the optional second parameter to
like in the examples above it defaults to the current buffer’s directory,
which in the profiled case is a remote path like in the second example above.
With a better understanding of what
expand-file-name does, let’s now try
to understand how it performs. We can benchmark it with
local and remote buffers, and compare their runtimes.
M-x describe-function benchmark-run
Benchmarking it in a local dired buffer at my local home directory
*dired /Users/mpereira @ macbook*
and in a remote dired buffer at my remote home directory
*dired /home/mpereira @ remote-host*
showed that it took 0.3 seconds to run
expand-file-name 70 thousand times
on a local buffer, and 30 seconds to do so on a remote buffer: two orders
of magnitude slower. 30 seconds is more than what we observed in the
profiler report (18 seconds), and I’ll attribute this discrepancy to
unknowns; maybe the
ffip execution took advantage of byte-compiled code
evaluation, or there’s some overhead associated with
something else entirely. Nevertheless, this experiment clearly corroborates
the profiler report results.
So! Back to
ffip. Looking again at the previous screenshot, it seems that
the list of displayed files doesn’t even show absolute file paths. Why is
expand-file-name being called at all? Maybe calling it isn’t too
Let’s remove the
expand-file-name call by
- visiting the
ffip-project-searchfunction in the library file with
M-x find-function ffip-project-search
filein the lambda
and see what happens.
I run my function again:
It’s faster. This change alone reduces the time for
ffip to show the
candidate list from 30 seconds to 8 seconds with no noticeable drawbacks.
Which is better, but still not even close to acceptable.
Profiling the changed function shows that now most of the time is spent in
sorting candidates with
ivy-prescient-sort-function, and garbage
collection. Automatic sorting of candidates based on selection recency
comes from the excellent ivy and ivy-prescient packages, which I had
installed and configured. Disabling
M-x ivy-prescient-mode and re-running my function reduces the time further
from 8 seconds to 4 seconds.
Another thing I notice is that
fd to be used as a backend
instead of GNU find.
fd claims to have better performance, so I install it
on the remote host and configure
ffip to use it. I evaluate the form below
like before, but I could also have used the very handy
M-x counsel-set-variable, which shows a narrowed list of candidates of all
variables in Emacs (in my setup there’s around 20 thousand) along with a
snippet of their docstrings, and on selection allows the variable value to
be set. Convenient!
Which brings my function’s runtime to a little over 2 seconds—a 15x performance improvement overall—achieved via:
- Manually evaluating a modified function from an installed library file
- Disabling useful functionality (prescient sorting)
- Installing a program on the remote host and configuring
ffipto use it
The last point is not really an issue, but the whole situation is not ideal. Even putting aside all of the above points, I don’t want to wait for over 2 seconds every time I search for files in this project.
Let’s see if we can do better than that.
So far we’ve been mostly configuring and introspecting Emacs. Let’s now extend it with new functionality that satisfies our needs.
We want a function that:
- Based on a remote buffer’s directory, figures out its remote project root directory
fdon the remote project root directory
- Presents the output from
fdas a narrowed list of candidate files, with it being possible to filter, scroll, and select a candidate from the list
- Has good performance and is responsive even on large, remote projects
Let’s see if there’s anything in find-file-in-project that we could reuse.
I know that
ffip is figuring out project roots and running shell commands
somehow. By checking out its library file with
M-x find-library find-file-in-project (which opens a buffer with the installed
find-file-in-project.el package file) I can see that the
shell-command-to-string function (included with Emacs) is being used for
running shell commands, and that there’s a function named
that sounds a lot like what we need.
I have a keybinding that shows the documentation for the thing under the cursor. I use it to inspect the two functions:
Perfect. We should be able to reuse them.
I also know that the
ivy-read function provided by ivy should take care
of displaying the narrowed list of files. Looks like we won’t need to write
a lot of code.
To verify that our code will work on remote buffers we’ll need to evaluate
forms in the context of one. The
with-current-buffer macro can be used for
M-x describe-function with-current-buffer
For writing our function, instead of evaluating forms ad-hoc with
M-x eval-expression, we’ll open a scratch buffer and write and evaluate forms
directly from there, which should be more convenient.
I have a clone of the Linux git repository on my remote host. Let’s assign a
remote buffer for the officially funniest file in the Linux kernel,
—to a variable named
remote-file-buffer by evaluating the following
Notice that the buffer is just a value, and can be passed around to
functions. We’ll use it further ahead to emulate evaluating forms as if we
had that buffer opened, with the
Let’s start exploring by writing to the
*scratch* buffer and continuing to
evaluate forms one by one with
And now let’s evaluate some forms in the context of a remote buffer. Notice
hostname in a shell returns something different.
executable-findrequires the second argument to be non-nil to search on remote hosts. Check
M-x describe-function executable-findfor more details.
Emacs is not only running shell commands, but also evaluating forms as if it were running on the remote host. That’s pretty sweet!
Now that we made sure that the executable for
fd is available on the
remote host, let’s try running some
fd tells us that there are 28 C files in
/home/mpereira/linux/kernel/time. Let’s see if we can get the project root,
which would be
That seems to work.
Let’s now play with
default-directory. This is a buffer-local variable
that holds a buffer’s working directory. By evaluating forms with a
default-directory it’s possible to emulate being in another
directory, which could even be on a remote host. The code block below is an
example of that—the second form redefines
default-directory to be the
I wonder how much Assembly and C are currently in the project.
Twenty seven million, eighty eight thousand, one hundred and sixty two lines of C, and almost half a million lines of Assembly. It’s fine.
Alright, at this point it feels like we have all the pieces: let’s put them together.
This is a bit longer than what we’ve been playing with, but even folks new to Emacs Lisp should be able to follow it:
default-directoryto be the project root directory (line 5)
- Build, execute, and parse the output of the
fdcommand into a list of file names (lines 6-9)
- Display a file prompt showing a narrowed list of all files in the project (lines 10-13)
Let’s see if it works.
Since it was declared
(interactive) we can also to call it via
Going back to the large remote project and running
few times shows that it now runs in a little over a second—a 30x
improvement compared with what we started with.
This is still not good enough, so I went ahead and evolved the function we were working on to most of the time show something on screen immediately and redraw it asynchronously. You can check out the code at fast-project-find-file.el.
Jonathan Blow addresses this situation somewhat entertainingly in “Preventing the Collapse of Civilization”.
* * *
Did you notice how the function implementation came almost naturally from exploration? The immediate feedback from evaluating forms and modifying a live system—even though old news to Lisp programmers—is incredibly powerful. Combine it with an “extensible, customizable, self-documenting” environment and you have a very satisfying and productive means of creation.