Way back, I wrote about benchmarking some sloppy interview code using jsperf.com (now jsperf.app). The gist of it is that I wrote a naive array searching algorithm for finding the two largest numbers in an array, and upon discovering just how bad my solution was, I decided to benchmark it to properly stomp on my self esteem a little bit. The contrast between the first solution and the second was surprising and incredibly insightful, and since then I've loved leaning on benchmarks to learn more and hone in on better solutions.
These days I do most of my benchmarking in Go and Rust because it's implemented in their standard libraries so well; there's no reason not to do it. I'm not as diligent when it comes to TypeScript, so I wanted to revisit that old benchmark and try to devise a convention I'd actually use, regardless of which runtime I'm using.
Vitest offers an experimental
bench command (which would be great to use because I use Vitest all the time) but it's fairly unstable still and uses tinybench under the hood, which I'm not as happy with as mitata. Deno and Bun have integrated mitata into their standard library which is great to see, but I don't typically use them. So, I'm going to set up a bare bones solution without any special tooling or conventions.
It looks like mitata hasn't been updated for a while now, and while it's stable and works very well, there are some things some people might believe should be ironed out. There are some relatively basic features that would be great to see, like a reporter API or a timeout argument for groups or single benchmarks. Even so, it's well-featured for how simple and nice it is to use, and I'm not going to let that get in the way.
Although people tend to be resistant to the idea, each of these features can be implemented in a benchmark running abstraction, and my impression is that this might be what the creator of mitata thinks is the correct solution.
I like to write tests to ensure what I'm benchmarking is working properly and as expected, so I'm including Vitest; you can feel free to exclude it. I'm also including prettier out of habit and it's safe to leave out as well.
To run the benchmarks which will be written in TypeScript, I'm using
The TypeScript configuration is typical except for a
ts-node entry. This is used to modify the compilation for running scripts with
ts-node, which we don't necessarily have to do for this example but under real-world circumstances I expect you would. Likely the most modern feature you need is top-level await, making it so you can await the
run function exported from
mitata. You can learn more about this
ts-node convention here. I've also added a specific secondary config for builds, which avoids compiling any benchmarking or test files and provides an output destination:
The Benchmarking Script
I've created a
./scripts directory with a file called
benchmark.ts. This file uses
glob to find all of our benchmarks and execute them:
This is rudimentary at this point, but it's loaded with potential. Although
mitata always outputs results to stdout for you to view in the terminal, you'll see later that it also provides a way for us to read and, if we choose to, write results from the benchmarks here.
Writing a Benchmark
This part is satisfyingly easy. In its most basic form, all you need to do is end a file's name with
.bench.ts, wrap what you're measuring in
bench() and then
Now we can use the
bench script from
package.json to run this benchmark:
$ yarn bench benchmark time (avg) (min … max) p75 p99 p995 --------------------------------------------------------------- ----------------------------- hello (template string) 434.19 ps/iter (284.9 ps … 902.06 ns) 366.8 ps 1.56 ns 2.27 ns hello (concatenation) 7.62 ns/iter (5.85 ns … 190.93 ns) 7.45 ns 20.89 ns 31.52 ns
In the example above I'm using
run with no options passed in, but there are several to choose from:
||collect returned values into an array during benchmark|
||enable colors in non-json output|
||output results as json|
Note: I haven't figured out how
collect does anything meaningfully different, but the rest are fairly straight forward.
run's Return Value
You might have noticed
run returns a
Promise<Report>; the report object looks like this:
Report object struck me as the perfect way to store data about benchmarks after they complete. While the run's results will always output to stdout, you can use that as feedback while the actual results are piped into files for later use.
I haven't quite figured out how I want to use the reports yet, but I'd love to create a convention for identifying functions and their changes over time, and tracking their performance along with each diff. This could help point out major performance improvements or degradations during pull requests or other stages of development, but it wouldn't require any special effort from developers — you'd just need to write the benchmarks.
Like the example above, this is about as easy. I'm starting with the two implementations I had in my post from 2017:
Then to write the benchmarks, I'm grouping each assessment with both implementations run with the same data:
Now we can run our benchmarking script and see the results:
Incredible, right? What a huge difference between the two. The difference is smaller than it was 6 years ago, and my best guess at this point is that it's due to JIT optimizations.
mitata seems great so far. There are a few quality of life features I'd like to see, such as quieting it's writing to
console.log — sometimes I'd like to take over there, outputting what I want to see instead. Even so, in a pinch it's still possible to either a) ignore default outputs or b) write your outputs to a file and pipe mitata's outputs into the void. Given how simple and effective it is, I'm really happy with it as it is.
In the future I'll hopefully write a bit about building around mitata to create useful tools for tracking performance over time, but I'll need to put some thought into making that useful and portable.
Feel free to take a look at the code on github.