Skip to main content

Command Palette

Search for a command to run...

A journey toward a fast and good minifier

Updated
9 min read

This blog post will tell you about the techniques involved in making the minifier 100x faster, especially the name mangler. For an arbitrary large input, it now takes 2.5s while using one cpu core instead of 13s while using 10 cpu cores.

SWC has a minifier. For now, the SWC minifier shows good performance while having similar or better compression rate than terser. See https://github.com/privatenumber/minification-benchmarks for current scores.

But SWC minifier was quite slow in the past. The main reason is that, no one in the planet, including me, knows every characteristics of hygiene-based variable management system, which is something I invented for SWC. It may sound strange, but at the moment of inventing it, I didn't even know how to calculate big-O. At the moment, I was too young to be a CS student, and more importantly, SWC is my first serious project.

My journey

I had hard time making minifier faster. I learned lots of things while working on the performance issue of the minifier. I'll write them down on this post. Some of them are related to hygiene-based identifier management system, and now I think I should a write a formal assay about it. It allows dropping time complexity by a margin, so I think it can be applied to some passes in other compilers.

For the curious, the title will be

Scopeless scoping with span-hygiene

Reversed hash, but with iteration? (#3633)

I profiled the minifier using cargo profile instruments --release, which is a subcommand added by my cargo-profile. Also, I used three.js benchmark from esbuild for profiling.

Actually, I was working on performance issue of bundler, not the minifier.

In swc_bundler, I ran

MINIFY=1 ./scripts/instrument.sh ~/path/to/three.js/bundle

instrument.sh builds the example named bundle, and runs xcode instrument using the built executable and three.js bundle.

Bundled as 1 modules
Bundler.bundle() took 541ms
   INFO Done in 83ns, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in inline global defs
    in minify

   INFO Done in 77.599583ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in analyze
    in precompress
    in minify

   INFO Done in 97.278541ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in precompress
    in minify

   INFO Running minifier in the normal mode
    at crates/swc_ecma_minifier/src/metadata/mod.rs:183
    in minify

   INFO Done in 73.851541ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in analyze
    in compress ast
    in minify

   INFO compress: Running expression simplifier (pass = 1)
    at crates/swc_ecma_minifier/src/compress/mod.rs:275
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO compress: expr_simplifier took 25.204666ms (pass = 1)
    at crates/swc_ecma_minifier/src/compress/mod.rs:297
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO Done in 27.640625ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in apply pure optimize
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO Done in 77.462958ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in analyze
    in apply full optimizer
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO Done in 267.659041ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in apply full optimizer
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO Done in 320.593083ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in optimize with , pass: 1
    in compress ast
    in minify

   INFO Done in 0ns, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in optimize with , pass: 2
    in compress ast
    in minify

   INFO Done in 444.3625ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in compress ast
    in minify

   INFO Done in 18.426625ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in postcompress
    in minify

   INFO Done in 12.344981416s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO Done in 12.944418041s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Minification took 12974ms
Created output.js (5914kb)
Took 13587ms

Uh... So name mangling uses 12 seconds? I was sure that there was some serious problem in the name mangler.

Yeah, instrument is proving it. So I looked at the source code of name mangler.

fn can_rename(&self, id: &Id, symbol: &JsWord, renamed: &AHashMap<Id, JsWord>) -> bool {
    if cfg!(target_arch = "wasm32")
        || self
            .data
            .usages
            .iter()
            .chain(self.data.decls.iter())
            .count()
            <= 64
    {
        for used_id in self.data.usages.iter().chain(self.data.decls.iter()) {
            if *used_id == *id {
                continue;
            }

            if let Some(renamed_id) = renamed.get(used_id) {
                if renamed_id == symbol {
                    return false;
                }
            }
        }
    } else {
        if self
            .data
            .usages
            .par_iter()
            .chain(self.data.decls.par_iter())
            .any(|used_id| {
                if *used_id == *id {
                    return false;
                }

                if let Some(renamed_id) = renamed.get(used_id) {
                    if renamed_id == symbol {
                        return true;
                    }
                }

                false
            })
        {
            return false;
        }
    }

    if cfg!(target_arch = "wasm32") {
        for c in self.children.iter() {
            if !c.can_rename(id, symbol, renamed) {
                return false;
            }
        }

        true
    } else {
        self.children
            .par_iter()
            .all(|c| c.can_rename(id, symbol, renamed))
    }
}

This is the problematic code. While looking at the code, I found that it's doing something crazy. Actually it was doing reversed hash operation, but by the iterating over a vector (in parallel).

Ah... Fix was simple. I created a type named RenameMap, which tracks the mapping between Id and JsWord. As it also trakced the reverse direction, (JsWord -> Vec<Id>), it became much faster.

Result:

   INFO Done in 4.501971666s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO Done in 5.107085375s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Minification took 5137ms
Created output.js (5914kb)
Took 5727ms

Now it's much better.

Multiple hashmaps, but with identical role? (#3983)

The performance of the name mangler is quite improved, but it was still too slow. I ran profiling again, with the same command.

MINIFY=1 ./scripts/instrument.sh ~/path/to/three.js/bundle
   INFO  Done in 4.49787s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO  Done in 5.278398666s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Minification took 5309ms
Created output.js (5889kb)
Took 5891ms

Okay, checking if a hash map contains an element is O(1), so we are performing doing it too many times. How can we reduce it? While looking at code, I found that usages and decls in ScopeData are always used in identical way. All operations performed on two hashmaps are identical, so I merged it into all.

And I profiled after changing the code.


INFO Done in 2.942127541s, kind: "perf"
at crates/swc_timer/src/lib.rs:32
in mangle names
in minify

INFO Done in 3.730316875s, kind: "perf"
at crates/swc_timer/src/lib.rs:32
in minify

Minification took 3760ms
Created output.js (5889kb)
Took 4344ms

It worked!

Span hygiene is cool (#3988)

In this time, I used instrument.sh in swc_ecma_minifer, not swc_bundler. I used typescript as the input for the minifier.

I ran

./scripts/instrument.sh path/to/typescript.js

Result is

   INFO  Done in 360.108708ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in compress ast
    in minify

   INFO  Done in 304.238125ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in postcompress
    in minify

   INFO  Done in 2.227133208s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO  Done in 2.976443833s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Let's look at instrument result.

can_rename is using lots of time. To know why, I expanded it.

The recursion was the problem. Can we avoid it? The answer is surprisingly yes, and more surprisingly it's not hard to do.

Instead of checking hash maps in child scopes, we can clone all Ids in child scopes to the parent scope, before checking.


pub(super) fn prepare_renaming(&mut self) {
    self.children.iter_mut().for_each(|child| {
        child.prepare_renaming();

        self.data.all.extend(child.data.all.iter().cloned());
    });
}

Recursion again

We are going to remove recursion in hot execution path, by cloning all required information recursively in single-execution path.

Let's look at the result.

   INFO  Done in 361.856375ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in compress ast
    in minify

   INFO  Done in 300.18725ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in postcompress
    in minify

   INFO  Done in 841.693541ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO  Done in 1.588692708s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Again, it worked!

This optimization is power of span-hygiene. I can store all variables within a single hashmap. Those identifiers can be still distinguished by their hygiene.

25M+ calls to HashSet#contains? Why? (#4342)

Let's start with profiling.

./scripts/instrument.sh path/to/typescript.js

Same command this time, from ./crates/swc_ecma_minifier.

Although it became much faster, it's quite strange. HashSet#contains is O(1). But 25% of total runtime? So... It means we are calling it too many times.

To know exact number, I added println!("all:contains") to the above of

if self.data.all.contains(left) {
    return false;
}

and counted it using the search feature of iTerm2. And I stopped it after 25M.

Obviously, this is not a number I expected. Not even close. Why the hell is it called 25M times?

After some thinking, I found that after mification, some symbols are used lots of time. e.g. a is used everywhere, because it's short. Lots of identifiers are mapped to a, because it's short. And we use one global reverse hashmap, As a result, we iterate over all of those identifiers which is mapped to same symbol.

We don't need to check all of them. Checking identifiers in parent scope is enough. So I decided to clone hash map for each scope.

Let's look at the result.

   INFO  Done in 382.930791ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in compress ast
    in minify

   INFO  Done in 280.970916ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in postcompress
    in minify

   INFO  Done in 188.407666ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify

   INFO  Done in 946.077ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify

Cool! Finally the name mangler is fast enough.

Wait, 10 passes? (#4373, #4378)

I couldn't reproduce the numbers in https://github.com/privatenumber/minification-benchmarks

Finally, I configured an integration benchmark in the swc crate, and I could reproduce the number. But the benchmark in swc_ecma_minifer shows quite different result. I couldn't reproduce the number in swc/minify from swc_ecma_minifier/full.

While debugging it, I dumped various variables, copied source code, and did some lots of things. But I found one stange thing in swc/minify benchmark.

   INFO  Done in 241.057958ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in optimize with pass: 10
    in compress ast
    in minify
    in Compiler.minify
    in minify

   INFO  Done in 2.593402583s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in compress ast
    in minify
    in Compiler.minify
    in minify

   INFO  Done in 44.142583ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in postcompress
    in minify
    in Compiler.minify
    in minify

   INFO  Done in 189.090166ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in mangle names
    in minify
    in Compiler.minify
    in minify

   INFO  Done in 2.922542875s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in minify
    in Compiler.minify
    in minify

   INFO  Done in 49.478041ms, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in Compiler.print
    in Compiler.minify
    in minify

   INFO  Done in 3.210377125s, kind: "perf"
    at crates/swc_timer/src/lib.rs:32
    in Compiler.minify
    in minify

It performs full AST compression 10 times. What? But default pass limit should be 1. Those options are ported from terser... So I think I made 1 default?

It turns out that the default value of passes was 1 in CompressOptions, while the default value of passes in TerserCompressorOptions was 0. passes: 0 means no limitation, and that was the cause.

I changed the default to 3 instead of 1, because I think it's more reasonable considering the performance of swc minifier.

Conclusion

While working on this, I learned the importance of time complexity, and the way to optimize it. Interpreting the result of instrumented data requires considering time complexity. e.g., If an operation which is O(1) takes too much time, it's being called an enormous amount of time.

Also, always profile. You can't guess performance.

Efforts like this are helping accelerate builds for projects like Next.js while making the Web faster for end users.

More from this blog

한국의 학벌에 대한 생각

내 블로그의 제목이 kdy1: The way I think 인만큼 앞으로는 내 생각을 더 자주 올리려고 한다. 한국 기준으로, 학벌은 사람을 볼 때 꽤나 유용한 지표이지만, 절대적이지는 않다. 경험적인 얘기일 뿐이지만, 성균관대학교 자퇴생으로서 느낀 것들이 몇 가지 있다. 대학까지 간 사람의 학벌은 학습 능력 x 성실함 에 대체로 비례한다. 그래서 의미가

Apr 3, 20261 min read

인간 지능에 대한 메모장

최종 업데이트: 2026/03/15 지능의 유전 현재 인류 기준으로, 고지능자는 고지능 유전자가 많이 겹친 사람이다. 지능의 유전엔 X 염색체가 매우 중요한 역할을 한다. 그리고 이게 남자와 여자의 지능 분포 차이를 만든다. 극상위권에 여자가 거의 없는 이유가 이것이다. 고지능 X 염색체가 여자한테서 발현되려면 2개가 있어야 한다. 이는 인간의 생

Mar 15, 20262 min read

Ai 코딩 팁 2 (한국어)

발표 자료: https://gamma.app/docs/AI--2a52e7tk3eb1ch1 AI 활용법 관련해서 간단하게 발표를 했다. 발표 자료 앞쪽은 전에 블로그에 올린 글이랑 같은 내용이다. 이 글에서는 기존 글에서 다루지 않은 내용들을 다루겠다. 에러 메시지 및 로깅 구체적 타입 및 스키마 활용 any 타입은 사람에게도 위험하지만, AI에게는 더 위험하다. 마찬가지로, JSON.parse처럼 아무 제약 없는 파싱 느슨한 인터페이스 ...

Jan 30, 20265 min read

kdy1: The way I think

233 posts