Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve performance of
flush_delayed_lets
for large expressions
We're seeing cases of quadratic behavior in compiling large expressions, particularly in list and array literals (see the generated `uucp_case_map_data.ml` in `uucp` for an example). Profiling suggests it's the fault of `To_cmm_env.flush_delayed_lets`, which uses the unoptimized O(n log n) `Patricia_tree.filter_map`. This patch implements `filter_map` in O(n) time in the obvious way. It also introduces `filter_map_sharing`, which is asymptotically the same but saves a good deal more work in the case that the map is mostly returned unchanged. Testing indicates the improved `filter_map` brought `flush_delayed_lets` down from ~70% (!) of execution time to ~60%, but switching to `filter_map_sharing` brought it down to ~30%. (This still seems high, but OTOH this is a file with almost nothing interesting to compile.)
- Loading branch information