Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libshvcore: Rewrite utils::joinPath #419

Merged
merged 1 commit into from
Feb 19, 2024
Merged

libshvcore: Rewrite utils::joinPath #419

merged 1 commit into from
Feb 19, 2024

Conversation

syyyr
Copy link
Contributor

@syyyr syyyr commented Feb 17, 2024

I rewrote the shv::core::utils::joinPath function to use a fold expression instead of a recursive resolving. The intention was to improve these things:

  1. Remove the unnecessary recursive calls.
    Handling of variadic template arguments is usually handled via recursion, which means that the recursive variant of joinPath has to be called multiple times, until it encounters the single argument joinPath which only returns its argument. Using a fold expression removes recursion. Note: the number of calls to the actual implementation of the joinPath algorithm is the same in both versions.
  2. Disallow calling joinPath with zero and one argument.
    This was impossible with the recursive version, because there had to be a one-arg joinPath overload to stop the recursion. The new version can explicitly disallow joining less than two paths and warn the user about the issue.
  3. Hopefully make the code (size) a bit smaller.
    a) As far as source code size goes, it is only negligibly shorter. The reason is that first of all, I wanted to have a nice error message when using joinPath with just 0/1 argument. This wasn't strictly necessary, but I think it's worth it.
    b) The binary code might theoretically be a tiny bit smaller, because there's no more recursive template instantiation. This hasn't been tested and doesn't really matter anyway. The compile time might improve a tiny bit, but it probably won't be noticeable anyway, so I didn't bother profiling it.
  4. Make the function faster.
    Removing recursive calls to all the template functions should theoretically improve code speed. I have benchmarked most of the common usecases and the newer version is generally faster. Below are the results.

joinPath(std::string("a/cze/some/path"), std::string("some"));

https://quick-bench.com/q/iCybZk00OlP48nFncDTJiPnCd6Q
This is what I would say is the most common usage, a medium-long size path joined by a single path fragment. There's a slight improvement in Clang, and in GCC, the performance is about the same.
Clang:
image
GCC:
image

joinPath(std::string("a/cze/some/path"), std::string("/some/other/path"));

https://quick-bench.com/q/gd-mxd1qCvwndlyvQQ8RZGID8b4
This is an example of joining two longer paths. I'm not sure why exactly this happens, but it could be that the recursive calls work worse when SSO isn't in play. Also, there's quite difference between GCC and Clang, but at least the new algorithm is faster in both cases.
Clang:
image
GCC:
image

joinPath(std::string("a"), std::string("a"));

https://quick-bench.com/q/1jeJ_3LxBjMX2tuX4_Zgm4V_qOE
This is not something that's commonly used, but it shows the performance when all of the strings use SSO. Very slight improvement in both Clang and GCC.
Clang:
image
GCC:
image

joinPath(std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"), std::string("a"));

https://quick-bench.com/q/OM59lipEuX7IdembWgmycK2_moI
Increasing the count of arguments mostly removes the speed advantage in Clang and in GCC, it actually reduces speed! Fortunately, it is slowed only by a little bit, and this example is quite contrived anyway, so it doesn't matter.
Clang:
image
GCC:
image

joinPath(std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"), std::string("some_long_path_xyz"));

https://quick-bench.com/q/nQA0kej-AmjVRJxlzx1ocxn5X8Q
For completeness, this is a benchmark of a lot of arguments with long paths. It seems that when SSO isn't taking place, the improvement is similar in Clang and GCC, although, as in the previous benchmark, the advantage is smaller the more arguments you put in the function.
Clang:
image
GCC:
image

In conclusion, the benchmarks above showed this:

  • In all examples but one, the newer version is faster. The one example that's slower is very contrived and it is slower only by a little bit.
  • The new version's speed improvement is more noticeable with no SSO. In the typical usage, at least one string won't have SSO.
  • Clang generally benefits from the new version more.

Overall, this change:

  • improves runtime code performance
  • disallows calling joinPath with one or zero arguments unnecessarily
  • possibly removes recursive instantiation of template functions, which should theoretically improve compile time speed (untested)

Notes:
The first version of the rewritten function was a bit simpler and didn't need the second template argument:

template <typename HeadStringType, typename... StringTypes>
std::string joinPath(const HeadStringType& head, const StringTypes& ...rest)
{
	static_assert(sizeof...(rest) > 0, "Don't use joinPath with only one fragment.");
	std::string res = head;
	return ((res = joinPath(std::string_view(res), std::string_view(rest))), ...);
}

Unfortunately, it was slower, because it had an unnecessary copy. The final version splits the first assignment/joining and the rest, which means that for two arguments, calling the template function is esentially the same as calling the implementation. The local variable res gets optimized due to NRVO. It does make the code more complicated though.

@fvacek fvacek marked this pull request as ready for review February 19, 2024 14:04
@fvacek fvacek merged commit 486eac8 into master Feb 19, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants