Book Reviews

Optimized C++

The author, Kurt Guntheroth, has been writing software for over 35 years. He never worked at Microsoft, Google, Facebook, Apple or anywhere else famous. But he has spent the last 20 years almost exclusively writing C++ and talking to other very bright developers about C++. So he is qualified to write a book about optimizing C++ code.

At the dawn of the 21st century, C++ was under assault. Fans of C pointed to C++ programs whose performance was inferior to supposedly equivalent code written in C. Big companies made big-money bets on coding web sites and operating systems in Java or C# or PHP. C++ seemed to be on the wane. It was an uncomfortable time for anyone who believed C++ was a powerful, useful tool.

Then a funny thing happened. Processor cores stopped getting faster, but workloads kept growing. Moore’s Law makes it possible to put more cores in a microprocessor each year. But it does little to make the interface to main memory faster. Thus, doubling the number of cores in the future will have little effect on performance. The cores will all be starved for access to memory. This looming limit on performance is called the Memory Wall.

Uniquely among programming languages in wide use in 2015, C++ offers developers a continuum of implementation choices ranging from hands-off automated support to fine manual control. C++ empowers developers to take control of performance tradeoffs. This control makes optimization possible.

The author begin with a case study - optimize string, because std::string is among the most used features in C++ standard library. For instance, an article in the google chrome-dev forum stated that std::string accounted for half of all calls to the memory manager in chromium.

Now suppose that profiling a large program reveals that the function remove_ctrl() below consumes significant time in a program. This function removes control characters from a string of ASCII characters.

remove_ctrl(std::string s) {
  std::string result;
  for (size_t i = 0; i < s.length(); ++i)
    if (s[i] >= 0x20)
      result = result + s[i];

  return result;

I measured this function on my Mac mini and it took 3us. The author then guides us through several rounds of optimization. I also bring in some C++11 syntax to this example. Now we have the following function.

remove_ctrl(const std::string& s) {
  std::string result;

  for (auto const ch : s)
    if (ch >= 0x20)
      result += ch;

  return std::move(result);

This function took only 0.4us, almost 8 times faster than the previous one.