Distributed Reasoner: ACE recursive mutexes vs. STL recursive mutexes

In the MADARA engine, we must use recursive mutexes to protect the underlying dictionary-based shared knowledge. Since 2008 or so, we have been using ACE_Recursive_Thread_Mutex to protect our critical sections, but I have been following the C++11 spec with especial interest. The goal of MADARA is portability and speed across platforms like Windows, Linux, ARM, Intel, Mac, Android, etc. and ACE was a natural choice because of its platform support but also because of its well-tested code base and community development. Over the past five years or so, the community that supports and uses ACE has dwindled, and there has been a push within the C++ community to use libraries like Boost and STL mutexes, which are basically Boost libraries that have been standardized.

But for a middleware like MADARA that is especially concerned with performance on low-powered processors for robotics systems, it's not just about how excited the C++ community is about a particular library, it's also about speed and efficiency. So, to make our own decision on whether or not the C++11 spec was ready for primetime in portable middleware, I incorporated new features into the MADARA build process to allow for null mutexes (essentially no-ops that do not actually protect multi-threaded access), STL recursive mutexes, and our current usage of ACE recursive mutex in an extensible way. After seeing the results, I retrofitted test_reasoning_throughput (one of our standard tests for performance measurements on a target platform) to include breakdowns of C++ STL mutex and recursive mutex against the ACE implementations of ACE_Thread_Mutex and ACE_Recursive_Thread_Mutex.

First, the results of the direct comparisons of ACE mutexes and STL mutexes for g++ and Visual Studio 2015.

Settings
CPU: Intel® Core™ i7-4810MQ CPU @ 2.80GHz × 4
Linux: Ubuntu 14.04
g++ -v:Version Info
Windows: 7, Sp 1
Visual Studio: 2015
Results are reported in average nanoseconds per operation in 100k operations performed.

As you can see from the above direct comparison, the g++ STL C++11 mutexes are roughly the same performance as the ACE Recursive Mutexes. The Visual Studio 2015 performance is supposedly much better than the Visual Studio 2013 performance, but I could not get my installation of Visual Studio 2013 to handle the STL mutex library correctly at runtime (it compiled fine but just seemed to stall for no reason). For completeness, I've included a bunch of C++ operations with no mutex usage in the breakdown as well. This is information also printed in our test_reasoning_throughput test.

Now, MADARA itself performs knowledge and reasoning operations for shared information in a distributed system. The test_reasoning_throughput tests many simple operations on the MADARA knowledge bases, using these recursive mutexes often in nested ways. It also enforces quality-of-service policies and various checks about knowledge consistency, time, and various other attributes. In short, it does useful things within the critical section.

The following table uses the same hardware, operating systems, and compilers to check performance of basic operations in MADARA.

Most of our current generation of MADARA software uses KaRL containers (the last line in the above check). Another thing we optimize for are large Knowledge and Reasoning Language (KaRL) programs, which fall into the 2nd and 4th row of the above table. From these metrics, the ACE Recursive Thread Mutex is still the way for us to go. However, the performance of std::recursive_mutex in g++ is promising. Hopefully, the performance of the Visual Studio STL mutex library will catch up. After all, they have 30 years of open source code to look to for inspiration ideas... if they care to open a web browser.

Distributed Reasoner

Pages

Sunday, February 21, 2016

ACE recursive mutexes vs. STL recursive mutexes

No comments:

Post a Comment