Sunday, November 3, 2013

On MADARA Documentation

For those who have followed my blog, you probably know that my dissertation research project MADARA (Multi-Agent Distributed Adaptive Resource Allocation) is a pet project that I still put a lot of my weekends and nights into. The feature set for MADARA has grown radically since I finished my dissertation and the KaRL engine has become much more convenient, faster, and feature-rich. The result so far is a middleware that provides timing, speed, and quality-of-service like a heavy weight middleware with the ease-of-use of a scripting language (at least I hope so).

In truth, I feel like MADARA has been ready for prime time usage for several months now, but I'm a perfectionist, and I have been working diligently on the middleware layers to provide easier-to-use features, deeper functionality, and faster execution.

But as ridiculous as it may sound, features are not the core of prime time readiness for a middleware. Documentation is the key to wider usage. I can't tell you how many middlewares I've downloaded without documentation, and it's extremely difficult to use any tool to its maximum effectiveness without good guidance. As much as I've tried to focus on feature building in the past several months, I've spent an equal amount of time on commenting, tutorials, tests, and the new Wiki pages. If MADARA ever does impress enough people to find itself into mainstream usage, I think the documentation will be the key.

Just how much documentation have I done with the MADARA middleware? Let's start with the code documentation, which generates the doxygen documentation for the library:


MADARA contains over 66,000 lines of code right now (v1.1.13) and over 22,700 lines of commenting (for every 3 lines of code, there is one line of documentation). There are also over 18,000 blank lines to aid with code legibility, which I feel is just as important as documentation. To be clear, I try to document only what is unnecessary--nothing as inane as "int i // an integer". The majority of the commenting is done for function headers so doxygen and IDEs like Visual Studio can provide helpful tips on usage, such as precondition/postcondition information, parameter listing and definition, error information, and return value descriptions. And these lines are equally as important to the middleware as code lines. To me, MADARA sits at 107,000 lines of code with a healthy proportion of nearly 2/5 dedicated to documentation and readability.

But code documentation is only 2/5 the battle when it comes to user presentation. Another important aspect is code examples, which I've worked diligently to provide in the tests and tutorials directories of the code base. Here are the cloc results from these two directories.

The tests directory, the first printed table, is meant to test every feature added to MADARA. It's also documented and uses descriptive variable names so people can use these tests as guides for usage. Consequently, the commenting and blank line usage is similar to the main repo but slightly less because there isn't much to doxygen comment. Consequently, the comment/readability ratio is 1 line of readability for roughly every 3 lines of code.

The tutorials, however, are meant as guides for developers, and they are thoroughly documented to discuss the intent of features and proper usage. One user commented to me recently that they are likely reading technical papers because of how rich they are. This bears out in the cloc results. For every 1 line of code in the tutorials, there is at least a line of documentation or blank line for readability. In fact, there are more documentation/readability lines than code lines in the repository.

As weird as it may sound, I also know that this readability and documentation of tests and tutorials may not constitute even 1/5 of the battle for prime time readiness of a middleware. After all, only someone who has downloaded the repository (i.e. only someone convinced enough of the power and features of MADARA) would be able to see the tests and tutorials directories. No, I feel the main focus on documentation has to go into external guides that help developers understand just what they're getting into with a new middleware or library. So, presentations and external guides featuring code examples and descriptions have received a large deal of focus as well--though not necessarily enough.

The main point of entry for external guides of MADARA is now the Wiki section of the MADARA project site. The MADARA Wiki now defaults to a set of guides that discuss high level overviews of the MADARA architecture, interactions with the Knowledge Base, interactions with the Transport layer, and what the target audience for the middleware is. There are half a dozen images outlining interactions and overviews to aid developers in visualizing the system, Youtube code tutorials, and video of example usage in a swarm of commercial uavs--the latter two of which are available at the bottom of each Wiki page in the More Information section. These Wiki guides add an additional 1700 lines of effort to make MADARA more user-friendly and accessible.

So, what do you think? When you get a chance, check out the Wiki section of the MADARA project site and maybe the code tutorials on the left hand pane. Feel free to tell me what you think. Documentation is an ongoing process, and I welcome the feedback!