Mixing Machine and Hand Generated Source

This post is the third in my series discussing automatic source code generation. In this post, I’m going to present a Matlab tool for integrating generated code into your hand written source.

There are many different ways to mix machine and hand generated source code. By far the most common is the C Preprocessor. The preprocessor command #define does a simple string replacement, while #include copies a full file into your source. Integrating tools such as the Preprocessor into your code generation tool stack is simple for many languages. I use it quite extensively for generating code when the required data manipulation does not call for a more flexible programming language.

Another common method is to use a template system. C++ has a full template library for compiler supported code generation. There are lots of HTML related systems that use templates too.

I personally do not like using either of these methods when generating code that requires a lot of mathematical manipulations, often required for control systems development. Both of these techniques have the same pair of shortcomings. First, the hand and machine generated source are in completely separate files. This separation breaks up the logical flow of the code. Second, the developer never sees the final combined source. While not insurmountable, this does complicate debugging.

A preferable a system will inject the machine generated code directly into the hand generated source. I’ve developed two different Matlab tools to do this. The first tool, completely of my own creation, worked but there was a large disconnect between the hand generated source and the Matlab code used for the generation. This disconnect made it difficult to track the origin of generated code, and left me generally unsatisfied.

A few months ago, I was lucky enough to stumble upon a Python code generation tool by Ned Batchelder called “Cog”. As Ned describes on the Cog web page:

Cog transforms files in a very simple way: it finds chunks of Python code embedded in them, executes the Python code, and inserts its output back into the original file. The file can contain whatever text you like around the Python code.

The beauty of this system is that it directly integrates the source of the generator language into the source of the generated language. This completely solves the build integration and tag-to-generator disconnect problems. If you’re a Python programmer (or Perl, Ruby or PHP) programmer, I highly recommend you check out the Cog web page.

If you look on my Matlab Tools page, you’ll find my Matlab knockoff of Cog. I’ve used this tool for both C and for LaTeX code generation, and it seems to work quite well. It has a few limitations, but I created it to be source file language independent. It is now my tool of choice for integrating Matlab generated into the source of any of my projects.

Posted in Code Generation | Tagged , , | Leave a comment

Code Generation and the Build Process

In my last post, I discussed my thoughts on using code generation to aid in the development of motion control system software. In this post, I’m going to discuss issues around moving system parameter values from their original source into software.

System Parameters & First Class Design Articles

I believe that a key concept of systems engineering is the “first class design article”. In any design, there is a document that captures a system parameter in the most fundamental way. I call this document “the first class design article”. In a circuit design, this is the schematic. In a mechanical design, it’s the 3D CAD. These are the master documents from which other design documents flow: circuit layouts, mechanical 2D drawings, etc. These are also the documents from which the physical parts are eventually fabricated. In an ideal world, data would be pulled directly from the master document at build time. While I believe that it is possible to do this today, it is difficult, and I’ve never seen it done in practice.

The two biggest issues with this idealized data flow process are:

  • A significant amount of effort and tool specific knowledge is needed to pull the data from all of the tools used by the various engineering disciplines. Without a willing power user for each tool, it is unlikely that getting direct access to the source data is possible.
  • Many of these tools require expensive licenses. Including these in the standard build process could require a license for each software developer.

Given those hurdles, pragmatism requires at least one step of manual data transfer. This necessitates close work with engineering partners to make sure that key parameter changes make it into the database.

I store the master document parameter values in a set of text files. These files are plain text, and are checked in to the revision control system with the rest of the software source. It is important that this file is a plain text file so that the ‘diff’ utility can be used to examine and track changes. With this text based parameter database in place, the build process can modified as needed.

System Parameters & Software Build

The key to efficiently utilizing the parameter database is integration with the software make system. To do this, some form of code generation is used to convert parameter information into C source code.

The standard C software build process goes roughly like this:

  1. The preprocessor expands preprocessor macros and includes header files.
  2. The C compiler converts the expanded C source to assembly code.
  3. The assembler converts assembly code to machine binary.

The simplest form of code generation to integrate into this build process is to utilize the C preprocessor. The design pattern called “X-macros” can be particularly handy. I’ll cover X-macros in a future post.

For more complex code generation utilizing high level design languages, the build process must be augmented. There’s no magic here, a step is simply added before the preprocessor to run the code generation scripts. If the build system uses a standard makefile, this is easily done. From a data coherence and simplicity standpoint, it is ideal to have the full process run at each compile. If build time is an issue, intermediate files can be generated and checked in. The make system is built to handle keeping things in sync in this type of setup.

My code generators usually require Matlab, my tool of choice for mathematical data manipulation. Since Matlab licenses are not cheap, it is rare that every software engineer will have access. If Matlab cannot be included in the standard build process, there are two similar options:

  1. The control systems engineer creates intermediate files via code generation with Matlab. The generated intermediate C source files are then checked in to the RCS. Matlab is only required to generate the intermediate files, and everyone has access to those files.
  2. The control systems engineer generates the code directly in the normal source files. This keeps all of the related code together in one file.

Both of these solve the software license issue. In either case, as long as the software engineers never touch the parameter files, Matlab will not be invoked. I tend to favor the latter approach, as the final source file is unified and easier to read. In my next post, I’ll describe and share the Matlab tool I use to inject generated code directly into source files.

As a final note, on the “Matlab Tools” page of this blog you can find a Bash script which I use with makefiles to invoke Matlab for code generation.

Posted in Code Generation | Tagged , , | Leave a comment

Code Generation

Code Generation

Code generation is a topic of intense interest for me.
If done correctly, code generation can be a tool that delivers on two of the core principles of computer science:

  • One Source of the Truth/Don’t Repeat Yourself. There should be one unambiguous source of data in a system. If you try to hand synchronize your data, at some point you will get it wrong.
  • Code in the most expressive language possible. Many studies have shown that a software developer will write about the same number of lines of code per day, regardless of the programming language. The more expressive the language is, the more work that gets done. This means that you want to code in a language that allows the most direct expression of the concepts of the problem domain, typically a domain specific language.

An example application of the two principles listed above is the handling of physical constants of a system. Let’s assume that we want to deal with the mass of the control system load. This value would typically show up in two places in our control system: the feedback and feedforward gains. During the development phase of the project, the mass is likely to change as the mechanical engineers proceed in their development. To avoid a long update process every time there is a mass change we don’t want to hard code the mass value into our system constants.

This is a case where code generation can be of great benefit. Our first step is to create a place to store the physical parameter value. If you can pull it directly from ME CAD, so much the better, I’ll have more to say on this in a future post. For now, let’s assume that’s not feasible. To make it simple, we’ll just drop them in a text file, with some designation of physical units attached. This file will serve as the single source of these values for all uses in the C code, addressing point one from above.

The next task is to automate the conversion of the physical parameters to each of the control gains. My tool of choice for this is a Matlab script. There are plenty of Matlab tools available to read data in from any text format, convert physical units, and scale and quantize for fixed point usage. Again, I’ll have more on these topics in future posts. This final step is to spit out bits of C code for each data use in the code. It is very simple to generate a C header with a #define line for each constant generated. Viola, you have a code generator!

Through this simple process you have done the following to help your development process:

  1. You have made your development system much more robust. Storing the constants in the physical units minimizes human error when updates are made because you now get automatic updates of all the places with dependencies on that value.
  2. You have also increased your productivity.  While you had to make an upfront investment in the code generation tools, mass changes are now handled almost effortlessly. Like any tool, upfront effort will pay off in long term productivity.
  3. You have documented a portion of your development process.  A key thing to note is that you really haven’t expended much additional effort.  You had to go through all of the same steps to go through the process one time, by recording those steps in a script, you’ve simplified the effort required the next time.

Code generation can be a powerful tool in the creation of robust, maintainable software. Modest upfront investments will pay large dividends in the long term. There are many tools available to aid in the process, in the future I’ll post about the Matlab tool stack I’ve developed over the years to aid in my code generation process.

Posted in Code Generation | Tagged , , | 1 Comment

Introduction

I’ve had a lifelong interest in making things move, starting from the first Lego Technic set that I received as a child. I got hooked when my father made a helicopter with a linkage between the rotors. Seeing the main rotor spin when I twirled the tail rotor was simply magic. From there I went on to study engineering in college with a focus on motion control, and I’ve spent over a decade developing highly optimized high precision, low-cost motion control systems.

In this blog I intend to provide various motion control related how-to’s, explanations of topics which I initially found confusing, and my opinions on technologies, tools and best practices related to motion control systems development. The posts will primarily focus on software/firmware (mostly in C/C++), Matlab tools and techniques, and a bit of math related to control systems.

Since 1996 I’ve been living in beautiful Portland, Oregon. Thus the blog’s name: PortlandMotion. I hope that you find this blog of interest from time to time.

Posted in Uncategorized | Leave a comment