Model-Driven Engineering of Intelligent Robot Architectures

Paper accompanying website

Evgeny Kusmenko, Svetlana Pavlitskaya Bernhard Rumpe, and Thomas Timmermanns

Contents for Supplementary Material according to Paper Outline:
1 Intro
3 Related Work
4 MontiAnna
5 AI Driven Robot Architectures

Video demonstration:
Detailed instructions on how to use EmbeddedMontiArc and MontiAnna to create a deep learning based robot software are given in here.

1 Intro

The EmbeddedMontiArc Studio is a web based IDE for EmbeddedMontiArc. It feature a sample project defining a neural network in MontiAnna. To get seeing the first results quickly, the network is trained on the cifar-10 data set which takes several minutes to train on a CPU or less than a minute on a GPU.
Please follow the instruction on the desktop to run the project or watch this video. You are invited to customize the network or retrain it using your own labeled images.

3 Related Work

The running example used in the paper to compare the deep learning frameworks is the so called ResNet152 deep network. It is a convolutional neural network able to learn to extract information from images, for instance to make a robot understand its environment. Residual networks such as the ResNet152 try to tackle the vanishing gradient problem in deep architectures by introducing so called residual blocks.
The implementations of the ResNet152 we used for our framework comparison are available here .
In addition to the actual code, each implementation contains its origin as well as some remarks. Note that there are always many ways to define a complex architecture such as the ResNet152. Hence, shorter or more elegant examples might exist. However, we may conclude that using most frameworks the ResNet152 needs several hundresds of lines of code. Caffe even needs over 6000 lines of codes.

The MontiAnna description of ResNet152 contains only 33 lines of code! This is one order of magnitude less than the other languages need.

Further network examples, e.g. the AlexNet can be found in our test model repository.

4 MontiAnna

MontiAnna is the umbrella term for a set of frameworks containing two languages and several code generators. The modules are:
  • CNNArc: the language to define the architecture of a deep artificial neural network. As the framework was originally intended for CNNs, the name still contains the term. However, MontiAnna can handle any kind of deep layered networks.
  • CNNTrain: the training language to define the training hyperparameters, loss function, etc.
  • CNNArch2MXNet: The MontiAnna-to-MxNet compiler. MxNet is a widely used deep learning framework used in industry applications, e.g. by Amazon. Our compiler produces C++-code for deployment and Python code for training. Furthermore, we provide CMAKE files to facilitate the final compilation process of the generated MxNet code.

5 AI Driven Architectures

We embed MontiAnna into a component and connector (C&C) modeling language called EmbeddedMontiArc. EmbeddedMontiArc allows us to decompose software architectures hierarchically and to implement the components using MontiMath, a Matlab-inspired matrix-based language for math-heavy algorithms such as controllers (PID, MPC,...). Having composed EmbeddedMontiArc with MontiAnna we make it possible to use neural networks as an alternative to MontiMath based component implementaions.
The composed language governing the sub-languages is EmbeddedMontiArcDL.
The composed generator compiling the architecture model, the MontiMath behavior, as well as MontiAnna deep neural networls to C++ code as well as corresponding CMAKE files for building the executable is EMADL2CPP.

To demonstrate our methodology in action, we developed a self-driving vehicle software based on the direct perception principle.
Detailed instructions on how to use EmbeddedMontiArc and MontiAnna to create a deep learning based robot software are given in here.

The complete project is available here. The model sources can be found under src/main/dp. The main component is stored as plain text in the Mastercomponent.emadl file. Subcomponents can be found in the subdirectory subcomponents. In particular the deep learning component is stored in Dpnet.emadl and its training is specified in Dpnet.cnnt.
The system uses a deep neural network to extract a dozen of scalar affordance indicators from a front camera image. The affordance indicators are scalar quantities, e.g. the distance to the front car, the distance to the lane marking, etc.
The affordance indicators extracted by the neural network are then fed into Kalman filters and a controller written in MontiMath which in turn computes the actuator commands, i.e. steering, acceleration, and braking.
The graphical component and connector architecture of the autonomous vehicle is depicted below. The Deep Leaning component is highlighted in violet. Other components are either implemented in MontiMath or are a composition of subcomponents. Of course it is possible to have multiple deep learning components in an architecture:
The workflow of the system design with EmbeddedMontiArc + MontiAnna is depicted in the following diagram.


The generation workflow is depicted next:

Of course, EmbeddedMontiArc+MontiAnna can be used to design a variety of other robotics applications beside the autonomous driving domain.