Quality in embedded solutions
I want to share experience achieved in electronic gaming of chance. This article may be important for all interested in quality.
Disclaimer: This article has as main goals to turn attention onto unclear things and underwater rocks. So many clear things are skipped ever there where they need for some reason. Also almost all statements are based on long, extensive and meticulous researches and investigations.
In the first version of this article I was detailed and precise but all reviewers found out that version too bored and inaccessible. So current version is simple, accessible, funny and full of fun ;)
1. Quality definition
There are a lot of quality definitions. Most of them we can find at wikipedia/Notable_definitions. But in Embedded we can let:
Quality ≈ reliability
and mention some attributes of quality embedded product:
1. Reliability itself;
2. Stability and predictable behavior;
3. Matching of declared and real behavior;
4. Matching of declared and implemented functions and features.
As a reference of the most quality system we can take sunrise ;).
1. we trust sunrise every morning;
2. we can predict the moment of sunrise;
3. sunrise, as every embedded product, sometimes fail on very seldom basis.
2. Quality Metrics
In steps to deal with quality, such as improve or monitor we need to measure it using some metrics. As far as we let in previous paragraph in Embedded we can let:
Quality metrics ≈ reliability metrics
I want mention most useful of them:
1. MTBC (Mean time between crash)
2. MTBF (Mean time between failure)
In my systems I achieved MTBC ≈ 20 years. So 20 years, in mean, no reboots, resets, repairs and like for every product. It is something outstanding on that market and our customers liked it.
And according to huge number of legends we can take as a reference:
MTBC ≈ 2000 years
for the most quality system from previous paragraph ;)
3. Basics of quality
The quality of final product start from quality of every single component. The key of the quality is setup of production process.
1. No Chinese connectors, switches, rele, and like;
2. No national (CIS) or Chinese PCB;
3. No Chinese wires.
Assembling — soldering
Some time ago instead of mature Pb soldering technology was introduced a set of pb-free technologies. So quality in relationships with soldering technologies:
1. only Pb — excellent;
2. only ROHS (pb-free) — bad. Ever Swiss producers was not able to provide acceptable quality using this type of technologies;
3. mixing of Pb and ROHS — very bad. This case provide too much mystics and strange and unpredictable behaviors.
4. Unification and replaceability of HW and SW components
What is bad for people is good for quality:
in software and in hardware ;)
5. SW Architecture
Hardware and System Simulation Layer
So Simulation Layer was implemented in DSL and DSL interpreter was implemented separately. It was excellent invention of my friend and was one of keys of quality of our products because:
1. No memory allocations;
2. No pointers;
3. No distracting details;
4. Only game flow.
6. Key technology and technics
So following software technologies and technics allow us to provide so hi quality:
1. TDD + DSL
2. Automated and non-interactive debugging such as vallgrind + strace + logging + ...
3. Code generators (cog). Everything what can be generated must be generated, because it avoid human factor in critical moments;
4. Cross-platformity. Sometimes compilation of code for another platform show hidden bugs for original platform.
7. Custom Workflow
So quality projects definitely need dedicated workflow engineer or dedicated workflow activity. Well-known and popular workflows usually are not enough effective. Also our workflow had following key moments:
2. Dedicated QA activity. Not just testing but researches and investigation in quality;
3. QA on every step;
4. Dedicated integration activity;
5. Brain Storm;
6. Automated and non-interactive debugging;
7. Static code analyzing;
8. Automated testing.
Specifically I want to turn attention than we didn’t need codebase stabilization phase.
8. Team Members
So not all of team players have same importance in achieving good quality of target product. According to my experience I want to mention following team roles:
1. Dedicated quality engineer.
In embedded if you want to achieve excellent results generic testing activity is not enough. So you need to provide all the time explicit researches and investigations in quality specific for you project.
2. Dedicated maintainer.
When particular team members switch from task to task they lose image of overall target project. So role of this player is to watch for consistence of overall project in independence of implementation processes.
3. Wide-area specialists are much more critical than narrow-area guru.
If you will create a team of narrow-area guru and give them embedded project for implementation they will fail them with 90% chance just because they will not understand each other.
9. Bad friends
So this guys will move your project in directions opposite to quality:
4. Cont. Int;
5. Other modern fashion words.
Yes, often we need to balance in triangle above with help of that guys, but just now I’m talking about quality as much as possible.
10. Inconsiderate decisions
Using this technologies has killed many ideas, startups and companies. It is better ever don’t try them.
When I saw C++ at the first time it look so beautiful, powerful and cute like this animal:
But when I tried to use it, it already wan not be so cute and I was not able to deal with it by unweared hands:
The most disappointing was that it is too often impossible to fit C++ object code into common platform ABI.
So my final feeling about this language is like this:
Some of significant C+±related issues:
1. Bad exportability of functionality via “extern C”, so we must limit C++ by
2. Bad handling of exception: too slow and non-reliable. I think some people know about “lost exceptions”.
3. If one of component is written in C++ - usually it affect all other components of product.
4. High cost of support: average Embedded-programmer knows about 20% C++ but at this time everybody knows his own 20% which most likely will be inaccessible for somebody other.
5. Very bad consistency.
So C++ was interested experiment but nothing more. I finally suggest to everybody to avoid C++ in new projects if you don’t have clear understanding and really significant reasons to relationships with this angry animal.
And if you still need C++ like OOP I strongly suggest to look on Ada. Ada miss almost all C++ issues and on the Internet you can find a lot of success story of using Ada in embedded.
As far as every good embedder knows USB is not more than well-known experiment of Indian students with all related consequences. USB is good as element of HMI or for updating and maintenance embedded device, but no more. So every good embedder prefer to don’t use USB as sensor interface, components interconnect or storage interface.
The main problem of USB is unsuccessful design: sometimes host or function can fit into situation like this ass:
and block all USB exchange until human intervention. For that who still want USB there are science researches, investigations and thesises about how to deal with it, and also HW-SW commercial solutions, for example, from Analog Devices.
But also there are some useful tricks which can significantly improve situation:
1. Well-known mature optical isolation, or some other kind of galvanic isolation:
Yes, this tricks don’t solve main USB trouble, but you will fill yourself like ass much more seldom.
There still are a lot of naive who believes in MS and Window. Because this piece of famous art:
we can see in embedded products up to ten times more often, than on regular desktop, I perceive that naives like this keen guy:
So, I believe, one of that guys in the nearest future will be nominated for
14. Native threads
This picture illustrates how I see generic multi-threaded application:
Really usually there are no reason to use threading in Linux userspace at all. Because threading is just increasing complexity and no more. If your task is really heavy to utilize all cpu core resource distributing load on second cpu core using threads will be the worst of possible solutions. Yes, it can be usable for some exotic tasks in HPC environment, but if you still want Linux userspace threads in embedded just recall Alan Cox:
A Computer is a state machine. Threads are for people who can’t program state machines.
And for expanding your horizons read here.
My main points about threads:
1. Threads are usable until they are too simple.
2. Reasons which caused introducing posix threads into Linux is not actual today.
3. On x86 HW threads made ineffective caching, TLB and CPU pipeline. So threads usually made application slow and luggy.
4. Good multithreaded application, in steps to be effective, must know almost everything about underlying HW, what lives in kernel platform code and is not exported to userspace in simple way. Here I bring greetings to JAVA guys ;).
5. Avoiding threads we avoid few classes of possible errors. The main one is thread synchronization errors.
Also we can get some cookies and benefits using cooperate or other kind of manual multithreading like this:
It allow us to handle most of issues from list above.
15. Mixing of FrameWorks
Mixing different frameworks inside of single applications is also a good root of nuances and unpredictable behavior.
Just for illustration: if you use Qt and Gtk+ simultaneously you can’t safely use any common way to run child application.
So if you mix different frameworks you can trust your application in same way as you trust gypsy foreknow future.
A paradise of developer
So having so more benefits from DSL we spent a lot of time in finding a way to avoid DSL developing effort in small applications. Finally we found than we can use variant of event driven development: direct programming in finite state machines. It’s good and flexible because:
1. FSM can be instantiated in multiple equivalent representations and converted between them.
2. FSM is very simple to learn and is known to every good engineer.
3. FSM is simultaneously enough hi-level to don’t worry about implementations and at same time allow low-level operation in natural and flexible way.
4. You have simple and accessible way how to explain for non-programmer engineers what and how you software do.
And finally main benefit: FSM can have same graphical and binary machine representation. So you don’t need to use debugger to find a trouble, because on top of good FSM engine FSM work in same way as on paper.
And here I have tried to illustrate how FSM can change your development workflow of ever avoid coding from them.
So it is good to start from ragel as FSM description language, UML as design environment and Moore machines as base for both of them.