Chapter: Embedded Systems Design : Real-time without a RTOS

Problems- Real-time without a RTOS

Saving to hard disk, Data size restrictions and the use of a RAM disk, Timer calculations and the compiler, Data corruption and the need for buffer flushing

Problems

Saving to hard disk

The logged data is saved to disk so that it can be read back at a later date. The data is sent to disk every time a sample set is received. This is typically completed before the next sample is required and does not interfere with the timing. This at least was the original thought. In practice, this was the case for most of the time as the data is buffered before it is sent out to disk. When the buffer is emptied, the time taken to complete this is becomes quite long, especially if the disk needs to be woken up and brought up to speed. While this is happening, no samples are taken and the integrity of the sampling rate is broken. Testing during develop-ment did not show a problem but when the unit was used in real life, a 2 to 3% timing error was seen. If the logs were kept small, the timing was accurate. If they extended to several minutes, they lost time. The problem turned out to be caused by two design prob-lems: accuracy of the timer counter programming and the buffer being flushed to the hard disk. The timer counter problem will be covered later and was really caused by an underlying behaviour in the compiler. The hard disk issue was a more fundamental problem because it meant that the design could not store data logs to disk without compromising the sampling rate integrity.

There is a simple solution to this. Why use a slow disk when there is plenty of faster memory available? Use a large data array to hold the data samples and then copy the data to disk when the logging is stopped. This has the disadvantage of restricting the length of time that the logging will work to the size of the data array which will typically be far less than the storage and thus the time offered by a hard disk. However with just a few Mbytes of RAM offering the equivalent of many hours logging, this is not necessarily a major obstacle and this solution was looked at.

Data size restrictions and the use of a RAM disk

The PC architecture started out as a segmented architecture where the total memory map is split into 64 kbytes segments of memory. Addressing data in these segments is straightforward providing the size of the data does not go beyond 64 kbytes. If it does the segment register needs to be programmed to place the address into the next segment. While later PC processors adopted larger linear address spaces, many of the compilers were slow to exploit this. This leads to a dilemma: the hardware supports it but the C compiler does not. In this case, the Borland TurboC v2.0 compiler was restricted to data arrays of no larger than 64 kbytes. It could have been possible to use several of these and then implement some housekeeping code that controls when to switch from a filled array to an empty one but initial attempts at that were not successful either. This is a case of a need that looks simple on face value but gets complex when all the potential conditions and scenarios are considered. This led to a fairly major decision: either change to a different or later version of the compiler that sup-ported larger data structures or find some other memory-based solution.

The decision to change a compiler is not to be taken lightly, especially as the application code had been written and was working well. It used the Borland specific functions to access the timer hardware and set up the interrupt routines. Changing the compiler would mean that this code would probably need to be rewritten and tested and this process may introduce other bugs and problems.

A search through the Borland developer archive provided a solution. It came up with the suggestion to use the existing disk storage method but create a RAM disk to store the file. When the logging is complete the RAM disk file can be copied to the hard disk. This copying operation is done outside of the logging and thus the timing problem goes away. There is no change to the fundamental code and as the PC was a laptop with its own battery supply, no real risk of losing the data. One added benefit was that the hard disk was not used and thus powered down during the logging operation and so reduced the power consumption.

Timer calculations and the compiler

The software was designed to calculate the timer values using the parameter passed to the timer set-up routine. All well and good except that the routine was not accurate and the wrong value would be programmed in. The reason was due to rounding and conversion errors in the arithmetic. Although the basic code looked fine and was syntactically correct, the compiler performed several behind-the-scenes approximations that led to a significant error. The problem is further compounded by the need for a final hexadecimal value to be programmed into the timer register. In the end, calculating the exact figure using the Microsoft Windows NT calculator accessory, and then converting the final value to hexadecimal solved this problem. This pre-calculated value is used if the passed parameter matches the required value.

Data corruption and the need for buffer flushing

While the system worked as expected in the lab using the test harness, occasional problems where noticed where the data order unexpectedly changed. The engine RPM data would move to a different location and appear as a wheel speed and the front wheel speeds would register as the rear wheel speeds and so on. It was initially thought that this was to do with an error in the way the data ordering was done. Indeed, a quick workaround was made which changed the sample data ordering but while it appeared to cure the problem, it too suffered from the wrong data order at some point.

Further investigation indicated that this problem occurred when the engine was switched off and re-started while logging or if the logging was started and the engine then switched on. It was also noticed that when the EMU was powered up, it sent out a “welcome” message including the version number. It was this that provided the clue.

The welcome message was stored in the serial port FIFO and when a sample was requested, the real sample data would be sent and added to the FIFO queue. The logger would read the first six bytes from the queue that would be the first six characters of the welcome message. This would then repeat. If the welcome mes-sage had a length that was a multiple of six characters, then the samples would be correctly picked up, albeit slightly delayed. The periodicity would still be fine, it's just that sample 1 would be received when sample 2 or 3 would have been. If the message was not a multiple of six, then the remaining characters would form part of the sample and the rest of the sample would be appended to this. If there were three characters left, then the first three characters of the sample would be appended to this and become the last three characters. The last three characters of the data sample are still in the FIFO and would become the first three of the next sample and so on. This would give the impression that the data order had changed but in reality, the samples were corrupted with each logged sample consisting of the end of the previous sample and the beginning of the next.

If the EMU was started before logging was enabled, the characters were lost and the problem did not arise. If the engine stalled during the logging or started during this period, the welcome message would be sent, stored and thus corrupt the sampling. The test chassis did not simulate this, and this was why the problem did not appear in the lab. Another problem was also identified and that was associated with turning the engine off. If this happened while the data sample was being sent, it would not complete the transfer and therefore there is a potential for the system to stall or leave partial data in the queue that could corrupt the data sampling.

Having identified the problem, the solution required sev-eral fixes: the first was to clear the buffers prior to enabling the logging so that any welcome message or other erroneous data was removed. This approach was extended to occur whenever a sam-ple is requested to ensure that a mid-logging engine restart did not insert the “welcome” message data into the FIFO. During normal operation, this check has very little overhead and basically adds a read to one of the serial port registers and tests a bit to the existing instruction flow. If characters are detected these are easily re-moved before operation but as this should only happen during an engine shutdown and restart where timing accuracy is not abso-lutely essential. Indeed, the data logger marks the data log to indicate that an error has occurred. The sample data collection will also time out if no data is received. This is detected and again the sample aborted and a marker inserted to indicate that a timeout has occurred.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Embedded Systems Design : Real-time without a RTOS : Problems- Real-time without a RTOS |