Memory management in low power embedded eystems

June 18, 2020

Power consumption for mobile systems designers have become a first class concern alongside performance and design. Typically the lowest power processors are 4 bit or 32khz processors with extremely lower power cycles. Memory transposition between program and SRAM on Arduino, or for example between the processor core and Icache on other systems, 50 to 80% of runtime power volatility can come from memory traffic caused by transfers between off-chip and on-chip memory.

In order to reduce the impact of memory transfer its important to, from the beginning, focus on strategies that monitor and reduce memory traffic and power consumption. Some such strategies include:

• Cache sizing
• Loop transformations

In Arduino for example a strategy to reduce memory transfer should also include proper memory management. The Arduino Uno has 32kb of Progmem (flash memory, of which .5k is used for the bootloader), 2kb of SRAM, and 1KB of EEPROM memory. If you run out of memory the program may fail in unexpected ways, or behave strangely. These issues are difficult to diagnose without monitoring. As of Arduino 1.0 the F macro has become available.

WString.h:#define F(string_literal) (reinterpret_cast(PSTR(string_literal)))

The F Macro stores C-style strings in Progmem and at runtime moves these strings into SRAM on an as-needed basis. Here’s a commit I made recently to the Adafruit NFCShield library to convert their Serial C-style strings (char arrays) to Progmem strings:

Rewriting the Adafruit NFCShield library to use the F Macro

The issue with this approach is there are additional cycles required for this conversion; and these additional cycles are power consumptive. One byte of data is copied to SRAM at a time. A better approach is to offload as many calculations, responses, and display values to a remote server.

Each character is a byte, plus the ’\0’ terminator that is appended to every C-style string is an additional byte. This simple string takes up 19 bytes of SRAM. With only 2,048 bytes it won’t take long to use up all available SRAM.

char message[] = “This is a message.”;

There are important considerations with using Progmem for strings. For example, Progmem and EEPROM are non-volatile (information persists after the power is turned off). SRAM is volatile (memory will be lost when power cycled); so if you need data to persist between power cycles, the data needs to have a Progmem or EEPROM memory allocation and address. Also, in Progmem the strings cannot be modified; however, if copied to SRAM the strings can be modified.

PROGMEM
EEPROM
F Macro

Secondly, being familiar with data type memory sizes and using the smallest data type can reduce memory traffic. For example, an int takes up two bytes while a byte uses only one (but stores a smaller range of values). Reducing oversized variables can offer a number of performance gains. Here is a list of variable sizing in bytes:

boolean 1
char 1
unsigned char, byte, uint8_t 1
int, short 2
unsigned int, word, uint16_t 2
long 4
unsigned long, uint32_t 4
float, double 4

Optimizing SRAM

Thirdly, use a variety of transistors to reduce power consumption without performance loss. Power gating allows certain sleep transistors to disable entire blocks of a circuit when not in use.

Sleep transistor sizing

Fourthly, loop transformations can allow two arrays to share the same memory space, reducing memory traffic and saving power consumption cycles due to memory allocation. Take array c[] and array w[]; in a loop interchange the number of memory reads would be reduced because array c[] and array w[] can share memory space.

Loop Transformations

Fifthly, power aware computing involves reducing the switching activity of the Icache data bus between the processor core and Icache. Sample registers should be used to record the transition frequencies between register labels (encodings) of the instructions executed in consecutive cycles. The OS maintains a table of virtual address-to-physical address translations.

malloc: allocates a given number of bytes and returns a pointer to them. Returns a null pointer if insufficient memory is available.

free: takes a pointer to a segment of memory allocated by malloc, and returns it for later use by the program or the operating system.

Compiler optimizations for low power systems
Memory design and exploration for low-power embedded systems