Abstract
Most programming languages support a call stack in the programming model and also in the runtime system.We show that for applications targeting low-power embedded microcontrollers (MCUs), RAM usage can be significantly decreased by partially or completely eliminating the runtime callstack. We present flattening, a transformation that absorbs a function into its caller, replacing function invocations and returns with jumps. Unlike inlining, flattening does not duplicate the bodies of functions that have multiple callsites. Applied aggressively, flattening results in stack elimination . Flattening is most useful in conjunction with a lifting transformation that moves global variables into a local scope. Flattening and lifting can save RAM. However, even more benefit can be obtained by adapting the compiler to cope with properties of flattened code. First, we show that flattening adds false paths that confuse a standard live variables analysis. The resulting problems can be mitigated by breaking spurious live-range conflicts between variables using information from the unflattened callgraph. Second, we show that the impact of high register pressure due to flattened and lifted code, and consequent spills out of the register allocator, can be mitigated by improving a compiler's stack layout optimizations. We have implemented both of these improvements in GCC, and have implemented flattening and lifting as source-to-source transformations. On a collection of applications for the AVR family of 8-bit MCUs, we show that total RAM usage can be reduced by 20% by compiling flattened and lifted programs with our improved GCC.
Highlights
On a collection of applications for the AVR family of 8-bit Microcontroller units (MCUs), we show that total RAM usage can be reduced by 20% by compiling flattened and lifted programs with our improved GCC
Microcontroller units (MCUs) are inexpensive systems-on-chip that are equipped with a limited amount of on-chip RAM: typically between hundreds of bytes and tens of kilobytes
We propose replacing the stack with static memory allocation, not replacing stack allocation with heap allocation
Summary
Microcontroller units (MCUs) are inexpensive systems-on-chip that are equipped with a limited amount of on-chip RAM: typically between hundreds of bytes and tens of kilobytes. Regardless of whether an MCU is programmed in assembly, C, C++, or Java, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. It is likely to use some of its RAM to store one or more call stacks: chains of activation records representing the state of executing functions. The stack is memory-efficient: variables live only as long as they are needed. Compilers can rapidly generate high-quality code by compiling one function at a time; the small size of the typical function makes it possible to run aggressive intraprocedural optimizations while achieving short compile times
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.