Tips, Ideas, Discussion for SOC Virtual Prototypes

Optimizing Linux Boot Time on an ARM Cortex-A9 CPAK

Posted by Pareena Verma on Tue, Jul 24, 2012 @ 08:30 AM

In my last post I wrote about porting bare metal benchmarks to Carbon’s reference platforms. Since then, I’ve been tackling more complex systems such as booting Linux on Carbon’s ARM® Cortex™-A9 Linux CPAK. Our customers have been doing this same task for a long time for multiple reasons:

  • It enables software engineers to develop and validate the boot code in parallel with RTL development.
  • Software engineers can leverage the prototypes for development and debug of device drivers and applications running on the OS.
  • Our virtual platforms make performance analysis and pre-silicon firmware debug possible before silicon is available.

Booting Linux on cycle accurate models though does have one big limitation: the amount of time required.  Booting the standard Linux distribution can require billions of cycles.  This means that the boot process can take days in some cases which limits the value.  My work has been to streamline that process and minimize the amount of time required while still being a valuable platform for development.

The Cortex-A9 Linux CPAK consists of an ARM Cortex A9 processor, a PL301 interconnect, PL011 (UARTs) and Memories. The Linux kernel is configured to use the internal timers and interrupt controller found in the Cortex-A9. The global timer is used as a free running clock source, while the watchdog timer is used as the primary linux timer device. The bootloader uses the local timer in the Cortex A9 system as its time source. This allows a minimal system to be created.


ARM Cortex A9 CPAK Block Diagram resized 600

Carbon's ARM® CortexTM-A9 Linux CPAK system


One of the ideas behind CPAKs is to provide our customers with a jump-start solution, which would considerably reduce their time involved in building a virtual platform complete with an OS running on it.  For those interested, there’s a wealth of information out there on Linux boot time optimizations – but in this post I’d like to focus on just a couple of them that maximized the boot-time speedup on our CPAKs.

Two major speedups were achieved by using an uncompressed kernel image and creating a smaller kernel image. The default kernel configuration comes with numerous features enabled by default that our system did not require. Eliminating some of these features and drivers from the default configuration considerably reduced the size of the kernel image. I also found that using an uncompressed kernel image turned out to be much faster than using a compressed image. Although the uncompressed image is a lot bigger, the biggest advantage is that it doesn’t have to be decompressed to RAM.  Normally, this image is stored in flash memory in a real system, but since none exists here, the user must load it into the memory. Using SoC Designer Plus, loading the kernel image to RAM is straightforward – just a single click step prior to simulating the system.  One of the nice advantages about using a virtual prototype.  

I found that another area for optimization was skipping over the calibration and Real Time Clock (RTC) sync up phases. During each boot, the Linux kernel calibrates a timing loop to the system’s processor speed. We need to measure this value just once and can avoid the time associated with calibration during subsequent boots. Also, by removing the Real Time Clock synchronization routine we saved additional boot time.

I found that these optimizations not only reduced the boot time but also helped shorten the development and debugging phase. 

The optimizations which I’ve discussed here have been incorporated into the Cortex-A9 CPAK on Carbon IP Exchange.  We’ll carry forward these optimizations in future CPAKs as well while continuing to further streamline the boot time required.  Swap & Play is still the fastest way to boot Linux and get to an accurate debug point but if cycle accuracy during the Linux boot is needed then I urge you to try the newly optimized Linux boot.

To learn more about the new CPAKs we develop and the optimizations we make for significant speedup keep checking back! If you want to be automatically notified when new blogs are posted, enter your email address in the subscription box to the right.


Tags: CPAK, SoCDesigner Plus, Swap & Play, Linux, ARM Performance Optimization, ARM Cortex-A9, Firmware Development, Pareena Verma, ARM PL301

FREE Insights & Analysis

Get the world's leading discussions on virtual prototyping delivered to your inbox

FREE Virtual Prototype Resources


                Virtual Prototype



Follow Me