Tips, Ideas, Discussion for SOC Virtual Prototypes

IP Selection: ARM Cortex-A9 or Cortex-A7?

Posted by Andy Meier on Tue, Feb 12, 2013 @ 09:25 AM

In Bill’s recent blog, he spoke about the tough questions customers need to ask as they are starting any new endeavor. These questions are extremly important to ask when selecting a piece of IP.  An architect may ask, for my application, how will the cache subsystem handle our existing code? What about code yet written?  How many instructions per cycle with the CPU be able to exectute?  What about percentage of time spent on Cache and TLB misses?  Bill went through the design challenges our customers are facing today, and for today's blog, I am going to dive a bit into how an architect at one of our customers used Carbon's solutions to answer these questions and drive the IP selection process.    

Linux Booting on an ARM Cortex A9 Virtual Prototype           ARM Cortex A9 Virtual Prototype Running Coremark Benchmark

CPU:   Cortex-A9 or Cortex-A7?

The first IP selection question our customer faced was to use a Cortex™-A9 or should they design with the new Cortex-A7.  According to ARM®, the Cortex-A7 will enable entry level smart phones designs below a $100 while the performance of these smart phones will be equivalent to a $500 high-end smart phone of just a few years ago. Pretty impressive!

Among its many features, the Cortex-A7 has an integrated L1 and L2 cache, which allows lower transaction latencies and ultimately improved memory system performance. While the Cortex-A9 architecture is capable of supporting a 16, 32 or 64KB  L1 cache with L2 cache supported with the optional PL310 L2 cache controller. 

Their intuition told them to choose the Cortex-A7 but they wanted to confirm their choice by benchmarking on a cycle accurate virtual prototype that was set up for several experiments that varied cache size, latency configuration, and interconnect possibilities.  The benchmarks they chose to use in their experiments were Dhrystone and Coremark. To jump start their effort, they began with  a  Carbon Performance Analysis Kits (CPAKs) developed around the Cortex-A9 and Cortex-A7.   Each CPAK contains not only a simple platform but also the bare metal benchmarks and sample initialization code whcih allowed them to get up and running immediately.  

They began their analysis by running Dhrystone on a Cortex-A9 1 CPU, 32K D-cache configuration with an external L2 Cache.  They examined the cache behavior by looking into cache events provided with each component.  Using Soc Designer Plus' profiling capability, they were quickly able to see how the benchmark was exercising the cache-subsystem.   

A9 NIC Dhrystone

Figure 1: Cache Activity from Cortex-A9 running Dhrystone

In addition to D-Cache characteristics, they gathered I-Cache information and TLB information provided by examining the PMU events from the Cortex A9.  They used these to calculate the D-Cache miss rate, I-Cache Miss rate and TLB miss rate percentage for both Dhrystone and Coremark.


Cortex A9 Instruction Events

Figure 2: Cortex-A9 CPU Profiling Events

Using Carbon IP Exchange this customer quickly and easily specified the alternate configurations of the Cortex A9 they were interested in.  Updating their platform to use these new models was as simple as selecting "Replace Component" in the Soc Designer Platform Menu.  

ARM Cortex A9 Configuration

Figure 3: Carbon IP Selection page for Cortex A9

Replicating the platform configuration and the experiments set up for the Cortex A9,they gathered the same profile information  but this time for the Cortex-A7. Below you will see the Instruction and pipeline profiling information provided with Carbon's Cortex A7 Model.


ARM Cortex-A7 profile results

Figure 4: Cortex A7 Profiling Events

Ultimately the experiments and analysis they performed, confirmed their initial thought of using the Cortex A7 for this project.  The real value in this was that they were able to do this all in a 100% implementation accurate environment prior to finalizing their decision. These initial platforms were also leveraged later on in their design cycle for architectural performance optimization.  

Other IP selection decisions that can be determined with Carbon's solution and partnerships, include  IP selection process are for the fabric / interconnect and GPU selection.  Should your design use an ARM NIC-301 or Arteris FlexNoC?  What IP provider are you going select for your GPU?  ARM Mali...  Imagination Technologies...  Vivante? All are options with Carbon IP Exchange. 

Speak to a Virtual Prototype Expert AXI Interconnect Optimization using a Virtual Prototype

Tags: CPAK, Andy Meier, ARM Performance Optimization, ARM Cortex-A9, ARM Cache Optimization, ARM Cortex-A7, Imagination Technologies

FREE Insights & Analysis

Get the world's leading discussions on virtual prototyping delivered to your inbox

FREE Virtual Prototype Resources


Virtual Prototype Resources for SoC Designers

Linux Booting on an ARM Cortex A9 Virtual Prototype

Virtual Prototype Success Story