Tips, Ideas, Discussion for SOC Virtual Prototypes

Cortex-A9 Cache Optimization (Part 1 of 3)

Posted by Toshihisa Oishi on Tue, Jun 12, 2012 @ 08:30 AM

Today's state-of-the-art SoC has an internal L1 cache in the processor, and most are also equipped with L2 cache. Cache can certainly help your application run faster. However, the cache also affects the cost of chip area and power consumption, so you need to consider this trade-off. Recently I have gotten a good case study from one of our customers on cache performance analysis.  In this blog I would like to share this with you. This topic is divided into a series, part two will come out next week. 

In this week's posting, I will tell you how the customer assembled their profiling environment in Carbon's SoC Designer Plus. The customer's next-generation SoC was scheduled to use an ARM™ Cortex®-A9 processor core.  To start the project however, they reproduced their previous SoC configuration which was based upon an ARM1176. As shown in the figure below, the major components were the processor, a L2 cache controller, an AXI bus fabric and a DDR3 memory controller (Cadence DDR3 Databahn). Some other IP were their own user logic, and some interface IP.

Virtual Prototype Block Diagram


As is often the case, the schedule for this evaluation was very tight. To save time and effort they used Carbon Performance Analysis Kits (CPAKs) which are available for both the Cortex-A9 and ARM1176.  After familiarizing themselves with the bare-metal CPAKs, they then generated models of their own configurations on Carbon IP Exchange, and used the new configurations to replace the pre-installed models with the generated model.  This meant that instead of spending time re-creating their entire system, they could instead focus only on the areas they wanted to test.

A part of the SoC Designer Plus systems created by the customer is shown in the figure below.  Note that they created both an ARM1176 system and also a Cortex-A9 system.

 ARM1176 Block Diagram


Cortex-A9 block diagram

In order to familarize themselves with the system and get a good understanding for the results they generated numerous configurations for both their existing part as well as the design under development.  In all, they developed over 30 different model configurations which are listed in the table below.

 Cache optimization table

In the next blog, I will discuss how the customer used the model configurations above to obtain and analyze a variety of different performance results.  Be sure to check back next week!

                            ARM Cortex A9 Virtual Prototype Running CoreMark Benchmark              Linux Booting on a Cortex A9 Virtual Prototype

Tags: CPAK, SoCDesigner Plus, ARM Cortex-A9, Carbon Performance Analysis Kit, ARM Performance Analysis, ARM Cache Optimization

FREE Insights & Analysis

Get the world's leading discussions on virtual prototyping delivered to your inbox

FREE Virtual Prototype Resources


                Virtual Prototype