

# Introduction Zynq

Introduction Zyng. Zyng PS vs. PL Data Buses

BTE5380 - Embedded Systems Octobre 2014

> Andreas Habegger Bern University of Applied Sciences

Introduction Zynq

Andreas Habegger



Bern University of Applied Sciences

Introduction Processing System Processor Peripherals AXI Bus Conclusion

### Zynq: A Programmable SoC

Zynq-7000 family is an APSOC from Xilinx

- Complete ARM-based processing system
  - application processor unit (APU)
  - fully integrated memory controllers
  - I/O peripherals
- Tightly integrated programming logic
  - used to extend the processing system
  - scalable density and performance
- Flexible array of I/O
  - wide range of external multi-standard I/O
  - high-performance integrated serial transceivers
  - analog-to-digital converter inputs

The slides are based on Xilinx Tutorials.



Introduction Zynq Andreas Habegger

Introductio

Processing System Processor Peripherals AXI Bus

### Zyng SoC Block Diagram

#### Introduction Zvna





### **Processor and Hardware Logic**

- The Zynq-7000 SoC architecture consists of two major sections:
- PS: processing system
  - dual ARM Cortex-A9 processors, 866MHz to 1GHz frequency
  - multiple peripherals
  - hard silicon core
- PL: programmable logic
  - shares the same FPGA series 7 programmable logic
  - logic cells: 28k 444k (430k to 6.6M gates)
  - flip-flops: 35k 554k
  - DSP/MAC: 80 2020
  - peak DSP performance: 100 2622 GMACs
  - AD converter: two 12bits



#### Introduction Zynq

#### Andreas Habegger



Bern University of Applied Sciences

#### Introductio

Processing System Processor Peripherals

AXI Bus

### **ARM Processor Architecture**

- ARM Cortex-A9 processor implements the ARMv7-A architecture
  - ARMv7 is the ARM instruction set architecture ISA
  - ARMv7-A: application set that includes support for a MMU
  - ARMv7-R: real-time set that includes support for a memory protection unit MPU
  - ARMv7-M: microcontroller set that is the smallest set
- ARMv7 ISA includes the following types of instructions (for backward compatibility)
  - Thumb instructions: 16 bits, Thumb-2 instructions: 32 bits
  - NEON: ARMS single instruction multiple data instructions
- ARM advanced microcontroller bus architecture (AMBA) protocol
  - AXI3: third-generation ARM interface
  - AxI4: adding to existing AxI definitions (extended bursts, subsets)
- Cortex is the new family of processors

Introduction Zynq

Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System Processor Peripherals AXI Bus Conclusion

### **ARM Cortex-A9 Processor Power**

- dual-core processor cluster
- 2.5 Dмір/MHz per processor
- Harvard architecture
- self-contained 32KB L1 caches for instruction and data
- external memory based 512KB L2 cache
- automatic cache coherencey between processor cores
- 1 GHz operation (fastest speed grade)



#### Introduction Zynq

#### Andreas Habegger



Bern University of Applied Sciences

#### Introduction

Processing System Processor Peripherals AXI Bus Conclusion

### **Cortex-A9 Processor Micro-Architecture (1)**

- instruction pipeline supports out-of-order instruction issue and completion
- register renaming to enable execution speculation
- non-blocking memory system with load-store forwarding <sup>PL3</sup> load-store forwarding
- fast loop mode in instruction pre-fetch to low power consumption



Introduction Zynq

### Andreas Habegger



Bern University of Applied Sciences

#### Introduction

Processing System

Processor Peripherals

AXI Bus

### **Cortex-A9 Processor Micro-Architecture (2)**

- variable length, out-of-order, eigth-stage, super-scalar instruction pipeline
  - advanced pre-fetch with parallel branch pipeline enabling earlier branch prediction and resolution
- speculative execution
  - supports virtual renaming of ARM physical registers to remove pipeline stall due to data dependencies
  - increased processor utilization and hiding of memory latencies
  - increased performance by hardware unrolling of code loops
  - reduced interrupt latency via speculative entry to Interrupt Service Routine ISR

Introduction Zynq

Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### **Processor System Components**

### application processing unit (APU)

- I/O peripherals (IOP)
  - multiplexed I/O (Мю), extended multiplexed I/O (Емю)
- memory interfaces
- PS interconnect
- DMA
- timers
- general interrupt controller GIC
- on-chip memory (ОСМ): RAM
- debug controller: CoreSight



Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### **Processor System Interconnect (1)**

- programmable logic to memory
  - two ports to DDR
  - one port to OCM
    SRAM
- central interconnect
  - enables other interconnects to communicate
- peripheral master
  - USB, GigE, SDIO connects to DDR and PL via the central interconnect
- peripheral slave
  - CPU, DMA, and PL access to IOP





Andreas Habegger



Bern University of Applied Sciences

Introduction

Processor Peripherals AXI Bus

### **Processor System Interconnect (2)**

- processing system master
  - two ports from the processing system to programmable logic
  - connects the CPU block to common peripherals through the central interconnect



 two ports from programmable logic to the processing system





#### Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals AXI Bus

### **Memory Map**

- the Cortex-A9 processor uses 32-bit addressing
- all PS and PL peripherals are memory mapped to the Cortex-A9 processor cores
- all slave PL peripherals will be located between: 40000000 and 7FFFFFF (connected to GP0) and 80000000 and BFFFFFFF (connected to GP1)

FFFC 0000 to FFFF FFFF FD00 0000 to FFFB FFFF FC00\_0000 to FCFF\_FFFF F8F0 3000 to FBFF FFFF F890 0000 to F8F0 2FFF F801 0000 to F88F FFFF F800 1000 to F880 FFFF F800 0C00 to F800 0FFF F800 0000 to F800 0BFF E600\_0000 to F7FF\_FFFF E100 0000 to E5FF FFFF E030 0000 to E0FF FFFF E000 0000 to E02F FFFF C000 0000 to DFFF FFFF 8000 0000 to BFFF FFFF 4000 0000 to 7FFF FFFF 0010 0000 to 3FFF FFFF 0004 0000 to 000F FFFF 0000 0000 to 0003 FFFF

| OCM                              |  |  |
|----------------------------------|--|--|
| Reserved                         |  |  |
| Quad SPI linear address          |  |  |
| Reserved                         |  |  |
| CPU Private registers            |  |  |
| Reserved                         |  |  |
| PS System registers,             |  |  |
| Reserved                         |  |  |
| SLCR Registers                   |  |  |
| Reserved                         |  |  |
| SMC Memory                       |  |  |
| Reserved                         |  |  |
| IO Peripherals                   |  |  |
| Reserved                         |  |  |
| PL (MAXI _GP1)                   |  |  |
| PL (MAXI_GP0)                    |  |  |
| DDR(address not filtered by SCU) |  |  |
| DDR(address filtered by SCU)     |  |  |
| OCM                              |  |  |

Introduction Zynq

#### Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

### **Memory Resources**

# on-chip memory (ОСМ)

- RAM
- boot Roм
- DDRx dynamic memory controller
  - supports LPDDR2, DDR2, DDR3
- flash/static memory controller
  - supports SRAM, QSPI, NAMD/NOR flash



Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### **PS Boots First**

- CPU0 boots from Осм ROM; CPU1 goes into a sleep state
- on-chip boot loader in Осм Rом (stage 0 boot)
- processor loads First Stage Boot Loader (FSBL) from external flash memory
  - NOR
  - NAND
  - Quad SPI
  - SD card
  - JTAG: not a memory device used for development/debug only
  - boot source selected via package bootstrapping pins
- optional secure boot mode allows the loading of encrypted software from the flash boot memory



Bern University of Applied Sciences

Introduction Zynq Andreas Habegger

Introduction

Processing System

Processor Peripherals

AXI Bus

### **Configuring the PL**

- the programmable logic is configured after the PS boots
- performed by application software which is accessing the hardware device configuration unit
  - bitstream image transfered
  - 100MHz, 32-bit PCAP stream interface
  - decryption/authentication hardware option for encrypted bitstream (in secure boot mode, this option can be used for software memory load)
  - built-in DMA allows simultaneous PL configuration and OS memory loading



Bern University of Applied Sciences

Introduction Zynq Andreas Habegger

Introduction

Processing System

Processor Peripherals

AXI Bus

### Input/Output Peripherals

Legend

IOP

Master

IOP Slave

#### Introduction Zvna

### Andreas Habegger

Arrow direction shows control (master to slave) data flows in both directions AHB 32bit, APB 32bit, MIO Only, EMIO only, MIO or EMIO GigaEO APR Gig aE 1 RGM APE SD/SD10 SD/SDI0-AHR AHR SD/SDIO 50/5010 JII P USRO USRO 54 Dedicated Pins MIO CAN x 2 CAN -12C-UART x 2 UART-GPIO -OSPI-

SPI

ONFI 1.1-

Parallel Bus

Trace Por

SPIx 2

NAND Flash

Parallel Bus (SRAM, Flash)

Trace Port

terface Uni (TPIU)



of Applied Sciences

Introduction

Processing System

AXI Bus

Conclusion

two GigE ►

### two USB

- two SPI
- two SD/SDIO
- two CAN
- two I2c
- two UART
- four 32-bit GPIO
- static memories
  - ► NAND, NOR/SRAM, ( SPI

trace ports

# Multiplexed I/O (MIO)

- external interface to PS I/O peripheral ports
  - 54 dedicated package pins available
  - software configurable
    - automatically added to bootloader by tools
  - not available for all periphery ports
    - some ports can only use EMIO



#### Introduction Zynq

### Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

rocessor Peripherals

AXI Bus

### Extended Multiplexed I/O (EMIO)

- extended interface to PS I/O peripheral ports
  - EMIO: peripheral port to programmable logic
  - alternative to use MIO
  - mandatory for some peripheral ports
  - facilitates
    - connection to peripheral in programmable logic
    - use of general I/O pins to supplement MIO pin usage



#### Introduction Zynq

#### Andreas Habegger



Bern University of Applied Sciences

Introduction Processing System

Processor Peripherals

AXI Bus

### **PS-PL Interfaces (1)**

- AxI high-performance slave ports (HP0-HP3)
  - configurable 32-bit or 64-bit data width
  - access to OCM and DDR only
  - conversion to processing system clock domain
  - AXI FIFO interface (AFI) are FIFOS (1KB) to smooth large data transfers
- Axi general-purpose ports (GP0-GP3)
  - two masters from PS to PL
  - two slaves from PL to PS
  - 32-bit data width
  - conversation and sync to processing system clock domain



### Introduction Zynq

### **PS-PL Interfaces (2)**

- one 64-bit accelerator coherence port (ACP) AXI slave interface to CPU memory
- DMA, interrupts, event signals
  - processor event bus for signaling event information to the CPU
  - PL peripheral IP interrupts to the PS general interrupt controller (GIC)
  - four DMA channel RDY/ACK signals
- extended multiplexed I/O (Емю) allows PS peripheral ports access to PL logic and device I/O pins
- clock and resets
  - four PS clock outputs to the PL with enable control
  - four PS reset outputs to the PL
- configuration and miscellaneous

| ntrod | luction | Zyno |
|-------|---------|------|
|-------|---------|------|

Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### **PL Clocking Sources**

### PS clocks

- PS clock source from external package pin
- PS has three PLLs for clock generation
- PS has four clock ports to PL
- PL has 7 series clocking resources
  - > PL has a different clock source domain compared to the PS
  - the clock to PL can be sourced from the external clock capable pins
  - one of the four PS clock sources can be used for the PL
- PS architecture synchronizes the clock between PL and PS
- PL cannot supply clock source to PS
- GUI interface for PL and PS clock definition



Processing System

Processor Peripherals

AXI Bus

Conclusion

### **Clocking the PL**

#### Introduction Zynq

#### Andreas Habegger



Rev. - 1.22

### **Zynq Resets**

### internal resets

- power-on-reset (POR)
- watchdog resets from the three watchdogs timers
- secure violation reset
- PS resets
  - external resets: PS\_SRST\_B
  - warm reset: SRSTB
- PL resets
  - four reset outputs from PS to PL
  - FCLK\_RESET[3:0]



Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### AXI is Part of ARM's AMBA Bus

#### Introduction Zynq



### **Basic AXI Signaling - 5 Channels**

#### Introduction Zynq



### The AXI Interface - AXI4-Lite

#### Introduction Zynq



- no burst
- data width 32 or 64
  - Xilinx IP only supports 32-bits
- very small footprint
- bridging to AXI4 handled automatically by AXI\_Interconnect

### The AXI Interface - AXI4

### sometimes called "Full AXI" or "memory mapped"

### not ARM-sanctioned names

- single address multiple data
  - burst up to 256 data beats
- data width parameterizable

1024 bits



Introduction Zynq

#### Andreas Habegger

Bern University of Applied Sciences

### The AXI Interface - AXI4-Stream

- no address channel, no read and write, always just master to slave
  - effectively and Axi4 "write data" channel
- unlimited burst length
  - Axi4 max 256
  - Axi4-Lite does not burst
- virtually same signaling as Axi data channels
  - protocol allows merging, packing, width conversion
  - supports sparse, continuous, aligned, unaligned streams





Introduction Zvna

Processing System

**Processor Peripherals** 

AXI Bus

### **The AXI Interface -Streaming Applications**

may not have packets

- e.g: digital up converter:
- no concept of address
- free running data (in this case)
- Axi4-stream would optimize to a very simple interface (in this case)

### may have packets

- e.g: Pcie:
- their packets may contain different information
- typically bridge logic of some sort is needed

#### Introduction Zyng

#### Andreas Habegger



Bern University of Applied Sciences

Introduction

Processing System

Processor Peripherals

AXI Bus

### Conclusion

- the Zynq-7000 processing platform is a system-on-chip SOC processor with embedded programmable logic
- the processing system (PS) is a hard silicon dual core consisting of
  - an application processor unit
    - two ARM Cortex-A9 processors
    - NEON co-processor
    - general interrupt controller
    - general and watchdog timers
  - I/O peripherals
  - external memory interfaces
- the programmable logic (PL) consists of 7 series devices
- high performance Axi4 point-to-point interface
- tightly coupled AxI4 ports interfacing PS and PL
- PS boots from a selection of external memory devices
- PL is configured by and after PS boots
- PS provides clocking resources to PL

#### Introduction Zynq

#### Andreas Habegger



Bern University of Applied Sciences

Introduction Processing System Processor Peripherals AXI Bus