Planeamento

Aulas Teóricas

Course Presentation

Presentation of the course.

Objectives, program, planning and classes, and evaluation method.

Introduction to Multiprocessors in FPGAs

FPGA-based Multiprocessors

Cores from FPGA Companies

Embedded System Design Flow on Zynq

Zynq Architecture

Zynq All Programmable SoC (AP SoC)
Zynq AP SoC Processing System (PS)
Processor Peripherals
Clock Features
AXI Interfaces

Intro to FPGA

Introduction to programmable logic devices and FPGAs.

Artix-7 architecture and main elements.

Adding PL Hardware to the Embedded System

Communication between PS and PL
Managing IP components
Creating base IP with AXI-Lite Interface
Add IP to the Zynq System
Evaluating resource usage and timing

Basic Hardware Architecture

Register-Transfer Level Methodology
FSMD = control unit + datapath
Algorithm to hardware: direct “dataflow” implementation, datapath with shared operators.

Simple Multiply-Accumulate IP

Designing a Multiply-Accumulate hardware component to execute vector dot products.
VHDL Specification of the Unit: Simple Registers, Accumulator and Arithmetic Operators.
Packages for Numeric Operations.
Component Definition and Instantiation.
Control Signal Generation from AXI_Lite interface.

Custom HW components

Codesign Development
Amdahl's law and Hardware Speedup
HW/SW Communication
Performance Comparisons
Stream-Based Interfaces

Stream-Based IP

Stream-Based Datapath Unit.
Adding BRAM IP
AXI-Stream Control Unit.
AXI-Stream FIFO
HW/SW System using GP0.
Software Drivers.

Floating-Point IP

Floating-Point IP cores.
Floating point standard formats and main arithmetic block components.
Multiplier and Accumulator cores. Non-blocking and Blocking modes.

HW IP Simulation

Design and Simulation Flow.
VHDL testbenchs specific for simulation of integer matrix product and real matrix product IPs
Waveform Visualization.

Hardware Debugging

In-System Logic Design Debugging.
Probing phase steps. Insertion of debug cores. Setting up debug. Integrated Logic Analyser options.
Test in Hardware.

AXI bus

Advanced eXtensible Interface (AXI) bus.
Basic AXI Transactions. AXI-Full, AXI-Lite and AXI-Stream Signals.
AXI Interconnect block.
AXI-Lite vs. AXI-Stream.

Zynq Memory Resources

On-chip memory (OCM): RAM, Boot ROM.
DDR3 external memory.
APU Memory Hierarchy, L1 and L2 caches.
OCM / DDR Address Map.
Linker Script – Use Example.

Direct Memory Access

Using Direct Memory Access to data shared between SW and HW.
AXI DMA IP.
Connecting the AXI-Stream HW IP.
Example Application.

Matprod IP simple DMA-based Architecture

Interfacing the Matprod AXI-Stream IP with a DMA AXI Component.

Dual Microprocessor System Design

Software Development for dual ARM Zynq platform.
Configuration management.
Shared memory management.
Dual Matrix-Product Test Application.

Zynq Cache System

Zynq Application Processing Unit Memory Hierarchy.
Shared Memory and Cache Issues.
Cortex A9 Processor Cache Functions.

Support to First Project Completion

Support to the completion of the first project (in LSD2).

Support to First Project Completion

Support to the completion of the first project (in LSD2).

Architectures for PL Accelerators

Multi-processor for sparse matrix applications - design example 1.
Architecture, processing elements and data movements.
Implementation, resource utilization and performance analysis.

A multi-processor for cluster analysis - design example 2.
K-means clustering application.
Architecture, processing elements and data movements.
Implementation, resource utilization and performance analysis.

Not Taught.

Following the decision taken by the government that declares a leave for day off on May the 12th,
it was decided to extend this decision to the Técnico community.
All the academic affairs will be suspended on May the 12th.

M. Fátima Montemor
Vice-Presidente para os Assuntos Académicos

On-Chip Communication Architectures

Bus based Communication Architectures. Main Topologies, Decoding and Arbitration.
Standard Bus Architectures.
Network-on-Chip (NoC). Definitions and Terminology. Generic Routers. Switching strategies.
Intel 80-Tile Teraflop NoC design.

Summary on HW/SW Codesign

Factors driving codesign.
HW vs. SW Performance trade-offs.
Abstraction Levels for codesign models.
Codesign Methodology.
State of Codesign Technology.
SystemC for Co-Specification.

Hardware (Software) Challenges

C-to-Hardware design tools.
High-Level Synthesis versus OpenCL.
FPGA Overlay Architectures.

Support to Final Project

Support to the completion of the final project (in LSD2).

Support to Final Project

Support to the completion of the final project (in LSD2).