Training - PowerPC 970FX (G5 IBM) (reference 003150A)
 
    Partners
  • This training course is approved by IBM microelectronics
  • Practical exercices are built with Diab Data compiler, downloaded to a PPC970 target board through the Vision probe
  • SingleStep debugger is used to control code execution
   
           
      Related trainings
  • The HyperTransport bus training (reference 003153A) may be recommended since most PPC970FX boards are based on this bus
  • MVD also delivers training courses around embeded OS which can be useful : Embeded Linux, OSEK
   
           
    Prerequisites
  • Attendees are assumed to be familiar with 32-bit PowerPC architecture
   
             
  Course Objectives
  • The course details the pipeline operation in order to determine code optimisation guidelines
  • Data flows between SDRAM, L1 caches and L2 cache are highlighted
  • MERSI cache coherency protocol is introduced in increasing depth
  • The operation of the elastic bus is described
  • Through a FFT algorithm, the instructor shows how to vectorize processing and reduce execution time using data streaming
  • The performance monitor is used to optimise the performance of the FFT
   
           
    Duration
  • 5-day course
   
           
    Topics

(The full description of this course can be provided on request)

OVERVIEW

  • Functional units
  • Key features

PPC970 PIPELINE

  • Pipeline basics
  • Deeply pipelined design, superscalar implementation, register renaming
  • Branch prediction mechanism
  • Instruction decode and preprocessing
  • Instruction dispatch, sequencing and completion control, register renaming
  • Dispatch group organization
  • Synchronization-based instruction grouping
  • Instruction latencies and throughputs
  • Software optimisation guidelines

MEMORY MANAGEMENT UNIT

  • MMU goals
  • Data address translation, 128-entry Data ERAT, ERAT Miss Queue
  • Second-level Memory Management Unit consisting of SLB and TLB
  • 1024-entry 4-way set associative TLB, 64-entry fully associative SLB
  • Large page support
  • Real memory limit register
  • Hypervisor vs supervisor
  • Support for 32-bit operating systems

INTERNAL DATA FLOWS

  • Data paths between load / store units, instruction queue, L2 and external bus
  • Out-of-order and speculative issue of load operations
  • 32-entry real address based store queues
  • 32-entry load re-order queue, tracking of the order of loads
  • 8-entry load miss queue
  • GUS subsystem
  • Core Interface Unit
  • L2 cache controller,
  • Non Cacheable Unit
  • Storage access ordering
  • Hardware controlled data prefetch
  • Prefetch startup sequence, stream detection
  • Synchronization instructions sync, lwsync, ptesync

L1 AND L2 CACHES

  • Cache basics
  • 64 kB direct-mapped instruction cache
  • 32 kB 2-way set associative data cache, FIFO replacement policy, Store-through policy
  • 512 kB L2 cache, fully inclusive of L1 data caches, MERSI coherency protocol
  • Cache coherency, MERSI cache line state, cache state transition tables

PROGRAMMING

  • Branch instructions
  • The system call communication path between applications and RTOS
  • Integer load / store instructions
  • Integer arithmetic and logic instructions
  • IEEE754 basics
  • FPU operation : FPSCR register
  • Float load / store instructions, floating point exceptions
  • Float arithmetic instructions
  • The EABI
  • Code and data sections, small data areas benefits
  • 970FX specific registers

THE PERFORMANCE MONITOR

  • Objectives
  • Event selection
  • Configuring the performance monitor bus
  • Instruction matching and sampling, the 3 stages of eligibility

EXCEPTION MECHANISM

  • Exception recognition and priorities
  • Focus on soft patch and maintenance exceptions
  • Registers updating according to the exception cause
  • Requirements to support exception nesting
  • Precise processing of machine check exceptions

VMX IMPLEMENTATION

  • VMX introduction, SIMD processing
  • Intra vs inter element instructions
  • VMX registers, VSCR initialization
  • ANSI C extension to support vector operators, new C types, new castings, vector declaration and initialization
  • VMX implementation on the PPC970FX
  • Data streams management
  • EABI extension to support VMX

POWER AND THERMAL MANAGEMENT

  • Clocking, PLL design
  • Time Base and decrementer
  • Frequency and voltage scaling
  • Additional dynamic power management

HARDWARE IMPLEMENTATION

  • Unidirectional point-to-point bus segments, source synchronized transfers
  • Packet protocols
  • Snoop response
  • Pipelined transactions
  • Power-on procedure
  • Electrical interface
   
           
    Documentation

Training manuals will be given to participants during training. Precise and easy of use, those notes can be used as a reference afterwards.
   
           
    Other trainings :

If you want to know our other training courses and their contents, you can consult or download our complete training courses list on this page : Training courses - General presentation