Course Outline

Performance Concepts and Metrics

  • Latency, throughput, power usage, resource utilization
  • System vs model-level bottlenecks
  • Profiling for inference vs training

Profiling on Huawei Ascend

  • Using CANN Profiler and MindInsight
  • Kernel and operator diagnostics
  • Offload patterns and memory mapping

Profiling on Biren GPU

  • Biren SDK performance monitoring features
  • Kernel fusion, memory alignment, and execution queues
  • Power and temperature-aware profiling

Profiling on Cambricon MLU

  • BANGPy and Neuware performance tools
  • Kernel-level visibility and log interpretation
  • MLU profiler integration with deployment frameworks

Graph and Model-Level Optimization

  • Graph pruning and quantization strategies
  • Operator fusion and computational graph restructuring
  • Input size standardization and batch tuning

Memory and Kernel Optimization

  • Optimizing memory layout and reuse
  • Efficient buffer management across chipsets
  • Kernel-level tuning techniques per platform

Cross-Platform Best Practices

  • Performance portability: abstraction strategies
  • Building shared tuning pipelines for multi-chip environments
  • Example: tuning an object detection model across Ascend, Biren, and MLU

Summary and Next Steps

Requirements

  • Experience working with AI model training or deployment pipelines
  • Understanding of GPU/MLU compute principles and model optimization
  • Basic familiarity with performance profiling tools and metrics

Audience

  • Performance engineers
  • Machine learning infrastructure teams
  • AI system architects
 21 Hours

Delivery Options

Private Group Training

Our identity is rooted in delivering exactly what our clients need.

  • Pre-course call with your trainer
  • Customisation of the learning experience to achieve your goals -
    • Bespoke outlines
    • Practical hands-on exercises containing data / scenarios recognisable to the learners
  • Training scheduled on a date of your choice
  • Delivered online, onsite/classroom or hybrid by experts sharing real world experience

Private Group Prices RRP from €6840 online delivery, based on a group of 2 delegates, €2160 per additional delegate (excludes any certification / exam costs). We recommend a maximum group size of 12 for most learning events.

Contact us for an exact quote and to hear our latest promotions


Public Training

Please see our public courses

Provisonal Upcoming Courses (Contact Us For More Information)

Related Categories