Quick-and-dirty knowledge base for ODU RCS.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1.0 KiB

Notes on Floating Point Precisions in Deep Learning Computations

ECCV 2020 Tutorial on Accelerating Computer Vision with Mixed Precision

https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/

Topics of the tutorial:

  • Training Neural Networks with Tensor Cores
  • PyTorch Performance Tuning Guide
  • Mixed Precision Training for Conditional GANs
  • Mixed Precision Training for FAZE: Few-shot Adaptive Gaze Estimation
  • Mixed Precision Training for Video Synthesis
  • Mixed Precision Training for Convolutional Tensor-Train LSTM
  • Mixed Precision Training for 3D Medical Image Analysis

Has PDF of the slides and the videos.

Q&A:

What's the difference between FP32 and TF32 modes? FP32 cores perform scalar instructions. TF32 is a Tensor Core mode, which performs matrix instructions - they are 8-16x faster and more energy efficient. Both take FP32 as inputs. TF32 mode also rounds those inputs to TF32.