You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.0 KiB
1.0 KiB
Notes on Floating Point Precisions in Deep Learning Computations
ECCV 2020 Tutorial on Accelerating Computer Vision with Mixed Precision
https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/
Topics of the tutorial:
- Training Neural Networks with Tensor Cores
- PyTorch Performance Tuning Guide
- Mixed Precision Training for Conditional GANs
- Mixed Precision Training for FAZE: Few-shot Adaptive Gaze Estimation
- Mixed Precision Training for Video Synthesis
- Mixed Precision Training for Convolutional Tensor-Train LSTM
- Mixed Precision Training for 3D Medical Image Analysis
Has PDF of the slides and the videos.
Q&A:
What's the difference between FP32 and TF32 modes? FP32 cores perform scalar instructions. TF32 is a Tensor Core mode, which performs matrix instructions - they are 8-16x faster and more energy efficient. Both take FP32 as inputs. TF32 mode also rounds those inputs to TF32.