Notes on Floating Point Precisions in Deep Learning Computations ================================================================ ECCV 2020 Tutorial on Accelerating Computer Vision with Mixed Precision ----------------------------------------------------------------------- https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/ Topics of the tutorial: * Training Neural Networks with Tensor Cores * PyTorch Performance Tuning Guide * Mixed Precision Training for Conditional GANs * Mixed Precision Training for FAZE: Few-shot Adaptive Gaze Estimation * Mixed Precision Training for Video Synthesis * Mixed Precision Training for Convolutional Tensor-Train LSTM * Mixed Precision Training for 3D Medical Image Analysis Has PDF of the slides and the videos. Q&A: **What's the difference between FP32 and TF32 modes?** FP32 cores perform scalar instructions. TF32 is a Tensor Core mode, which performs matrix instructions - they are 8-16x faster and more energy efficient. Both take FP32 as inputs. TF32 mode also rounds those inputs to TF32.