Bfloat16 floating-point format

The bfloat16 (brain floating point)^[1]^[2] floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a shortened (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing.^[3] It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use. Bfloat16 is used to reduce the storage requirements and increase the calculation speed of machine learning algorithms.^[4]

The bfloat16 format was developed by Google Brain, an artificial intelligence research group at Google. It is utilized in many CPUs, GPUs, and AI processors, such as Intel Xeon processors (AVX-512 BF16 extensions), Intel Data Center GPU, Intel Nervana NNP-L1000, Intel FPGAs,^[5]^[6]^[7] AMD Zen, AMD Instinct, NVIDIA GPUs, Google Cloud TPUs,^[8]^[9]^[10] AWS Inferentia, AWS Trainium, ARMv8.6-A,^[11] and Apple's M2^[12] and therefore A15 chips and later. Many libraries support bfloat16, such as CUDA,^[13] Intel oneAPI Math Kernel Library, AMD ROCm,^[14] AMD Optimizing CPU Libraries, PyTorch, and TensorFlow.^[10]^[15] On these platforms, bfloat16 may also be used in mixed-precision arithmetic, where bfloat16 numbers may be operated on and expanded to wider data types.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

Exponent	Significand zero	Significand non-zero	Equation
00_H	zero, −0	subnormal numbers	(−1)^signbit×2⁻¹²⁶× 0.significandbits
01_H, ..., FE_H	normalized value		(−1)^signbit×2^{exponentbits−127}× 1.significandbits
FF_H	±infinity	NaN (quiet, signaling)

Bfloat16 floating-point format

bfloat16 floating-point format

Exponent encoding

Rounding and conversion

Encoding of special values

Positive and negative infinity

Not a Number

Range and precision

Examples

Zeros and infinities

Special values

NaNs

See also

References

Wikiwand - on