刷新浮點數(也 PDF ),IEEE-754和參與在轉換為字符串時進行浮點四舍五入的討論,讓我知道如何獲得給定浮點數的最大值和最小值二進制表示是相等的。
Disclaimer: for this discussion, I like to stick to 32 bit and 64 bit floating point as described by IEEE-754. I'm not interested in extended floating point (80-bits) or quads (128 bits IEEE-754-2008) or any other standard (IEEE-854).
Background: Computers are bad at representing
0.1
in binary representation. In C#, a float
represents this as 3DCCCCCD
internally (C# uses
round-to-nearest) and a double as 3FB999999999999A
.
The same bit patterns are used for decimal 0.100000005
(float) and 0.1000000000000000124
(double), but not
for 0.1000000000000000144
(double).
為方便起見,以下C#代碼給出了這些內部表示:
string GetHex(float f)
{
return BitConverter.ToUInt32(BitConverter.GetBytes(f), 0).ToString("X");
}
string GetHex(double d)
{
return BitConverter.ToUInt64(BitConverter.GetBytes(d), 0).ToString("X");
}
// float
Console.WriteLine(GetHex(0.1F));
// double
Console.WriteLine(GetHex(0.1));
在 0.1
的情況下,不存在用相同位模式表示的小數,任何 0.99 ...
99
將產生不同的位表示(即,對於 0.999999937
的float在內部產生
3F7FFFFF
)。
My question is simple: how can I find the lowest and highest decimal value for a given float (or double) that is internally stored in the same binary representation.
Why: (I know you'll ask) to find the error in rounding in .NET when it converts to a string and when it converts from a string, to find the internal exact value and to understand my own rounding errors better.
我的猜測是這樣的:取尾數,刪除其余部分,得到其確切值,得到一個(尾數位)更高,並計算平均值:低於該值的任何值將產生相同的位模式。我的主要問題是:如何獲得小數部分為整數(位操作它不是我最強大的資產)。 Jon Skeet的DoubleConverter 課程可能會有幫助。