获取 long double 的数字

Question

我正在尝试实现类似于 dtoa 但没有任意精度的函数的简单版本。我将其命名为 ftob，它也处理除 10 (2-36) 以外的算术基数。它适用于 long double，在我的机器上是：x86 extended pricision

该函数工作正常，但在某些值上它会给出错误和不可接受的结果，例如2.5600 ,

这是我的代码：

#include<stdio.h>
#include<math.h>

char *ftob(long double ld, char *str, unsigned short int n_digits, unsigned short int base)
{
    long double ftob_tmp;
    short unsigned index = 1;
    short const sign = (ld < 0.0L) ? -1 : 1;
    short int i = 0, j = 0 , k = 0;

    //check base, number.
    if(base < 2 || base > 36)
        return NULL;
    else if(ld == INFINITY) {
        str = "inf";
        return str;
    }
    else if(ld == -INFINITY) {
        str = "-inf";
        return str;
    }
    else if(__isnanl(ld)) {
        str = "nan";
        return str;
    }
    //initialisations
    (sign == -1) ? str[i++] = '-' : 0;
    ftob_tmp = sign * ld;
    while(ftob_tmp > 0.0L && ftob_tmp < 1.0L) {
        ftob_tmp *= base;
        j++;
    }
    while(ftob_tmp >= base) {
        ftob_tmp /= base;
        j--;
    }
    //reinitialise
    ftob_tmp = sign * ld;
    if(ftob_tmp >= 0.0L && ftob_tmp < 1.0L) {
        str[i++] = '0';
        str[i++] = '.';
        for(k = 0; k < j - 1 && k < n_digits - 1; k++)
            str[i++] = '0';
        n_digits -= j;
    }
    else if(ftob_tmp >= base)
        k = i - j + 1;
    else
        k = i + 1;
    ftob_tmp *= powl(base, --j);
//  printf("%0.20Lf\n", ftob_tmp); /*debug message*/

    //main loop
    for(n_digits += i; i < n_digits; i++) {
        if(i == k)
            str[n_digits++, i] = '.';
        else {
//          printf("%0.20Lf * %Lf = %0.20Lf\n", ftob_tmp, powl(base, index), ftob_tmp * powl(base, index)); /* debug message*/
            str[i] = (int)fmodl((ftob_tmp * powl(base, index++)), base);
            str[i] += (str[i] < 10) ? '0' : 'A' - 10;
        }
    }

    //finalise
    str[i] = '[=10=]';

    return str;
}

int main(void)
{
    char ftl[300];

    printf("ftl = \"%s\"\n", ftob(2.56L, ftl, 19, 10));
    return 0;
}

ftob(2.56L, ftl, 19, 10) 的输出是：

ftl = "2.550990990990990990"

取消注释调试消息给出：

0.25599999999999999999
0.25599999999999999999 * 10.000000 = 2.55999999999999999995
0.25599999999999999999 * 100.000000 = 25.59999999999999999861
0.25599999999999999999 * 1000.000000 = 255.99999999999999998612
0.25599999999999999999 * 10000.000000 = 2560.00000000000000000000
0.25599999999999999999 * 100000.000000 = 25599.99999999999999822364
0.25599999999999999999 * 1000000.000000 = 255999.99999999999998578915
0.25599999999999999999 * 10000000.000000 = 2560000.00000000000000000000
0.25599999999999999999 * 100000000.000000 = 25599999.99999999999818101060
0.25599999999999999999 * 1000000000.000000 = 255999999.99999999998544808477
0.25599999999999999999 * 10000000000.000000 = 2560000000.00000000000000000000
0.25599999999999999999 * 100000000000.000000 = 25599999999.99999999813735485077
0.25599999999999999999 * 1000000000000.000000 = 255999999999.99999998509883880615
0.25599999999999999999 * 10000000000000.000000 = 2560000000000.00000000000000000000
0.25599999999999999999 * 100000000000000.000000 = 25599999999999.99999809265136718750
0.25599999999999999999 * 1000000000000000.000000 = 255999999999999.99998474121093750000
0.25599999999999999999 * 10000000000000000.000000 = 2560000000000000.00000000000000000000
0.25599999999999999999 * 100000000000000000.000000 = 25599999999999999.99804687500000000000
0.25599999999999999999 * 1000000000000000000.000000 = 255999999999999999.98437500000000000000
0.25599999999999999999 * 10000000000000000000.000000 = 2560000000000000000.00000000000000000000
ftl = "2.550990990990990990"

错误的来源似乎是 0.256 无法在 long double 中准确表示，并且其值约为 0.255999999999999999989374818710。
但是如果我得到输出我没问题：

flt = "2.5599999999999999999"

而不是：

flt = "2.5600000000000000000"

问题是在第四轮的“main loop”中任意取整为2560.00000导致str[i]设置为0而不是 9。这也是因为 2559.99999999999999... 不能用 long double 表示。
但我只需要 '2559' 即可表示，因此 str[i] 可以设置为 9。（循环中的每一轮也是如此）。

我请求有关如何实现此目标或是否可以实现的建议。

提前致谢，

Answer 1

舍入误差放大 mod

ftob_tmp * powl(...) 乘积可能需要四舍五入到最接近的 long double，因此不是精确的数学结果。这个四舍五入的产品然后是 modded 有时 returns 0 或 9 因为它在后面的数字 0.255999999999999999999.

//                  v- rounding introduced error -v
str[i] = (int)fmodl((ftob_tmp * powl(base, index++)), base);
//            ^-- error magnified -----------------^

有了更多调试信息，可以看到有时是 0，有时是 9，而预期只有 9。

printf("bbb %0.20Lf * %Lf = %0.20Lf  %d\n", 
    ftob_tmp, powl(base, index), ftob_tmp * powl(base, index), 
    (int) fmodl((ftob_tmp * powl(base, index++)), base));

bbb 0.25599999999999999999 * 100.000000 = 25.59999999999999999861  2
bbb 0.25599999999999999999 * 10000.000000 = 2560.00000000000000000000  5
bbb 0.25599999999999999999 * 1000000.000000 = 255999.99999999999998578915  9
bbb 0.25599999999999999999 * 100000000.000000 = 25599999.99999999999818101060  0
bbb 0.25599999999999999999 * 10000000000.000000 = 2560000000.00000000000000000000  9
bbb 0.25599999999999999999 * 1000000000000.000000 = 255999999999.99999998509883880615  9
bbb 0.25599999999999999999 * 100000000000000.000000 = 25599999999999.99999809265136718750  0
bbb 0.25599999999999999999 * 10000000000000000.000000 = 2560000000000000.00000000000000000000  9
...

how I can achieve this, or if it is achievable at all (?)

是的，可以实现，但不能使用 OP 的方法，因为在各个步骤中注入了太多错误。这些极端情况非常困难，通常需要宽整数或扩展整数计算而不是浮点数。

以 10 exactly 为基数打印 double 的示例代码可能会有所帮助。

其他较小的问题

更多舍入误差

带有 ftob_tmp *= base 和 ftob_tmp /= base 的循环每个注入高达 0.5 ULP 错误。然后这些循环可以形成一个差一个 j 计算。

-0.0

测试符号，而不是值，否则 -0.0 将打印为 0.0。

// sign = (ld < 0.0L) ? -1 : 1;
sign = signbit(ld) ? -1 : 1;

字符串大小

char ftl[300]; 不足以满足基数 2 中的 LDBL_MAX。查看 LDBL_MAX_EXP, LDBL_MIN_EXP 以帮助确定 最小值 最大字符串大小。

获取 long double 的数字

Get the digits of a long double

c

floating-point

long-double