Rcpp 和 int64 NA 值

Question

如何在 64 位向量中将 NA 值从 Rcpp 传递到 R？

我的第一个方法是：

// [[Rcpp::export]]                                     
Rcpp::NumericVector foo() {
  Rcpp::NumericVector res(2);

  int64_t val = 1234567890123456789;
  std::memcpy(&(res[0]), &(val), sizeof(double));
  res[1] = NA_REAL;

  res.attr("class") = "integer64";
  return res;
}

但它会产生

#> foo()
integer64
[1] 1234567890123456789 9218868437227407266

我需要得到

#> foo()
integer64
[1] 1234567890123456789 <NA>

Answer 1

好吧，我想我找到了答案......（不漂亮，但工作）。

简答：

// [[Rcpp::export]]                                     
Rcpp::NumericVector foo() {
  Rcpp::NumericVector res(2);

  int64_t val = 1234567890123456789;
  std::memcpy(&(res[0]), &(val), sizeof(double));

  # This is the magic:
  int64_t v = 1ULL << 63;
  std::memcpy(&(res[1]), &(v), sizeof(double));

  res.attr("class") = "integer64";
  return res;
}

结果是

#> foo()
integer64
[1] 1234567890123456789 <NA>

更长的答案

检查 bit64 如何存储 NA

# the last value is the max value of a 64 bit number
a <- bit64::as.integer64(c(1, 2, NA, 9223372036854775807))
a
#> integer64
#> [1] 1    2    <NA> <NA>
bit64::as.bitstring(a[3])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"
bit64::as.bitstring(a[4])
#> [1] "1000000000000000000000000000000000000000000000000000000000000000"

^{由 reprex package (v0.3.0)}

于 2020-04-23 创建

我们看到它是 10000...。这可以在 Rcpp 中用 int64_t val = 1ULL << 63; 重新创建。使用 memcpy() 而不是使用 = 的简单分配可确保不会更改任何位！

Answer 2

真的简单多了。我们在 R 中有 int64 的行为，由（几个）附加包提供，其中最好的是 bit64 给我们 integer64 S3 class 和相关行为。

它defines the NA内部如下：

#define NA_INTEGER64 LLONG_MIN

仅此而已。 R 及其包是最重要的 C 代码，LLONG_MIN 存在于此并且（几乎）一直追溯到开国元勋。

这里有两节课。第一个是 IEEE 为 浮点值 定义 NaN 和 Inf 的扩展。 R 实际上超越了它并为 的每个类型 添加了 NA。与上面的方法差不多：通过保留一个特定的位模式。（其中一个是 R 的两位原创者之一的生日。）

另一个是欣赏 Jens 使用 bit64 包和所有所需的转换和运算符函数所做的大量工作。无缝转换所有可能的值，包括 NA、NaN、Inf 等，这不是一项小任务。

而且这是一个鲜为人知的话题。很高兴你提出这个问题，因为我们现在在这里有一个记录。

Rcpp 和 int64 NA 值

Rcpp and int64 NA value

r

rcpp

na

bit64

简答：

更长的答案