Apache POI 支持多少功能?这些缺失的功能在哪里

How many Functions are supported by Apache POI? and where these missing functions

我正在开展一个项目,该项目使用具有非常复杂数学模型的电子表格。 从Apache官方文档中,列出了302个supported functions as of July 2021,但是调用WorkbookEvaluator.getSupportedFunctionNames(),却只有202个,为什么相差100个呢?特别是这些统计功能,如 NORMDIST、NORMINV 等? 如果有包含遗漏的 100 个函数的库,我在哪里可以找到它们?

在这种情况下,Apache POI 的文档有点不清楚。因此,我将尝试阐明更多信息。

首先:以下都是2022-01-26的状态,Apache poi 5.2.0.

Excel 已实现或仅以名称为人所知的函数

Developing Formula Evaluation -> What functions are supported? 告诉:

As of July 2021, POI supports 302 built-in functions, see Appendix A for the full list.

但事实并非如此。最好阅读:“POI 知道 393 built-in 个函数的名称。”,因为尚未实现所有这些函数。

  // list of functions that POI can evaluate
  java.util.Collection<String> supportedFuncs = org.apache.poi.ss.formula.WorkbookEvaluator.getSupportedFunctionNames();
  System.out.println("Following functions are implemented by Apache POI:");
  System.out.println("Count: " + supportedFuncs.size());
  System.out.println(supportedFuncs);

列出已实现的 202 个函数。

  // list of functions that are known but not supported by POI
  System.out.println("Following functions are known by name by Apache POI but not implemented yet:");
  java.util.Collection<String> unsupportedFuncs = org.apache.poi.ss.formula.WorkbookEvaluator.getNotSupportedFunctionNames();
  System.out.println("Count: " + unsupportedFuncs.size());
  System.out.println(unsupportedFuncs);

列出了 191 个已知但尚未实现的函数。

这191个函数缺少Java实现代码。它们的名字是已知的,但 WorkbookEvaluator 将无法评估它们。所以org.apache.poi.ss.formula.eval.NotImplementedFunctionException会被抛出

如何实现已知但未实现的功能?

Java代码可以按照Developing Formula Evaluation中的描述实现。但是还有一个障碍。

Developing Formula Evaluation -> Two base interfaces to start your implementation:

All Excel formula function classes implement either the org.apache.poi.hssf.record.formula.functions.Function or the org.apache.poi.hssf.record.formula.functions.FreeRefFunction interface. Function is a common interface for the functions defined in the Binary Excel File Format (BIFF8): these are "classic" Excel functions like SUM, COUNT, LOOKUP, etc. FreeRefFunction is a common interface for the functions from the Excel Analysis ToolPak, for User Defined Functions that you create, and for Excel built-in functions that have been defined since BIFF8 was defined.

所以对于实现,需要知道这 191 个函数中哪些需要实现 org.apache.poi.hssf.record.formula.functions.Function,哪些需要实现 org.apache.poi.hssf.record.formula.functions.FreeRefFunction

  // list of ATP functions that are known but not supported by POI
  System.out.println("Following functions are ATP functions known by name by Apache POI but not implemented yet:");
  java.util.Collection<String> unsupportedFuncsATP = org.apache.poi.ss.formula.atp.AnalysisToolPak.getNotSupportedFunctionNames();
  System.out.println("Count: " + unsupportedFuncsATP.size());
  System.out.println(unsupportedFuncsATP); 

列出了 83 个函数,它们是 AnalysisToolPak 已知函数但尚未实施。对于那些实施需要实施 org.apache.poi.hssf.record.formula.functions.FreeRefFunction.

  // list of BIFF8 functions that are known but not supported by POI
  System.out.println("Following functions are BIFF8 functions known by name by Apache POI but not implemented yet:");
  java.util.List<String> unsupportedFuncsBIFF8 = unsupportedFuncs.stream().filter(n -> !unsupportedFuncsATP.contains(n)).collect(java.util.stream.Collectors.toList());
  System.out.println("Count: " + unsupportedFuncsBIFF8.size());
  System.out.println(unsupportedFuncsBIFF8);  

列出了 108 个“经典”函数 Excel 已知但尚未实现的函数。对于那些实施需要实施 org.apache.poi.hssf.record.formula.functions.Function.

那些列表之外的 Excel 函数呢?

Excel 既未实现也未命名的函数必须作为用户定义的函数实现。这在 User Defined Functions. This is basically to implement user defined functions, which are in VBA or in Add-Ins. But it also can be used to implement new Excel functions which not even known by name in Apache POI. I have shown an example for implementing 函数中有描述。

为什么需要实现 Java 功能代码?

Apache POI 的方法是在不需要安装 Microsoft Office 应用程序的情况下处理 Microsoft Office 文件。因此 Apache POI 无法访问已实现的 Excel 功能,因为没有 Excel 应用程序。所以所有需要的功能都需要自己的代码实现。

这里的主要问题是单个 Excel 函数的源代码是封闭源代码,因此 public 不可用。唯一的可能是根据发布的功能描述来实现代码。这通常会导致 Java-implemented 函数代码和 Excel 函数代码的结果之间存在差异。问题在函数的 border-line 个用例中增加。例如,如果省略参数或使用非预期类型设置参数。因此,如果 Apache POI 的公式评估结果与 Microsoft Excel.

不同,请不要感到惊讶。

回应 PJ Fanning,以防万一有人需要使用 NORMDIST。

public final class NormDist 实现函数 {

@Override
public ValueEval evaluate(ValueEval[] args, int srcRowIndex, int srcColumnIndex) {
    try {
        final ValueEval x = OperandResolver.getSingleValue(args[0], srcRowIndex, srcColumnIndex);
        final ValueEval mean = OperandResolver.getSingleValue(args[1], srcRowIndex, srcColumnIndex);
        final ValueEval sd = OperandResolver.getSingleValue(args[2], srcRowIndex, srcColumnIndex);
        final ValueEval cumulative = OperandResolver.getSingleValue(args[3], srcRowIndex, srcColumnIndex);
       
        return new NumberEval(this.compute(OperandResolver.coerceValueToDouble(x),
                OperandResolver.coerceValueToDouble(mean),
                OperandResolver.coerceValueToDouble(sd),
                OperandResolver.coerceValueToBoolean(cumulative, true)));
    } catch (EvaluationException e) {
        return e.getErrorEval();
    }
}

public double compute(double x, double mean, double sd, boolean cumulative) {
    try {
        NormalDistribution normDist = new NormalDistribution(mean, sd);
        if(cumulative)
            return normDist.cumulativeProbability(x);
        
        return normDist.density(x);
    } catch (IllegalArgumentException ex) {
        return Double.NaN;
    }
}

}