读取 spss 文件 java

Question

  SPSSReader reader = new SPSSReader(args[0], null);
            Iterator it = reader.getVariables().iterator();
            while (it.hasNext())
             {
                System.out.println(it.next());
            }

我正在使用这个 SPSSReader 来读取 spss 文件。在这里，每个字符串都打印有一些附加的垃圾字符。

获得的结果：

StringVariable: nameogr(nulltpc{)(10)
NumericVariable: weightppuo(nullf{nd)
DateVariable: datexsgzj(nulllanck)
DateVariable: timeppzb(null|wt{l)
DateVariable: datetimegulj{(null|ns)
NumericVariable: commissionyrqh(nullohzx)
NumericVariable: priceeub{av(nullvlpl)

预期结果：

 StringVariable: name (10)
 NumericVariable: weight
 DateVariable: date
 DateVariable: time
 DateVariable: datetime
 NumericVariable: commission
 NumericVariable: price

提前致谢:)

Answer 1

我不确定，但查看您的代码，it.next() 正在返回一个 Variable 对象。

必须有一些方法链接到 Variable 对象，例如 it.next().getLabel() 或 it.next().getVariableName()。 toString() 在对象上并不总是有意义的。检查 SPSSReader 库中 Variable class 的 toString() 方法。

Answer 2

我尝试重新创建问题并发现了同样的事情。
考虑到该库有许可（参见 here), I would assume that this might be a way of the developers to ensure that a license is bought as the regular download only contains a demo version as evaluation (see licensing before the download）。

因为那个库比较旧（网站的版权是 2003-2008，对库的要求是 Java 1.2，没有泛型，使用 Vectors 等等），我会推荐一个不同的库作为只要您不限于问题中使用的那个。

快速搜索了一下，原来有一个开源的spss reader here which is also available through Maven here.

使用 github 页面上的示例，我将其放在一起：

import com.bedatadriven.spss.SpssDataFileReader;
import com.bedatadriven.spss.SpssVariable;

public class SPSSDemo {

    public static void main(String[] args) {
        try {
            SpssDataFileReader reader = new SpssDataFileReader(args[0]);

            for (SpssVariable var : reader.getVariables()) {
                System.out.println(var.getVariableName());
            }

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

我找不到可以打印 NumericVariable 或类似内容的内容，但由于这些是您在问题中使用的库的类名，我假设这些不是 SPSS 标准化的。如果是，您将在库中找到类似的东西，或者您可以在 github 页面上打开一个问题。

使用 here 中的 employees.sav 文件，我使用开源库从上面的代码中得到了这个输出：

resp_id
gender
first_name
last_name
date_of_birth
education_type
education_years
job_type
experience_years
monthly_income
job_satisfaction

没有更多的字符了！

编辑关于评论：

没错。我通读了一些 SPSS 的东西，据我了解，只有字符串和数字变量，然后以不同的方式格式化。在 maven 中发布的版本只允许您访问变量的类型代码（老实说，不知道那是什么）但是 github 版本（not 似乎不幸的是，在 maven 上发布为 1.3-SNAPSHOT）在引入 write- 和 printformat 之后。

您可以克隆或下载库和运行 mvn clean package（假设您安装了 maven）并在您的项目中使用生成的库（在 target\spss-reader-1.3-SNAPSHOT.jar 下找到）以获得方法 SpssVariable#getPrintFormat 和 SpssVariable#getWriteFormat 可用。

那些 return 和 SpssVariableFormat 您可以从中获得更多信息。由于我不知道这一切是怎么回事，我能做的最好的事情就是 link 你找到 SpssVariableFormat#getType 文档中引用的来源 here where references to the stuff that was implemented there should help you further (I assume that this link 可能最有助于确定你那里有什么样的格式。

如果绝对没有任何效果，我想你也可以使用问题中库的演示版本来通过 it.next().getClass().getSimpleName() 来确定内容，但只有在没有其他方法的情况下我才会求助于它确定格式的方法。

读取 spss 文件 java

Reading the spss file java

java

code-analysis

analysis

spss

spss-modeler