使用 SAVE TRANSLATE 导出为 CSV,但空值导出为单个 space
Export to CSV using SAVE TRANSLATE but empty values are exported as a single space
我在 SPSS 中有一个数据集,请参见下面的示例数据集。这只是一个例子,真正的是由一个单独的外部进程提供的,并且有更多的列和行。示例中的空值设置为 " "
,但这也是 SPSS 中提供空值的方式,它在内部被视为 null/empty/missing 值。
data list list/FieldNam(a20) FormName(a20) FieldType(a20) Choices(a50) Required(F1) Identifier(a1) Minimum(f8) Maximum(f8).
begin data
"Field 1" "Form abc" "text" " " 1 "y" " " " "
"Field 2" "Form abc" "datetime" " " 1 "y" " " " "
"Field 3" "Form xyz" "radio" "0=never | 1=sometimes | 2=often | 3=always" " " " " " " " "
"Field 4" "Form xyz" "text" " " " " " " "1" "100"
"Field 5" "Form xyz" "radio" "0=no | 1=yes" " " " " " " " "
end data.
然后我使用以下语法将其保存为CSV文本文件。
SAVE TRANSLATE
/TYPE = CSV
/FIELDNAMES
/TEXTOPTIONS DELIMITER=',' QUALIFIER='"'
/OUTFILE = 'C:\Temp\my_csv_file.csv'
/ENCODING='Windows-1252'
/REPLACE.
生成的 CSV 文件包含以下内容,空值包含单个空格
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text, ,1,y, ,
Field 2,Form abc,datetime, ,1,y, ,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always, , , ,
Field 4,Form xyz,text, , , ,1,100
Field 5,Form xyz,radio,0=no | 1=yes, , , ,
但是,我希望空值只是空值,如下所示:
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text,,1,y,,
Field 2,Form abc,datetime,,1,y,,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always,,,,
Field 4,Form xyz,text,,,,1,100
Field 5,Form xyz,radio,0=no | 1=yes,,,,
所以我的问题是,是否可以像这样导出 SPSS 数据集?
导出的 csv 文件将用作另一个系统的输入,它无法处理 , ,
空值。我知道我可以在记事本中打开它,然后在事后进行搜索和替换。但是我想尽量自动化,因为导出会用的比较频繁,这样会省很多功夫。
此页面中的信息表明可以调用脚本:https://www.ibm.com/docs/en/spss-statistics/23.0.0?topic=reference-script
SCRIPT
SCRIPT
runs a script to customize the program or automate
regularly performed tasks. You can run a Basic script or a Python
script.
SCRIPT 'filename' [(quoted string)]
This command takes effect immediately. It does not read the active
dataset or execute pending transformations. See the topic Command
Order for more information.
Release History
Release 16.0
Scripts run from the SCRIPT
command now run synchronously with the
command syntax stream.
Release 17.0
Ability to run Python scripts introduced.
示例 Python 在每次导出 17.0 或更高版本后调用的脚本:
import fileinput
import os
filename = 'C:\Temp\my_csv_file.csv'
postfix = '.bak'
with fileinput.FileInput(filename, inplace=True, backup=postfix) as file:
for line in file:
print(line.replace(', ', ',').replace(' ,', ','), end='')
try:
os.remove(filename + postfix)
except FileNotFoundError as e:
pass
脚本执行简单的搜索和替换。我已经包含了自动删除临时备份文件的代码,即使 Python 手册声明它会自动删除该文件。对我来说,目前始终没有(因此手动删除文件)。但是您可以删除该特定代码,如果它不适合您的话。
当然您也可以使用 Python 的 csv
模块并迭代行并将其写回另一个 csv 等。请在此处查看该文件的文档:https://docs.python.org/3/library/csv.html
我在 SPSS 中有一个数据集,请参见下面的示例数据集。这只是一个例子,真正的是由一个单独的外部进程提供的,并且有更多的列和行。示例中的空值设置为 " "
,但这也是 SPSS 中提供空值的方式,它在内部被视为 null/empty/missing 值。
data list list/FieldNam(a20) FormName(a20) FieldType(a20) Choices(a50) Required(F1) Identifier(a1) Minimum(f8) Maximum(f8).
begin data
"Field 1" "Form abc" "text" " " 1 "y" " " " "
"Field 2" "Form abc" "datetime" " " 1 "y" " " " "
"Field 3" "Form xyz" "radio" "0=never | 1=sometimes | 2=often | 3=always" " " " " " " " "
"Field 4" "Form xyz" "text" " " " " " " "1" "100"
"Field 5" "Form xyz" "radio" "0=no | 1=yes" " " " " " " " "
end data.
然后我使用以下语法将其保存为CSV文本文件。
SAVE TRANSLATE
/TYPE = CSV
/FIELDNAMES
/TEXTOPTIONS DELIMITER=',' QUALIFIER='"'
/OUTFILE = 'C:\Temp\my_csv_file.csv'
/ENCODING='Windows-1252'
/REPLACE.
生成的 CSV 文件包含以下内容,空值包含单个空格
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text, ,1,y, ,
Field 2,Form abc,datetime, ,1,y, ,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always, , , ,
Field 4,Form xyz,text, , , ,1,100
Field 5,Form xyz,radio,0=no | 1=yes, , , ,
但是,我希望空值只是空值,如下所示:
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text,,1,y,,
Field 2,Form abc,datetime,,1,y,,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always,,,,
Field 4,Form xyz,text,,,,1,100
Field 5,Form xyz,radio,0=no | 1=yes,,,,
所以我的问题是,是否可以像这样导出 SPSS 数据集?
导出的 csv 文件将用作另一个系统的输入,它无法处理 , ,
空值。我知道我可以在记事本中打开它,然后在事后进行搜索和替换。但是我想尽量自动化,因为导出会用的比较频繁,这样会省很多功夫。
此页面中的信息表明可以调用脚本:https://www.ibm.com/docs/en/spss-statistics/23.0.0?topic=reference-script
SCRIPT
SCRIPT
runs a script to customize the program or automate regularly performed tasks. You can run a Basic script or a Python script.
SCRIPT 'filename' [(quoted string)]
This command takes effect immediately. It does not read the active dataset or execute pending transformations. See the topic Command Order for more information.
Release History
Release 16.0
Scripts run from the
SCRIPT
command now run synchronously with the command syntax stream.Release 17.0
Ability to run Python scripts introduced.
示例 Python 在每次导出 17.0 或更高版本后调用的脚本:
import fileinput
import os
filename = 'C:\Temp\my_csv_file.csv'
postfix = '.bak'
with fileinput.FileInput(filename, inplace=True, backup=postfix) as file:
for line in file:
print(line.replace(', ', ',').replace(' ,', ','), end='')
try:
os.remove(filename + postfix)
except FileNotFoundError as e:
pass
脚本执行简单的搜索和替换。我已经包含了自动删除临时备份文件的代码,即使 Python 手册声明它会自动删除该文件。对我来说,目前始终没有(因此手动删除文件)。但是您可以删除该特定代码,如果它不适合您的话。
当然您也可以使用 Python 的 csv
模块并迭代行并将其写回另一个 csv 等。请在此处查看该文件的文档:https://docs.python.org/3/library/csv.html