读取文本文件和处理数字
Reading text file and dealing with numbers
我正在尝试计算文本文件中的字母数量,但不幸的是,如果涉及到数字,我就会卡住。
到目前为止,我已经能够处理字母和符号,但不幸的是,ischar 函数在处理数字时对我没有帮助。
function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
end
C = [];
D = [];
oneline = fgets(fid);
while ischar(oneline)
C = oneline(isletter(oneline));
W = length(C);
D = [D ; W];
oneline = fgets(fid);
end
total = 0;
for i = 1:length(D)
total = D(i) + total;
end
ok = total;
如果文本文件中也有数字,如何处理计算字母?
我认为你把它做得比需要的复杂得多,只需像以前一样使用 isletter
,然后使用 length
。
function ok = lets(file_name)
%Original code as you had it
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
end
%Initialize length
ok = 0;
%Get first line
oneline = fgets(fid);
%While line isn't empty
while oneline ~= -1
%remove everythin that's not a letter
oneline(~isletter(oneline)) = [];
%Add number of letters to output
ok = ok + length(oneline);
%Get next line
oneline = fgets(fid);
end
end
我用的是输入文件,
Ar,TF,760,2.5e-07,1273.14,4.785688323049946e+24,24.80738364864047,37272905351.7263,37933372595.0276
Ar,TF,760,5e-07,1273.14,4.785688323049946e+24,40.3092219226107,2791140681.70926,2978668073.513113
Ar,TF,760,7.5e-07,1273.14,4.785688323049946e+24,54.80989010679312,738684259.1671219,836079550.0157251
得到了18
,这算了e
的数字,你想算这些吗?
我通过以下方式解决了这个问题:
function ok = lets(file_name)
file = memmapfile( file_name, 'writable', false );
lowercase = [65:90];
uppercase = [97:122];
data = file.Data;
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
我使用memmapfile
function and compared the data with the character encodings from this ASCII table. Lower case letters are represented by [65:90]
and upper case letters by [97:122]
. By applying the histc函数将文件映射到内存,我得到了每个字母在文件中出现的频率。字母总数是通过将所有频率相加得到的。
请注意,我调用了两次 histc
以避免出现从 90
到 97
的分箱,这会计算 []^_` 个字符。
我将该函数应用于名为 sample.txt 的示例文件,其中包含以下行:
abc23D&f![
k154&¨&skj
djaljaljds
这是我的输出:
>> lets('sample.txt')
Elapsed time is 0.017783 seconds.
ans =
19
编辑:
正在输出 ok=-1
以解决读取文件的问题:
function ok = lets(fclose(fid);file_name)
try
file = memmapfile( file_name, 'writable', false );
catch
file=[];
ok=-1;
end
if ~isempty(file)
lowercase = [65:90];
uppercase = [97:122];
data = file.Data;
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
end
使用 fopen
方法,因为你得到 ok=-1
"by default":
function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
else
celldata=textscan(fid,'%s');
fclose(fid);
lowercase = [65:90];
uppercase = [97:122];
data = uint8([celldata{1}{:});
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
end
我正在尝试计算文本文件中的字母数量,但不幸的是,如果涉及到数字,我就会卡住。
到目前为止,我已经能够处理字母和符号,但不幸的是,ischar 函数在处理数字时对我没有帮助。
function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
end
C = [];
D = [];
oneline = fgets(fid);
while ischar(oneline)
C = oneline(isletter(oneline));
W = length(C);
D = [D ; W];
oneline = fgets(fid);
end
total = 0;
for i = 1:length(D)
total = D(i) + total;
end
ok = total;
如果文本文件中也有数字,如何处理计算字母?
我认为你把它做得比需要的复杂得多,只需像以前一样使用 isletter
,然后使用 length
。
function ok = lets(file_name)
%Original code as you had it
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
end
%Initialize length
ok = 0;
%Get first line
oneline = fgets(fid);
%While line isn't empty
while oneline ~= -1
%remove everythin that's not a letter
oneline(~isletter(oneline)) = [];
%Add number of letters to output
ok = ok + length(oneline);
%Get next line
oneline = fgets(fid);
end
end
我用的是输入文件,
Ar,TF,760,2.5e-07,1273.14,4.785688323049946e+24,24.80738364864047,37272905351.7263,37933372595.0276
Ar,TF,760,5e-07,1273.14,4.785688323049946e+24,40.3092219226107,2791140681.70926,2978668073.513113
Ar,TF,760,7.5e-07,1273.14,4.785688323049946e+24,54.80989010679312,738684259.1671219,836079550.0157251
得到了18
,这算了e
的数字,你想算这些吗?
我通过以下方式解决了这个问题:
function ok = lets(file_name)
file = memmapfile( file_name, 'writable', false );
lowercase = [65:90];
uppercase = [97:122];
data = file.Data;
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
我使用memmapfile
function and compared the data with the character encodings from this ASCII table. Lower case letters are represented by [65:90]
and upper case letters by [97:122]
. By applying the histc函数将文件映射到内存,我得到了每个字母在文件中出现的频率。字母总数是通过将所有频率相加得到的。
请注意,我调用了两次 histc
以避免出现从 90
到 97
的分箱,这会计算 []^_` 个字符。
我将该函数应用于名为 sample.txt 的示例文件,其中包含以下行:
abc23D&f![
k154&¨&skj
djaljaljds
这是我的输出:
>> lets('sample.txt')
Elapsed time is 0.017783 seconds.
ans =
19
编辑:
正在输出 ok=-1
以解决读取文件的问题:
function ok = lets(fclose(fid);file_name)
try
file = memmapfile( file_name, 'writable', false );
catch
file=[];
ok=-1;
end
if ~isempty(file)
lowercase = [65:90];
uppercase = [97:122];
data = file.Data;
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
end
使用 fopen
方法,因为你得到 ok=-1
"by default":
function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
ok = -1;
else
celldata=textscan(fid,'%s');
fclose(fid);
lowercase = [65:90];
uppercase = [97:122];
data = uint8([celldata{1}{:});
ok = sum(histc(data,lowercase)+histc(data,uppercase));
end
end