读取文本文件和处理数字

Question

我正在尝试计算文本文件中的字母数量，但不幸的是，如果涉及到数字，我就会卡住。

到目前为止，我已经能够处理字母和符号，但不幸的是，ischar 函数在处理数字时对我没有帮助。

function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
    ok = -1;
end
C = [];
D = [];
oneline = fgets(fid);

while ischar(oneline)
    C = oneline(isletter(oneline));
    W = length(C);
    D = [D ; W];
    oneline = fgets(fid);
end
total = 0;
for i = 1:length(D)
    total = D(i) + total;
end
ok = total;

如果文本文件中也有数字，如何处理计算字母？

Answer 1

我认为你把它做得比需要的复杂得多，只需像以前一样使用 isletter，然后使用 length。

function ok = lets(file_name)
%Original code as you had it
fid = fopen(file_name, 'rt');
if fid < 0
    ok = -1;
 end
%Initialize length
ok = 0;
%Get first line
oneline = fgets(fid);

%While line isn't empty
while oneline ~= -1
    %remove everythin that's not a letter
    oneline(~isletter(oneline)) = [];
    %Add number of letters to output
    ok = ok + length(oneline);
    %Get next line
    oneline = fgets(fid);
end
end

我用的是输入文件，

Ar,TF,760,2.5e-07,1273.14,4.785688323049946e+24,24.80738364864047,37272905351.7263,37933372595.0276
Ar,TF,760,5e-07,1273.14,4.785688323049946e+24,40.3092219226107,2791140681.70926,2978668073.513113
Ar,TF,760,7.5e-07,1273.14,4.785688323049946e+24,54.80989010679312,738684259.1671219,836079550.0157251

得到了18，这算了e的数字，你想算这些吗？

Answer 2

我通过以下方式解决了这个问题：

function ok = lets(file_name)

file    = memmapfile( file_name, 'writable', false );
lowercase = [65:90];
uppercase = [97:122];
data = file.Data;
ok = sum(histc(data,lowercase)+histc(data,uppercase));

end

我使用memmapfile function and compared the data with the character encodings from this ASCII table. Lower case letters are represented by [65:90] and upper case letters by [97:122]. By applying the histc函数将文件映射到内存，我得到了每个字母在文件中出现的频率。字母总数是通过将所有频率相加得到的。

请注意，我调用了两次 histc 以避免出现从 90 到 97 的分箱，这会计算 []^_` 个字符。

我将该函数应用于名为 sample.txt 的示例文件，其中包含以下行：

abc23D&f![
k154&¨&skj
djaljaljds

这是我的输出：

>> lets('sample.txt')
Elapsed time is 0.017783 seconds.

ans =

    19

编辑：

正在输出 ok=-1 以解决读取文件的问题：

function ok = lets(fclose(fid);file_name)
try
    file    = memmapfile( file_name, 'writable', false );
catch
    file=[];
    ok=-1;
end
if ~isempty(file)
    lowercase = [65:90];
    uppercase = [97:122];
    data = file.Data;
    ok = sum(histc(data,lowercase)+histc(data,uppercase));
end

end

使用 fopen 方法，因为你得到 ok=-1 "by default":

function ok = lets(file_name)
fid = fopen(file_name, 'rt');
if fid < 0
    ok = -1;
else
    celldata=textscan(fid,'%s');
    fclose(fid);
    lowercase = [65:90];
    uppercase = [97:122];
    data = uint8([celldata{1}{:});
    ok = sum(histc(data,lowercase)+histc(data,uppercase));
end

end

读取文本文件和处理数字

Reading text file and dealing with numbers

matlab

for-loop

text

binary