微妙的类型错误

Subtle type error

我是编程新手,F# 是我的第一门 .NET 语言。

我正在 Rosalind.info 上尝试 this problem。基本上,给定一个 DNA 字符串,我应该 return 四个整数来计算符号 'A'、'C'、'G' 和 [=26= 的相应次数] 出现在字符串中。

这是我到目前为止编写的代码:

open System.IO
open System

type DNANucleobases = {A: int; C: int; G: int; T: int}

let initialLetterCount = {A = 0; C = 0; G = 0; T = 0}

let countEachNucleobase (accumulator: DNANucleobases)(dnaString: string) =
    let dnaCharArray = dnaString.ToCharArray()
    dnaCharArray
    |> Array.map (fun eachLetter -> match eachLetter with
                                    | 'A' -> {accumulator with A = accumulator.A + 1}
                                    | 'C' -> {accumulator with C = accumulator.C + 1}
                                    | 'G' -> {accumulator with G = accumulator.G + 1}
                                    | 'T' -> {accumulator with T = accumulator.T + 1}
                                    | _ -> accumulator)

let readDataset (filePath: string) =
    let datasetArray = File.ReadAllLines filePath 
    String.Join("", datasetArray)

let dataset = readDataset @"C:\Users\Unnamed\Desktop\Documents\Throwaway Documents\rosalind_dna.txt"
Seq.fold countEachNucleobase initialLetterCount dataset

但是,我收到以下错误消息:

CountingDNANucleotides.fsx(23,10): error FS0001: Type mismatch. Expecting a DNANucleobases -> string -> DNANucleobases but given a DNANucleobases -> string -> DNANucleobases [] The type 'DNANucleobases' does not match the type 'DNANucleobases []'

出了什么问题?我应该做哪些更改来纠正我的错误?

countEachNucleobase returns 数组 累加器类型,而不仅仅是它作为第一个参数获得的累加器。因此,Seq.fold 无法为其 'State 参数找到有效的解决方案:它只是输入上的记录,但输出上是一个数组。用于折叠的函数必须将累加器类型作为其第一个输入和输出。

代替问题代码中的 Array.map,您已经可以使用 Array.fold:

let countEachNucleobase (accumulator: DNANucleobases) (dnaString: string) =
    let dnaCharArray = dnaString.ToCharArray()
    dnaCharArray
    |> Array.fold (fun (accumulator : DNANucleobases) eachLetter ->
        match eachLetter with
        | 'A' -> {accumulator with A = accumulator.A + 1}
        | 'C' -> {accumulator with C = accumulator.C + 1}
        | 'G' -> {accumulator with G = accumulator.G + 1}
        | 'T' -> {accumulator with T = accumulator.T + 1}
        | _ -> accumulator) accumulator

然后,最后一行的调用变为:

countEachNucleobase initialLetterCount dataset

更短的版本

let readChar accumulator = function
    | 'A' -> {accumulator with A = accumulator.A + 1}
    | 'C' -> {accumulator with C = accumulator.C + 1}
    | 'G' -> {accumulator with G = accumulator.G + 1}
    | 'T' -> {accumulator with T = accumulator.T + 1}
    | _ -> accumulator

let countEachNucleobase acc input = Seq.fold readChar acc input

由于字符串是字符序列,input 将接受字符串以及字符数组或其他字符序列。