Papa Parse:已解析的 JSON 键与预期的字符串不匹配
Papa Parse: Parsed JSON key doesn't match expected string
背景:我正在使用 FormData 将一个 csv 文件和一个映射文件上传到我的服务器,并使用 Papa Parse 进行解析。
出于某种原因,Papa Parse 的输出对象(使用 console.log 正确呈现)无法被普通字符串索引。我什至尝试在我的字符串和对象上使用 JSON.parse(JSON.stringify(...))
来查看我是否可以以某种方式对其进行标准化。
import Papa from 'papaparse'
import formidable from 'formidable'
import fs from 'fs'
...
const { files, fields } = await parseRequestForm(req)
let parsedMapping: Record<string, string = JSON.parse(fields.mapping as string)
const f = files.file as formidable.File
const output = await new Promise<{ loadedCount: number; totalCount: number }>(
(resolve) => {
const filecontent = fs.createReadStream(f.path)
filecontent.setEncoding('utf8')
let loadedCount = 0
let totalCount = 0
Papa.parse<Record<string, any>>(filecontent, {
header: true,
skipEmptyLines: true,
dynamicTyping: true,
chunkSize: 25,
encoding: 'utf8',
chunk: async (out) => {
const data = out.data.map((r) => applyMapping(r, parsedMapping))
totalCount += data.length
try {
await prisma.softLead.createMany({ data }).then((x) => {
loadedCount += x.count
})
} catch (e) { }
},
complete: () => resolve({ loadedCount, totalCount }),
})
}
)
type ParsedForm = {
error: Error | string
fields: formidable.Fields
files: formidable.Files
}
function parseRequestForm(req: NextApiRequest): Promise<ParsedForm> {
const form = formidable({ encoding: 'utf8' })
return new Promise((resolve, reject) => {
form.parse(req, (err, fields, files) => {
if (err) reject({ err })
resolve({ error: err, fields, files })
})
})
}
function applyMapping(
data: Record<string, any>,
mapping: Record<keyof SoftLead, string>
): Partial<SoftLead> {
return Object.fromEntries(
Object.entries(mapping).map(([leadField, csvField]) => {
// Struggling to access field here
console.log('Field', `"${csvField}"`)
console.log('Data', data)
const parsed = JSON.parse(JSON.stringify(data))
console.log(Buffer.from(Object.keys(parsed)[0]))
console.log(Buffer.from(Buffer.from(csvField).toString('utf8')))
console.log(parsed[csvField]) // undefined
return [leadField, data[csvField]]
})
)
}
缓冲区行也表明字符串不相同,即使它们在控制台上打印相同。
Papaparse 的索引
Buffer.from(Object.keys(parsed)[0])
=> <Buffer ef bb bf 45 6d 61 69 6c 73>
映射对象键
Buffer.from(Buffer.from(csvField).toString('utf8'))
=> <Buffer 45 6d 61 69 6c 73>
普通字符串
Buffer.from(Buffer.from('Emails').toString('utf-8'))
=> <Buffer 45 6d 61 69 6c 73>
更新
- 1:我也尝试将编码设置为
utf16le
,但我认为它完全无法解析,因为 FormData
显然完全是 utf8
ef bb bf
可能是字节顺序标记,在JSON(JSON Specification and usage of BOM/charset-encoding)中是非法的。
如果您的字符串有 BOM,您可以尝试在使用 Replace("\u00EF\u00BB\u00BF", null)
.
传递给 json.parse
或 json.stringify
之前清除它
我能够按照 here 所述通过剥离 BOM 来解决这个问题。简单地说,
const parsed = Object.fromEntries(
Object.entries(data).map(([k, v]) => [stripBom(k), v])
)
export default function stripBom(str: string) {
if (str.charCodeAt(0) === 0xfeff) {
return str.slice(1)
}
return str
}
背景:我正在使用 FormData 将一个 csv 文件和一个映射文件上传到我的服务器,并使用 Papa Parse 进行解析。
出于某种原因,Papa Parse 的输出对象(使用 console.log 正确呈现)无法被普通字符串索引。我什至尝试在我的字符串和对象上使用 JSON.parse(JSON.stringify(...))
来查看我是否可以以某种方式对其进行标准化。
import Papa from 'papaparse'
import formidable from 'formidable'
import fs from 'fs'
...
const { files, fields } = await parseRequestForm(req)
let parsedMapping: Record<string, string = JSON.parse(fields.mapping as string)
const f = files.file as formidable.File
const output = await new Promise<{ loadedCount: number; totalCount: number }>(
(resolve) => {
const filecontent = fs.createReadStream(f.path)
filecontent.setEncoding('utf8')
let loadedCount = 0
let totalCount = 0
Papa.parse<Record<string, any>>(filecontent, {
header: true,
skipEmptyLines: true,
dynamicTyping: true,
chunkSize: 25,
encoding: 'utf8',
chunk: async (out) => {
const data = out.data.map((r) => applyMapping(r, parsedMapping))
totalCount += data.length
try {
await prisma.softLead.createMany({ data }).then((x) => {
loadedCount += x.count
})
} catch (e) { }
},
complete: () => resolve({ loadedCount, totalCount }),
})
}
)
type ParsedForm = {
error: Error | string
fields: formidable.Fields
files: formidable.Files
}
function parseRequestForm(req: NextApiRequest): Promise<ParsedForm> {
const form = formidable({ encoding: 'utf8' })
return new Promise((resolve, reject) => {
form.parse(req, (err, fields, files) => {
if (err) reject({ err })
resolve({ error: err, fields, files })
})
})
}
function applyMapping(
data: Record<string, any>,
mapping: Record<keyof SoftLead, string>
): Partial<SoftLead> {
return Object.fromEntries(
Object.entries(mapping).map(([leadField, csvField]) => {
// Struggling to access field here
console.log('Field', `"${csvField}"`)
console.log('Data', data)
const parsed = JSON.parse(JSON.stringify(data))
console.log(Buffer.from(Object.keys(parsed)[0]))
console.log(Buffer.from(Buffer.from(csvField).toString('utf8')))
console.log(parsed[csvField]) // undefined
return [leadField, data[csvField]]
})
)
}
缓冲区行也表明字符串不相同,即使它们在控制台上打印相同。
Papaparse 的索引
Buffer.from(Object.keys(parsed)[0])
=><Buffer ef bb bf 45 6d 61 69 6c 73>
映射对象键
Buffer.from(Buffer.from(csvField).toString('utf8'))
=><Buffer 45 6d 61 69 6c 73>
普通字符串
Buffer.from(Buffer.from('Emails').toString('utf-8'))
=><Buffer 45 6d 61 69 6c 73>
更新
- 1:我也尝试将编码设置为
utf16le
,但我认为它完全无法解析,因为FormData
显然完全是utf8
ef bb bf
可能是字节顺序标记,在JSON(JSON Specification and usage of BOM/charset-encoding)中是非法的。
如果您的字符串有 BOM,您可以尝试在使用 Replace("\u00EF\u00BB\u00BF", null)
.
json.parse
或 json.stringify
之前清除它
我能够按照 here 所述通过剥离 BOM 来解决这个问题。简单地说,
const parsed = Object.fromEntries(
Object.entries(data).map(([k, v]) => [stripBom(k), v])
)
export default function stripBom(str: string) {
if (str.charCodeAt(0) === 0xfeff) {
return str.slice(1)
}
return str
}