比较两个文件名并提取差异
compare two filenames and extract differences
我有两个文件名几乎相同的文件:
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz
如何在 bash 中仅提取不同的字符?
期望的输出:
1 3
编辑:
- 总是相同的长度
- 只考虑_R[0-9]的差异
只比较一个有趣的子集
(回答问题as-edited)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
revision_re='_R([[:digit:]]+)[._]'
rev1=; rev2=;
[[ $s1 =~ $revision_re ]] && rev1=${BASH_REMATCH[1]}
[[ $s2 =~ $revision_re ]] && rev2=${BASH_REMATCH[1]}
if [[ $rev1 && $rev2 ]] && [[ $rev1 != "$rev2" ]]; then
printf '%s %s\n' "$rev1" "$rev2"
fi
比较整个字符串
(按原样回答问题)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
max_len=$(( ${#s1} > ${#s2} ? ${#s1} : ${#s2} ))
for (( idx=0; idx<max_len; idx++ )); do
if [[ ${s1:idx:1} != "${s2:idx:1}" ]]; then
printf '%s ' "${s1:idx:1}" "${s2:idx:1}"
fi
done
printf '\n'
我有两个文件名几乎相同的文件:
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz
/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz
如何在 bash 中仅提取不同的字符?
期望的输出:
1 3
编辑:
- 总是相同的长度
- 只考虑_R[0-9]的差异
只比较一个有趣的子集
(回答问题as-edited)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
revision_re='_R([[:digit:]]+)[._]'
rev1=; rev2=;
[[ $s1 =~ $revision_re ]] && rev1=${BASH_REMATCH[1]}
[[ $s2 =~ $revision_re ]] && rev2=${BASH_REMATCH[1]}
if [[ $rev1 && $rev2 ]] && [[ $rev1 != "$rev2" ]]; then
printf '%s %s\n' "$rev1" "$rev2"
fi
比较整个字符串
(按原样回答问题)
#!/usr/bin/env bash
s1='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R1.extracted.fastq.gz'
s2='/home/104800-001-001/H27VNDSX3_104800-001-001_GCCTATCA-CGACCATT_L002_R3.extracted.fastq.gz'
max_len=$(( ${#s1} > ${#s2} ? ${#s1} : ${#s2} ))
for (( idx=0; idx<max_len; idx++ )); do
if [[ ${s1:idx:1} != "${s2:idx:1}" ]]; then
printf '%s ' "${s1:idx:1}" "${s2:idx:1}"
fi
done
printf '\n'