如何根据文件名和计数文件将文件移动到目录中
How To Move Files Into Directories based on filename and Count Files
我看到了与我的问题有些相似的其他主题,但是我仍然是初学者,无法真正理解 posted 中的一些代码。
我有一个目录列表,这些目录遵循其中一种格式,但是末尾的数字不一样,缩写 (UVM) 可以来自缩写列表。
jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0/
jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0/
jhu-usc.edu_UVM.HumanMethylation450.mage-tab.1.0.0/
我希望制作一个脚本,根据缩写 (ex.UVM) 将目录及其文件从当前目录递归移动到新目录(如果目录不存在,则创建该目录) .
然后我希望能够计算目录中以 *idat 结尾的文件的数量,并输出一个 .txt 文件,上面写着“ For abbreviation there are 这么多个idat文件。
我最近没有太多时间来解决这个问题,我的最后期限很快就要到了。如果有人能帮我解决这个问题,我将不胜感激。
如果问题措辞或格式不正确,请原谅我,这是我的第一个post,所以我尽力了。
谢谢!
是这样的吗?
#!/bin/sh
#Definition of where you want the directory to be moved
destination_root="/tmp/stuff"
function abbreviation() {
#Default
destination="${destination_root}/UNKNOWN"
if [[ == UVM* ]]; then
destination="${destination_root}/UVMFOLDER"
fi
if [[ == BRCA* ]]; then
destination="${destination_root}/BRCAFOLDER"
fi
#Ensure folder is present
mkdir -p ${destination}
}
#Loop through all folders matching the prefix
for instance in jhu-usc.edu_*
do
#Count instances of *idat files and put the result in a.txt file
echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a a.txt
abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`
#Move the folder to the destination
mv ${instance} ${destination}
done
这将创建 a.txt,内容如下:
jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 - 3
jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0 - 2
其中 3 和 2 是文件夹中以 idat 结尾的文件的实例数。
编辑
更改了移动文件的输出文件夹 - 我删除了前缀 jhu-usc.edu_。因此,例如 jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 将移动到 /tmp/stuff/UVM.HumanMethylation450.Level_1.1.0.0
#!/bin/bash
#Definition of where you want the directory to be moved
destination_root="/data/nrnb01_nobackup/agross/TCGA_methylation"
#Delete tar.gz files
find . -type f -name '*tar.gz' -exec rm {} +
#Move all files within cancersubtype folders into current directory to allow ease of moving to specific directories
find . -mindepth 2 -type f -print -exec mv {} . \;
function abbreviation() {
#Default
destination="${destination_root}/UNKNOWN"
if [[ == UVM* ]]; then
destination="${destination_root}/UVMFOLDER"
fi
if [[ == BRCA* ]]; then
destination="${destination_root}/BRCAFOLDER"
fi
if [[ == ACC* ]]; then
destination="${destination_root}/ACCFOLDER"
fi
if [[ == BLCA* ]]; then
destination="${destination_root}/BLCAFOLDER"
fi
if [[ == CESC* ]]; then
destination="${destination_root}/CESCFOLDER"
fi
if [[ == CHOL* ]]; then
destination="${destination_root}/CHOLFOLDER"
fi
if [[ == COAD* ]]; then
destination="${destination_root}/COADFOLDER"
fi
if [[ == DLBC* ]]; then
destination="${destination_root}/DLBCFOLDER"
fi
if [[ == ESCA* ]]; then
destination="${destination_root}/ESCAFOLDER"
fi
if [[ == GBM* ]]; then
destination="${destination_root}/GBMFOLDER"
fi
if [[ == HNSC* ]]; then
destination="${destination_root}/HNSCFOLDER"
fi
if [[ == KICH* ]]; then
destination="${destination_root}/KICHFOLDER"
fi
if [[ == KIRC* ]]; then
destination="${destination_root}/KIRCFOLDER"
fi
if [[ == KIRP* ]]; then
destination="${destination_root}/KIRPFOLDER"
fi
if [[ == LAML* ]]; then
destination="${destination_root}/LAMLFOLDER"
fi
if [[ == LGG* ]]; then
destination="${destination_root}/LGGFOLDER"
fi
if [[ == LIHC* ]]; then
destination="${destination_root}/LIHCFOLDER"
fi
if [[ == LUAD* ]]; then
destination="${destination_root}/LUADFOLDER"
fi
if [[ == LUSC* ]]; then
destination="${destination_root}/LUSCFOLDER"
fi
if [[ == MESO* ]]; then
destination="${destination_root}/MESOFOLDER"
fi
if [[ == OV* ]]; then
destination="${destination_root}/OVFOLDER"
fi
if [[ == PAAD* ]]; then
destination="${destination_root}/PAADFOLDER"
fi
if [[ == PCPG* ]]; then
destination="${destination_root}/PCPGFOLDER"
fi
if [[ == PRAD* ]]; then
destination="${destination_root}/PRADFOLDER"
fi
if [[ == READ* ]]; then
destination="${destination_root}/READFOLDER"
fi
if [[ == SARC* ]]; then
destination="${destination_root}/SARCFOLDER"
fi
if [[ == SKCM* ]]; then
destination="${destination_root}/SKCMFOLDER"
fi
if [[ == STAD* ]]; then
destination="${destination_root}/STADFOLDER"
fi
if [[ == TGCT* ]]; then
destination="${destination_root}/TGCTFOLDER"
fi
if [[ == THCA* ]]; then
destination="${destination_root}/THCAFOLDER"
fi
if [[ == THYM* ]]; then
destination="${destination_root}/THYMFOLDER"
fi
if [[ == UCEC* ]]; then
destination="${destination_root}/UCECFOLDER"
fi
if [[ == UCS* ]]; then
destination="${destination_root}/UCSFOLDER"
fi
#Ensure folder is present
mkdir -p ${destination}
}
#Loop through all folders matching the jhu prefix
for instance in jhu-usc.edu_*
do
#Count instances of *idat files and put the result in idat_count.txt file
echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a idat_count.txt
abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`
#Move the folder to the destination
mv ${instance} ${destination}
done
我看到了与我的问题有些相似的其他主题,但是我仍然是初学者,无法真正理解 posted 中的一些代码。
我有一个目录列表,这些目录遵循其中一种格式,但是末尾的数字不一样,缩写 (UVM) 可以来自缩写列表。
jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0/
jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0/
jhu-usc.edu_UVM.HumanMethylation450.mage-tab.1.0.0/
我希望制作一个脚本,根据缩写 (ex.UVM) 将目录及其文件从当前目录递归移动到新目录(如果目录不存在,则创建该目录) .
然后我希望能够计算目录中以 *idat 结尾的文件的数量,并输出一个 .txt 文件,上面写着“ For abbreviation there are 这么多个idat文件。
我最近没有太多时间来解决这个问题,我的最后期限很快就要到了。如果有人能帮我解决这个问题,我将不胜感激。
如果问题措辞或格式不正确,请原谅我,这是我的第一个post,所以我尽力了。
谢谢!
是这样的吗?
#!/bin/sh
#Definition of where you want the directory to be moved
destination_root="/tmp/stuff"
function abbreviation() {
#Default
destination="${destination_root}/UNKNOWN"
if [[ == UVM* ]]; then
destination="${destination_root}/UVMFOLDER"
fi
if [[ == BRCA* ]]; then
destination="${destination_root}/BRCAFOLDER"
fi
#Ensure folder is present
mkdir -p ${destination}
}
#Loop through all folders matching the prefix
for instance in jhu-usc.edu_*
do
#Count instances of *idat files and put the result in a.txt file
echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a a.txt
abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`
#Move the folder to the destination
mv ${instance} ${destination}
done
这将创建 a.txt,内容如下:
jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 - 3
jhu-usc.edu_UVM.HumanMethylation450.aux.1.0.0 - 2
其中 3 和 2 是文件夹中以 idat 结尾的文件的实例数。
编辑
更改了移动文件的输出文件夹 - 我删除了前缀 jhu-usc.edu_。因此,例如 jhu-usc.edu_UVM.HumanMethylation450.Level_1.1.0.0 将移动到 /tmp/stuff/UVM.HumanMethylation450.Level_1.1.0.0
#!/bin/bash
#Definition of where you want the directory to be moved
destination_root="/data/nrnb01_nobackup/agross/TCGA_methylation"
#Delete tar.gz files
find . -type f -name '*tar.gz' -exec rm {} +
#Move all files within cancersubtype folders into current directory to allow ease of moving to specific directories
find . -mindepth 2 -type f -print -exec mv {} . \;
function abbreviation() {
#Default
destination="${destination_root}/UNKNOWN"
if [[ == UVM* ]]; then
destination="${destination_root}/UVMFOLDER"
fi
if [[ == BRCA* ]]; then
destination="${destination_root}/BRCAFOLDER"
fi
if [[ == ACC* ]]; then
destination="${destination_root}/ACCFOLDER"
fi
if [[ == BLCA* ]]; then
destination="${destination_root}/BLCAFOLDER"
fi
if [[ == CESC* ]]; then
destination="${destination_root}/CESCFOLDER"
fi
if [[ == CHOL* ]]; then
destination="${destination_root}/CHOLFOLDER"
fi
if [[ == COAD* ]]; then
destination="${destination_root}/COADFOLDER"
fi
if [[ == DLBC* ]]; then
destination="${destination_root}/DLBCFOLDER"
fi
if [[ == ESCA* ]]; then
destination="${destination_root}/ESCAFOLDER"
fi
if [[ == GBM* ]]; then
destination="${destination_root}/GBMFOLDER"
fi
if [[ == HNSC* ]]; then
destination="${destination_root}/HNSCFOLDER"
fi
if [[ == KICH* ]]; then
destination="${destination_root}/KICHFOLDER"
fi
if [[ == KIRC* ]]; then
destination="${destination_root}/KIRCFOLDER"
fi
if [[ == KIRP* ]]; then
destination="${destination_root}/KIRPFOLDER"
fi
if [[ == LAML* ]]; then
destination="${destination_root}/LAMLFOLDER"
fi
if [[ == LGG* ]]; then
destination="${destination_root}/LGGFOLDER"
fi
if [[ == LIHC* ]]; then
destination="${destination_root}/LIHCFOLDER"
fi
if [[ == LUAD* ]]; then
destination="${destination_root}/LUADFOLDER"
fi
if [[ == LUSC* ]]; then
destination="${destination_root}/LUSCFOLDER"
fi
if [[ == MESO* ]]; then
destination="${destination_root}/MESOFOLDER"
fi
if [[ == OV* ]]; then
destination="${destination_root}/OVFOLDER"
fi
if [[ == PAAD* ]]; then
destination="${destination_root}/PAADFOLDER"
fi
if [[ == PCPG* ]]; then
destination="${destination_root}/PCPGFOLDER"
fi
if [[ == PRAD* ]]; then
destination="${destination_root}/PRADFOLDER"
fi
if [[ == READ* ]]; then
destination="${destination_root}/READFOLDER"
fi
if [[ == SARC* ]]; then
destination="${destination_root}/SARCFOLDER"
fi
if [[ == SKCM* ]]; then
destination="${destination_root}/SKCMFOLDER"
fi
if [[ == STAD* ]]; then
destination="${destination_root}/STADFOLDER"
fi
if [[ == TGCT* ]]; then
destination="${destination_root}/TGCTFOLDER"
fi
if [[ == THCA* ]]; then
destination="${destination_root}/THCAFOLDER"
fi
if [[ == THYM* ]]; then
destination="${destination_root}/THYMFOLDER"
fi
if [[ == UCEC* ]]; then
destination="${destination_root}/UCECFOLDER"
fi
if [[ == UCS* ]]; then
destination="${destination_root}/UCSFOLDER"
fi
#Ensure folder is present
mkdir -p ${destination}
}
#Loop through all folders matching the jhu prefix
for instance in jhu-usc.edu_*
do
#Count instances of *idat files and put the result in idat_count.txt file
echo "${instance} - `find ${instance} -type f -name "*idat" | wc -l`" | tee -a idat_count.txt
abbreviation `echo "${instance}" | sed s/jhu-usc.edu_//g`
#Move the folder to the destination
mv ${instance} ${destination}
done