修改 R 脚本以使用在 Snakemake 中使用的命令行参数
Modifying R script to use command line arguments for use in Snakemake
我编写了这个小的 R 脚本来生成 DNA 序列覆盖率数据图,它将目录中的所有文件作为输入。
coverage.files<-list.files("~/coverage_plotting", full.names = TRUE, pattern = ".txt")
coverage.names<-list.files("~/coverage_plotting", full.names = F, pattern=".txt")
pdf.files <- gsub("txt","pdf", coverage.file)
plot.colors <- c("red","blue","green","yellow","purple")
for(i in 1:length(coverage.name)) {
coverage <- read.delim(coverage.file[i])
pdf(pdf.files[i], width = 5, height= 4)
colnames(coverage) <- c("contig", "position", "coverage")
contigs <- unique(coverage[,1])
plot(-100,-100, xlim=c(0,800), ylim=c(0,500000), xlab="Coverage", ylab="Number of basepairs")
for(j in contigs) {
contig.cov <- subset(coverage,contig==j)
cov.hist <- hist(contig.cov$coverage, breaks=seq(0,5000, by = 2), plot=F)
points(cov.hist$mids, cov.hist$counts, type="p", col=plot.colors[j], pch=19, cex=0.5)
}
dev.off()
}
我现在想将脚本包含在 Snakemake 文件中,因此想将其更改为从命令行获取单个文件作为输入。我找到了 commandArgs() 并尝试使用它,同时也摆脱了第一个循环,因为现在一次只输入一个文件。我最终得到了这样的东西
coverage.file <- commandArgs()
pdf.file <- gsub("txt","pdf", coverage.file)
plot.colors <- c("red","blue","green","yellow","purple")
coverage <- read.delim(coverage.file)
pdf(pdf.file, width = 5, height= 4)
colnames(coverage) <- c("contig", "position", "coverage")
contigs <- unique(coverage[,1])
plot(-100,-100, xlim=c(0,800), ylim=c(0,500000), xlab="Coverage", ylab="Number of basepairs")
for(j in contigs) {
contig.cov <- subset(coverage,contig==j)
cov.hist <- hist(contig.cov$coverage, breaks=seq(0,5000, by = 2), plot=F)
points(cov.hist$mids, cov.hist$counts, type="p", col=plot.colors[j], pch=19, cex=0.5)
}
dev.off()
当我 运行 它时,出现以下错误,
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file 'coverage.file': No such file or directory
Execution halted
有没有人对我应该如何修改它有任何建议,以便从命令行获取单个输入?
谢谢
R 文档说明 commandArgs()
:
Value
A character vector containing the name of the executable and the user-supplied command line arguments. The first element is the name of the executable by which R was invoked. The exact form of this element is platform dependent: it may be the fully qualified name, or simply the last component (or basename) of the application, or for an embedded R it can be anything the programmer supplied.If trailingOnly = TRUE, a character vector of those arguments (if any) supplied after --args.
见https://www.rdocumentation.org/packages/base/versions/3.0.3/topics/commandArgs
因此您的对象 coverage.file
是一个向量,您应该通过在向量中指定一个位置来访问参数。例如:
args <- commandArgs(trailingOnly=TRUE)
# access i'th argument depending how you write you shell command in the snakemake. ex:
coverage.file <- args[1]
...
我编写了这个小的 R 脚本来生成 DNA 序列覆盖率数据图,它将目录中的所有文件作为输入。
coverage.files<-list.files("~/coverage_plotting", full.names = TRUE, pattern = ".txt")
coverage.names<-list.files("~/coverage_plotting", full.names = F, pattern=".txt")
pdf.files <- gsub("txt","pdf", coverage.file)
plot.colors <- c("red","blue","green","yellow","purple")
for(i in 1:length(coverage.name)) {
coverage <- read.delim(coverage.file[i])
pdf(pdf.files[i], width = 5, height= 4)
colnames(coverage) <- c("contig", "position", "coverage")
contigs <- unique(coverage[,1])
plot(-100,-100, xlim=c(0,800), ylim=c(0,500000), xlab="Coverage", ylab="Number of basepairs")
for(j in contigs) {
contig.cov <- subset(coverage,contig==j)
cov.hist <- hist(contig.cov$coverage, breaks=seq(0,5000, by = 2), plot=F)
points(cov.hist$mids, cov.hist$counts, type="p", col=plot.colors[j], pch=19, cex=0.5)
}
dev.off()
}
我现在想将脚本包含在 Snakemake 文件中,因此想将其更改为从命令行获取单个文件作为输入。我找到了 commandArgs() 并尝试使用它,同时也摆脱了第一个循环,因为现在一次只输入一个文件。我最终得到了这样的东西
coverage.file <- commandArgs()
pdf.file <- gsub("txt","pdf", coverage.file)
plot.colors <- c("red","blue","green","yellow","purple")
coverage <- read.delim(coverage.file)
pdf(pdf.file, width = 5, height= 4)
colnames(coverage) <- c("contig", "position", "coverage")
contigs <- unique(coverage[,1])
plot(-100,-100, xlim=c(0,800), ylim=c(0,500000), xlab="Coverage", ylab="Number of basepairs")
for(j in contigs) {
contig.cov <- subset(coverage,contig==j)
cov.hist <- hist(contig.cov$coverage, breaks=seq(0,5000, by = 2), plot=F)
points(cov.hist$mids, cov.hist$counts, type="p", col=plot.colors[j], pch=19, cex=0.5)
}
dev.off()
当我 运行 它时,出现以下错误,
Error in file(file, "rt") : cannot open the connection
Calls: read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
cannot open file 'coverage.file': No such file or directory
Execution halted
有没有人对我应该如何修改它有任何建议,以便从命令行获取单个输入?
谢谢
R 文档说明 commandArgs()
:
Value
A character vector containing the name of the executable and the user-supplied command line arguments. The first element is the name of the executable by which R was invoked. The exact form of this element is platform dependent: it may be the fully qualified name, or simply the last component (or basename) of the application, or for an embedded R it can be anything the programmer supplied.If trailingOnly = TRUE, a character vector of those arguments (if any) supplied after --args.
见https://www.rdocumentation.org/packages/base/versions/3.0.3/topics/commandArgs
因此您的对象 coverage.file
是一个向量,您应该通过在向量中指定一个位置来访问参数。例如:
args <- commandArgs(trailingOnly=TRUE)
# access i'th argument depending how you write you shell command in the snakemake. ex:
coverage.file <- args[1]
...