如何使用 curl 提交作业(序列文件)并在网络服务器中检索结果

how to submit a job (sequence file) and retrieve result in webserver using curl

我有一个fasta序列

>seq1

UUUAAAAUCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAA

想将此提交给网络服务器 http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/index.jsp

然后从以下位置检索结果(仅结果 table):http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/showresult.jsp 到输出文本文件。

我尝试了以下无效的代码。

post 工作

curl -X POST -d 'seq1\nUUUAAAAUCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAA' http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/ -H "Content-Type: application/json"

得到结果

curl -X POST http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/showresult.jsp/response -H "Content-Type: text/plain";echo

你能帮忙吗。我有这样的 1000 个序列。我需要从 Linux 终端自动执行它。

附加了一个不能完全工作的 perl 脚本。有什么建议,编辑?

#!/usr/bin/perl

use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;

# Script parameters


# Script hidden parameters
$idCon="12345";

# Sequences source file
# IMPORTANT! use standard fasta file format

$inputFile="file.fa";

# Maximum number of sequences per request
$maxNumOfSequences=1;  

# If you want to skip the N first requests
$skipRequests=0;

# Output files prefix
$outputFile="result_ssf";

# Promoter script URL 
$URL = "http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/";

# Header and bottom line
$header = "sequenceName; primaryStru; secondStru; Pvalue; Classification\n";

#$URL2 = "http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/showresult.jsp";

##################################################################################

# The browser

printf "Creating the browser...\n";
$browser = LWP::UserAgent->new();
$browser->timeout(30);

printf "Opening input file...\n";
open(SEQUENCES, "<".$inputFile) or die $!;

printf "Opening output file...\n";
open OUTPUTFILE, ">".$outputFile or die $!;
printf OUTPUTFILE $header;

$sequences = "";
$sequenceName="";
$currentSec=0;
$currentRequest=0;

printf "Sending request...\n";
while(<SEQUENCES>) {

        if ($sequenceName eq "") {
            $sequenceName = $_;
        } else {
            $sequences = $sequences.$sequenceName.$_;
            $currentSec = $currentSec+1;
            $sequenceName = "";
        }

        if ($currentSec == $maxNumOfSequences) {
           $currentRequest=$currentRequest+1;

           if ($currentRequest > $skipRequests ) {
            printf " # Request num. ".$currentRequest."\n";

                my $response = $browser->post($URL, 
                [   "Predict" => $sequences,
                    "uploadFile" => ""
                ], 
                "Content_Type" => "form-data"  );

                if ($response->is_error()) {
                    printf "%s\n", $response->status_line;
                    exit 1;
                }

                $response = $browser->post($URL, ["showresult.jsp"]);


                if ($response->is_error()) {
                    printf "%s\n", $response->status_line;
                    exit 1;
                }

                $contents = $response->content();
                #$contents =~ s/(<BR>\n|<BODY>|<\/BODY>|<HEAD>|<\/HEAD>|<HTML>|<\/HTML>|<META(.*)>|<TITLE>(.*)<\/TITLE>)//ig;
                $contents =~ s/(<BR>\n|<BODY>|<\/BODY>|<HEAD>|<\/HEAD>|<HTML>|<\/HTML>|<META(.*)>|<table>(.*)<\/table>)//ig;

                if ($contents =~ m/$header(.*)\n\n-/s) {
                    print OUTPUTFILE ;
                    print OUTPUTFILE "\n";
                }

            }
            $currentSec = 0;
            $sequences = "";
        }
}



close OUTPUTFILE;
close SEQUENCES;

您需要使用-F to send multipart/form-data. Because of multiline string in the testdata parameter you will need to store the data in a file before running the 命令。

您还需要在两次调用之间存储 cookie,因为服务器以这种方式存储有关作业的信息(针对哪个结果进行处理):

echo -ne ">seq1\nUUUAAAAUCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAA" > test.txt

curl -v -c cookie.txt 'http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/Receive.jsp' \
     -F "testdata=<test.txt" -F "Predict=Predict" -F "uploadFile="

curl -b cookie.txt 'http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/showresult.jsp'

您也可以删除 testdata 参数以仅使用 uploadFile :

echo -ne ">seq1\nUUUAAAAUCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAAC" > test.txt

curl -v -c cookie.txt 'http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/Receive.jsp' \
     -F "Predict=Predict" -F "uploadFile=@test.txt"

curl -b cookie.txt 'http://bioinformatics.hitsz.edu.cn/iMiRNA-SSF/showresult.jsp'