跳过字符串中的一个术语 (Ruby)

Skipping a term in a string (Ruby)

使用以下代码。我基本上想跳过第一个标签并打印第二个标签:

<results>

    <status>OK</status>

    <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>

    <totalTransactions>1</totalTransactions>
    <language>english</language>
    <taxonomy>
        <element>
            <label>/food and drink/desserts and baking</label>
            <score>0.995261</score>
        </element>
        <element>
            <confident>no</confident>
            <label>/food and drink/food/candy and sweets</label>
            <score>0.0748896</score>
        </element>
        <element>
            <confident>no</confident>
            <label>/food and drink/vegan</label>
            <score>0.0267116</score>
        </element>
    </taxonomy>
</results>
<?xml version="1.0" encoding="UTF-8"?>

<results>

    <status>ERROR</status>

    <statusInfo>unsupported-text-language</statusInfo>

    <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>

    <totalTransactions>1</totalTransactions>
    <language>spanish</language>
</results>
science/weather<?xml version="1.0" encoding="UTF-8"?>

<results>

    <status>OK</status>

    <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>

    <totalTransactions>1</totalTransactions>
    <language>english</language>
    <taxonomy>
        <element>
            <label>/science/weather</label>
            <score>1</score>
        </element>
        <element>
            <confident>no</confident>
            <label>/shopping/toys/dolls</label>
            <score>2.22317e-05</score>
        </element>
        <element>
            <confident>no</confident>
            <label>/shopping/toys/puppets</label>
            <score>2.22317e-05</score>
        </element>
    </taxonomy>
</results>
style and fashion/clothing/shirts<?xml version="1.0" encoding="UTF-8"?>

<results>

所以我会以“/shopping/toys/puppets”结尾。有谁知道我如何忽略第一个标签以获得第二个标签?谢谢!

到目前为止,这是我的代码:

 file='C:\Users\USERNAME\Desktop\cloudsight.txt'
 f = File.open(file, "r")
 f.each_line {|line|

 tstart = 'name"=>"'
 tstop = '"'
 term = line[/#{tstart}(.*?)#{tstop}/m, 1]

 url = 'http://access.alchemyapi.com/calls'
 service = '/text/TextGetRankedTaxonomy'
 apikey = '?apikey= ENTER API KEY'
 thething = '&text='
 #termencoded = URI::encode(term)
 termencoded = URI::encode(term.to_s)
 fullurl = url + service + apikey + thething + termencoded

 sleep 1

 opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
 lstart = '<label>/'
 lstop = '</label>'
 label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
 print label

我用另一种方法解决了。我手动打印了我需要的所有字符串:

炼金术

 file='C:\Users\USERNAME\Desktop\cloudsight.txt'
 f = File.open(file, "r")
 f.each_line {|line|
   
 tstart = 'name"=>"'
 tstop = '"'
 term = line[/#{tstart}(.*?)#{tstop}/m, 1]

 url = 'http://access.alchemyapi.com/calls'
 service = '/text/TextGetRankedTaxonomy'
 apikey = '?apikey= ENTER API KEY
 thething = '&text='
 #termencoded = URI::encode(term)
 termencoded = URI::encode(term.to_s)
 fullurl = url + service + apikey + thething + termencoded
 
 sleep 1
 
 opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
  
 lstart = '<label>/'
 lstop = '</label>'
 label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
 print label

 print ","


 cstart = '<score>'
 cstop = '</score>'
 confidence = opener[/#{cstart}(.*?)#{cstop}/m, 1]  
 print confidence

 lstart2 = '</label>'
 lstop2 = '</taxonomy>'
 label2 = opener[/#{lstart2}(.*?)#{lstop2}/m, 1]
 label22 = label2.to_s
 
 print ","

 lstart22 = '<label>/'
 lstop22 = '</label>'
 label23 = label22[/#{lstart22}(.*?)#{lstop22}/m, 1]
 print label23

 cstart22 = '</score>'
 cstop22 = '</taxonomy>'
 confidence22 = opener[/#{cstart22}(.*?)#{cstop22}/m, 1] 

 print ","

 cstart23 = '<score>'
 cstop23 = '</score>'
 confidence23 = confidence22[/#{cstart23}(.*?)#{cstop23}/m, 1] 
 print confidence23

 lstart30 = '</element>'
 lstop30 = '</taxonomy>'
 label30 = opener[/#{lstart30}(.*?)#{lstop30}/m, 1]

 lstart31 = '</element>'
 lstop31 = '</element>'
 label31 = label30[/#{lstart31}(.*?)#{lstop31}/m, 1]
 

 lstart32 = '<label>/'
 lstop32 = '</label>'
 label32 = label31[/#{lstart32}(.*?)#{lstop32}/m, 1]
 print label32

 print ","

 cstart32 = '<score>'
 cstop32 = '</score>'
 confidence32 = label31[/#{cstart32}(.*?)#{cstop32}/m, 1]
 print confidence32