跳过字符串中的一个术语 (Ruby)
Skipping a term in a string (Ruby)
使用以下代码。我基本上想跳过第一个标签并打印第二个标签:
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>english</language>
<taxonomy>
<element>
<label>/food and drink/desserts and baking</label>
<score>0.995261</score>
</element>
<element>
<confident>no</confident>
<label>/food and drink/food/candy and sweets</label>
<score>0.0748896</score>
</element>
<element>
<confident>no</confident>
<label>/food and drink/vegan</label>
<score>0.0267116</score>
</element>
</taxonomy>
</results>
<?xml version="1.0" encoding="UTF-8"?>
<results>
<status>ERROR</status>
<statusInfo>unsupported-text-language</statusInfo>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>spanish</language>
</results>
science/weather<?xml version="1.0" encoding="UTF-8"?>
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>english</language>
<taxonomy>
<element>
<label>/science/weather</label>
<score>1</score>
</element>
<element>
<confident>no</confident>
<label>/shopping/toys/dolls</label>
<score>2.22317e-05</score>
</element>
<element>
<confident>no</confident>
<label>/shopping/toys/puppets</label>
<score>2.22317e-05</score>
</element>
</taxonomy>
</results>
style and fashion/clothing/shirts<?xml version="1.0" encoding="UTF-8"?>
<results>
所以我会以“/shopping/toys/puppets”结尾。有谁知道我如何忽略第一个标签以获得第二个标签?谢谢!
到目前为止,这是我的代码:
file='C:\Users\USERNAME\Desktop\cloudsight.txt'
f = File.open(file, "r")
f.each_line {|line|
tstart = 'name"=>"'
tstop = '"'
term = line[/#{tstart}(.*?)#{tstop}/m, 1]
url = 'http://access.alchemyapi.com/calls'
service = '/text/TextGetRankedTaxonomy'
apikey = '?apikey= ENTER API KEY'
thething = '&text='
#termencoded = URI::encode(term)
termencoded = URI::encode(term.to_s)
fullurl = url + service + apikey + thething + termencoded
sleep 1
opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
lstart = '<label>/'
lstop = '</label>'
label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
print label
我用另一种方法解决了。我手动打印了我需要的所有字符串:
炼金术
file='C:\Users\USERNAME\Desktop\cloudsight.txt'
f = File.open(file, "r")
f.each_line {|line|
tstart = 'name"=>"'
tstop = '"'
term = line[/#{tstart}(.*?)#{tstop}/m, 1]
url = 'http://access.alchemyapi.com/calls'
service = '/text/TextGetRankedTaxonomy'
apikey = '?apikey= ENTER API KEY
thething = '&text='
#termencoded = URI::encode(term)
termencoded = URI::encode(term.to_s)
fullurl = url + service + apikey + thething + termencoded
sleep 1
opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
lstart = '<label>/'
lstop = '</label>'
label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
print label
print ","
cstart = '<score>'
cstop = '</score>'
confidence = opener[/#{cstart}(.*?)#{cstop}/m, 1]
print confidence
lstart2 = '</label>'
lstop2 = '</taxonomy>'
label2 = opener[/#{lstart2}(.*?)#{lstop2}/m, 1]
label22 = label2.to_s
print ","
lstart22 = '<label>/'
lstop22 = '</label>'
label23 = label22[/#{lstart22}(.*?)#{lstop22}/m, 1]
print label23
cstart22 = '</score>'
cstop22 = '</taxonomy>'
confidence22 = opener[/#{cstart22}(.*?)#{cstop22}/m, 1]
print ","
cstart23 = '<score>'
cstop23 = '</score>'
confidence23 = confidence22[/#{cstart23}(.*?)#{cstop23}/m, 1]
print confidence23
lstart30 = '</element>'
lstop30 = '</taxonomy>'
label30 = opener[/#{lstart30}(.*?)#{lstop30}/m, 1]
lstart31 = '</element>'
lstop31 = '</element>'
label31 = label30[/#{lstart31}(.*?)#{lstop31}/m, 1]
lstart32 = '<label>/'
lstop32 = '</label>'
label32 = label31[/#{lstart32}(.*?)#{lstop32}/m, 1]
print label32
print ","
cstart32 = '<score>'
cstop32 = '</score>'
confidence32 = label31[/#{cstart32}(.*?)#{cstop32}/m, 1]
print confidence32
使用以下代码。我基本上想跳过第一个标签并打印第二个标签:
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>english</language>
<taxonomy>
<element>
<label>/food and drink/desserts and baking</label>
<score>0.995261</score>
</element>
<element>
<confident>no</confident>
<label>/food and drink/food/candy and sweets</label>
<score>0.0748896</score>
</element>
<element>
<confident>no</confident>
<label>/food and drink/vegan</label>
<score>0.0267116</score>
</element>
</taxonomy>
</results>
<?xml version="1.0" encoding="UTF-8"?>
<results>
<status>ERROR</status>
<statusInfo>unsupported-text-language</statusInfo>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>spanish</language>
</results>
science/weather<?xml version="1.0" encoding="UTF-8"?>
<results>
<status>OK</status>
<usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>
<totalTransactions>1</totalTransactions>
<language>english</language>
<taxonomy>
<element>
<label>/science/weather</label>
<score>1</score>
</element>
<element>
<confident>no</confident>
<label>/shopping/toys/dolls</label>
<score>2.22317e-05</score>
</element>
<element>
<confident>no</confident>
<label>/shopping/toys/puppets</label>
<score>2.22317e-05</score>
</element>
</taxonomy>
</results>
style and fashion/clothing/shirts<?xml version="1.0" encoding="UTF-8"?>
<results>
所以我会以“/shopping/toys/puppets”结尾。有谁知道我如何忽略第一个标签以获得第二个标签?谢谢!
到目前为止,这是我的代码:
file='C:\Users\USERNAME\Desktop\cloudsight.txt'
f = File.open(file, "r")
f.each_line {|line|
tstart = 'name"=>"'
tstop = '"'
term = line[/#{tstart}(.*?)#{tstop}/m, 1]
url = 'http://access.alchemyapi.com/calls'
service = '/text/TextGetRankedTaxonomy'
apikey = '?apikey= ENTER API KEY'
thething = '&text='
#termencoded = URI::encode(term)
termencoded = URI::encode(term.to_s)
fullurl = url + service + apikey + thething + termencoded
sleep 1
opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
lstart = '<label>/'
lstop = '</label>'
label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
print label
我用另一种方法解决了。我手动打印了我需要的所有字符串:
炼金术
file='C:\Users\USERNAME\Desktop\cloudsight.txt'
f = File.open(file, "r")
f.each_line {|line|
tstart = 'name"=>"'
tstop = '"'
term = line[/#{tstart}(.*?)#{tstop}/m, 1]
url = 'http://access.alchemyapi.com/calls'
service = '/text/TextGetRankedTaxonomy'
apikey = '?apikey= ENTER API KEY
thething = '&text='
#termencoded = URI::encode(term)
termencoded = URI::encode(term.to_s)
fullurl = url + service + apikey + thething + termencoded
sleep 1
opener = open(fullurl, 'Accept-Encoding' => '') {|f| f.read }
lstart = '<label>/'
lstop = '</label>'
label = opener[/#{lstart}(.*?)#{lstop}/m, 1]
print label
print ","
cstart = '<score>'
cstop = '</score>'
confidence = opener[/#{cstart}(.*?)#{cstop}/m, 1]
print confidence
lstart2 = '</label>'
lstop2 = '</taxonomy>'
label2 = opener[/#{lstart2}(.*?)#{lstop2}/m, 1]
label22 = label2.to_s
print ","
lstart22 = '<label>/'
lstop22 = '</label>'
label23 = label22[/#{lstart22}(.*?)#{lstop22}/m, 1]
print label23
cstart22 = '</score>'
cstop22 = '</taxonomy>'
confidence22 = opener[/#{cstart22}(.*?)#{cstop22}/m, 1]
print ","
cstart23 = '<score>'
cstop23 = '</score>'
confidence23 = confidence22[/#{cstart23}(.*?)#{cstop23}/m, 1]
print confidence23
lstart30 = '</element>'
lstop30 = '</taxonomy>'
label30 = opener[/#{lstart30}(.*?)#{lstop30}/m, 1]
lstart31 = '</element>'
lstop31 = '</element>'
label31 = label30[/#{lstart31}(.*?)#{lstop31}/m, 1]
lstart32 = '<label>/'
lstop32 = '</label>'
label32 = label31[/#{lstart32}(.*?)#{lstop32}/m, 1]
print label32
print ","
cstart32 = '<score>'
cstop32 = '</score>'
confidence32 = label31[/#{cstart32}(.*?)#{cstop32}/m, 1]
print confidence32