使用 R 抓取数据
Webscraping the data using R
目标: 我正在尝试从网页 http://www.nepalstock.com/datanepse/previous.php 中抓取所有公司的历史每日股价。以下代码有效;但是,它始终只生成最近(2015 年 2 月 5 日)的每日股票价格。换句话说,无论我输入的日期如何,输出都是一样的。如果您能在这方面提供帮助,我将不胜感激。
library(RHTMLForms)
library(RCurl)
library(XML)
url <- "http://www.nepalstock.com/datanepse/previous.php"
forms <- getHTMLFormDescription(url)
# we are interested in the second list with date forms
# forms[[2]]
# HTML Form: http://www.nepalstock.com/datanepse/
# Date: [ ]
get_stock<-createFunction(forms[[2]])
#create sequence of dates from start to end and store it as a list
date_daily<-as.list(seq(as.Date("2011-08-24"), as.Date("2011-08-30"), "days"))
# determine the number of elements in the list
num<-length(date_daily)
daily_1<-lapply(date_daily,function(x){
show(x) #displays the particular date
readHTMLTable(htmlParse(get_stock(Date = x)), which = 7)
})
#18 tables out of which 7 is one what we desired
# change the colnames
col_name<-c("SN","Traded_Companies","No_of_Transactions","Max_Price","Min_Price","Closing_Price","Total_Share","Amount","Previous_Closing","Difference_Rs.")
daily_2<-lapply(daily_1,setNames,nm=col_name)
Output:
> head(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount
1 1 Agricultural Development Bank Ltd 24 489 471 473 2,868 1,359,038
2 2 Arun Valley Hydropower Development Company Limited 40 365 360 362 8,844 3,199,605
3 3 Alpine Development Bank Limited 11 297 295 295 150 44,350
4 4 Asian Life Insurance Co. Limited 10 1,230 1,215 1,225 898 1,098,452
5 5 Apex Development Bank Ltd. 23 131 125 131 6,033 769,893
Previous_Closing Difference_Rs.
1 480 -7
2 363 -1
3 303 -8
4 1,242 -17
5 132 -1
> tail(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount Previous_Closing
140 140 United Finance Ltd 4 255 242 242 464 115,128 255
141 141 United Insurance Co.(Nepal)Ltd. 3 905 905 905 234 211,770 915
142 142 Vibor Bikas Bank Limited 7 158 152 156 710 109,510 161
143 143 Western Development Bank Limited 35 320 311 313 7,631 2,402,497 318
144 144 Yeti Development Bank Limited 22 139 132 139 14,355 1,921,511 134
Difference_Rs.
140 -13
141 -10
142 -5
143 -5
144 5
这是一种快速方法。请注意,该站点使用 POST 请求将日期发送到服务器。
library(rvest)
library(httr)
page <- "http://www.nepalstock.com/datanepse/previous.php" %>%
POST(body = list(Date = "2015-02-01")) %>%
html()
page %>%
html_node(".dataTable") %>%
html_table(header = TRUE)
目标: 我正在尝试从网页 http://www.nepalstock.com/datanepse/previous.php 中抓取所有公司的历史每日股价。以下代码有效;但是,它始终只生成最近(2015 年 2 月 5 日)的每日股票价格。换句话说,无论我输入的日期如何,输出都是一样的。如果您能在这方面提供帮助,我将不胜感激。
library(RHTMLForms)
library(RCurl)
library(XML)
url <- "http://www.nepalstock.com/datanepse/previous.php"
forms <- getHTMLFormDescription(url)
# we are interested in the second list with date forms
# forms[[2]]
# HTML Form: http://www.nepalstock.com/datanepse/
# Date: [ ]
get_stock<-createFunction(forms[[2]])
#create sequence of dates from start to end and store it as a list
date_daily<-as.list(seq(as.Date("2011-08-24"), as.Date("2011-08-30"), "days"))
# determine the number of elements in the list
num<-length(date_daily)
daily_1<-lapply(date_daily,function(x){
show(x) #displays the particular date
readHTMLTable(htmlParse(get_stock(Date = x)), which = 7)
})
#18 tables out of which 7 is one what we desired
# change the colnames
col_name<-c("SN","Traded_Companies","No_of_Transactions","Max_Price","Min_Price","Closing_Price","Total_Share","Amount","Previous_Closing","Difference_Rs.")
daily_2<-lapply(daily_1,setNames,nm=col_name)
Output:
> head(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount
1 1 Agricultural Development Bank Ltd 24 489 471 473 2,868 1,359,038
2 2 Arun Valley Hydropower Development Company Limited 40 365 360 362 8,844 3,199,605
3 3 Alpine Development Bank Limited 11 297 295 295 150 44,350
4 4 Asian Life Insurance Co. Limited 10 1,230 1,215 1,225 898 1,098,452
5 5 Apex Development Bank Ltd. 23 131 125 131 6,033 769,893
Previous_Closing Difference_Rs.
1 480 -7
2 363 -1
3 303 -8
4 1,242 -17
5 132 -1
> tail(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount Previous_Closing
140 140 United Finance Ltd 4 255 242 242 464 115,128 255
141 141 United Insurance Co.(Nepal)Ltd. 3 905 905 905 234 211,770 915
142 142 Vibor Bikas Bank Limited 7 158 152 156 710 109,510 161
143 143 Western Development Bank Limited 35 320 311 313 7,631 2,402,497 318
144 144 Yeti Development Bank Limited 22 139 132 139 14,355 1,921,511 134
Difference_Rs.
140 -13
141 -10
142 -5
143 -5
144 5
这是一种快速方法。请注意,该站点使用 POST 请求将日期发送到服务器。
library(rvest)
library(httr)
page <- "http://www.nepalstock.com/datanepse/previous.php" %>%
POST(body = list(Date = "2015-02-01")) %>%
html()
page %>%
html_node(".dataTable") %>%
html_table(header = TRUE)