从 URL 下载 excel 文件并使用 `read_xlsx` 阅读
Download excel file from URL and read it with `read_xlsx`
我正在尝试将一个特别混乱的 .xlsx 文件从 URL 下载到本地目录,然后使用 read_xlsx
.
读取该文件
# Download file into directory
my_url <- 'https://docs.google.com/spreadsheets/d/0Bw4a10rhk2QqaTZkUmQwaXU4aEE/edit?resourcekey=0-RQa9gRpFX0x3z5bSJGn0Dg#gid=1944035140'
download.file(url=my_url, destfile='./dat/df.xlsx')
# Load file
df <- read_xlsx('./dat/df.xlsx')
最后一行抛出以下错误:
Error: Evaluation error: zip file '/Users/... some path .../dat/df.xlsx' cannot be opened.
我相信这是因为 download.file()
以某种方式弄乱了格式。其他几个已经解决了,但是解决方案(mode='wb'
)没有帮助。
你能帮我下载文件而不弄乱格式,这样我以后可以使用 read_xlsx
阅读这个文件吗?
作为附加要求,我想尽可能少地使用外部依赖项(这就是我尝试使用 download.file()
的原因)。
确实,link 会将您带到 Google 文档并且用于不可下载的编辑。您不能通过这种方式下载此文件。只需将其保存到您的硬盘驱动器即可。但是,我做了一个从下载到磁盘的文件中读取数据的函数。也许对你有用。
library(tidyverse)
library(readxl)
urlFile = "https://docs.google.com/spreadsheets/d/1SF0PkBz9BR4yqiQ27Bt5OsD33Y8Rt5lh/edit?usp=sharing&ouid=107152468748636733235&rtpof=true&sd=true"
xlsFile = "refugios_nayarit.xlsx"
download.file(url=urlFile, destfile=xlsFile, mode="wb")
fReadXls = function(xlsFile, sheet) {
data = read_excel(
xlsFile, sheet = sheet, skip = 6,
col_names = c("No.", "REFUGIO", "MUNICIPIO", "DIRECCIÓN", "USO DEL INMUEBLE",
"SERVICIOS", "CAPACIDAD DE PERSONAS", "COORD. LATITUD L",
"COORD. LATITUD W", "COORD. ALTITUD MSNM", "RESPONSABLE",
"TELÉFONO"))
data %>% slice_head(n=nrow(.)-1)
}
df = tibble(sheet = excel_sheets(xlsFile)) %>%
mutate(data = map(sheet, ~fReadXls(xlsFile, .x)))
df$data[[1]]
输出
# A tibble: 20 x 12
No. REFUGIO MUNICIPIO DIRECCIÓN `USO DEL INMUEB~ SERVICIOS `CAPACIDAD DE PE~ `COORD. LATITUD~ `COORD. LATITUD~
<dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr>
1 1 PRIMARIA LAB~ ACAPONETA LOPEZ RAYON EDUCACION AGUA, SANITA~ 200 "22°29'56.06\"" "105°21'37.27\""
2 2 JARDIN DE NI~ ACAPONETA ALDAMA ESQ C~ EDUCACION AGUA, SANITA~ 100 "22°29'53.14\"" "105°21'29.48\""
3 3 PRIMARIA CAR~ ACAPONETA E. CARRANZA EDUCACION AGUA, SANITA~ 200 "22o30'00.43\"" "105°21'37.46\""
4 4 PRIMARIA LAZ~ ACAPONETA AMADO NERVO EDUCACION AGUA, SANITA~ 100 "22°29'27.17\"" "105°21'39.68"
5 5 PRIMARIA H. ~ ACAPONETA VERACRUZ No.~ EDUCACION AGUA, SANITA~ 150 "22o29'40.21\"" "105a21'40.23\""
6 6 PRIMARIA MIG~ ACAPONETA MORELOS Y OA~ EDUCACION AGUA, SANITA~ 200 "22o29'23.26\"" "105a21'41.99\""
7 7 PRIMARIA CEN~ ACAPONETA MATAMOROS Y ~ EDUCACION AGUA, SANITA~ 250 "22o29'37.31\"" "105a21'33.33\""
8 8 SINDICATO CTM ACAPONETA QUERETARO Y ~ GREMIO SINDICAL AGUA, SANITA~ 100 "22°29'39.32\"" "105°21'46.60"
9 9 ESTADIO MUNI~ ACAPONETA JUAN ESCUTIA DEPORTE AGUA, SANITA~ 300 "22a29'55.10\"" "105a21'52.29\""
10 10 CASA DE LA C~ ACAPONETA MORELOS CULTURAL AGUA, SANITA~ 300 "22a29'20.78\"" "105a21'46.46\""
11 11 CENTRO RECRE~ ACAPONETA México ENTRE~ RECREATIVO AGUA, SANITA~ 400 "22a29'39.76\"" "105a21'37.87\""
12 12 CENTRO RECRE~ ACAPONETA VERACRUZ No15 RECREATIVO AGUA, SANITA~ 400 "22a29'30.03\"" "105a21'39.47\""
13 13 IGLESIA CRIS~ ACAPONETA VERACRUZ No ~ RELIGIOSO AGUA, SANITA~ 50 "22a29'41.60\"" "105a21'40.35\""
14 14 ESCUELA FRAY~ AHUACATLAN 20 DE NOVIEM~ EDUCACION AGUA, SANITA~ 80 "21a03'06.07\"" "104a29'03.50\""
15 15 SECUNDARIA F~ AHUACATLAN 20 DE NOVIEM~ EDUCACION AGUA, SANITA~ 250 "21a03'18.33\"" "104a28'56.26\""
16 16 ESCUELA JOSE~ AHUACATLAN MORELOS Y MA~ EDUCACION AGUA, SANITA~ 200 "21a03'04.55\"" "104a29'12.67\""
17 17 ESCUELA PREP~ AHUACATLAN CALLE EL SAL~ EDUCACION AGUA, SANITA~ 200 "21a02'57.01\"" "104a29'16.71\""
18 18 ESCUELA PLAN~ AHUACATLAN OAXACA E HID~ EDUCACION AGUA, SANITA~ 200 "21a03'02.43\"" "104a28'58.82\""
19 19 UNIDAD ACADE~ AHUACATLAN CARR A GUADA~ EDUCACION AGUA, SANITA~ 200 "21a03'28.20\"" "104a29'06.67\""
20 20 CLUB SOCIAL ~ AHUACATLAN 20 DE NOVIEM~ DEPORTE AGUA, SANITA~ 400 "21a03'07.37\"" "104a29'01\"57\~
# ... with 3 more variables: COORD. ALTITUD MSNM <dbl>, RESPONSABLE <chr>, TELÉFONO <chr>
我正在尝试将一个特别混乱的 .xlsx 文件从 URL 下载到本地目录,然后使用 read_xlsx
.
# Download file into directory
my_url <- 'https://docs.google.com/spreadsheets/d/0Bw4a10rhk2QqaTZkUmQwaXU4aEE/edit?resourcekey=0-RQa9gRpFX0x3z5bSJGn0Dg#gid=1944035140'
download.file(url=my_url, destfile='./dat/df.xlsx')
# Load file
df <- read_xlsx('./dat/df.xlsx')
最后一行抛出以下错误:
Error: Evaluation error: zip file '/Users/... some path .../dat/df.xlsx' cannot be opened.
我相信这是因为 download.file()
以某种方式弄乱了格式。其他几个mode='wb'
)没有帮助。
你能帮我下载文件而不弄乱格式,这样我以后可以使用 read_xlsx
阅读这个文件吗?
作为附加要求,我想尽可能少地使用外部依赖项(这就是我尝试使用 download.file()
的原因)。
确实,link 会将您带到 Google 文档并且用于不可下载的编辑。您不能通过这种方式下载此文件。只需将其保存到您的硬盘驱动器即可。但是,我做了一个从下载到磁盘的文件中读取数据的函数。也许对你有用。
library(tidyverse)
library(readxl)
urlFile = "https://docs.google.com/spreadsheets/d/1SF0PkBz9BR4yqiQ27Bt5OsD33Y8Rt5lh/edit?usp=sharing&ouid=107152468748636733235&rtpof=true&sd=true"
xlsFile = "refugios_nayarit.xlsx"
download.file(url=urlFile, destfile=xlsFile, mode="wb")
fReadXls = function(xlsFile, sheet) {
data = read_excel(
xlsFile, sheet = sheet, skip = 6,
col_names = c("No.", "REFUGIO", "MUNICIPIO", "DIRECCIÓN", "USO DEL INMUEBLE",
"SERVICIOS", "CAPACIDAD DE PERSONAS", "COORD. LATITUD L",
"COORD. LATITUD W", "COORD. ALTITUD MSNM", "RESPONSABLE",
"TELÉFONO"))
data %>% slice_head(n=nrow(.)-1)
}
df = tibble(sheet = excel_sheets(xlsFile)) %>%
mutate(data = map(sheet, ~fReadXls(xlsFile, .x)))
df$data[[1]]
输出
# A tibble: 20 x 12
No. REFUGIO MUNICIPIO DIRECCIÓN `USO DEL INMUEB~ SERVICIOS `CAPACIDAD DE PE~ `COORD. LATITUD~ `COORD. LATITUD~
<dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr>
1 1 PRIMARIA LAB~ ACAPONETA LOPEZ RAYON EDUCACION AGUA, SANITA~ 200 "22°29'56.06\"" "105°21'37.27\""
2 2 JARDIN DE NI~ ACAPONETA ALDAMA ESQ C~ EDUCACION AGUA, SANITA~ 100 "22°29'53.14\"" "105°21'29.48\""
3 3 PRIMARIA CAR~ ACAPONETA E. CARRANZA EDUCACION AGUA, SANITA~ 200 "22o30'00.43\"" "105°21'37.46\""
4 4 PRIMARIA LAZ~ ACAPONETA AMADO NERVO EDUCACION AGUA, SANITA~ 100 "22°29'27.17\"" "105°21'39.68"
5 5 PRIMARIA H. ~ ACAPONETA VERACRUZ No.~ EDUCACION AGUA, SANITA~ 150 "22o29'40.21\"" "105a21'40.23\""
6 6 PRIMARIA MIG~ ACAPONETA MORELOS Y OA~ EDUCACION AGUA, SANITA~ 200 "22o29'23.26\"" "105a21'41.99\""
7 7 PRIMARIA CEN~ ACAPONETA MATAMOROS Y ~ EDUCACION AGUA, SANITA~ 250 "22o29'37.31\"" "105a21'33.33\""
8 8 SINDICATO CTM ACAPONETA QUERETARO Y ~ GREMIO SINDICAL AGUA, SANITA~ 100 "22°29'39.32\"" "105°21'46.60"
9 9 ESTADIO MUNI~ ACAPONETA JUAN ESCUTIA DEPORTE AGUA, SANITA~ 300 "22a29'55.10\"" "105a21'52.29\""
10 10 CASA DE LA C~ ACAPONETA MORELOS CULTURAL AGUA, SANITA~ 300 "22a29'20.78\"" "105a21'46.46\""
11 11 CENTRO RECRE~ ACAPONETA México ENTRE~ RECREATIVO AGUA, SANITA~ 400 "22a29'39.76\"" "105a21'37.87\""
12 12 CENTRO RECRE~ ACAPONETA VERACRUZ No15 RECREATIVO AGUA, SANITA~ 400 "22a29'30.03\"" "105a21'39.47\""
13 13 IGLESIA CRIS~ ACAPONETA VERACRUZ No ~ RELIGIOSO AGUA, SANITA~ 50 "22a29'41.60\"" "105a21'40.35\""
14 14 ESCUELA FRAY~ AHUACATLAN 20 DE NOVIEM~ EDUCACION AGUA, SANITA~ 80 "21a03'06.07\"" "104a29'03.50\""
15 15 SECUNDARIA F~ AHUACATLAN 20 DE NOVIEM~ EDUCACION AGUA, SANITA~ 250 "21a03'18.33\"" "104a28'56.26\""
16 16 ESCUELA JOSE~ AHUACATLAN MORELOS Y MA~ EDUCACION AGUA, SANITA~ 200 "21a03'04.55\"" "104a29'12.67\""
17 17 ESCUELA PREP~ AHUACATLAN CALLE EL SAL~ EDUCACION AGUA, SANITA~ 200 "21a02'57.01\"" "104a29'16.71\""
18 18 ESCUELA PLAN~ AHUACATLAN OAXACA E HID~ EDUCACION AGUA, SANITA~ 200 "21a03'02.43\"" "104a28'58.82\""
19 19 UNIDAD ACADE~ AHUACATLAN CARR A GUADA~ EDUCACION AGUA, SANITA~ 200 "21a03'28.20\"" "104a29'06.67\""
20 20 CLUB SOCIAL ~ AHUACATLAN 20 DE NOVIEM~ DEPORTE AGUA, SANITA~ 400 "21a03'07.37\"" "104a29'01\"57\~
# ... with 3 more variables: COORD. ALTITUD MSNM <dbl>, RESPONSABLE <chr>, TELÉFONO <chr>