R for循环根据条件帮助trim

R for loop help to trim based on condition

我是 R 的新手,正在为 for 循环而苦苦挣扎: 我想根据条件在 df 中拆分一些字符串 我的DF:

我想拆分以“X”开头的地方 确定我正在使用 - grepl("X.",df1[,1]) 拆分 - str_split_fixed(df1[,1],"X",2)[,2] 不确定如何将其合并到循环中...

for (i in df1[,1]){
  # if (begins with X) then split
}

所以这里的目标是从 df 行(11 和 12)中去除“X”

提前致谢!

R 是一种矢量化语言,因此您只需在一行中将前导 "X" 替换为 ""

df1[,1] <- sub("^X", "", df1[,1])

在这种情况下使用 for 循环会非常低效,但如果你坚持这样做,那么

for (i in seq_along(df1[,1])) {
  if (substr(df1[i,1],1,1) == "X")
    df1[i,1] <- substring(df1[i,1],2)
}

数据

df1 <- structure(list(header1 = c("PLAYERID", "YEARID", "STINT", "TEAMID", 
"LGID", "G", "G_BATTING", "AB", "R", "H", "X2B", "X3B", "HR", 
"RBI", "SB", "CS", "BB", "SO", "IBB", "HBP", "SH", "SF", "GIDP", 
"G_OLD")), class = "data.frame", row.names = c(NA, -24L))

for 循环

for(i in seq_along(df$header1)){ df$header1[i] <- sub("^X","",df$header1[i]) }
df
      header1
1    DGIGACAE
2  DFGGFAEHBD
3        BIBH
4          EB
5      DHBDFC
6         2BD
7        3GDE
8     DEAEFGE
9           I
10    FFBGDBD

更好的矢量化

df$header1 <- sub("^X","",df$header1)
df
      header1
1    DGIGACAE
2  DFGGFAEHBD
3        BIBH
4          EB
5      DHBDFC
6         2BD
7        3GDE
8     DEAEFGE
9           I
10    FFBGDBD

数据

df <- structure(list(header1 = c("DGIGACAE", "DFGGFAEHBD", "BIBH", 
"EB", "DHBDFC", "X2BD", "X3GDE", "DEAEFGE", "I", "FFBGDBD")), row.names = c(NA, 
-10L), class = "data.frame")

stringr解决方案

我知道你想使用循环,其他 2 个解决方案给了你那个,再加上一个矢量化解决方案。如果您正在使用 tidyverse,您也可以使用 str_replace 来执行此操作:

df1 <- structure(list(header1 = c("PLAYERID", "YEARID", "STINT", "TEAMID", 
                                  "LGID", "G", "G_BATTING", "AB", "R", "H", "X2B", "X3B", "HR", 
                                  "RBI", "SB", "CS", "BB", "SO", "IBB", "HBP", "SH", "SF", "GIDP", 
                                  "G_OLD")), class = "data.frame", row.names = c(NA, -24L))

df1

library(tidyverse)

df1 %>% mutate(header1 = str_replace(header1, "^X", ""))

输出:

     header1
1   PLAYERID
2     YEARID
3      STINT
4     TEAMID
5       LGID
6          G
7  G_BATTING
8         AB
9          R
10         H
11        2B
12        3B
13        HR
14       RBI
15        SB
16        CS
17        BB
18        SO
19       IBB
20       HBP
21        SH
22        SF
23      GIDP
24     G_OLD

所以...您有很多选择。不过,我认为其他答案比您的直接问题更好。