如何读取用于文本挖掘的csv文件

How to read csv file for text mining

我将使用 tm 进行文本挖掘 purpose.However,我的 CSV 文件很奇怪。下面是 dput,我在 r 中使用 read.table 函数后.有谎言、情感和评论三个栏目。然而,第四列包含没有列的评论 name.I 是 R 和文本挖掘的新手。如果我使用 read.csv,它会给我一个错误。请建议更好的读取 csv 文件的方法。

更新:

  > dput(head(df))

structure(list(V1 = c("lie,sentiment,review", "f,n,'Mike\'s", 
"f,n,'i", "f,n,'After", "f,n,'Olive", "f,n,'I"), V2 = c("", "Pizza", 
"really", "I", "Oil", "went"), V3 = c("", "High", "like", "went", 
"Garden", "to"), V4 = c("", "Point,", "this", "shopping", "was", 
"the"), V5 = c("", "NY", "buffet", "with", "very", "Chilis"), 
    V6 = c("", "Service", "restaurant", "some", "disappointing.", 
    "on"), V7 = c("", "was", "in", "of", "I", "Erie"), V8 = c("", 
    "very", "Marshall", "my", "expect", "Blvd"), V9 = c("", "slow", 
    "street.", "friend,", "good", "and"), V10 = c("", "and", 
    "they", "we", "food", "had"), V11 = c("", "the", "have", 
    "went", "and", "the"), V12 = c("", "quality", "a", "to", 
    "good", "worst"), V13 = c("", "was", "lot", "DODO", "service", 
    "meal"), V14 = c("", "low.", "of", "restaurant", "(at", "of"
    ), V15 = c("", "You", "selection", "for", "least!!)", "my"
    ), V16 = c("", "would", "of", "dinner.", "when", "life."), 
    V17 = c("", "think", "american,", "I", "I", "We"), V18 = c("", 
    "they", "japanese,", "found", "go", "arrived"), V19 = c("", 
    "would", "and", "worm", "out", "and"), V20 = c("", "know", 
    "chinese", "in", "to", "waited"), V21 = c("", "at", "dishes.", 
    "one", "eat.", "5"), V22 = c("", "least", "we", "of", "The", 
    "minutes"), V23 = c("", "how", "also", "the", "meal", "for"
    ), V24 = c("", "to", "got", "dishes", "was", "a"), V25 = c("", 
    "make", "a", ".'", "cold", "hostess,"), V26 = c("", "good", 
    "free", "", "when", "and"), V27 = c("", "pizza,", "drink", 
    "", "we", "then"), V28 = c("", "not.", "and", "", "got", 
    "were"), V29 = c("", "Stick", "free", "", "it,", "seated"
    ), V30 = c("", "to", "refill.", "", "and", "by"), V31 = c("", 
    "pre-made", "there", "", "the", "a"), V32 = c("", "dishes", 
    "are", "", "waitor", "waiter"), V33 = c("", "like", "also", 
    "", "had", "who"), V34 = c("", "stuffed", "different", "", 
    "no", "was"), V35 = c("", "pasta", "kinds", "", "manners", 
    "obviously"), V36 = c("", "or", "of", "", "whatsoever.", 
    "in"), V37 = c("", "a", "dessert.", "", "Don\'t", "a"), 
    V38 = c("", "salad.", "the", "", "go", "terrible"), V39 = c("", 
    "You", "staff", "", "to", "mood."), V40 = c("", "should", 
    "is", "", "the", "We"), V41 = c("", "consider", "very", "", 
    "Olive", "order"), V42 = c("", "dining", "friendly.", "", 
    "Oil", "drinks"), V43 = c("", "else", "it", "", "Garden.", 
    "and"), V44 = c("", "where.'", "is", "", "\nf,n,", "it"), 
    V45 = c("", "", "also", "", "The", "took"), V46 = c("", "", 
    "quite", "", "Seven", "them"), V47 = c("", "", "cheap", "", 
    "Heaven", "15"), V48 = c("", "", "compared", "", "restaurant", 
    "minutes"), V49 = c("", "", "with", "", "was", "to"), V50 = c("", 
    "", "the", "", "never", "bring"), V51 = c("", "", "other", 
    "", "known", "us"), V52 = c("", "", "restaurant", "", "for", 
    "both"), V53 = c("", "", "in", "", "a", "the"), V54 = c("", 
    "", "syracuse", "", "superior", "wrong"), V55 = c("", "", 
    "area.", "", "service", "beers"), V56 = c("", "", "i", "", 
    "but", "which"), V57 = c("", "", "will", "", "what", "were"
    ), V58 = c("", "", "definitely", "", "we", "barely"), V59 = c("", 
    "", "coming", "", "experienced", "cold."), V60 = c("", "", 
    "back", "", "last", "Then"), V61 = c("", "", "here.'", "", 
    "week", "we"), V62 = c("", "", "", "", "was", "order"), V63 = c("", 
    "", "", "", "a", "an"), V64 = c("", "", "", "", "disaster.", 
    "appetizer"), V65 = c("", "", "", "", "The", "and"), V66 = c("", 
    "", "", "", "waiter", "wait"), V67 = c("", "", "", "", "would", 
    "25"), V68 = c("", "", "", "", "not", "minutes"), V69 = c("", 
    "", "", "", "notice", "for"), V70 = c("", "", "", "", "us", 
    "cold"), V71 = c("", "", "", "", "until", "southwest"), V72 = c("", 
    "", "", "", "we", "egg"), V73 = c("", "", "", "", "asked", 
    "rolls,"), V74 = c("", "", "", "", "him", "at"), V75 = c("", 
    "", "", "", "4", "which"), V76 = c("", "", "", "", "times", 
    "point"), V77 = c("", "", "", "", "to", "we"), V78 = c("", 
    "", "", "", "bring", "just"), V79 = c("", "", "", "", "us", 
    "paid"), V80 = c("", "", "", "", "the", "and"), V81 = c("", 
    "", "", "", "menu.", "left."), V82 = c("", "", "", "", "The", 
    "Don\'t"), V83 = c("", "", "", "", "food", "go.'"), V84 = c("", 
    "", "", "", "was", ""), V85 = c("", "", "", "", "not", ""
    ), V86 = c("", "", "", "", "exceptional", ""), V87 = c("", 
    "", "", "", "either.", ""), V88 = c("", "", "", "", "It", 
    ""), V89 = c("", "", "", "", "took", ""), V90 = c("", "", 
    "", "", "them", ""), V91 = c("", "", "", "", "though", ""
    ), V92 = c("", "", "", "", "2", ""), V93 = c("", "", "", 
    "", "minutes", ""), V94 = c("", "", "", "", "to", ""), V95 = c("", 
    "", "", "", "bring", ""), V96 = c("", "", "", "", "us", ""
    ), V97 = c("", "", "", "", "a", ""), V98 = c("", "", "", 
    "", "check", ""), V99 = c("", "", "", "", "after", ""), V100 = c("", 
    "", "", "", "they", ""), V101 = c("", "", "", "", "spotted", 
    ""), V102 = c("", "", "", "", "we", ""), V103 = c("", "", 
    "", "", "finished", ""), V104 = c("", "", "", "", "eating", 
    ""), V105 = c("", "", "", "", "and", ""), V106 = c("", "", 
    "", "", "are", ""), V107 = c("", "", "", "", "not", ""), 
    V108 = c("", "", "", "", "ordering", ""), V109 = c("", "", 
    "", "", "more.", ""), V110 = c("", "", "", "", "Well,", ""
    ), V111 = c("", "", "", "", "never", ""), V112 = c("", "", 
    "", "", "more.", ""), V113 = c("", "", "", "", "\nf,n,", 
    ""), V114 = c("", "", "", "", "I", ""), V115 = c("", "", 
    "", "", "went", ""), V116 = c("", "", "", "", "to", ""), 
    V117 = c("", "", "", "", "XYZ", ""), V118 = c("", "", "", 
    "", "restaurant", ""), V119 = c("", "", "", "", "and", ""
    ), V120 = c("", "", "", "", "had", ""), V121 = c("", "", 
    "", "", "a", ""), V122 = c("", "", "", "", "terrible", ""
    ), V123 = c("", "", "", "", "experience.", ""), V124 = c("", 
    "", "", "", "I", ""), V125 = c("", "", "", "", "had", ""), 
    V126 = c("", "", "", "", "a", ""), V127 = c("", "", "", "", 
    "YELP", ""), V128 = c("", "", "", "", "Free", ""), V129 = c("", 
    "", "", "", "Appetizer", ""), V130 = c("", "", "", "", "coupon", 
    ""), V131 = c("", "", "", "", "which", ""), V132 = c("", 
    "", "", "", "could", ""), V133 = c("", "", "", "", "be", 
    ""), V134 = c("", "", "", "", "applied", ""), V135 = c("", 
    "", "", "", "upon", ""), V136 = c("", "", "", "", "checking", 
    ""), V137 = c("", "", "", "", "in", ""), V138 = c("", "", 
    "", "", "to", ""), V139 = c("", "", "", "", "the", ""), V140 = c("", 
    "", "", "", "restaurant.", ""), V141 = c("", "", "", "", 
    "The", ""), V142 = c("", "", "", "", "person", ""), V143 = c("", 
    "", "", "", "serving", ""), V144 = c("", "", "", "", "us", 
    ""), V145 = c("", "", "", "", "was", ""), V146 = c("", "", 
    "", "", "very", ""), V147 = c("", "", "", "", "rude", ""), 
    V148 = c("", "", "", "", "and", ""), V149 = c("", "", "", 
    "", "didn\'t", ""), V150 = c("", "", "", "", "acknowledge", 
    ""), V151 = c("", "", "", "", "the", ""), V152 = c("", "", 
    "", "", "coupon.", ""), V153 = c("", "", "", "", "When", 
    ""), V154 = c("", "", "", "", "I", ""), V155 = c("", "", 
    "", "", "asked", ""), V156 = c("", "", "", "", "her", ""), 
    V157 = c("", "", "", "", "about", ""), V158 = c("", "", "", 
    "", "it,", ""), V159 = c("", "", "", "", "she", ""), V160 = c("", 
    "", "", "", "rudely", ""), V161 = c("", "", "", "", "replied", 
    ""), V162 = c("", "", "", "", "back", ""), V163 = c("", "", 
    "", "", "saying", ""), V164 = c("", "", "", "", "she", ""
    ), V165 = c("", "", "", "", "had", ""), V166 = c("", "", 
    "", "", "already", ""), V167 = c("", "", "", "", "applied", 
    ""), V168 = c("", "", "", "", "it.", ""), V169 = c("", "", 
    "", "", "Then", ""), V170 = c("", "", "", "", "I", ""), V171 = c("", 
    "", "", "", "inquired", ""), V172 = c("", "", "", "", "about", 
    ""), V173 = c("", "", "", "", "the", ""), V174 = c("", "", 
    "", "", "free", ""), V175 = c("", "", "", "", "salad", ""
    ), V176 = c("", "", "", "", "that", ""), V177 = c("", "", 
    "", "", "they", ""), V178 = c("", "", "", "", "serve.", ""
    ), V179 = c("", "", "", "", "She", ""), V180 = c("", "", 
    "", "", "rudely", ""), V181 = c("", "", "", "", "said", ""
    ), V182 = c("", "", "", "", "that", ""), V183 = c("", "", 
    "", "", "you", ""), V184 = c("", "", "", "", "have", ""), 
    V185 = c("", "", "", "", "to", ""), V186 = c("", "", "", 
    "", "order", ""), V187 = c("", "", "", "", "the", ""), V188 = c("", 
    "", "", "", "main", ""), V189 = c("", "", "", "", "course", 
    ""), V190 = c("", "", "", "", "to", ""), V191 = c("", "", 
    "", "", "get", ""), V192 = c("", "", "", "", "that.", ""), 
    V193 = c("", "", "", "", "Overall,", ""), V194 = c("", "", 
    "", "", "I", ""), V195 = c("", "", "", "", "had", ""), V196 = c("", 
    "", "", "", "a", ""), V197 = c("", "", "", "", "bad", ""), 
    V198 = c("", "", "", "", "experience", ""), V199 = c("", 
    "", "", "", "as", ""), V200 = c("", "", "", "", "I", ""), 
    V201 = c("", "", "", "", "had", ""), V202 = c("", "", "", 
    "", "taken", ""), V203 = c("", "", "", "", "my", ""), V204 = c("", 
    "", "", "", "family", ""), V205 = c("", "", "", "", "to", 
    ""), V206 = c("", "", "", "", "that", ""), V207 = c("", "", 
    "", "", "restaurant", ""), V208 = c("", "", "", "", "for", 
    ""), V209 = c("", "", "", "", "the", ""), V210 = c("", "", 
    "", "", "first", ""), V211 = c("", "", "", "", "time", ""
    ), V212 = c("", "", "", "", "and", ""), V213 = c("", "", 
    "", "", "I", ""), V214 = c("", "", "", "", "had", ""), V215 = c("", 
    "", "", "", "high", ""), V216 = c("", "", "", "", "hopes", 
    ""), V217 = c("", "", "", "", "from", ""), V218 = c("", "", 
    "", "", "the", ""), V219 = c("", "", "", "", "restaurant", 
    ""), V220 = c("", "", "", "", "which", ""), V221 = c("", 
    "", "", "", "is,", ""), V222 = c("", "", "", "", "otherwise,", 
    ""), V223 = c("", "", "", "", "my", ""), V224 = c("", "", 
    "", "", "favorite", ""), V225 = c("", "", "", "", "place", 
    ""), V226 = c("", "", "", "", "to", ""), V227 = c("", "", 
    "", "", "dine.", ""), V228 = c("", "", "", "", "\nf,n,", 
    ""), V229 = c("", "", "", "", "I", ""), V230 = c("", "", 
    "", "", "went", ""), V231 = c("", "", "", "", "to", ""), 
    V232 = c("", "", "", "", "ABC", ""), V233 = c("", "", "", 
    "", "restaurant", ""), V234 = c("", "", "", "", "two", ""
    ), V235 = c("", "", "", "", "days", ""), V236 = c("", "", 
    "", "", "ago", ""), V237 = c("", "", "", "", "and", ""), 
    V238 = c("", "", "", "", "I", ""), V239 = c("", "", "", "", 
    "hated", ""), V240 = c("", "", "", "", "the", ""), V241 = c("", 
    "", "", "", "food", ""), V242 = c("", "", "", "", "and", 
    ""), V243 = c("", "", "", "", "the", ""), V244 = c("", "", 
    "", "", "service.", ""), V245 = c("", "", "", "", "We", ""
    ), V246 = c("", "", "", "", "were", ""), V247 = c("", "", 
    "", "", "kept", ""), V248 = c("", "", "", "", "waiting", 
    ""), V249 = c("", "", "", "", "for", ""), V250 = c("", "", 
    "", "", "over", ""), V251 = c("", "", "", "", "an", ""), 
    V252 = c("", "", "", "", "hour", ""), V253 = c("", "", "", 
    "", "just", ""), V254 = c("", "", "", "", "to", ""), V255 = c("", 
    "", "", "", "get", ""), V256 = c("", "", "", "", "seated", 
    ""), V257 = c("", "", "", "", "and", ""), V258 = c("", "", 
    "", "", "once", ""), V259 = c("", "", "", "", "we", ""), 
    V260 = c("", "", "", "", "ordered,", ""), V261 = c("", "", 
    "", "", "our", ""), V262 = c("", "", "", "", "food", ""), 
    V263 = c("", "", "", "", "came", ""), V264 = c("", "", "", 
    "", "out", ""), V265 = c("", "", "", "", "cold.", ""), V266 = c("", 
    "", "", "", "I", ""), V267 = c("", "", "", "", "ordered", 
    ""), V268 = c("", "", "", "", "the", ""), V269 = c("", "", 
    "", "", "pasta", ""), V270 = c("", "", "", "", "and", ""), 
    V271 = c("", "", "", "", "it", ""), V272 = c("", "", "", 
    "", "was", ""), V273 = c("", "", "", "", "terrible", ""), 
    V274 = c("", "", "", "", "-", ""), V275 = c("", "", "", "", 
    "completely", ""), V276 = c("", "", "", "", "bland", ""), 
    V277 = c("", "", "", "", "and", ""), V278 = c("", "", "", 
    "", "very", ""), V279 = c("", "", "", "", "unappatizing.", 
    ""), V280 = c("", "", "", "", "I", ""), V281 = c("", "", 
    "", "", "definitely", ""), V282 = c("", "", "", "", "would", 
    ""), V283 = c("", "", "", "", "not", ""), V284 = c("", "", 
    "", "", "recommend", ""), V285 = c("", "", "", "", "going", 
    ""), V286 = c("", "", "", "", "there,", ""), V287 = c("", 
    "", "", "", "especially", ""), V288 = c("", "", "", "", "if", 
    ""), V289 = c("", "", "", "", "you\'re", ""), V290 = c("", 
    "", "", "", "in", ""), V291 = c("", "", "", "", "a", ""), 
    V292 = c("", "", "", "", "hurry!'", "")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", 
"V12", "V13", "V14", "V15", "V16", "V17", "V18", "V19", "V20", 
"V21", "V22", "V23", "V24", "V25", "V26", "V27", "V28", "V29", 
"V30", "V31", "V32", "V33", "V34", "V35", "V36", "V37", "V38", 
"V39", "V40", "V41", "V42", "V43", "V44", "V45", "V46", "V47", 
"V48", "V49", "V50", "V51", "V52", "V53", "V54", "V55", "V56", 
"V57", "V58", "V59", "V60", "V61", "V62", "V63", "V64", "V65", 
"V66", "V67", "V68", "V69", "V70", "V71", "V72", "V73", "V74", 
"V75", "V76", "V77", "V78", "V79", "V80", "V81", "V82", "V83", 
"V84", "V85", "V86", "V87", "V88", "V89", "V90", "V91", "V92", 
"V93", "V94", "V95", "V96", "V97", "V98", "V99", "V100", "V101", 
"V102", "V103", "V104", "V105", "V106", "V107", "V108", "V109", 
"V110", "V111", "V112", "V113", "V114", "V115", "V116", "V117", 
"V118", "V119", "V120", "V121", "V122", "V123", "V124", "V125", 
"V126", "V127", "V128", "V129", "V130", "V131", "V132", "V133", 
"V134", "V135", "V136", "V137", "V138", "V139", "V140", "V141", 
"V142", "V143", "V144", "V145", "V146", "V147", "V148", "V149", 
"V150", "V151", "V152", "V153", "V154", "V155", "V156", "V157", 
"V158", "V159", "V160", "V161", "V162", "V163", "V164", "V165", 
"V166", "V167", "V168", "V169", "V170", "V171", "V172", "V173", 
"V174", "V175", "V176", "V177", "V178", "V179", "V180", "V181", 
"V182", "V183", "V184", "V185", "V186", "V187", "V188", "V189", 
"V190", "V191", "V192", "V193", "V194", "V195", "V196", "V197", 
"V198", "V199", "V200", "V201", "V202", "V203", "V204", "V205", 
"V206", "V207", "V208", "V209", "V210", "V211", "V212", "V213", 
"V214", "V215", "V216", "V217", "V218", "V219", "V220", "V221", 
"V222", "V223", "V224", "V225", "V226", "V227", "V228", "V229", 
"V230", "V231", "V232", "V233", "V234", "V235", "V236", "V237", 
"V238", "V239", "V240", "V241", "V242", "V243", "V244", "V245", 
"V246", "V247", "V248", "V249", "V250", "V251", "V252", "V253", 
"V254", "V255", "V256", "V257", "V258", "V259", "V260", "V261", 
"V262", "V263", "V264", "V265", "V266", "V267", "V268", "V269", 
"V270", "V271", "V272", "V273", "V274", "V275", "V276", "V277", 
"V278", "V279", "V280", "V281", "V282", "V283", "V284", "V285", 
"V286", "V287", "V288", "V289", "V290", "V291", "V292"), row.names = c(NA, 
6L), class = "data.frame")

数据集:

lie sentiment   review                                                                                  
f   n   'Mike\'s Pizza High Point    NY Service was very slow and the quality was low. You would think they would know at least how to make good pizza   not. Stick to pre-made dishes like stuffed pasta or a salad. You should consider dining else where.'                                                                           
f   n   'i really like this buffet restaurant in Marshall street. they have a lot of selection of american   japanese    and chinese dishes. we also got a free drink and free refill. there are also different kinds of dessert. the staff is very friendly. it is also quite cheap compared with the other restaurant in syracuse area. i will definitely coming back here.'                                                                          
f   n   'After I went shopping with some of my friend    we went to DODO restaurant for dinner. I found worm in one of the dishes .'                                                                                
f   n   'Olive Oil Garden was very disappointing. I expect good food and good service (at least!!) when I go out to eat. The meal was cold when we got it    and the waitor had no manners whatsoever. Don\'t go to the Olive Oil Garden. '                                                                             
f   n   'The Seven Heaven restaurant was never known for a superior service but what we experienced last week was a disaster. The waiter would not notice us until we asked him 4 times to bring us the menu. The food was not exceptional either. It took them though 2 minutes to bring us a check after they spotted we finished eating and are not ordering more. Well   never more. '                                                                              
f   n   'I went to XYZ restaurant and had a terrible experience. I had a YELP Free Appetizer coupon which could be applied upon checking in to the restaurant. The person serving us was very rude and didn\'t acknowledge the coupon. When I asked her about it     she rudely replied back saying she had already applied it. Then I inquired about the free salad that they serve. She rudely said that you have to order the main course to get that. Overall    I had a bad experience as I had taken my family to that restaurant for the first time and I had high hopes from the restaurant which is     otherwise   my favorite place to dine. '                                                                   
f   n   'I went to ABC restaurant two days ago and I hated the food and the service. We were kept waiting for over an hour just to get seated and once we ordered    our food came out cold. I ordered the pasta and it was terrible - completely bland and very unappatizing. I definitely would not recommend going there  especially if you\'re in a hurry!'                                                                         
f   n   'I went to the Chilis on Erie Blvd and had the worst meal of my life. We arrived and waited 5 minutes for a hostess  and then were seated by a waiter who was obviously in a terrible mood. We order drinks and it took them 15 minutes to bring us both the wrong beers which were barely cold. Then we order an appetizer and wait 25 minutes for cold southwest egg rolls     at which point we just paid and left. Don\'t go.'                                                                          
f   n   'OMG. This restaurant is horrible. The receptionist did not greet us     we just stood there and waited for five minutes. The food came late and served not warm. Me and my pet ordered a bowl of salad and a cheese pizza. The salad was not fresh  the crust of a pizza was so hard like plastics. My dog didn\'t even eat that pizza. I hate this place!!!!!!!!!!'                                       

提前致谢,

对于这个特定的文本文件,您需要查看 quote 参数。在 read.table() 中,默认的 quote 参数是单引号或双引号。这里你需要让它只是一个单引号:

df <- read.table("filename", header = TRUE, quote = "\'")

str(df)
# 'data.frame': 9 obs. of  3 variables:
#  $ lie      : Factor w/ 1 level "f": 1 1 1 1 1 1 1 1 1
#  $ sentiment: Factor w/ 1 level "n": 1 1 1 1 1 1 1 1 1
#  $ review   : Factor w/ 9 levels "After I went shopping with some of my friend    we went to DODO restaurant for dinner. I found worm in one of the dishes .",..: 6 2 1 7 9 5 3 4 8

这应该适合你。

我建议阅读 read.table() 的帮助文件(从头到尾)。有很多事情要考虑。

我不知道你为什么从原来的 post 中删除文件,@是的老板,但这个答案是基于这个文件,而不是你的 dput 输出。该文件基本上有两个问题导致您无法读取它。 1. 您的引号字符是 ' 而不是更常见的 "; 2. ' 也用在 review 列中,这对于 base 来说有点太多了(在这些情况下它会尝试拆分成新的列)。幸运的是,包 data.table 更聪明一些,可以解决问题 #2:

library(data.table)

df <- fread(file = "deception.csv", quote="\'")

生成的对象将是 data.table 而不是 data.frame:

> str(df)
Classes ‘data.table’ and 'data.frame':  92 obs. of  3 variables:
 $ lie      : chr  "f" "f" "f" "f" ...
 $ sentiment: chr  "n" "n" "n" "n" ...
 $ review   : chr  "Mike\'s Pizza High Point, NY Service was very slow and the quality was low. You would think they would know at"| __truncated__ "i really like this buffet restaurant in Marshall street. they have a lot of selection of american, japanese, an"| __truncated__ "After I went shopping with some of my friend, we went to DODO restaurant for dinner. I found worm in one of the dishes ." "Olive Oil Garden was very disappointing. I expect good food and good service (at least!!) when I go out to eat."| __truncated__ ...
 - attr(*, ".internal.selfref")=<externalptr> 

您可以通过在 fread() 中设置 data.table = FALSE 来关闭此行为(如果您愿意,我建议您学习如何使用 data.table)。

个人意见:如果你想进入文本挖掘,请查看 quanteda 包而不是 tm。它速度更快,并且对许多任务有更现代的方法。