计算大型数据框中的不同字符

Counting different characters in a large dataframe

我想计算不同单词在数据框中出现的次数,然后将其重新制作成显示每个单词计数的新数据框。

比如我有这样一个数据table:

Col1 Col2 Col3 Col4 Col5 Continues...
Passwords1 GHSME12 POWDER2 JOHNC PLOW01 PLANE
Usercode20 HUNG1 GHSME12 PLOW01 GORGE09 JOHNC
Usercode15 PLOW01 GORGE09 JOHNC POWDER2 SYRUP9
Continues... ... ... ... ... ...

我希望能够计算数据中每个单词在每个 Col1 中出现的次数。虽然我可以做诸如 WordX = wordX 的项目数之类的事情,但有数百个密码,使得手动计数变得困难,所以我想知道在这种情况下我是否必须使用 for 循环和空白数据框来实现这样的事情:

Passwords Passwords1 Usercode20 Usercode15 Continues...
GHSME12 1 1 0 ...
POWDER2 1 0 1 ...
JOHNC 1 1 1 ...
PLOW01 1 1 1 ...
PLANE 1 0 0 ...
HUNG1 0 1 0 ...
GORGE09 0 1 1 ...
SYRUP9 0 0 1 ...

如果有人对解决这个问题有好的想法,我将不胜感激。谢谢!

table(cbind(stack(df, -Col1)['values'], df['Col1']))

         Col1
values    Passwords1 Usercode15 Usercode20
  GHSME12          1          0          1
  GORGE09          0          1          1
  HUNG1            0          0          1
  JOHNC            1          1          1
  PLANE            1          0          0
  PLOW01           1          1          1
  POWDER2          1          1          0
  SYRUP9           0          1          0

整洁宇宙:

library(tidyverse)
df %>%
   pivot_longer(-Col1) %>%
   pivot_wider(names_from = Col1, values_from = name, 
               values_fn = length, values_fill = 0)

# A tibble: 8 x 4
  value   Passwords1 Usercode20 Usercode15
  <chr>        <int>      <int>      <int>
1 GHSME12          1          1          0
2 POWDER2          1          0          1
3 JOHNC            1          1          1
4 PLOW01           1          1          1
5 PLANE            1          0          0
6 HUNG1            0          1          0
7 GORGE09          0          1          1
8 SYRUP9           0          0          1