在 R 中用 'which' 语句分割 bin 字符串

Question

我已经在我的数据集上尝试了很多算法来执行聚类，现在很想在我的数据上应用带有“which”语句的管理细分。我想知道如果我要根据客户数学或 X1-X8 持续的年份来做细分，什么可能更有意义。在X1-X8上做管理分段很清楚，但我不知道如何在字符串上做。

这是我的 df:

   customer_id customer_math X1 X2 X3 X4 X5 X6 X7 X8
1   15251       10001010      1  0  0  0  1  0  1  0
2   10101       11111111      1  1  1  1  1  1  1  1
3   84787       10101010      1  0  1  0  1  0  1  0

例如，我想回答以下问题：

曾经有 "zero" 的客户
连续两次 "zero" 的客户
例如，离开并回来的客户"--> 字符串中至少有一个零和结尾字符串 1。

非常感谢您的反馈！

Answer 1

如果我理解正确的话：

library(stringr)
q1 <- df[str_count(df$customer_math, "0")==1,]            #exactly one '0' occurrence in string
q2 <- df[grepl("00",df$customer_math),]                   #at least two zeros ina a row - or more, be aware of it, this is simple solution and it won't get only exact 00 occurences, but you can fix it easly^^
q3 <- df[str_count(df$customer_math, "0")>=1 & df$X8==1,] #at least one zero in string and always 1 at the end

在 R 中用 'which' 语句分割 bin 字符串

Segment bin string with 'which' statement in R

r

hierarchical-clustering

segment