使用 K-mean 在 R 中进行聚类

Clustering in R using K-mean

我尝试使用 K-mean 对我的数据集进行聚类,但第 9 列中有分类数据;所以当我 运行 k-mean 它有这样的错误:

res<-NbClust(mi[2:9],min.nc=2,max.nc=15,method="ward.D2")

Error in t(jeu) %*% jeu : requires numeric/complex matrix/vector arguments

所以对于第 2 到第 8 列,我只能 运行 K 均值。我想知道是否还有另一种方法可以对第 9 列 运行 的数据进行聚类?

数据:

df <- structure(list(Name = structure(c(58L, 188L, 40L, 155L, 32L, 88L, 92L, 55L, 135L, 31L, 139L, 26L, 126L, 10L, 166L, 104L, 75L, 180L, 35L, 175L, 77L, 99L, 4L, 71L, 141L, 176L, 53L, 39L, 172L, 196L, 123L, 107L, 16L, 96L, 82L, 185L, 30L, 15L, 94L, 129L, 187L, 151L, 33L, 23L, 28L, 44L, 157L, 69L, 132L, 83L, 131L, 11L, 182L, 181L, 54L, 115L, 116L, 183L, 150L, 195L, 45L, 144L, 1L, 110L, 17L, 114L, 9L, 117L, 112L, 70L, 34L, 169L, 27L, 66L, 3L, 73L, 133L, 91L, 154L, 130L, 160L, 105L, 90L, 165L, 67L, 100L, 162L, 98L, 29L, 68L, 189L, 192L, 102L, 190L, 134L, 136L, 52L, 12L, 81L, 59L, 63L, 122L, 93L, 109L, 178L, 138L, 5L, 43L, 140L, 95L, 2L, 174L, 76L, 51L, 156L, 60L, 149L, 128L, 177L, 142L, 103L, 7L, 8L, 14L, 164L, 74L, 145L, 148L, 113L, 86L, 108L, 48L, 163L, 6L, 186L, 89L, 36L, 191L, 125L, 120L, 62L, 65L, 124L, 168L, 147L, 79L, 173L, 84L, 193L, 25L, 146L, 121L, 127L, 153L, 13L, 106L, 119L, 161L, 49L, 97L, 101L, 61L, 137L, 24L, 85L, 194L, 78L, 41L, 170L, 47L, 118L, 184L, 179L, 72L, 42L, 111L, 87L, 57L, 38L, 37L, 171L, 22L, 50L, 80L, 159L, 18L, 152L, 64L, 56L, 158L, 167L, 46L, 19L, 21L, 20L, 143L), .Label = c("#Mashtag 2013", "#Mashtag 2014", "#Mashtag 2015", "10 Heads High", "5am Saint", "77 Lager", "AB:02", "AB:03", "AB:04", "AB:06", "AB:08", "AB:10", "AB:11", "AB:13", "AB:15", "AB:17", "AB:18", "AB:20", "Ace Of Chinook", "Ace Of Citra", "Ace Of Equinox", "Ace Of Simcoe", "Albino Squid Assasin", "Alice Porter", "All Day Long - Prototype Challenge", "Alpha Dog", "Alpha Pop", "Amarillo - IPA Is Dead", "American Ale", "Anarchist Alchemist", "Arcade Nation", "Avery Brown Dredge", "Baby Dogma", "Baby Saison - B-Sides", "Bad Pixie", "Barley Wine - Russian Doll", "Barrel Aged Albino Squid Assassin", "Barrel Aged Hinterland", "Berliner Weisse With Raspberries And Rhubarb - B-Sides", "Berliner Weisse With Yuzu - B-Sides", "Bitch Please (w/ 3 Floyds)", "Black Dog", "Black Eye Joe (w/ Stone Brewing Co)", "Black Eyed King Imp", "Black Eyed King Imp - Vietnamese Coffee Edition", "Black Hammer", "Black Jacques", "Black Tokyo Horizon (w/Nøgne Ã0 & Mikkeller)", "Blitz Berliner Weisse", "Blitz Series", "Born To Die", "Bounty Hunter - Shareholder Brew", "Bourbon Baby", "Bracken's Porter", "Bramling X", "Brewdog Vs Beavertown", "Brixton Porter", "Buzz", "Candy Kaiser", "Cap Dog (w/ Cap Brewery)", "Catherine's Pony (w/ Beavertown)", "Challenger", "Chaos Theory", "Chili Hammer", "Chinook - IPA Is Dead", "Citra", "Clown King", "Cocoa Psycho", "Coffee Imperial Stout", "Comet", "Dana - IPA Is Dead", "Dead Metaphor", "Dead Pony Club", "Deaf Mermaid - B-Sides", "Devine Rebel (w/ Mikkeller)", "Dog A", "Dog B", "Dog C", "Dog D", "Dog E", "Dog Fight (w/ Flying Dog)", "Dog Wired (w/8 Wired)", "Dogma", "Doodlebug", "Double IPA - Russian Doll", "Edge", "El Dorado - IPA Is Dead", "Electric India", "Ella - IPA Is Dead", "Elvis Juice V2.0 - Prototype Challenge", "Everday Anarchy", "Fake Lager", "Galaxy", "Goldings - IPA Is Dead", "Growler", "Hardcore IPA", "Hardkogt IPA", "HBC 366 - IPA Is Dead", "HBC 369", "Hello My Name Is Beastie", "Hello My Name Is Holy Moose", "Hello My Name Is Ingrid", "Hello My Name Is Little Ingrid", "Hello My Name Is Mette-Marit", "Hello My Name Is PaÌ0ivi", "Hello My Name is Sonja (w/ Evil Twin)", "Hello My Name is Vladimir", "Hello My Name Is ZeÌ1 (w/ 2Cabeças)", "Hinterland", "Hobo Pop", "Hop Fiction - Prototype Challenge", "Hopped-Up Brown Ale - Prototype Challenge", "Hoppy Christmas", "Hops Kill Nazis", "Hunter Foundation Pale Ale", "Hype", "India Session Lager - Prototype Challenge", "International Arms Race (w/ Flying Dog)", "Interstellar", "Jack Hammer", "Jasmine IPA", "Jet Black Heart", "Kohatu - IPA Is Dead", "Konnichiwa Kitsune", "Libertine Black Ale", "Libertine Porter", "Lichtenstein Pale Ale", "Lizard Bride - Prototype Challenge", "Lost Dog (w/Lost Abbey)", "Lumberjack Stout", "Magic Stone Dog (w/Magic Rock & Stone Brewing Co.)", "Mandarina Bavaria - IPA Is Dead", "Mango Gose - B-Sides", "Melon And Cucumber IPA - B-Sides", "Misspent Youth", "Morag's Mojito - B-Sides", "Moshi Moshi 15", "Motueka", "Movember", "Mr.Miyagi's Wasabi Stout", "Nanny State", "Nelson Sauvin", "Neon Overlord", "Never Mind The Anabolics", "No Label", "Nuns With Guns", "Old World India Pale Ale", "Old World Russian Imperial Stout", "Orange Blossom - B-Sides", "Pale - Russian Doll", "Paradox Islay", "Paradox Islay 2.0", "Paradox Jura", "Peroxide Punk", "Pilsen Lager", "Pioneer - IPA Is Dead", "Prototype 27", "Prototype Helles", "Prototype Pils 2.0", "Pumpkin King", "Punk IPA 2007 - 2010", "Punk IPA 2010 - Current", "Restorative Beverage For Invalids And Convalescents", "Rhubarb Saison - B-Sides", "Riptide", "Russian Doll â0“ India Pale Ale", "Rye Hammer", "San Diego Scotch Ale (w/Ballast Point)", "Santa Paws", "Shareholder Black IPA 2011", "Ship Wreck", "Shipwrecker Circus (w/ Oskar Blues)", "Simcoe", "Sink The Bismarck!", "Skull Candy", "Sorachi Ace", "Sorachi Bitter - B-Sides", "Spiced Cherry Sour - B-Sides", "Stereo Wolf Stout - Prototype Challenge", "Storm", "Sub Hop", "Sunk Punk", "Sunmaid Stout", "Sunshine On Rye - B-Sides", "The Physics", "This. Is. Lager", "TM10", "Trashy Blonde", "Truffle and Chocolate Stout - B-Sides", "U-Boat (w/ Victory Brewing)", "Vagabond Pale ALe - Prototype Challenge", "Vagabond Pilsner", "Vic Secret", "Waimea - IPA Is Dead", "Whisky Sour - B-Sides", "Zephyr"), class = "factor"), ABV = c(4.5, 4.1, 4.2, 6.3, 7.2, NA, 4.7, 7.5, 7.3, 5.3, 4.5, 4.5, 6.1, 11.2, 6, 8.2, 12.5, 8, 4.7, 3.5, 15, 6.7, 7.8, 6.7, 0.5, 7.5, 5.8, 3.6, 10.5, 12.5, 7.2, 8.2, 10.7, 9.2, 7.1, 5, 16.5, 12.8, 6.7, 10, NA, 10, 4.5, 7.4, 7.2, 9.5, 9.2, 9, 7.2, 7.5, NA, 10.43, 7.1, 8, 5, 5.4, 4.1, 10.2, 4, 7, 12.7, 6.5, 7.5, 4.2, 11.8, 7.6, 15, 4.4, 6.3, 7.2, NA, 4.5, 4.5, 7.5, 10, 3.8, 6.4, NA, 4, 15.2, 5.4, 8.3, 6.5, 8, 12, 8.2, 5.6, 7.2, 6.3, 10, 5.6, 4.5, 8.2, 8.4, 6, 6.7, 6.5, 11.5, 8.5, 5.2, 7.1, 4.7, 6.7, 9, 6.5, 6.7, 5, 5.8, 7.5, 4.5, 9, 41, 15, 8.5, 7.2, 9, 3.8, 5.7, 6.3, 7.5, 4.4, 18, 10.5, 11.3, NA, 5.2, 4.5, 9.5, 7.2, 2.7, 6.4, 17.2, 8.5, 4.9, 4.7, 7.2, 10, 4.5, 7.2, 7.2, 6.7, 7.2, 4.4, 9, 7.5, 16.1, 6.7, 2.5, 7.4, 2.8, 4.2, 5.8, 5.2, 10, 12.8, 8.3, 6.5, 6, 3, 7.6, 5.5, 8.8, 5.2, 5.2, 8, 6.7, 15, 11.5, 7.1, NA, 7.5, 7.2, 5.2, 6.8, 5.5, 5.2, 6.7, 5, 9, 9.2, 13.8, 4.5, 3.2, 16.1, 4.7, 14.2, 13, 7.2, 9.2, 4.9, 7.2, 7.2, 4.5, 4.5, 4.5, 7.6), IBU = c(60, 41.5, 8, 55, 59, 38, 40, 75, 30, 60, 50, 42, 45, 150, 70, 70, 100, 60, 45, 33, 90, 67, 70, 70, 55, 75, 35, 8, 85, 125, 70, 70, 100, 125, 65, 47, 20.5, 50, 70, 35, 20, 55, 35, 65, 70, 85, 149, 65, 100, 30, 30, 65, 68, 35, 50, 35, 65, 50, 35, 20, 85, 35, 50, 50, 80, 70, 80, 35, 85, 70, 9, 35, 30, 70, 85, 35, 40, 45, 40, 20, 20, 70, 60, 45, 85, 42, 40, 70, 55, 85, 30, 55, 42, 50, 50, 40, 35, 80, 65, 45, 90, 45, 67, 85, 20, 67, 30, 40, 90, 38, 50, 1085, 90, 85, 100, 80, 20, 35, 130, 75, 35, 70, 14, 50, 25, 65, 25, 80, 70, 36, 50, 75, 100, 30, 37, 100, 80, 55, 50, 250, 67, 100, 70, 70, 80, 85, 70, 35, 70, 30, 25, 40, 50, 55, 70, 70, 55, 60, 8, 175, 35, 40, 45, 55, 85, 70, 90, 50, 80, 45, 0, 130, 55, 30, 60, 40, 70, 50, 85, 65, 60, 40, 8, 100, 25, 20, 100, 250, 50, 18, 250, 250, 40, 40, 40, 70), OG = c(1044, 1041.7, 1040, 1060, 1069, 1045, 1046, 1068, 1079, 1052, 1047, 1046, 1067, 1098, 1058, 1076, 1093, 1082, 1047, 1038, 1120, 1013, 1074, 1066, 1007, 1068, 1049, 1040, 1102, 1087, 1067, 1076, 1105, 1085, 1065, 1048.5, 1112, 1096, 1066, 1080, 1048, 1090, 1048, 1069, 1067, 1095, 1083, 1080, 1064, 1080, 1043, 1095, 1056, 1077, 1049, 1050, 1042, 1026, 1041, 1081, 1113.5, 1050, 1070, 1042, 1096, 1073, 1113, 1040, 1063, 1067, 1032, 1048, 1045, 1068, 1098, 1040, 1057, 1081, 1039, 1110, 1055, 1076, 1060, 1075, 1130, 1078, 1055, 1067, 1060, 1098, 1058, 1046, 1078, 1080, 1050, 1063, 1068, 1096, 1078, 1049, 1067, 1055, 1013, 1094, 1060, 1013, 1050, 1053, 1072, 1042.9, 1084, 1085, 1120, 1072, 1064, 1083, 1039, 1053, 1060, 1068, 1045, 1150, 1093, 1098, 1052, 1048, 1043, 1075, 1067, 1033, 1061, 1156, 1068, 1047, 1043, 1064, 1097, 1045, 1068, 1065, 1064, 1064, 1045, 1090, 1069, 1125, 1063, 1027, 1069, 1032.5, 1044, 1060, 1050, 1128, 1108, 1076, 1059, 1056, 1007, 1072, 1053, 1084, 1048, 1053, 1074, 1066, 1120, 1104, 1067, 1089, 1069, 1065, 1052, 1068, 1062, 1048, 1066, 1053, 1094, 1069, 1088, 1045, 1007, 1015, 1008, 1025, 1015, 1065, 1016, 1010, 1065, 1065, 1045, 1045, 1045, 1067), EBC = c(20, 15, 8, 30, 10, 15, 12, 22, 120, 200, 140, 62, 219, 70, 25, NA, 36, 12, 8, 50, 100, 19, 90, 30, 30, 30, 44, NA, 64, 40, 30, 16, 300, 40, 13, 65, 20, 111, 30, 80, 14, 300, 40, 60, 30, 250, 19.5, 97, 12, 46, 15, 23, 14, 15, 110, 11.5, 17, 197, 45, 12, 250, 23, 40, 30, 115, 59, 400, 12, 24, 30, 2, 44, 25, 30, 130, 25, 10, 15, 18, 158, 30, 30, 25, 240, 24, 90, 15, 30, 30, 30, 54, 25, 70, 200, 8, 15, 250, 115, 31.2, 45, 15, 200, 19, 400, NA, 19, 60, 177.3, 200, 18, 20, 40, 100, 15, 12, 180, 6, 25, 14, 30, 30, 57, NA, 164, 10, 16, 10, 195, 30, 57, 20, 128, 15, 12, 10, 12, 65, 20, 150, 15, 19, 12, 30, 190, 50, 400, 30, 10, 30, 42, 19, 35, 17, 300, 79, 30, 50, 17, 9, 40, 25, 190, 35, 165, 35, 30, 100, 38, 71, 15, 50, 14, 200, 86, 230, 13, 30, 200, 400, 60, 25, 18, 8, 500, 25, 67, 300, 15, 78.8, 13, 17, 104, 18, 18, 18, 20), PH = c(4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 5.2, 4.4, 4.4, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 3.2, 5.2, 4.4, 4.4, 4.4, 5.2, 4.4, 4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.3, 3.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.4, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.2, 4.5, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.2, 4.4, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.3, 4.4, 4.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 3.2, 4.4, 4.4, 4.5, 4.4, 5.2, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.4, 5.2, 5.2, 4.4, 4.4, 4.4, 4.4, 4.4, 4.3, 4.2, 4.4, 4.2, 3.2, 4.4, 4.2, 4, 4.4, 4.4, 4.2, 4.2, 4.4, 4.4, 4.2, 4.2, 4.2, 4.4), AttenuationLevel = c(75, 76, 83, 80, 67, 88.9, 78, 80.9, 74.7, 77, 74.5, 72.8, 70.1, 87, 79.3, 83, 68, 86, 79, 68.4, 98, 79.7, 79.7, 77.3, 28.6, 82.1, 90, 83, 102, 81.2, 82.1, 83, 76.2, 81.2, 85, 79.4, 100, 79.17, 77.3, 85, 89.6, 84.4, 72.9, 82.6, 82.1, 76.8, 83, 76, 84, 70, 81.4, 83.2, 82.1, 79.2, 79, 84, 76.2, 74.5, 75.6, 74, 76.8, 76, 81.4, 76.2, 79.2, 79.5, 84.1, 79.5, 82.6, 82.1, 88, 72.9, 75.6, 82.1, 79.6, 70, 87, 93.8, 76.9, 82, 74.6, 82.9, 83.3, 81.3, 102.3, 83.3, 78, 82.1, 80, 70, 74, 73.9, 83.3, 81.3, 87, 84, 70.6, 79.2, 84.6, 81.6, 80.6, 70, 79.7, 73.4, 87, 79.7, 76, 84.9, 79.2, 81, 82.1, 81.2, 98, 90.3, 84, 83.1, 87, 79.3, 83, 82.1, 73.3, 93.3, 80, 79.6, 87, 79, 79.1, 81.3, 82.1, 70.8, 80.3, 80.8, 95.6, 80.7, 83.7, 84, 79.4, 73.9, 78.6, 84.6, 79.7, 84, 82.9, 80, 82.6, 84, 81, 70.4, 82.6, 63.1, 72.7, 76.7, 80, 89, 81.5, 82.9, 81.4, 82.14, 82.5, 80.6, 79.3, 79.8, 77.1, 75.5, 82.4, 77.3, 98, 85, 79, 94.4, 81.1, 87, 73.1, 76.5, 67.7, 79.2, 77.3, 73.6, 73.4, 82.6, 83, 75.6, 78, 84, 75.6, 75.6, 84.4, 84.6, 81, 78.7, 84.6, 84.6, 75.6, 75.6, 75.6, 82), FermentationTempCelsius = c(19L, 18L, 21L, 9L, 10L, 22L, 10L, 19L, 19L, 19L, 19L, 22L, 18L, 17L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 18L, 19L, 19L, 19L, 19L, 21L, 21L, 21L, 19L, 21L, 21L, 21L, 9L, 19L, 20L, 21L, 19L, 19L, 22L, 21L, 19L, 18L, 19L, 18L, 19L, 19L, 19L, 12L, 23L, 21L, 10L, 9L, 19L, 19L, 19L, 21L, 19L, 19L, 18L, 18L, 21L, 19L, 20L, 20L, 21L, 10L, 19L, 19L, 21L, 19L, 19L, 19L, 21L, 19L, 20L, 23L, 19L, 21L, 19L, 21L, 19L, 20L, 21L, 21L, 19L, 19L, 19L, 21L, 19L, 9L, 22L, 14L, 20L, 19L, 19L, 20L, 18L, 14L, 19L, 19L, 19L, 21L, 20L, 19L, 19L, 19L, 21L, 10L, 21L, 21L, 19L, 18L, 19L, 21L, 20L, 17L, 20L, 19L, 19L, 22L, 19L, 20L, 20L, 19L, 15L, 19L, 19L, 19L, 19L, 21L, 21L, 10L, 12L, 19L, 21L, 19L, 19L, 21L, 19L, 19L, 20L, 21L, 22L, 21L, 99L, 19L, 19L, 22L, 16L, 19L, 19L, 21L, 18L, 21L, 19L, 19L, 19L, 21L, 17L, 21L, 19L, 19L, 19L, 19L, 19L, 21L, 19L, 23L, 19L, 20L, 19L, 19L, 19L, 19L, 19L, 19L, 21L, 18L, 21L, 19L, 21L, 21L, 12L, 21L, 21L, 21L, 21L, 12L, 21L, 21L, 19L, 19L, 19L, 21L), Yeast = structure(c(1L, 1L, 1L, 3L, 3L, 4L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 1L, 2L, 4L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 4L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 3L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 1L, 1L, 4L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 4L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 1L, 1L, 1L, 2L), .Label = c("Wyeast 1056 - American Ale", "Wyeast 1272 - American Ale II", "Wyeast 2007 - Pilsen Lager", "Wyeast 3711 - French Saison"), class = "factor")), class = "data.frame", row.names = c(NA, -196L))
df

要解决您的具体问题,您可以为运行您想要的聚类生成虚拟变量。

一种方法是使用 fastDummies 包中的 dummy_columns() 函数。

library(fastDummies)

df_dummy <- dummy_columns(df, select_columns = "Yeast", remove_selected_columns = TRUE)

res <- NbClust(df_dummy[2:9], min.nc = 2, max.nc = 15, method = "ward.D2")

如评论中所述,进行聚类分析的更好做法是对 CrossValidated 提出更多问题。