隐藏在 R 中的数据
Data Hiding in R
我有一个矢量集 'location',其中 1000 个位置包含 lat/long 个值。我希望随机隐藏某些位置的百分比,并通过我的算法估计这些位置的 lat/long 值。假设我想随机隐藏这 1000 个位置中的 10% 并使它们不为人知,我如何在 R 中隐藏我的数据集中的值。R 中是否有可用的包可以帮助我实现这一点。
所以如果这是一个完整的数据集 location
:
print(location)
Longitude Latitude
74.858863999999997 31.327629000000002
74.224755999999999 31.309773000000000
74.216177999999999 31.463429000000001
74.321051999999995 31.575917000000000
74.349832000000006 31.582062000000001
74.319663000000006 31.573923000000001
74.349384000000001 31.527654999999999
74.410433999999995 31.521415999999999
74.349609000000001 31.527670000000001
74.426238999999995 31.522907000000000
74.309755999999993 31.561537999999999
74.426238999999995 31.522907000000000
74.282814000000002 31.456077000000001
74.224754000000004 31.309773000000000
74.426238999999995 31.522907000000000
74.365804999999995 31.470144000000001
74.311349000000007 31.483550999999999
74.312512999999996 31.472501999999999
74.426238999999995 31.522907000000000
74.319362999999996 31.484127000000001
74.370300000000000 31.537609000000000
74.879557000000005 32.104958000000003
74.426238999999995 31.522907000000000
73.463269999999994 30.815715999999998
74.412903999999997 31.470146000000000
74.319362999999996 31.484127999999998
74.412891999999999 31.470144999999999
74.313017000000002 31.484044999999998
74.412890000000004 31.470147999999998
74.328925999999996 31.536244000000000
74.336599000000007 31.528677999999999
我只想打印以下内容:
print(location)
Longitude Latitude
74.858863999999997 31.327629000000002
74.224755999999999 31.309773000000000
74.216177999999999 31.463429000000001
74.321051999999995 31.575917000000000
74.349832000000006 31.582062000000001
74.319663000000006 31.573923000000001
74.349384000000001 31.527654999999999
74.410433999999995 31.521415999999999
74.349609000000001 31.527670000000001
74.426238999999995 31.522907000000000
74.309755999999993 31.561537999999999
74.426238999999995 31.522907000000000
74.282814000000002 31.456077000000001
74.224754000000004 31.309773000000000
74.426238999999995 31.522907000000000
74.365804999999995 31.470144000000001
74.311349000000007 31.483550999999999
74.312512999999996 31.472501999999999
74.426238999999995 31.522907000000000
74.319362999999996 31.484127000000001
74.370300000000000 31.537609000000000
74.879557000000005 32.104958000000003
74.426238999999995 31.522907000000000
73.463269999999994 30.815715999999998
74.412903999999997 31.470146000000000
74.319362999999996 31.484127999999998
74.412891999999999 31.470144999999999
74.313017000000002 31.484044999999998
但数据集仍然包含未打印且 "hidden"。
的值
我只想定义一个向量(可以是数据集的一列,也可以是单独的)来指示每一行是隐藏还是显示。例如:
# to hide about 20% of your data:
hide_row = which(rbinom(n = nrow(location), size = 1, prob = 0.2) == 1)
# to hide exactly 20% of your data:
hide_row = sample(1:nrow(location), size = 0.2 * nrow(location))
# print all but the hidden rows
location[-hide_row, ]
您似乎不想要这个(不确定您的用例),但更自然的方法是制作一个省略隐藏行的数据副本:
partial_location = location[-hide_row, ]
我有一个矢量集 'location',其中 1000 个位置包含 lat/long 个值。我希望随机隐藏某些位置的百分比,并通过我的算法估计这些位置的 lat/long 值。假设我想随机隐藏这 1000 个位置中的 10% 并使它们不为人知,我如何在 R 中隐藏我的数据集中的值。R 中是否有可用的包可以帮助我实现这一点。
所以如果这是一个完整的数据集 location
:
print(location)
Longitude Latitude
74.858863999999997 31.327629000000002
74.224755999999999 31.309773000000000
74.216177999999999 31.463429000000001
74.321051999999995 31.575917000000000
74.349832000000006 31.582062000000001
74.319663000000006 31.573923000000001
74.349384000000001 31.527654999999999
74.410433999999995 31.521415999999999
74.349609000000001 31.527670000000001
74.426238999999995 31.522907000000000
74.309755999999993 31.561537999999999
74.426238999999995 31.522907000000000
74.282814000000002 31.456077000000001
74.224754000000004 31.309773000000000
74.426238999999995 31.522907000000000
74.365804999999995 31.470144000000001
74.311349000000007 31.483550999999999
74.312512999999996 31.472501999999999
74.426238999999995 31.522907000000000
74.319362999999996 31.484127000000001
74.370300000000000 31.537609000000000
74.879557000000005 32.104958000000003
74.426238999999995 31.522907000000000
73.463269999999994 30.815715999999998
74.412903999999997 31.470146000000000
74.319362999999996 31.484127999999998
74.412891999999999 31.470144999999999
74.313017000000002 31.484044999999998
74.412890000000004 31.470147999999998
74.328925999999996 31.536244000000000
74.336599000000007 31.528677999999999
我只想打印以下内容:
print(location)
Longitude Latitude
74.858863999999997 31.327629000000002
74.224755999999999 31.309773000000000
74.216177999999999 31.463429000000001
74.321051999999995 31.575917000000000
74.349832000000006 31.582062000000001
74.319663000000006 31.573923000000001
74.349384000000001 31.527654999999999
74.410433999999995 31.521415999999999
74.349609000000001 31.527670000000001
74.426238999999995 31.522907000000000
74.309755999999993 31.561537999999999
74.426238999999995 31.522907000000000
74.282814000000002 31.456077000000001
74.224754000000004 31.309773000000000
74.426238999999995 31.522907000000000
74.365804999999995 31.470144000000001
74.311349000000007 31.483550999999999
74.312512999999996 31.472501999999999
74.426238999999995 31.522907000000000
74.319362999999996 31.484127000000001
74.370300000000000 31.537609000000000
74.879557000000005 32.104958000000003
74.426238999999995 31.522907000000000
73.463269999999994 30.815715999999998
74.412903999999997 31.470146000000000
74.319362999999996 31.484127999999998
74.412891999999999 31.470144999999999
74.313017000000002 31.484044999999998
但数据集仍然包含未打印且 "hidden"。
的值我只想定义一个向量(可以是数据集的一列,也可以是单独的)来指示每一行是隐藏还是显示。例如:
# to hide about 20% of your data:
hide_row = which(rbinom(n = nrow(location), size = 1, prob = 0.2) == 1)
# to hide exactly 20% of your data:
hide_row = sample(1:nrow(location), size = 0.2 * nrow(location))
# print all but the hidden rows
location[-hide_row, ]
您似乎不想要这个(不确定您的用例),但更自然的方法是制作一个省略隐藏行的数据副本:
partial_location = location[-hide_row, ]