如何处理移位数据
How to deal with shifted data
我的传感器校准不知何故,我最终得到了一些严重偏移的数据。
实测的数据,你猜对了,我测的,基线是水平,数据应该在什么地方。
data <- data.frame(hour = c(1:244),
measured = c(1151.43, 1151.19, 1150.39, 1149.38,
1149.01, 1148.3, 1147.61, 1146.68, 1145.13, 1144.23, 1151.17,
1145.58, 1140.4, 1139.47, 1138.38, 1137.11, 1136.24, 1135.55,
1134.84, 1134.18, 1133.82, 1135.74, 1159.47, 1180.34, 1169.46,
1136.52, 1131.85, 1132.28, 1132.84, 1134.29, 1135.86, 1136.97,
1142.12, 1188.96, 1231.69, 1254.89, 1246.7, 1202.24, 1156.71,
1146.82, 1148.99, 1150.41, 1151.31, 1151.59, 1151.87, 1157.17,
1190.79, 1210.93, 1209.53, 1179.72, 1153.43, 1153.28, 1153.23,
1153.2, 1152.95, 1152.55, 1152.33, 1152.67, 1152.58, 1154.27,
1163.28, 1153.28, 1150.61, 1149.78, 1148.39, 1147.02, 1146.01,
1144.79, 1143.43, 1142.81, 1142.02, 1141.34, 1140.36, 1139.73,
1139.22, 1138.59, 1137.79, 1137.44, 1136.92, 1136.15, 1135.93,
1136, 1137, 1138.15, 1138.51, 1149.14, 1155.07, 1138.72, 1138.04,
1138.58, 1138.65, 1138.91, 1139.59, 1139.73, 1139.11, 1138.66,
1138.57, 1148.24, 1166.46, 1157.07, 1140.83, 1140.43, 1140.7,
1140.79, 1140.64, 1141.63, 1142.87, 1143.84, 1144.57, 1144.89,
1147.34, 1156.13, 1147.45, 1146.44, 1146.93, 1147.68, 1148.14,
1148.27, 1147.62, 1146.77, 1146.25, 1146.47, 1147.69, 1164.92,
1164.16, 1148.28, 1147.49, 1147.27, 1147.66, 1147.94, 1148.97,
1150.35, 1151.25, 1152.39, 1153.05, 1154.46, 1166.86, 1160.59,
1154.12, 1154.55, 1155.08, 1155.64, 1156.19, 1156.38, 1156.46,
1156.38, 1155.96, 1163.76, 1189.55, 1191.38, 1162.85, 1157.35,
1157.28, 1158, 1158.6, 1159.6, 1160.03, 1160.16, 1160.78, 1161.24,
1161.72, 1164.73, 1161.89, 1162.13, 1162.35, 1162.61, 1162.25,
1161.42, 1160.78, 1160.35, 1159.98, 1159.83, 1165.63, 1186.16,
1182.38, 1159.98, 1158.49, 1158.33, 1159.3, 1160.39, 1160.97,
1161.17, 1161.25, 1161.36, 1161.31, 1162.32, 1169.11, 1160.85,
1160.19, 1160.06, 1159.86, 1158.93, 1158.65, 1158.49, 1158.52,
1157.93, 1157.94, 1179.24, 1195.79, 1179.21, 1156.38, 1156.31,
1157.05, 1158.47, 1159.08, 1159.28, 1159.73, 1160, 1160.1, 1160.04,
1160.12, 1159.18, 1159.05, 1159.07, 1158, 1157.06, 1156.52, 1156.22,
1156.91, 1157.18, 1156.54, 1160.11, 1183.55, 1188.34, 1162.84,
1154.78, 1154.72, 1154.6, 1154.61, 1154.63, 1154.66, 1154.76,
1155.2, 1160.27, 1188.68, 1205.58, 1192.46, 1158.55, 1157.47,
1157.73, 1158.1, 1158.37, 1158.3, 1158.4),
baseline = c(1010.1, 1009.2, 1008.8, 1007.8, 1007.1, 1005.5,
1004.2, 1002.9, 1001.9, 1000.8, 999.8, 998.7, 997.8,
996.8, 996, 995.5, 995.1, 994.4, 993.5, 992.8, 992.4,
992.2, 992.2, 992.2, 992.8, 993.6, 995.3, 997.2, 998.4,
999.7, 1001, 1002.1, 1003.1, 1004.1, 1004.7, 1005.2,
1006.7, 1008.6, 1009.7, 1010.5, 1010.9, 1011, 1011.2,
1011.8, 1012.1, 1012.3, 1012.9, 1013.2, 1013.4, 1013.3,
1013.4, 1013.1, 1013, 1012.5, 1012.3, 1011.8, 1010.9,
1010.6, 1010, 1009.8, 1008.9, 1008, 1006.4, 1005.3, 1004.8, 1003.1, 1002.2, 1001.3, 1000.7,
1000, 999.8, 999.5, 999.3, 999, 998.9, 998.4, 998.2, 998, 998.2,
998.3, 998.2, 998.6, 998.5, 998.4, 998.3, 998.3, 998.7, 998.4,
998.5, 998.6, 998.7, 998.8, 999, 999.4, 999.6, 1000.1, 1000.6,
1001.2, 1001.1, 1001.2, 1001.5, 1001.8, 1002.1, 1002.6, 1003.2,
1003.8, 1004.3, 1004.6, 1004.8, 1005, 1005.4, 1005.5, 1005.7,
1006.2, 1006.3, 1006.5, 1006.9, 1007.2, 1007.6, 1008.1, 1008.2,
1008.8, 1009, 1009.3, 1009.2, 1009.4, 1009.6, 1009.6, 1010.1,
1010.8, 1011.4, 1011.9, 1012.1, 1012.6, 1012.7, 1013, 1013.4,
1013.5, 1013.9, 1014.3, 1014.9, 1015.3, 1015.9, 1016.1, 1016.6,
1017.2, 1017.3, 1017.5, 1017.6, 1017.6, 1017.4, 1017.5, 1017.8,
1018.2, 1018.4, 1018.5, 1018.5, 1018.5, 1018.7, 1018.6, 1018.9,
1018.7, 1018.7, 1019, 1019.5, 1019.6, 1019.5, 1019.3, 1019.1,
1018.9, 1018.6, 1018.6, 1018.4, 1018.3, 1018, 1017.6, 1017.8,
1018.1, 1018.3, 1018.3, 1018.3, 1018.3, 1018.4, 1018.1, 1017.8,
1017.6, 1017.4, 1017.5, 1017.5, 1017.5, 1017.6, 1017.7, 1017.8,
1017.5, 1017.4, 1017.1, 1016.9, 1016.8, 1016.8, 1016.9, 1017.1,
1017.3, 1017.5, 1017.4, 1017.3, 1017, 1016.8, 1016.7, 1016.4,
1016, 1015.5, 1015.4, 1015.4, 1015.3, 1015.1, 1015.3, 1015.6,
1015.5, 1014.9, 1014.6, 1013.4, 1012.9, 1012.3, 1012.1, 1012,
1012.1, 1012.4, 1012.7, 1012.7, 1013.3, 1013.6, 1014, 1014.1,
1014.5, 1014.8, 1015.2, 1015.7, 1016.3, 1016.9, 1017.4, 1017.5,
1017.3, 1017.3, 1016.9))
快速绘图显示了这一点
plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")
黑线是实测数据,红线是实际应该在的位置。
由于数据和基线之间的距离看起来相等,我想我可以取它们的平均值之间的差值并减去它。
correction <- mean(data$measured) - mean(data$baseline)
plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")
lines(data$hour, data$measured-correction, col = "green")
这几乎奏效了,但如您所见,绿线最终有点太低了。
我也想过通过数据拟合一条线。像这样
fit <- lm(measured ~ poly(hour, degree = 7), data = data)$fitted.values
有人知道如何将测量值向下移动到基线吗?
非常感谢您的帮助。
我倾向于同意 Peace 关于仅应用偏移而不改变信号形状的观点。
您可以通过使用没有峰值的更局部稳定的片段来改进 correction
。对于这个信号,tail
应该起作用:
correction <- mean(tail(data$measured)) - mean(tail(data$baseline))
我的传感器校准不知何故,我最终得到了一些严重偏移的数据。 实测的数据,你猜对了,我测的,基线是水平,数据应该在什么地方。
data <- data.frame(hour = c(1:244),
measured = c(1151.43, 1151.19, 1150.39, 1149.38,
1149.01, 1148.3, 1147.61, 1146.68, 1145.13, 1144.23, 1151.17,
1145.58, 1140.4, 1139.47, 1138.38, 1137.11, 1136.24, 1135.55,
1134.84, 1134.18, 1133.82, 1135.74, 1159.47, 1180.34, 1169.46,
1136.52, 1131.85, 1132.28, 1132.84, 1134.29, 1135.86, 1136.97,
1142.12, 1188.96, 1231.69, 1254.89, 1246.7, 1202.24, 1156.71,
1146.82, 1148.99, 1150.41, 1151.31, 1151.59, 1151.87, 1157.17,
1190.79, 1210.93, 1209.53, 1179.72, 1153.43, 1153.28, 1153.23,
1153.2, 1152.95, 1152.55, 1152.33, 1152.67, 1152.58, 1154.27,
1163.28, 1153.28, 1150.61, 1149.78, 1148.39, 1147.02, 1146.01,
1144.79, 1143.43, 1142.81, 1142.02, 1141.34, 1140.36, 1139.73,
1139.22, 1138.59, 1137.79, 1137.44, 1136.92, 1136.15, 1135.93,
1136, 1137, 1138.15, 1138.51, 1149.14, 1155.07, 1138.72, 1138.04,
1138.58, 1138.65, 1138.91, 1139.59, 1139.73, 1139.11, 1138.66,
1138.57, 1148.24, 1166.46, 1157.07, 1140.83, 1140.43, 1140.7,
1140.79, 1140.64, 1141.63, 1142.87, 1143.84, 1144.57, 1144.89,
1147.34, 1156.13, 1147.45, 1146.44, 1146.93, 1147.68, 1148.14,
1148.27, 1147.62, 1146.77, 1146.25, 1146.47, 1147.69, 1164.92,
1164.16, 1148.28, 1147.49, 1147.27, 1147.66, 1147.94, 1148.97,
1150.35, 1151.25, 1152.39, 1153.05, 1154.46, 1166.86, 1160.59,
1154.12, 1154.55, 1155.08, 1155.64, 1156.19, 1156.38, 1156.46,
1156.38, 1155.96, 1163.76, 1189.55, 1191.38, 1162.85, 1157.35,
1157.28, 1158, 1158.6, 1159.6, 1160.03, 1160.16, 1160.78, 1161.24,
1161.72, 1164.73, 1161.89, 1162.13, 1162.35, 1162.61, 1162.25,
1161.42, 1160.78, 1160.35, 1159.98, 1159.83, 1165.63, 1186.16,
1182.38, 1159.98, 1158.49, 1158.33, 1159.3, 1160.39, 1160.97,
1161.17, 1161.25, 1161.36, 1161.31, 1162.32, 1169.11, 1160.85,
1160.19, 1160.06, 1159.86, 1158.93, 1158.65, 1158.49, 1158.52,
1157.93, 1157.94, 1179.24, 1195.79, 1179.21, 1156.38, 1156.31,
1157.05, 1158.47, 1159.08, 1159.28, 1159.73, 1160, 1160.1, 1160.04,
1160.12, 1159.18, 1159.05, 1159.07, 1158, 1157.06, 1156.52, 1156.22,
1156.91, 1157.18, 1156.54, 1160.11, 1183.55, 1188.34, 1162.84,
1154.78, 1154.72, 1154.6, 1154.61, 1154.63, 1154.66, 1154.76,
1155.2, 1160.27, 1188.68, 1205.58, 1192.46, 1158.55, 1157.47,
1157.73, 1158.1, 1158.37, 1158.3, 1158.4),
baseline = c(1010.1, 1009.2, 1008.8, 1007.8, 1007.1, 1005.5,
1004.2, 1002.9, 1001.9, 1000.8, 999.8, 998.7, 997.8,
996.8, 996, 995.5, 995.1, 994.4, 993.5, 992.8, 992.4,
992.2, 992.2, 992.2, 992.8, 993.6, 995.3, 997.2, 998.4,
999.7, 1001, 1002.1, 1003.1, 1004.1, 1004.7, 1005.2,
1006.7, 1008.6, 1009.7, 1010.5, 1010.9, 1011, 1011.2,
1011.8, 1012.1, 1012.3, 1012.9, 1013.2, 1013.4, 1013.3,
1013.4, 1013.1, 1013, 1012.5, 1012.3, 1011.8, 1010.9,
1010.6, 1010, 1009.8, 1008.9, 1008, 1006.4, 1005.3, 1004.8, 1003.1, 1002.2, 1001.3, 1000.7,
1000, 999.8, 999.5, 999.3, 999, 998.9, 998.4, 998.2, 998, 998.2,
998.3, 998.2, 998.6, 998.5, 998.4, 998.3, 998.3, 998.7, 998.4,
998.5, 998.6, 998.7, 998.8, 999, 999.4, 999.6, 1000.1, 1000.6,
1001.2, 1001.1, 1001.2, 1001.5, 1001.8, 1002.1, 1002.6, 1003.2,
1003.8, 1004.3, 1004.6, 1004.8, 1005, 1005.4, 1005.5, 1005.7,
1006.2, 1006.3, 1006.5, 1006.9, 1007.2, 1007.6, 1008.1, 1008.2,
1008.8, 1009, 1009.3, 1009.2, 1009.4, 1009.6, 1009.6, 1010.1,
1010.8, 1011.4, 1011.9, 1012.1, 1012.6, 1012.7, 1013, 1013.4,
1013.5, 1013.9, 1014.3, 1014.9, 1015.3, 1015.9, 1016.1, 1016.6,
1017.2, 1017.3, 1017.5, 1017.6, 1017.6, 1017.4, 1017.5, 1017.8,
1018.2, 1018.4, 1018.5, 1018.5, 1018.5, 1018.7, 1018.6, 1018.9,
1018.7, 1018.7, 1019, 1019.5, 1019.6, 1019.5, 1019.3, 1019.1,
1018.9, 1018.6, 1018.6, 1018.4, 1018.3, 1018, 1017.6, 1017.8,
1018.1, 1018.3, 1018.3, 1018.3, 1018.3, 1018.4, 1018.1, 1017.8,
1017.6, 1017.4, 1017.5, 1017.5, 1017.5, 1017.6, 1017.7, 1017.8,
1017.5, 1017.4, 1017.1, 1016.9, 1016.8, 1016.8, 1016.9, 1017.1,
1017.3, 1017.5, 1017.4, 1017.3, 1017, 1016.8, 1016.7, 1016.4,
1016, 1015.5, 1015.4, 1015.4, 1015.3, 1015.1, 1015.3, 1015.6,
1015.5, 1014.9, 1014.6, 1013.4, 1012.9, 1012.3, 1012.1, 1012,
1012.1, 1012.4, 1012.7, 1012.7, 1013.3, 1013.6, 1014, 1014.1,
1014.5, 1014.8, 1015.2, 1015.7, 1016.3, 1016.9, 1017.4, 1017.5,
1017.3, 1017.3, 1016.9))
快速绘图显示了这一点
plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")
黑线是实测数据,红线是实际应该在的位置。
由于数据和基线之间的距离看起来相等,我想我可以取它们的平均值之间的差值并减去它。
correction <- mean(data$measured) - mean(data$baseline)
plot(data$hour, data$measured, type = "l", ylim = c(950, 1250))
lines(data$hour, data$baseline, col = "red")
lines(data$hour, data$measured-correction, col = "green")
这几乎奏效了,但如您所见,绿线最终有点太低了。
我也想过通过数据拟合一条线。像这样
fit <- lm(measured ~ poly(hour, degree = 7), data = data)$fitted.values
有人知道如何将测量值向下移动到基线吗?
非常感谢您的帮助。
我倾向于同意 Peace 关于仅应用偏移而不改变信号形状的观点。
您可以通过使用没有峰值的更局部稳定的片段来改进 correction
。对于这个信号,tail
应该起作用:
correction <- mean(tail(data$measured)) - mean(tail(data$baseline))