如何替换dataframe-js中的列值?
How to replace column values in dataframe-js?
我有 2 个 javascript 数据帧:
const df1 = new DataFrame([
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
], ['c1', 'c2', 'c3', 'c4', 'c5']);
和
const df2 = new DataFrame([
[11, 22, 33, 44, 55],
[11, 22, 33, 44, 55],
[11, 22, 33, 44, 55],
], ['c1', 'c2', 'c3', 'c4', 'c5']);
df1.show(df1.count())
给出:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 1 | 2 | 3 | 4 | 5 |
| 1 | 2 | 3 | 4 | 5 |
| 1 | 2 | 3 | 4 | 5 |
df2.show(df2.count())
给出:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 11 | 22 | 33 | 44 | 55 |
| 11 | 22 | 33 | 44 | 55 |
| 11 | 22 | 33 | 44 | 55 |
用 df2
中的列值替换 df1
中 c2
和 c3
列中的所有值的最佳方法是什么?
所以最终我想结束:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 1 | 22 | 33 | 4 | 5 |
| 1 | 22 | 33 | 4 | 5 |
| 1 | 22 | 33 | 4 | 5 |
我做的方式(快):
const cols = ['c2', 'c3']
const values = df2.select(...cols).toArray()
for (i in cols) {
df1 = df1.withColumn(cols[i], (row, j) => values[j][i])
}
或者(同样快):
const cols = ['c2', 'c3']
const values = df2.select(...cols).toArray()
for (i in cols) {
df1 = df1.chain((row, j) => row.set(cols[i], values[j][i]))
}
甚至更短(但慢了大约 10 倍):
const cols = ['c2', 'c3']
for (i in cols) {
df1 = df1.withColumn(cols[i], (row, j) => df2.select(cols[i]).toArray()[j][0])
}
有没有更简单的方法来达到同样的效果?
我有 2 个 javascript 数据帧:
const df1 = new DataFrame([
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
], ['c1', 'c2', 'c3', 'c4', 'c5']);
和
const df2 = new DataFrame([
[11, 22, 33, 44, 55],
[11, 22, 33, 44, 55],
[11, 22, 33, 44, 55],
], ['c1', 'c2', 'c3', 'c4', 'c5']);
df1.show(df1.count())
给出:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 1 | 2 | 3 | 4 | 5 |
| 1 | 2 | 3 | 4 | 5 |
| 1 | 2 | 3 | 4 | 5 |
df2.show(df2.count())
给出:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 11 | 22 | 33 | 44 | 55 |
| 11 | 22 | 33 | 44 | 55 |
| 11 | 22 | 33 | 44 | 55 |
用 df2
中的列值替换 df1
中 c2
和 c3
列中的所有值的最佳方法是什么?
所以最终我想结束:
| c1 | c2 | c3 | c4 | c5 |
------------------------------------------------------------
| 1 | 22 | 33 | 4 | 5 |
| 1 | 22 | 33 | 4 | 5 |
| 1 | 22 | 33 | 4 | 5 |
我做的方式(快):
const cols = ['c2', 'c3']
const values = df2.select(...cols).toArray()
for (i in cols) {
df1 = df1.withColumn(cols[i], (row, j) => values[j][i])
}
或者(同样快):
const cols = ['c2', 'c3']
const values = df2.select(...cols).toArray()
for (i in cols) {
df1 = df1.chain((row, j) => row.set(cols[i], values[j][i]))
}
甚至更短(但慢了大约 10 倍):
const cols = ['c2', 'c3']
for (i in cols) {
df1 = df1.withColumn(cols[i], (row, j) => df2.select(cols[i]).toArray()[j][0])
}
有没有更简单的方法来达到同样的效果?