填充数组的有效方法

efficient way of padding an array

我想知道是否有一种有效的方法可以在不使用 numpy.pad()

的情况下在 python 中填充数组

我知道一种使用嵌套 for 循环的方法,但我想知道是否有更快的方法?

输入:

row padding on top- 2
column padding from left - 1
1 2 3
4 5 6
7 8 9

输出

0 0 0 0
0 0 0 0
0 1 2 3
0 4 5 6
0 7 8 9

我做了什么

y = [[1,2,3],[4,5,6],[7,8,9]]

topPadding = 2
leftPadding = 1
noOfRows = len(y)+topPadding
noOfCols = len(y)+leftPadding

x = [[0 for i in range(noOfCols)] for j in range(noOfRows)]

for i in range(topPadding,noOfRows):
    for j in range(leftPadding,noOfCols):
        x[i][j] = y[i-topPadding][j-leftPadding]
    print()
        
print(x)

输出

[[0, 0, 0, 0], [0, 0, 0, 0], [0, 1, 2, 3], [0, 4, 5, 6], [0, 7, 8, 9]]

使用 list 连接和重复运算符的解决方案:

def concat(x, top, left):
    n = len(x[0])
    return [[0]*(n + left - len(row)) + row for row in [[]]*top + x]

以下是一些非常基本的计时结果,使用您的嵌套 for 循环解决方案与我在 10000x10000 随机数字矩阵上的串联解决方案:

nested: 122.26 s
concat: 5.66 s

测试代码:

import timeit
from random import randint


def concat(x, top, left):
    n = len(x[0])
    return [[0]*(n + left - len(row)) + row for row in [[]]*top + x]


def nested(x, topPadding, leftPadding):
    noOfRows = len(x)+topPadding
    noOfCols = len(x)+leftPadding

    z = [[0 for i in range(noOfCols)] for j in range(noOfRows)]

    for i in range(topPadding,noOfRows):
        for j in range(leftPadding,noOfCols):
            z[i][j] = x[i-topPadding][j-leftPadding]

    return z


test = [[randint(0, 9) for _ in range(10000)] for _ in range(10000)]

t1 = timeit.timeit(
    "nested(test, 4, 2)",
    number=10,
    globals=globals()
)

t2 = timeit.timeit(
    "concat(test, 4, 2)",
    number=10,
    globals=globals()
)

print(nested(test, 4, 2) == concat(test, 4, 2))
print(f"nested: {t1:.2f} s")
print(f"concat: {t2:.2f} s")

完整输出:

True
nested: 122.26 s
concat: 5.66 s

您输入所需高度和宽度的修改版本:

def concat(x, h, w):
    H = h - len(x)
    return [[0]*(w - len(row)) + row for row in [[]]*H + x]

另一个允许向北、南、东和西填充的版本:

def nsew_concat(x, N, S, E, W):
    """Pad x with zeros to the north, south, east, and west."""
    k = len(x[0])
    stack = [[]]*N + x + [[]]*S
    return [([0]*W + [0]*(k - len(row)) + row + [0]*E) for row in stack]

这适用于任何(非空)矩形矩阵,但不适用于锯齿状数组(每行的长度不同)。

def pad_matrix(matrix, element=0, *, left=0, top=0):
    full_width = left + len(matrix[0])
    e = [element]
    return [
        *(e * full_width for i in range(top)),
        *(e * left + row for row in matrix),
    ]

这是一个允许在所有四个边上填充的版本:

def pad_matrix(matrix, element=0, *, left=0, top=0, right=0, bottom=0):
    full_width = left + len(matrix[0]) + right
    e = [element]
    return [
        *(e * full_width for i in range(top)),
        *(e * left + row + e * right for row in matrix),
        *(e * full_width for i in range(bottom)),
    ]

用法类似于pad_matrix(matrix, left=1, top=2)