像素坐标与绘图坐标

Question

在下面的代码片段中，传递 x 和 y 值会将点放在 (y,x) 坐标中，而绘图是在 (x,y) 中完成的。设置绘图缓冲区以使其在同一坐标系中放置像素和绘图的正确方法是什么？

from PIL import Image, ImageDraw

def visual_test(x, y):
    grid = np.zeros((100, 100, 3), dtype=np.uint8)
    grid[:] = [0, 0, 0]
    grid[x, y] = [255, 0, 0]
    img = Image.fromarray(grid, 'RGB')
    draw = ImageDraw.Draw(img)
    draw.line((x, y, x, y-5), fill=(255,255,255), width=1)
    img.show()

Answer 1

注意："axis" 我指的是图像坐标，而不是 NumPy 的数组维度。

问题在于 ndarray 的维度 ("The N-dimensional array") 的解释，或者在该上下文中 坐标系 的定义。

对于Pillow，很明显：

Coordinate System

The Python Imaging Library uses a Cartesian pixel coordinate system, with (0,0) in the upper left corner. Note that the coordinates refer to the implied pixel corners; the centre of a pixel addressed as (0, 0) actually lies at (0.5, 0.5).

Coordinates are usually passed to the library as 2-tuples (x, y). Rectangles are represented as 4-tuples, with the upper left corner given first. For example, a rectangle covering all of an 800x600 pixel image is written as (0, 0, 800, 600).

看起来像这样（图像 -> public 域）：

您的代码已修改为创建 2x2 像素图像：

import numpy as np
from PIL import Image # Pillow

w, h, d = 2,2,3
x,y = 0,1

grid = np.zeros((w, h, d), dtype=np.uint8) # NumPyarray for image data
#test = np.zeros(w*h*d, dtype=np.uint8).reshape(w, h, d)
#print(np.array_equal(grid,test)) # => True

# red pixel with NumPy
grid[x, y] = [255, 0, 0]

print(grid[::])

# green pixel with Pillow
img = Image.fromarray(grid, 'RGB')
pixels = img.load()
pixels[x,y] = (0, 255, 0)

# display temporary image file with default application
scale = 100
img.resize((w*scale,h*scale)).show()

显示问题（在 (0,1) 处绘制像素，绿色：Pillow，红色：ndarray）：

X 和 Y 确实交换了：

是因为 NumPy 还是 Pillow？

ndarray 打印为

[[[  0   0   0]
  [255   0   0]]

 [[  0   0   0]
  [  0   0   0]]]

很容易重新格式化以在视觉上与图像像素相对应

[
 [ [  0   0   0] [255   0   0] ]
 [ [  0   0   0] [  0   0   0] ]
]

这表明 Pillow 会按预期解释数组。

但为什么 NumPy 的 ndarray 似乎交换了坐标轴？

让我们进一步分解一下

[ # grid
 [ # grid[0]
   [  0   0   0]  #grid[0,0]
                  [255   0   0] #grid[0,1]
 ]
 [ #grid[1]
   [  0   0   0]  #grid[1,0]
                  [  0   0   0] #grid[1,1]
 ]
]

让我们测试一下（一旦脚本完成，-i 在交互模式下有 Python 运行）：

>py -i t.py
[[[  0   0   0]
  [255   0   0]]

 [[  0   0   0]
  [  0   0   0]]]
>>> grid[0,1]
array([255,   0,   0], dtype=uint8)
>>> grid[0]
array([[  0,   0,   0],
       [255,   0,   0]], dtype=uint8)
>>> ^Z

这证实了上面假设的指标。

很明显 ndarray 的第一个维度如何对应于图像行或 Y 轴，第二个维度如何对应于图像列或 X 轴（第三个显然对应于 RGB 像素值）。

因此，要匹配 "coordinate systems"，要么...

...轴需要 "swapped"
...数据需要"swapped"
...轴解释需要"swapped"

让我们看看：

1。写入 ndarray:

时只需交换索引变量

# red pixel with NumPy
grid[y, x] = [255, 0, 0]

预期结果为

[[[  0   0   0]
  [  0   0   0]]

 [[255   0   0]
  [  0   0   0]]]

和

当然，包装函数可以做到这一点。

2. Transposing the array, as ，在 3 维数组上很容易 that 不起作用，因为此函数会影响所有默认尺寸：

grid = np.transpose(grid)
print("transposed\n", grid)
print("shape:", grid.shape)

结果

[[[  0   0]
  [255   0]]

 [[  0   0]
  [  0   0]]

 [[  0   0]
  [  0   0]]]
shape: (3, 2, 2)

并且由于指定了 Pillow RGB 图像模式，因此抛出异常：

ValueError: not enough image data

但是np.transpose还有一个额外的参数，axes:

...permute the axes according to the values given.

我们只想交换 0 和 1，而不是 2，所以：

grid = np.transpose(grid, (1,0,2))

还有其他功能类似，例如

grid = np.swapaxes(grid,0,1)

3。更改解释 ?

Can Pillow's PIL.Image.fromarray be brought to interpret the ndarray with swapped axes? It does not have any other arguments than mode for color (really, see the source code).

Creates an image memory from an object exporting the array interface using the buffer protocol). If obj is not contiguous, then the tobytes method is called and frombuffer() is used.

该函数计算出如何调用 PIL.Image.frombuffer() (source)，它为 "decoder".

提供了更多选项

Array interface? Buffer protocol？现在这两个都有点太 low-level...

TL;DR
只需交换索引变量（任一）！

延伸阅读： - https://docs.scipy.org/doc/numpy-dev/user/quickstart.html

像素坐标与绘图坐标

Pixel coordinates vs Drawing Coordinates

python

numpy

python-imaging-library

pillow