英特尔 Fortran 中结构内的数据对齐

Question

我正在尝试在内存中对齐以下类型的数据：

type foo
   real, allocatable, dimension(:) :: bar1, bar2
   !dir$ attributes align:64 :: bar1
   !dir$ attributes align:64 :: bar2
end type foo

type(foo), allocatable, dimension(:) :: my_foo
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))

! somewhere here I need to tell the compiler that data is aligned
!    for a simple array with name `bar` I would just do:
!dir$ assume_aligned bar1: 64
!dir$ assume_aligned bar2: 64
!    but what do I do for the data type I have, something like this?
!dir$ assume_aligned my_foo(1)%bar1: 64
!dir$ assume_aligned my_foo(1)%bar2: 64

do i = 1, 100
   my_foo(1)%bar1(i) = 10.
   my_foo(1)%bar2(i) = 10.
end do

如您所见，它是一个 foo 类型结构的数组，有两个大数组 bar1 和 bar2 作为变量，我需要在缓存边界附近对齐它们记忆。

我知道如何为简单的数组做那件事 (link)，但我不知道如何为这种复杂的数据结构做那件事。如果 my_foo 的大小不是 1，而是大小，比如说 100，会怎样？我循环遍历它们吗？

Answer 1

好的，案例半封闭。结果证明该解决方案非常简单。您只需使用指针并对它们执行 assume_aligned。那应该会处理它。

type foo
   real, allocatable, dimension(:) :: bar1, bar2
   !dir$ attributes align:64 :: bar1
   !dir$ attributes align:64 :: bar2
end type foo

type(foo), target, allocatable, dimension(:) :: my_foo
real, pointer, contiguous :: pt_bar1(:)
real, pointer, contiguous :: pt_bar2(:)
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))

pt_bar1 = my_foo(1)%bar1
pt_bar2 = my_foo(1)%bar2
!dir$ assume_aligned pt_bar1:64, pt_bar2:64

pt_bar1 = 10.
pt_bar2 = 10.

do 循环仍未向量化 smh。就像我做同样的事情一样

do i = 1, 100
   pt_bar1(i) = 10.
   pt_bar2(i) = 10.
end do

它不会被矢量化。

UPD. 好的，这完成了工作（还需要向编译器添加 -qopenmp-simd 标志）：

!$omp simd
!dir$ vector aligned
do i = 1, 100
   pt_bar1(i) = 10.
   pt_bar2(i) = 10.
end do

此外，如果您正在循环 my_foo(j)%...，请确保在每次迭代后使用 pt_bar1 => null() 等释放指针

PS。感谢我们部门的 BW 提供的帮助。 :) 有时个人交流> Whosebug（不总是，只是有时）。

英特尔 Fortran 中结构内的数据对齐

Data alignment inside a structure in Intel Fortran

fortran

vectorization

memory-alignment

intel-fortran