如何矢量化matlab中的天线arrayfactor表达式
How to vectorize the antenna arrayfactor expression in matlab
我这里有天线阵列因子表达式:
我已将数组因子表达式编码如下:
lambda = 1;
M = 100;N = 200; %an M x N array
dx = 0.3*lambda; %inter-element spacing in x direction
m = 1:M;
xm = (m - 0.5*(M+1))*dx; %element positions in x direction
dy = 0.4*lambda;
n = 1:N;
yn = (n - 0.5*(N+1))*dy;
thetaCount = 360; % no of theta values
thetaRes = 2*pi/thetaCount; % theta resolution
thetas = 0:thetaRes:2*pi-thetaRes; % theta values
phiCount = 180;
phiRes = pi/phiCount;
phis = -pi/2:phiRes:pi/2-phiRes;
cmpWeights = rand(N,M); %complex Weights
AF = zeros(phiCount,thetaCount); %Array factor
tic
for i = 1:phiCount
for j = 1:thetaCount
for p = 1:M
for q = 1:N
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
end
end
end
end
如何向量化计算数组因子 (AF) 的代码。
我想要这条线:
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
以矢量形式编写(通过修改 for 循环)。
方法 #1:全速
最内层的嵌套循环每次迭代都会生成这个 - cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
,它们将迭代求和以在 AF
中为我们提供最终输出。
让我们将 exp(....
部分称为 B
。现在,B
基本上有两部分,一部分是标量 (2*pi*1j/lambda)
另一部分
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))
由依赖于的变量组成
原始循环版本中使用的四个迭代器 - i,j,p,q
。我们将此另一部分称为 C
,以便稍后参考。
让我们全面了解一下:
Loopy 版本有 AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
,现在相当于 AF(i,j) = AF(i,j) + cmpWeights(q,p)*B
,其中 B = exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
.
B
可以简化为B = exp((2*pi*1j/lambda)* C)
,其中C = (xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))
.
C
将取决于迭代器 - i,j,p,q
.
所以,在移植到矢量化方式之后,它最终会变成这样 -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Create vectorized version of C using all four vector iterators
mult1 = bsxfun(@times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = bsxfun(@times,sin(thetas(J)),sin(phis(I)).'); %//'
mult1_xm = bsxfun(@times,mult1(:),permute(xm,[1 3 2]));
mult2_yn = bsxfun(@times,mult2(:),yn);
C_vect = bsxfun(@plus,mult1_xm,mult2_yn);
%// 3) Create vectorized version of B using vectorized C
B_vect = reshape(exp((2*pi*1j/lambda)*C_vect),phiCount*thetaCount,[]);
%// 4) Final output as matrix multiplication between vectorized versions of B and C
AF_vect = reshape(B_vect*cmpWeights(:),phiCount,thetaCount);
方法 #2:较少内存占用
第二种方法会减少内存流量,它使用指数分布 属性 - exp(A+B) = exp(A)*exp(B)
.
现在,原来的loopy版本是这样的-
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*...
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
所以,在使用分配式 属性 之后,我们最终会得到这样的结果 -
K = (2*pi*1j/lambda)
part1 = K*xm(p)*sin(thetas(j))*cos(phis(i));
part2 = K*yn(q)*sin(thetas(j))*sin(phis(i));
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp(part1)*exp(part2);
因此,相关的矢量化方法将变成这样-
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Define the constant used at the start of EXP() call
K = (2*pi*1j/lambda);
%// 3) Perform the sine-cosine operations part1 & part2 in vectorized manners
mult1 = K*bsxfun(@times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = K*bsxfun(@times,sin(thetas(J)),sin(phis(I)).'); %//'
%// Perform exp(part1) & exp(part2) in vectorized manners
part1_vect = exp(bsxfun(@times,mult1(:),xm));
part2_vect = exp(bsxfun(@times,mult2(:),yn));
%// Perform multiplications with cmpWeights for final output
AF = reshape(sum((part1_vect*cmpWeights.').*part2_vect,2),phiCount,[])
快速基准测试
这是原始循环方法和建议方法 #2 的问题中列出的输入数据的运行时 -
---------------------------- With Original Approach
Elapsed time is 358.081507 seconds.
---------------------------- With Proposed Approach #2
Elapsed time is 0.405038 seconds.
运行时建议使用 Approach #2
!
进行疯狂的性能改进
基本技巧是弄清楚哪些事物是常数,哪些事物取决于下标项 - 因此是矩阵项。
内总和:
C(n,m)
是一个矩阵
2π/λ
是常数
sin(θ)cos(φ)
是常数
x(m)
和 y(n)
是向量
所以我要做的两件事是:
- 使用
meshgrid()
将 xm
和 ym
展开为矩阵
- 将所有常数项的东西都放在循环之外。
像这样:
...
piFactor = 2 * pi * 1j / lambda;
[xgrid, ygrid] = meshgrid(xm, ym); % xgrid and ygrid will be size (N, M)
for i = 1:phiCount
for j = 1:thetaCount
xFactor = sin(thetas(j)) * cos(phis(i));
yFactor = sin(thetas(j)) * sin(phis(i));
expFactor = exp(piFactor * (xgrid * xFactor + ygrid * yFactor)); % expFactor is size (N, M)
elements = cmpWeights .* expFactor; % elements of sum, size (N, M)
AF(i, j) = AF(i, j) + sum(elements(:)); % sum and then integrate.
end
end
您可能也想出如何矢量化外循环,但希望这能为您提供一个起点。
我这里有天线阵列因子表达式:
我已将数组因子表达式编码如下:
lambda = 1;
M = 100;N = 200; %an M x N array
dx = 0.3*lambda; %inter-element spacing in x direction
m = 1:M;
xm = (m - 0.5*(M+1))*dx; %element positions in x direction
dy = 0.4*lambda;
n = 1:N;
yn = (n - 0.5*(N+1))*dy;
thetaCount = 360; % no of theta values
thetaRes = 2*pi/thetaCount; % theta resolution
thetas = 0:thetaRes:2*pi-thetaRes; % theta values
phiCount = 180;
phiRes = pi/phiCount;
phis = -pi/2:phiRes:pi/2-phiRes;
cmpWeights = rand(N,M); %complex Weights
AF = zeros(phiCount,thetaCount); %Array factor
tic
for i = 1:phiCount
for j = 1:thetaCount
for p = 1:M
for q = 1:N
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
end
end
end
end
如何向量化计算数组因子 (AF) 的代码。
我想要这条线:
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
以矢量形式编写(通过修改 for 循环)。
方法 #1:全速
最内层的嵌套循环每次迭代都会生成这个 - cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
,它们将迭代求和以在 AF
中为我们提供最终输出。
让我们将 exp(....
部分称为 B
。现在,B
基本上有两部分,一部分是标量 (2*pi*1j/lambda)
另一部分
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))
由依赖于的变量组成
原始循环版本中使用的四个迭代器 - i,j,p,q
。我们将此另一部分称为 C
,以便稍后参考。
让我们全面了解一下:
Loopy 版本有
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
,现在相当于AF(i,j) = AF(i,j) + cmpWeights(q,p)*B
,其中B = exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
.B
可以简化为B = exp((2*pi*1j/lambda)* C)
,其中C = (xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))
.C
将取决于迭代器 -i,j,p,q
.
所以,在移植到矢量化方式之后,它最终会变成这样 -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Create vectorized version of C using all four vector iterators
mult1 = bsxfun(@times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = bsxfun(@times,sin(thetas(J)),sin(phis(I)).'); %//'
mult1_xm = bsxfun(@times,mult1(:),permute(xm,[1 3 2]));
mult2_yn = bsxfun(@times,mult2(:),yn);
C_vect = bsxfun(@plus,mult1_xm,mult2_yn);
%// 3) Create vectorized version of B using vectorized C
B_vect = reshape(exp((2*pi*1j/lambda)*C_vect),phiCount*thetaCount,[]);
%// 4) Final output as matrix multiplication between vectorized versions of B and C
AF_vect = reshape(B_vect*cmpWeights(:),phiCount,thetaCount);
方法 #2:较少内存占用
第二种方法会减少内存流量,它使用指数分布 属性 - exp(A+B) = exp(A)*exp(B)
.
现在,原来的loopy版本是这样的-
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*...
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
所以,在使用分配式 属性 之后,我们最终会得到这样的结果 -
K = (2*pi*1j/lambda)
part1 = K*xm(p)*sin(thetas(j))*cos(phis(i));
part2 = K*yn(q)*sin(thetas(j))*sin(phis(i));
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp(part1)*exp(part2);
因此,相关的矢量化方法将变成这样-
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Define the constant used at the start of EXP() call
K = (2*pi*1j/lambda);
%// 3) Perform the sine-cosine operations part1 & part2 in vectorized manners
mult1 = K*bsxfun(@times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = K*bsxfun(@times,sin(thetas(J)),sin(phis(I)).'); %//'
%// Perform exp(part1) & exp(part2) in vectorized manners
part1_vect = exp(bsxfun(@times,mult1(:),xm));
part2_vect = exp(bsxfun(@times,mult2(:),yn));
%// Perform multiplications with cmpWeights for final output
AF = reshape(sum((part1_vect*cmpWeights.').*part2_vect,2),phiCount,[])
快速基准测试
这是原始循环方法和建议方法 #2 的问题中列出的输入数据的运行时 -
---------------------------- With Original Approach
Elapsed time is 358.081507 seconds.
---------------------------- With Proposed Approach #2
Elapsed time is 0.405038 seconds.
运行时建议使用 Approach #2
!
基本技巧是弄清楚哪些事物是常数,哪些事物取决于下标项 - 因此是矩阵项。
内总和:
C(n,m)
是一个矩阵2π/λ
是常数sin(θ)cos(φ)
是常数x(m)
和y(n)
是向量
所以我要做的两件事是:
- 使用
meshgrid()
将 - 将所有常数项的东西都放在循环之外。
xm
和 ym
展开为矩阵
像这样:
...
piFactor = 2 * pi * 1j / lambda;
[xgrid, ygrid] = meshgrid(xm, ym); % xgrid and ygrid will be size (N, M)
for i = 1:phiCount
for j = 1:thetaCount
xFactor = sin(thetas(j)) * cos(phis(i));
yFactor = sin(thetas(j)) * sin(phis(i));
expFactor = exp(piFactor * (xgrid * xFactor + ygrid * yFactor)); % expFactor is size (N, M)
elements = cmpWeights .* expFactor; % elements of sum, size (N, M)
AF(i, j) = AF(i, j) + sum(elements(:)); % sum and then integrate.
end
end
您可能也想出如何矢量化外循环,但希望这能为您提供一个起点。