计时实验 - 矩阵

Timing Experiment - Matrices

Determine a matrix size that you can comfortably fit into your available RAM. For example, if you have a 4 GB machine, you should be able to comfortably store a matrix that occupies about 800MB. Store this value in a variable Mb. Use the following information to compute a maximum matrix dimension N that you can store in Mb megabytes of memory.

  • A megabyte has 1024 kilobytes

  • A kilobyte is 1024 bytes

  • A floating point number is 8 bytes.

  • An N × N matrix contains N^2 floating point numbers.

Call the N you compute nmax.

(b) Create two random matrices A and B each of size Nmax × Nmax. Using the MATLAB functions tic and toc, determine how much time (seconds) it takes to compute the product AB. Determine the number of floating point operations (additions and multiplications) it takes to compute the Nmax × Nmax matrix-matrix product (2/3)n^3. Use this number to estimate the number of floating point operations per second (’flops’) your computer can carry out. Call this flop rate flops.

% Part A
nmax = sqrt((1600*1024*1024)/8); % 8GB of RAM

% Part B
A = (nmax:nmax);
B = (nmax:nmax);

tic 
prod = A*B;
prod_time = toc

flops = (2/3)*(prod).^3

一切正常,但我觉得我没有为值 AB 创建矩阵。我做错了什么?

两个主要问题:你搞砸了你的矩阵分配; c:c 其中 c 是一个常量,只是 returns 常量。冒号 : 创建 数组 ,例如

c = 5;
N = 1:c
    1  2  3  4  5

为冒号运算符提供相同的起点和终点显然只是 returns 那一点。

其次:操作总数与元素数量成正比,而不是矩阵乘积的实际结果(实际上无关紧要,我们只是对时间)。所以先计算FL浮动点O操作总数

还记得我们用过tic/toc吗?好吧,也许我们应该找出存储在 prod_time 中的总时间是多少。这是执行矩阵乘法所花费的秒数。将 Totflops 除以 prod_time 得到 FL 浮动点 O 操作 Per S其次,即 FLOPS.


[~,systemview] = memory; % Get RAM info
tmp = systemview.PhysicalMemory;
% tmp.Total stores the total system RAM

Mb = 0.2*((tmp.Total/(1024^2))); % 20% of the total RAM

% floor your nmax to force it to be integer
nmax = floor(sqrt((Mb*1024^2/8))); % create your nmax

A = rand(nmax); % random nmax x nmax matrix
B = rand(nmax); % random nmax x nmax matrix

tic
prod = A*B;
prod_time = toc;

% Total flops
Totflops = (2/3)*(nmax).^3;

flops = Totflops/prod_time; % flops/sec

我的系统(8GB RAM 和 i5 750 2.66GHz)给出了 flops = 1.0617e+10