使用 f2py 在 Python 上进行更快的计算

Using f2py for faster calculation on Python

我正在使用 python 进行序列比对项目,python for 循环太慢了。

所以,我决定使用f2py。我对fortran不太了解,所以我坚持下面的观点。

有两个序列名为'column'和'row',类型为np.array

例如:

column = ['A', 'T', 'G', 'C']
row = ['A', 'A', 'C', 'C'] 

我为 Needleman-Wunsch 算法创建了一个矩阵,我对两个序列(列、行)进行了评分。

    import numpy as np
    column = np.array(list('ATGC'))
    row = np.array(list('AACC'))
    matrix = np.zeros((len(column) + 1, len(row) + 1), dtype='int')

    for i in range(1, len(column)+1):
        self.matrix[i][0] = -1 * i

    for j in range(1, len(row)+1):
        self.matrix[0][j] = -1 * j

    matchCheck = 0

    for i in range(1, len(column) + 1):
        for j in range(1, len(row) + 1):
            if column[i-1] == row[j-1]:
                matchCheck = 1 
            else:
                matchCheck = -1 
            top = matrix[i-1][j] + -1
            left = matrix[i][j-1] + -1
            top_left = matrix[i-1][j-1] + matchCheck
            matrix[i][j] = max(top, left, top_left)

我想从fortran那里得到一些帮助来加快计算速度,所以我用fortran写了一个代码。

subroutine needlemanWunsch(matrix, column, row, cc, rr, new_matrix)
integer, intent(in) :: cc, rr
character, intent(in) :: column(0:cc-1), row(0:rr-1)
integer, intent(in) :: matrix(0:cc, 0:rr)
integer, intent(out) :: new_matrix(0:cc, 0:rr)
integer :: matchcheck, top, left, top_left

do i = 1, cc
    new_matrix(i, 0) = -1 * i
end do

do j = 1, rr
    new_matrix(i, 0) = -1 * j
end do

do k = 1, cc
    do l = 1, rr
        if (column(i-1).EQ.row(j-1)) then
            matchcheck = 1
        else
            matchcheck = -1 
        
        top = matrix(i-1, j) + inDel
        left = matrix(i, j-1) + inDel
        top_left = matrix(i-1, j-1) + matchCheck
        new_matrix(i, j) = max(top, left, top_left)
        end if 
    end do
end do 
return
end subroutine

然后,我用 f2py 转换了这个 fortran 代码,并用这个代码在 python 上导入了它。

    import numpy as np
    column = np.array(list('ATGC'))
    row = np.array(list('AACC'))
    matrix = np.zeros((len(column) + 1, len(row) + 1), dtype='int')
    
    # import my fortran code 
    matrix = algorithm.needlemanwunsch(matrix, column, row, cc, rr)

每当我尝试导入 Fortran 代码时
它崩溃了...

以下适用于我的情况。

文件neeldemanWunsch.f90

subroutine needlemanWunsch(matrix, column, row, cc, rr, new_matrix)
  integer, intent(in) :: cc, rr
  character, intent(in) :: column(0:cc-1), row(0:rr-1)
  integer, intent(in) :: matrix(0:cc, 0:rr)
  integer, intent(out) :: new_matrix(0:cc, 0:rr)
  integer :: matchcheck, top, left, top_left

  do i = 1, cc
      new_matrix(i, 0) = -1 * i
  end do

  do j = 1, rr
      new_matrix(i, 0) = -1 * j
  end do

  do k = 1, cc
      do l = 1, rr
          if (column(i-1).EQ.row(j-1)) then
              matchcheck = 1
          else
              matchcheck = -1

          top = matrix(i-1, j) + inDel
          left = matrix(i, j-1) + inDel
          top_left = matrix(i-1, j-1) + matchCheck
          new_matrix(i, j) = max(top, left, top_left)
          end if
      end do
  end do
  return
end subroutine

通过f2py

编译
$ f2py -c needlemanWunsch.f90 -m needlemanWunsch

正在导入 python 文件 needlemanWunsch.py。 这就是您的错误来源! 您需要导入已编译的模块,请参见下面的示例。

import needlemanWunsch         # THIS IS MISSING IN YOUR CODE!!
import numpy as np

# create matrix
_column = ['A', 'T', 'G', 'C']
_row = ['A', 'A', 'C', 'C']
column = np.array(list(_column))
row = np.array(list(_row))
cc = len(column)
rr = len(column)
matrix = np.zeros((len(column) + 1, len(row) + 1), dtype='int')

# import my fortran code
matrix = needlemanWunsch.needlemanwunsch(matrix, column, row, cc, rr)

print(matrix)

输出为

$ python needlemanWunsch.py 
[[ 0 -4  0  0  0]
 [-1  0  0  0  0]
 [-2  0  0  0  0]
 [-3  0  0  0  0]
 [-4  0  0  0  0]]