如何避免 c++ 和 boost::python 文档之间的冗余？

Question

我正在使用 boost::python 在 C++ 代码中添加一个 python 模块。 c++ 项目用 doxygen 记录。我想为 python 模块创建一个文档，但我不知道如何不像这样冗余：

#include <boost/python.hpp>
using namespace boost::python;

/** @brief Sum two integers
  * @param a an integer
  * @param b another integer
  * @return sum of integers
  */
int sum(int a, int b)
{
    return a+b;
}

BOOST_PYTHON_MODULE(pymodule)
{ 
    def("sum",&sum,args("a","b"),
        "Sum two integers.\n\n:param a: an integer\n:param b: another integer\n:returns: sum of integers");
};

这里我在docstring和doxygen注释中说同样的话。有什么想法吗？

编辑： c++ 文档不是 public 并且 python 接口是 c++ 的子集。

Answer 1

我是代码生成的粉丝，我相信这是部署它的合理情况。

如果您在编写 Doxygen DocStrings 时有点自律并且避免在其中使用复杂的标记，那么编写一个提取它们并将它们替换回 Python DocStrings 的小型解析器并不难.

这是一个小例子。它不够强大，无法处理任何现实的 use-case，但我相信扩展它并不困难，也值得付出努力，除非您只有 hand-full 个函数要记录。

在每个 Doxygen DocString 之前放置一个特殊注释，为以下注释块命名。在这里，我使用语法

// DocString: sum
/**
 * @brief Sum two integers
 * @param a an integer
 * @param b another integer
 * @return sum of integers
 *
 */
int sum(int a, int b);

将名称 sum 与以下 DocString 相关联。

然后，在引用该名称的 Python 绑定中放置另一个特殊字符串。我在这里使用以下语法。

BOOST_PYTHON_MODULE(pymodule)
{ 
  def("sum",&sum,args("a","b"), "@DocString(sum)");
};

现在我们需要一个工具来提取 Doxygen DocStrings 并将它们替换到 Python 绑定中。

正如我所说，这个例子是人为设计的，但它应该展示想法并证明它并不难做到。

import re
import sys

def parse_doc_string(istr):
    pattern = re.compile(r'@(\w+)\s+(.*)')
    docstring = list()
    for line in map(lambda s : s.strip(), istr):
        if line == '/**':
            continue
        if line == '*/':
            return docstring
        line = line.lstrip('* ')
        match = pattern.match(line)
        if match:
            docstring.append((match.group(1), match.group(2)))

def extract(istr, docstrings):
    pattern = re.compile(r'^//\s*DocString:\s*(\w+)$')
    for line in map(lambda s : s.strip(), istr):
        match = pattern.match(line)
        if match:
            token = match.group(1)
            docstrings[token] = parse_doc_string(istr)

def format_doc_string(docstring):
    return '\n'.join('{}: {}'.format(k, v) for (k, v) in docstring)

def escape(string):
    return string.replace('\n', r'\n')

def substitute(istr, ostr, docstrings):
    pattern = re.compile(r'@DocString\((\w+)\)')
    for line in map(lambda s : s.rstrip(), istr):
        for match in pattern.finditer(line):
            token = match.group(1)
            docstring = format_doc_string(docstrings[token])
            line = line.replace(match.group(0), escape(docstring))
        print(line, file=ostr)

if __name__ == '__main__':
    sourcefile = sys.argv[1]
    docstrings = dict()
    with open(sourcefile) as istr:
        extract(istr, docstrings)
    with open(sourcefile) as istr:
        with sys.stdout as ostr:
            substitute(istr, ostr, docstrings)

运行源文件上的这个脚本产生以下输出。

#include <boost/python.hpp>
using namespace boost::python;

// DocString: sum
/**
 * @brief Sum two integers
 * @param a an integer
 * @param b another integer
 * @return sum of integers
 *
 */
int sum(int a, int b)
{
  return a+b;
}

BOOST_PYTHON_MODULE(pymodule)
{
  def("sum",&sum,args("a","b"), "brief: Sum two integers\nparam: a an integer\nparam: b another integer\nreturn: sum of integers");
};

为脚本添加两个小时的润色，您就可以开始了。

由于其他人也可能对此感兴趣，因此如果有人已经编写了这样的脚本，我不会感到惊讶。如果没有，将您的软件发布为免费软件肯定会受到其他人的欢迎。

Answer 2

5gon12eder 的想法是提取 doxygen 注释并将其替换为 python 文档字符串。他提出了一个使用 python 脚本的解决方案。

这是另一个带有 CMake 脚本的脚本，因为我正在使用它来构建我的项目。希望能帮到有同样问题的人:

set(FUNCTION "sum")
file(READ "pymodule.cpp.in" CONTENTS)

# To find the line with the flag
string(REGEX REPLACE "\n" ";" CONTENTS "${CONTENTS}")
list(FIND CONTENTS "// Docstring_${FUNCTION}" INDEX)

# To extract doxygen comments
math(EXPR INDEX "${INDEX}+1")
list(GET CONTENTS ${INDEX} LINE)
while(${LINE} MATCHES "@([a-z]+) (.*)")
  string(REGEX MATCH "@([a-z]+) (.*)" LINE "${LINE}")
  set(DOXY_COMMENTS ${DOXY_COMMENTS} ${LINE})
  math(EXPR INDEX "${INDEX}+1")
  list(GET CONTENTS ${INDEX} LINE)
endwhile()

# To convert doxygen comments into docstrings
foreach(LINE ${DOXY_COMMENTS})
  string(REGEX REPLACE "@brief " "" LINE "${LINE}")
  if("${LINE}" MATCHES "@param ([a-zA-Z0-9_]+) (.*)")
    set(LINE ":param ${CMAKE_MATCH_1}: ${CMAKE_MATCH_2}")
  endif()
  if ("${LINE}" MATCHES "@return (.+)")
    set(LINE ":returns: ${CMAKE_MATCH_1}")
  endif()
  set(DOCSTRING ${DOCSTRING} ${LINE})
endforeach()
string(REPLACE ";" "\n" DOCSTRING "${DOCSTRING}")

# To insert docstrings in cpp file
set(Docstring_${FUNCTION} ${DOCSTRING})
configure_file("pymodule.cpp.in" "pymodule.cpp" @ONLY)

pymodule.cpp.in :

/**
 * @file pymodule.cpp
 */

#include<boost/python.hpp>
using namespace boost::python;

// Docstring_sum
/** @brief Sum two integers
  * @param a an integer
  * @param b another integer
  * @return sum of integers
  */
int sum(int a, int b) {
  return a+b;
}

BOOST_PYTHON_MODULE(pymodule){ 
  def("sum",&sum,args("a","b"),
      "@Docstring_sum@");
};

在这种情况下，脚本将生成 pymodule.cpp 具有良好的文档字符串。

如何避免 c++ 和 boost::python 文档之间的冗余？

How to avoid redundancy between c++ and boost::python docs?

c++

doxygen

docstring

boost-python