用 libclang 解析;无法解析某些标记(Python in Windows)
Parsing with libclang; unable to parse certain tokens (Python in Windows)
我有一些代码(从here and 中获取和改编),它使用libclang解析Python(Widnows)中的C++源文件并获取所有其 声明语句,如下所示:
import clang.cindex
def parse_decl(node):
reference_node = node.get_definition()
if node.kind.is_declaration():
print(node.kind, node.kind.name,
node.location.line, ',', node.location.column,
reference_node.displayname)
for ch in node.get_children():
parse_decl(ch)
# configure path
clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll')
index = clang.cindex.Index.create()
trans_unit = index.parse(r'C:\path\to\sourcefile\test.cpp', args=['-std=c++11'])
parse_decl(trans_unit.cursor)
对于以下 C++ 源文件 (test_ok.cpp
):
/* test_ok.cpp
*/
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <cmath>
#include <iomanip>
using namespace std;
int main (int argc, char *argv[]) {
int linecount = 0;
double array[1000], sum=0, median=0, add=0;
string filename;
if (argc <= 1)
{
cout << "Error: no filename specified" << endl;
return 0;
}
//program checks if a filename is specified
filename = argv[1];
ifstream myfile (filename.c_str());
if (myfile.is_open())
{
myfile >> array[linecount];
while ( myfile.good() )
{
linecount++;
myfile >> array[linecount];
}
myfile.close();
}
parse
方法按应有的方式解析并输出:
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 10 , 17 std
CursorKind.FUNCTION_DECL FUNCTION_DECL 12 , 5 main(int, char **)
CursorKind.PARM_DECL PARM_DECL 12 , 15 argc
CursorKind.PARM_DECL PARM_DECL 12 , 27 argv
CursorKind.VAR_DECL VAR_DECL 13 , 7 linecount
CursorKind.VAR_DECL VAR_DECL 14 , 10 array
CursorKind.VAR_DECL VAR_DECL 14 , 23 sum
CursorKind.VAR_DECL VAR_DECL 14 , 30 median
CursorKind.VAR_DECL VAR_DECL 14 , 40 add
CursorKind.VAR_DECL VAR_DECL 15 , 10 filename
CursorKind.VAR_DECL VAR_DECL 23 , 12 myfile
Process finished with exit code 0
然而,
对于以下 C++ 源文件 (test.cpp
):
/* test.cpp
*/
#include <iostream>
#include <vector>
#include <fstream>
#include <cmath>
#include <algorithm>
#include <iomanip>
using namespace std;
void readfunction(vector<double>& numbers, ifstream& myfile)
{
double number;
while (myfile >> number) {
numbers.push_back(number);}
}
double meanfunction(vector<double>& numbers)
{
double total=0;
vector<double>::const_iterator i;
for (i=numbers.begin(); i!=numbers.end(); ++i) {
total +=*i; }
return total/numbers.size();
}
解析不完整:
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 8 , 17 std
CursorKind.VAR_DECL VAR_DECL 10 , 6 readfunction
Process finished with exit code 0
解析无法处理 vector<double>& numbers
等行,并停止解析该部分代码。
我认为该问题与该问题的另一个 SO question. I have tried to explicitly use the std=c++11
parse argument with no success. In an answer 中描述的问题相似(即使它没有解决问题)也建议使用 -x c++
但 我不知道如何在上面的代码中添加它。
任何人都可以指出 libclang 的解决方案来解析像 test.cpp
中那样的 C++ 语句吗?
此外,我能否让它继续解析,即使它遇到无法解析的标记?
默认情况下,libclang 不添加编译系统包含路径。
始终确保您已经检查了诊断信息 - 如编译器错误消息,它们往往会指示如何解决任何问题。在这种情况下,很明显存在包含问题:
<Diagnostic severity 4, location <SourceLocation file 'test.cpp', line 3, column 10>, spelling "'iostream' file not found">
如果您确定 libclang 添加了这些路径,它应该会开始工作。
This question includes an approach to solving this problem. This seems to be a recurring theme on Whosebug, so I wrote ccsyspath 以帮助找到 OSX、Linux 和 Windows 上的那些路径。稍微简化您的代码:
import clang.cindex
clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll')
import ccsyspath
index = clang.cindex.Index.create()
args = '-x c++ --std=c++11'.split()
syspath = ccsyspath.system_include_paths('clang++')
incargs = [ b'-I' + inc for inc in syspath ]
args = args + incargs
trans_unit = index.parse('test.cpp', args=args)
for node in trans_unit.cursor.walk_preorder():
if node.location.file is None:
continue
if node.location.file.name != 'test.cpp':
continue
if node.kind.is_declaration():
print(node.kind, node.location)
我的 args
最终成为:
['-x',
'c++',
'--std=c++11',
'-IC:\Program Files (x86)\LLVM\bin\..\lib\clang\3.8.0\include',
'-IC:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\shared',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\um',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt']
输出为:
(CursorKind.USING_DIRECTIVE, <SourceLocation file 'test.cpp', line 10, column 17>)
(CursorKind.FUNCTION_DECL, <SourceLocation file 'test.cpp', line 12, column 6>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 12, column 35>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 12, column 54>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 15, column 14>)
(CursorKind.FUNCTION_DECL, <SourceLocation file 'test.cpp', line 21, column 8>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 21, column 37>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 24, column 14>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 25, column 40>)
我有一些代码(从here and
import clang.cindex
def parse_decl(node):
reference_node = node.get_definition()
if node.kind.is_declaration():
print(node.kind, node.kind.name,
node.location.line, ',', node.location.column,
reference_node.displayname)
for ch in node.get_children():
parse_decl(ch)
# configure path
clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll')
index = clang.cindex.Index.create()
trans_unit = index.parse(r'C:\path\to\sourcefile\test.cpp', args=['-std=c++11'])
parse_decl(trans_unit.cursor)
对于以下 C++ 源文件 (test_ok.cpp
):
/* test_ok.cpp
*/
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <cmath>
#include <iomanip>
using namespace std;
int main (int argc, char *argv[]) {
int linecount = 0;
double array[1000], sum=0, median=0, add=0;
string filename;
if (argc <= 1)
{
cout << "Error: no filename specified" << endl;
return 0;
}
//program checks if a filename is specified
filename = argv[1];
ifstream myfile (filename.c_str());
if (myfile.is_open())
{
myfile >> array[linecount];
while ( myfile.good() )
{
linecount++;
myfile >> array[linecount];
}
myfile.close();
}
parse
方法按应有的方式解析并输出:
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 10 , 17 std
CursorKind.FUNCTION_DECL FUNCTION_DECL 12 , 5 main(int, char **)
CursorKind.PARM_DECL PARM_DECL 12 , 15 argc
CursorKind.PARM_DECL PARM_DECL 12 , 27 argv
CursorKind.VAR_DECL VAR_DECL 13 , 7 linecount
CursorKind.VAR_DECL VAR_DECL 14 , 10 array
CursorKind.VAR_DECL VAR_DECL 14 , 23 sum
CursorKind.VAR_DECL VAR_DECL 14 , 30 median
CursorKind.VAR_DECL VAR_DECL 14 , 40 add
CursorKind.VAR_DECL VAR_DECL 15 , 10 filename
CursorKind.VAR_DECL VAR_DECL 23 , 12 myfile
Process finished with exit code 0
然而,
对于以下 C++ 源文件 (test.cpp
):
/* test.cpp
*/
#include <iostream>
#include <vector>
#include <fstream>
#include <cmath>
#include <algorithm>
#include <iomanip>
using namespace std;
void readfunction(vector<double>& numbers, ifstream& myfile)
{
double number;
while (myfile >> number) {
numbers.push_back(number);}
}
double meanfunction(vector<double>& numbers)
{
double total=0;
vector<double>::const_iterator i;
for (i=numbers.begin(); i!=numbers.end(); ++i) {
total +=*i; }
return total/numbers.size();
}
解析不完整:
CursorKind.USING_DIRECTIVE USING_DIRECTIVE 8 , 17 std
CursorKind.VAR_DECL VAR_DECL 10 , 6 readfunction
Process finished with exit code 0
解析无法处理 vector<double>& numbers
等行,并停止解析该部分代码。
我认为该问题与该问题的另一个 SO question. I have tried to explicitly use the std=c++11
parse argument with no success. In an answer 中描述的问题相似(即使它没有解决问题)也建议使用 -x c++
但 我不知道如何在上面的代码中添加它。
任何人都可以指出 libclang 的解决方案来解析像 test.cpp
中那样的 C++ 语句吗?
此外,我能否让它继续解析,即使它遇到无法解析的标记?
默认情况下,libclang 不添加编译系统包含路径。
始终确保您已经检查了诊断信息 - 如编译器错误消息,它们往往会指示如何解决任何问题。在这种情况下,很明显存在包含问题:
<Diagnostic severity 4, location <SourceLocation file 'test.cpp', line 3, column 10>, spelling "'iostream' file not found">
如果您确定 libclang 添加了这些路径,它应该会开始工作。
This question includes an approach to solving this problem. This seems to be a recurring theme on Whosebug, so I wrote ccsyspath 以帮助找到 OSX、Linux 和 Windows 上的那些路径。稍微简化您的代码:
import clang.cindex
clang.cindex.Config.set_library_file('C:/Program Files (x86)/LLVM/bin/libclang.dll')
import ccsyspath
index = clang.cindex.Index.create()
args = '-x c++ --std=c++11'.split()
syspath = ccsyspath.system_include_paths('clang++')
incargs = [ b'-I' + inc for inc in syspath ]
args = args + incargs
trans_unit = index.parse('test.cpp', args=args)
for node in trans_unit.cursor.walk_preorder():
if node.location.file is None:
continue
if node.location.file.name != 'test.cpp':
continue
if node.kind.is_declaration():
print(node.kind, node.location)
我的 args
最终成为:
['-x',
'c++',
'--std=c++11',
'-IC:\Program Files (x86)\LLVM\bin\..\lib\clang\3.8.0\include',
'-IC:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\shared',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\um',
'-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt']
输出为:
(CursorKind.USING_DIRECTIVE, <SourceLocation file 'test.cpp', line 10, column 17>)
(CursorKind.FUNCTION_DECL, <SourceLocation file 'test.cpp', line 12, column 6>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 12, column 35>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 12, column 54>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 15, column 14>)
(CursorKind.FUNCTION_DECL, <SourceLocation file 'test.cpp', line 21, column 8>)
(CursorKind.PARM_DECL, <SourceLocation file 'test.cpp', line 21, column 37>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 24, column 14>)
(CursorKind.VAR_DECL, <SourceLocation file 'test.cpp', line 25, column 40>)