ctypes wintypes WCHAR 字符串附加空格

ctypes wintypes WCHAR String Additional White Spaces

为什么下面每个字符后面都有一个白色space?

C++ DLL

test.h:

#ifndef TEST_DLL_H
#define TEST_DLL_H
#define EXPORT __declspec(dllexport) __stdcall 

#include <iostream>
#include <Windows.h>

namespace Test_DLL
{
    struct Simple
    {
        TCHAR a[1024];
    };

    extern "C"
    {
        int EXPORT simple(Simple* a);
    }
};

#endif

test.cpp:

#include "test.h"

int EXPORT Test_DLL::simple(Simple* a)
{
    std::wcout << a->a << std::endl;

    return 0;
}

Python

test.py:

import ctypes
from ctypes import wintypes


class MyStructure(ctypes.Structure):
    _fields_ = [("a", wintypes.WCHAR * 1024)]


a = "Hello, world!"
hDLL = ctypes.LibraryLoader(ctypes.WinDLL)
hDLL_Test = hDLL.LoadLibrary(r"...\test.dll")
simple = hDLL_Test.simple
mystruct = MyStructure(a=a)
ret = simple(ctypes.byref(mystruct))

结果:

H e l l o ,   w o r l d ! 

是C++ DLL端的问题吗?还是我在 Python 方面遗漏了什么?

一开始我认为这是你的代码中的一些小问题。调试的时候发现不是那么回事。从你的例子开始,我开发了另一个说明一些关键点的例子。

test.h:

#if !defined(TEST_DLL_H)
#define TEST_DLL_H


#if defined(_WIN32)
#  if defined(TEST_EXPORTS)
#    define TEST_API __declspec(dllexport)
#  else
#    define TEST_API __declspec(dllimport)
#  endif
#  define CALLING_CONVENTION __cdecl
#else
#  define __TEXT(X) L##X
#  define TEXT(X) __TEXT(X)
#  define TEST_API
#  define CALLING_CONVENTION
#endif


namespace TestDll {
    typedef struct Simple_ {
        wchar_t a[1024];
    } Simple;

    extern "C" {
        TEST_API int CALLING_CONVENTION simple(Simple *pSimple);
        TEST_API int CALLING_CONVENTION printStr(char *pStr);
        TEST_API int CALLING_CONVENTION wprintWstr(wchar_t *pWstr);
        TEST_API wchar_t* CALLING_CONVENTION wstr();
        TEST_API void CALLING_CONVENTION clearWstr(wchar_t *pWstr);
    }
};

#endif  // TEST_DLL_H

test.cpp:

#define TEST_EXPORTS
#include "test.h"
#if defined(_WIN32)
#  include <Windows.h>
#else
#  include <wchar.h>
#  define __FUNCTION__ "function"
#endif
#include <stdio.h>
//#include <iostream>

#define PRINT_MSG_0() printf("From C: - [%s] (%d) - [%s]\n", __FILE__, __LINE__, __FUNCTION__)
#define WPRINT_MSG_0() wprintf(L"From C: - [%s] (%d) - [%s]\n", TEXT(__FILE__), __LINE__, TEXT(__FUNCTION__))

#define DUMMY_TEXT_W L"Dummy text."


//using namespace std;


int TestDll::simple(Simple *pSimple) {
    //std::wcout << pSimple->a << std::endl;
    WPRINT_MSG_0();
    int ret = wprintf(L"%s", pSimple->a);
    wprintf(L"\n");
    return ret;
}


int TestDll::printStr(char *pStr) {
    PRINT_MSG_0();
    int ret = printf("%s", pStr);
    printf("\n");
    return ret;
}


int TestDll::wprintWstr(wchar_t *pWstr) {
    WPRINT_MSG_0();
    int ret = wprintf(L"%s", pWstr);
    wprintf(L"\n");
    int len = wcslen(pWstr);
    char *buf = (char*)pWstr;
    wprintf(L"Hex (%d): ", len);
    for (int i = 0; i < len * sizeof(wchar_t); i++)
        wprintf(L"%02X ", buf[i]);
    wprintf(L"\n");
    return ret;
}


wchar_t *TestDll::wstr() {
    wchar_t *ret = (wchar_t*)malloc((wcslen(DUMMY_TEXT_W) + 1) * sizeof(wchar_t));
    wcscpy(ret, DUMMY_TEXT_W);
    return ret;
}


void TestDll::clearWstr(wchar_t *pWstr) {
    free(pWstr);
}

main.cpp:

#include "test.h"
#include <stdio.h>
#if defined(_WIN32)
#  include <Windows.h>
#endif


int main() {
    char *text = "Hello, world!";
    TestDll::Simple s = { TEXT("Hello, world!") };
    int ret = simple(&s);  // ??? Compiles even if namespace not specified here !!!
    printf("\"simple\" returned %d\n", ret);
    ret = TestDll::printStr("Hello, world!");
    printf("\"printStr\" returned %d\n", ret);
    ret = TestDll::wprintWstr(s.a);
    printf("\"wprintWstr\" returned %d\n", ret);
    return 0;
}

code.py:

#!/usr/bin/env python3

import sys
import ctypes


DLL_NMAME = "./test.dll"
DUMMY_TEXT = "Hello, world!"


WCharArr1024 = ctypes.c_wchar * 1024

class SimpleStruct(ctypes.Structure):
    _fields_ = [
        ("a", WCharArr1024),
    ]


def main():

    test_dll = ctypes.CDLL(DLL_NMAME)

    simple_func = test_dll.simple
    simple_func.argtypes = [ctypes.POINTER(SimpleStruct)]
    simple_func.restype = ctypes.c_int
    stuct_obj = SimpleStruct(a=DUMMY_TEXT)

    print_str_func = test_dll.printStr
    print_str_func.argtypes = [ctypes.c_char_p]
    print_str_func.restype = ctypes.c_int

    wprint_wstr_func = test_dll.wprintWstr
    wprint_wstr_func.argtypes = [ctypes.c_wchar_p]
    wprint_wstr_func.restype = ctypes.c_int

    wstr_func = test_dll.wstr
    wstr_func.argtypes = []
    wstr_func.restype = ctypes.c_wchar_p

    clear_wstr_func = test_dll.clearWstr
    clear_wstr_func.argtypes = [ctypes.c_wchar_p]
    clear_wstr_func.restype = None

    #print("From PY: [{:s}]".format(stuct_obj.a))
    ret = simple_func(ctypes.byref(stuct_obj))
    print("\"{:s}\" returned {:d}".format(simple_func.__name__, ret))
    ret = print_str_func(DUMMY_TEXT.encode())
    print("\"{:s}\" returned {:d}".format(print_str_func.__name__, ret))
    #ret = wprint_wstr_func(ctypes.cast(DUMMY_TEXT.encode(), ctypes.c_wchar_p))
    ret = wprint_wstr_func(DUMMY_TEXT)
    print("\"{:s}\" returned {:d}".format(wprint_wstr_func.__name__, ret))
    s = wstr_func()
    print("\"{:s}\" returned \"{:s}\"".format(wstr_func.__name__, s))
    #clear_wstr_func(s)


if __name__ == "__main__":
    #print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    main()

变化

  • 删除了 C++ 层(以排除尽可能多的变量)并且只依赖 C
  • 将代码改编为 Nix 兼容(我在 Ubtu 上 运行 它,但我遇到了其他问题我不打算讨论)
  • 添加了更多功能(这是一个调试过程),以收集尽可能多的情报
  • 进行了一些重命名、重构和其他不重要的更改
  • 在调查的过程中,我发现了一个有趣的问题(来自main.cpp的评论)。显然 simple 函数可以编译,即使我不在声明它的命名空间之前添加。这不适用于其他函数。 经过一些快速尝试,我意识到这是因为 Simple 参数(可能是因为它也是命名空间的一部分?) .反正没花太多时间也没有深究(还),大概是Undefined Behavior(而且它只是因为运气不好)
  • narrow 和 wide 函数混合使用,即 NO - NO,仅用于调试/演示目的

输出:

e:\Work\Dev\Whosebug\q054269984>"c:\Install\x86\Microsoft\Visual Studio Community15\vc\vcvarsall.bat" x64

e:\Work\Dev\Whosebug\q054269984>dir /b
code.py
main.cpp
test.cpp
test.h

e:\Work\Dev\Whosebug\q054269984>cl /nologo /DDLL /DUNICODE /MD /EHsc test.cpp  /link /NOLOGO /DLL /OUT:test.dll
test.cpp
   Creating library test.lib and object test.exp

e:\Work\Dev\Whosebug\q054269984>cl /nologo /DUNICODE /MD /EHsc main.cpp  /link /NOLOGO /OUT:main.exe test.lib
main.cpp

e:\Work\Dev\Whosebug\q054269984>dir /b
code.py
main.cpp
main.exe
main.obj
test.cpp
test.dll
test.exp
test.h
test.lib
test.obj

e:\Work\Dev\Whosebug\q054269984>main.exe
From C: - [test.cpp] (23) - [TestDll::simple]
Hello, world!
"simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
From C: - [test.cpp] (39) - [TestDll::wprintWstr]
Hello, world!
Hex (13): 48 00 65 00 6C 00 6C 00 6F 00 2C 00 20 00 77 00 6F 00 72 00 6C 00 64 00 21 00
"wprintWstr" returned 13

e:\Work\Dev\Whosebug\q054269984>"e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32

F r o m   C :   -   [ t e s t . c p p ]   ( 2 3 )   -   [ T e s t D l l : : s i m p l e ]
 H e l l o ,   w o r l d !
 "simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
F r o m   C :   -   [ t e s t . c p p ]   ( 3 9 )   -   [ T e s t D l l : : w p r i n t W s t r ]
 H e l l o ,   w o r l d !
 H e x   ( 1 3 ) :   4 8   0 0   6 5   0 0   6 C   0 0   6 C   0 0   6 F   0 0   2 C   0 0   2 0   0 0   7 7   0 0   6 F   0 0   7 2   0 0   6 C   0 0   6 4   0 0   2 1   0 0
 "wprintWstr" returned 13
"wstr" returned "Dummy text."
  • 好像是Python相关
  • 字符串本身没有被弄乱(它们的长度和 wprintf return 值是正确的)。更像是 stdout 是罪魁祸首

然后,我更进一步:

e:\Work\Dev\Whosebug\q054269984>for /f %f in ('dir /b "e:\Work\Dev\VEnvs\py_064*"') do ("e:\Work\Dev\VEnvs\%f\Scripts\python.exe" code.py)

e:\Work\Dev\Whosebug\q054269984>("e:\Work\Dev\VEnvs\py_064_02.07.15_test0\Scripts\python.exe" code.py )
Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:30:26) [MSC v.1500 64 bit (AMD64)] on win32

From C: - [test.cpp] (23) - [TestDll::simple]
Hello, world!
"simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
From C: - [test.cpp] (39) - [TestDll::wprintWstr]
Hello, world!
Hex (13): 48 00 65 00 6C 00 6C 00 6F 00 2C 00 20 00 77 00 6F 00 72 00 6C 00 64 00 21 00
"wprintWstr" returned 13
"wstr" returned "Dummy text."

e:\Work\Dev\Whosebug\q054269984>("e:\Work\Dev\VEnvs\py_064_03.04.04_test0\Scripts\python.exe" code.py )
Python 3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 20:20:57) [MSC v.1600 64 bit (AMD64)] on win32

From C: - [test.cpp] (23) - [TestDll::simple]
Hello, world!
"simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
From C: - [test.cpp] (39) - [TestDll::wprintWstr]
Hello, world!
Hex (13): 48 00 65 00 6C 00 6C 00 6F 00 2C 00 20 00 77 00 6F 00 72 00 6C 00 64 00 21 00
"wprintWstr" returned 13
"wstr" returned "Dummy text."

e:\Work\Dev\Whosebug\q054269984>("e:\Work\Dev\VEnvs\py_064_03.05.04_test0\Scripts\python.exe" code.py )
Python 3.5.4 (v3.5.4:3f56838, Aug  8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)] on win32

F r o m   C :   -   [ t e s t . c p p ]   ( 2 3 )   -   [ T e s t D l l : : s i m p l e ]
 H e l l o ,   w o r l d !
 "simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
F r o m   C :   -   [ t e s t . c p p ]   ( 3 9 )   -   [ T e s t D l l : : w p r i n t W s t r ]
 H e l l o ,   w o r l d !
 H e x   ( 1 3 ) :   4 8   0 0   6 5   0 0   6 C   0 0   6 C   0 0   6 F   0 0   2 C   0 0   2 0   0 0   7 7   0 0   6 F   0 0   7 2   0 0   6 C   0 0   6 4   0 0   2 1   0 0
 "wprintWstr" returned 13
"wstr" returned "Dummy text."

e:\Work\Dev\Whosebug\q054269984>("e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py )
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32

F r o m   C :   -   [ t e s t . c p p ]   ( 2 3 )   -   [ T e s t D l l : : s i m p l e ]
 H e l l o ,   w o r l d !
 "simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
F r o m   C :   -   [ t e s t . c p p ]   ( 3 9 )   -   [ T e s t D l l : : w p r i n t W s t r ]
 H e l l o ,   w o r l d !
 H e x   ( 1 3 ) :   4 8   0 0   6 5   0 0   6 C   0 0   6 C   0 0   6 F   0 0   2 C   0 0   2 0   0 0   7 7   0 0   6 F   0 0   7 2   0 0   6 C   0 0   6 4   0 0   2 1   0 0
 "wprintWstr" returned 13
"wstr" returned "Dummy text."

e:\Work\Dev\Whosebug\q054269984>("e:\Work\Dev\VEnvs\py_064_03.07.02_test0\Scripts\python.exe" code.py )
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32

F r o m   C :   -   [ t e s t . c p p ]   ( 2 3 )   -   [ T e s t D l l : : s i m p l e ]
 H e l l o ,   w o r l d !
 "simple" returned 13
From C: - [test.cpp] (31) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
F r o m   C :   -   [ t e s t . c p p ]   ( 3 9 )   -   [ T e s t D l l : : w p r i n t W s t r ]
 H e l l o ,   w o r l d !
 H e x   ( 1 3 ) :   4 8   0 0   6 5   0 0   6 C   0 0   6 C   0 0   6 F   0 0   2 C   0 0   2 0   0 0   7 7   0 0   6 F   0 0   7 2   0 0   6 C   0 0   6 4   0 0   2 1   0 0
 "wprintWstr" returned 13
"wstr" returned "Dummy text."

如所见,从 Python 3.5.

开始的行为是可重现的

我认为这是因为 [Python]: PEP 529 -- Change Windows filesystem encoding to UTF-8,但只有 3.6.

版本才可用

然后我开始阅读,(我什至试图在 Python 3.4Python 3.5[=131 之间做一个差异=]) 但收效甚微。我浏览过的一些文章:

然后我注意到[SO]: Output unicode strings in Windows console app (@DuckMaestro's answer) and started to play with [MS.Docs]: _setmode

添加:

#include <io.h>
#include <fcntl.h>


static int set_stdout_mode(int mode) {
    fflush(stdout);
    int ret = _setmode(_fileno(stdout), mode);
    return ret;
}

并在 test.cpp 中像 int stdout_mode = set_stdout_mode(_O_TEXT); 那样调用它,然后从 C 输出任何内容(and C++std::wcout 行未注释),产生:

e:\Work\Dev\Whosebug\q054269984>"e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32

Hello, world!
From C: - [test.cpp] (32) - [TestDll::simple]
Hello, world!
"simple" returned 13
From C: - [test.cpp] (40) - [TestDll::printStr]
Hello, world!
"printStr" returned 13
From C: - [test.cpp] (48) - [TestDll::wprintWstr]
Hello, world!
Hex (13): 48 00 65 00 6C 00 6C 00 6F 00 2C 00 20 00 77 00 6F 00 72 00 6C 00 64 00 21 00
"wprintWstr" returned 13
"wstr" returned "Dummy text."
  • 虽然有效,但我不知道为什么。它可能是 未定义的行为
    • 打印_setmode的return值,显示Python 3.4main.exe自动设置模式为_O_TEXT0x4000),而较新的 Python 版本(那些不起作用的版本)将其设置为 _O_BINARY (0x8000) - 显然 似乎是原因(可能相关:[Python]: Issue #16587 - Py_Initialize breaks wprintf on Windows
    • 尝试将其设置为任何宽相关常量 (_O_U16TEXT, _O_U8TEXT, _O_WTEXT) 在调用 printfstd::cout 时使程序崩溃( 即使在完成宽函数 - 在窄函数之前)
  • 尝试输出真正的 Unicode 字符,将无法工作(很可能)
  • 您可以在 Python 方面实现相同的目标:msvcrt.setmode(sys.stdout.fileno(), 0x4000)