从 Python 子进程调用中获取控制台输出并将其转换为有效的 CSV (and/or JSON)

Grab console output from a Python subprocess call and transmogrify it into valid CSV (and/or JSON)

我正在处理大量数字,您可以在这里看到类似的数字:

6060604052361561006c5760e060020a60003504630121b93f81146100e15780636637b882146101615780636dbf2fa0146101935780638da5cb5b1461026a578063a6f9dae11461027c578063beabacc8146102ae578063d979f5aa14610322578063e1fa763814610354575b61050b600060006000600460005054111561051d576004805460001901905560015460035460055460e260020a6320998771026060908152606492909252600160a060020a03908116608452909116906382661dc49060a49060209060448187876161da5a03f11561000257506105c3915050565b6105cb60043560005433600160a060020a039081169116141561015e57600180547fc9d27afe0000000000000000000000000000000000000000000000000000000060609081526064849052608492909252600160a060020a03169063c9d27afe9060a4906020906044816000876161da5a03f115610002575050505b50565b6105cb60043560005433600160a060020a039081169116141561015e5760018054600160a060020a0319168217905550565b60806020604435600481810135601f8101849004909302840160405260608381526105cb9482359460248035956064949391019190819083828082843750949650505050505050600054600160a060020a039081163390911614156102655782600160a060020a03168282604051808280519060200190808383829060006004602084601f0104600f02600301f150905090810190601f16801561024b5780820380516001836020036101000a031916815260200191505b5091505060006040518083038185876185025a03f1505050505b505050565b6105cd600054600160a060020a031681565b6105cb60043560005433600160a060020a039081169116141561015e5760008054600160a060020a0319168217905550565b6105cb6004356024356044356000805433600160a060020a039081169116141561031c5760e060020a63a9059cbb026060908152600160a060020a03848116606452608484905285929083169163a9059cbb9160a4916020916044908290876161da5a03f115610002575050505b50505050565b6105cb60043560005433600160a060020a039081169116141561015e5760028054600160a060020a0319168217905550565b6105cb60043560243560005433600160a060020a03908116911614156105075760015460e060020a6370a0823102606090815230600160a060020a0390811660645291909116906370a08231906084906020906024816000876161da5a03f1156100025750506040805180516006556002546001547f1a695230000000000000000000000000000000000000000000000000000000008352600160a060020a039081166004840152925192169250631a695230916024828101926000929190829003018183876161da5a03f1156100025750505060048181556003839055600154604080517f013cf08b00000000000000000000000000000000000000000000000000000000815292830185905251600160a060020a03919091169163013cf08b91602482810192602092919082900301816000876161da5a03f11561000257505060408051805160058054600160a060020a0319169091179081905560015460035460e260020a63209987710284526004840152600160a060020a0391821660248401529251921692506382661dc491604482810192602092919082900301816000876161da5a03f115610002575050505b5050565b60408051918252519081900360200190f35b60015460e060020a6370a0823102606090815230600160a060020a0390811660645291909116906370a082319060849060209060248187876161da5a03f11561000257505060408051805160015460025460e060020a63a9059cbb028452600160a060020a039081166004850152602484018390529351919550909216925063a9059cbb916044828101926020929190829003018188876161da5a03f115610002575050505b600191505090565b005b6060908152602090f3

他们是这样处理的

echo "INPUT_DATA" >> file_name && evm disasm file_name

输出数据如下所示:

000000: PUSH1 0x60
000002: PUSH1 0x40
000004: MSTORE
000005: CALLDATASIZE
000006: ISZERO
000007: PUSH2 0x006c
000010: JUMPI
000011: PUSH1 0xe0
000013: PUSH1 0x02
000015: EXP
000016: PUSH1 0x00
000018: CALLDATALOAD
000019: DIV

我最终想做的是将该输出呈现为 CSV(或者也可能 JSON)。像这样:

PUSH1 0x60, PUSH1 0x40, MSTORE, CALLDATASIZE, ISZERO, PUSH2 0x006c, JUMPI, PUSH1 0xe0, PUSH1 0x02, EXP, PUSH1 0x00, CALLDATALOAD, DIV

然而,目前我只是将它记录到控制台,使用这个脚本:

import sys
import subprocess

def my_test_func(filename, data):
    with open(filename, 'w') as fd:
        fd.write(data)
        fd.write('\n')
    return subprocess.check_output(['evm', 'disasm', filename])




if '__main__' == __name__:

    file_name = sys.argv[1] 
    byte_code = sys.argv[2]
    status = my_test_func(file_name, byte_code)

    # python opcode-farmer.py 'tst2' '6005600401'
    print(status)

^ 该脚本有点简洁,因为它在 Python 脚本中创建了一个子进程,如您所见。

我想知道的是 - 获取输出的最佳方式是什么,而不是仅仅将其写入控制台 - 放入一个可以将其转换为 CSV 的过程。所以 - 我当然有一些关于如何做到这一点的想法 - 但是 - 几乎没有失败 - 我的想法往往是计算上最昂贵和最不雅的可能方式 - 所以我想看看 SO 社区提出了什么样的建议.

此类作品:

edits = csv.reader(status.splitlines(), delimiter=",")
for row in edits:
    print(row)

但也 - 不是真的 - 它给出了这个输出:

['6005600401']
['000000: PUSH1 0x05']
['000002: PUSH1 0x04']
['000004: ADD']

这在两个方面是次优的 - 首先 - 许多列表或类似的字典以后很难使用 - 而且 - 它不会去除所有多余的信息。我需要为此使用正则表达式吗?

我真正想要的是:

PUSH1 0x60, PUSH1 0x40, MSTORE, CALLDATASIZE, ISZERO, PUSH2 0x006c, JUMPI, PUSH1 0xe0, PUSH1 0x02, EXP, PUSH1 0x00, CALLDATALOAD, DIV

这个怎么样?

', '.join([' '.join(line.split()[1:]) for line in status.splitlines()])