使用 Python 将 DICOM 标签转换为 Excel 时出错

Error while converting DICOM tags to Excel using Python

我正在尝试将 .dcm 文件中的 DICOM 标签转换并列出到 Excel(使用 pydicom),但某些标签在转换过程中显示错误(患者姓名、患者 ID 等)。

有些标签在 Excel 文件中显示 'None',尽管它们 contain/show 数据(SOP Class UID、SOP 实例 UID 等)采用 DICOM 格式。我该如何解决?

import xlsxwriter 
import sys 
import pydicom 
import os.path
from pydicom.valuerep import PersonName
keywords = ("Patient's Name",
            "Patient ID",
            "Patient's Birth Date",
            "Patient's Sex",
            "SOP Class UID",
            "SOP Instance UID",
            "Group Length",
            "Referring Physician's Name",
            "Study ID",
            "Patient Orientation",
            "Series Number",
            "Pixel Data",
            "Group Length",

# ...
dcm_files = [r"C:\Users\akhil\Downloads\Sample_Dataset\Sample_Dataset\PRASANNA_KUMARI_12_2013_11_13_46_AM\IMG-0001-00001.dcm"]   

for dcm_file in dcm_files:
    ds = pydicom.filereader.dcmread(dcm_file)
    workbook = xlsxwriter.Workbook(os.path.basename(dcm_file) + '.xlsx')
    worksheet = workbook.add_worksheet()

    row = 0
    col = 0

    for keyword in keywords:
        value = ds.get(keyword, "None")
        if isinstance(value, list):
            value = ", ".join([str(x) for x in value])
        elif isinstance(value, PersonName):
            value = str(value)
        worksheet.write(row, col, keyword)
        worksheet.write(row + 1, col, value)
        col += 1


DICOM 文件中的一些标签:

(0008, 0005) Specific Character Set              CS: 'ISO_IR 100'
(0008, 0016) SOP Class UID                       UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.2.300.0.7230010.
(0008, 0020) Study Date                          DA: '20200908'
(0008, 0021) Series Date                         DA: '20200908'
(0008, 0022) Acquisition Date                    DA: '20200908'
(0008, 0023) Content Date                        DA: '20200908'
(0008, 0030) Study Time                          TM: '155900'
(0008, 0031) Series Time                         TM: '155900'
(0008, 0032) Acquisition Time                    TM: '155900'
(0008, 0033) Content Time                        TM: '155900'
(0008, 0050) Accession Number                    SH: ''
(0008, 0060) Modality                            CS: 'OT'
(0008, 0064) Conversion Type                     CS: ''
(0008, 0070) Manufacturer                        LO: 'SANTESOFT'
(0008, 0090) Referring Physician's Name          PN: ''
(0010, 0000) Group Length                        UL: 48
(0010, 0010) Patient's Name                      PN: 'NO^NAME'
(0010, 0020) Patient ID                          LO: '00000001'
(0010, 0030) Patient's Birth Date                DA: ''
(0010, 0040) Patient's Sex                       CS: ''
(0018, 0000) Group Length                        UL: 14
(0018, 1063) Frame Time                          DS: "33.0"

您在这里使用的关键字不正确。首先,DICOM 关键字没有 's 部分,例如它被称为“患者姓名”,而不是“患者姓名”(这在大约 15 年前的 DICOM 标准中已更改)。

keywords = ("Patient Name",
            "Patient ID",
            "Patient Birth Date",
            "Patient Sex",
            "SOP Class UID",
            "SOP Instance UID",
            "Group Length",
            "Referring Physician Name",
            "Study ID",
            "Patient Orientation",
            "Series Number",
            "Group Length",


for dcm_file in dcm_files:
    ds = pydicom.filereader.dcmread(dcm_file)
    for keyword in keywords:
        dcm_keyword = keyword.replace(' ', '')  # remove the spaces for the lookup
        value = ds.get(dcm_keyword, "None")

请注意,我已经删除了标签名称中的所有撇号,而且我还删除了 Pixel Data - 将二进制数据转换为字符串将无法正常工作,您当然不想显示Excel table.
