使用 Python 读取 MIME 类型时出错

Error reading MIME types using Python

我正在编写一个 python 脚本,它将读取文件扩展名、MIME 类型和文件签名,这样我就可以确定其中是否有任何丢失或损坏,并确定文件中的文件类型给定的目录。

到目前为止我有:

import magic, os

def get_ext(dirPath):
    foldercount = 0
    filecount = 0
    while True:
        if os.path.exists(dirPath):
            break
        else:
            print "Directory doesn't exist!"
            continue
    includePath = raw_input("Do you want to include the complete path to the files in the output?: Y/N\n")

    if includePath.upper() == "Y":
        for rootfolder, subfolders, files in os.walk(dirPath):
            foldercount += len(subfolders)
            filecount += len(files)
            for f in files:
                name = f
                path = os.path.join(rootfolder, f)
                ext = os.path.splitext(f)[1]
                if ext != "":
                    print "Filename: " + str(path) + "\t\tExtension: " + str(ext) + "\tMIME: "
                else:
                    print "Filename: " + str(path) + "\t\tExtension: no extension found"
        print "Found {0} files in {1} folders".format(filecount, foldercount)

    elif includePath.upper() == "N":
        for rootfolder, subfolders, files in os.walk(dirPath):
            foldercount += len(subfolders)
            for f in files:
                name = f
                path = os.path.join(rootfolder, f)
                ext = os.path.splitext(f)[1]
                if ext != "":
                    print "Filename: " + str(name) + "\t\tExtension: " + str(ext)
                else:
                    print "Filename: " + str(name) + "\t\tExtension: no extension found"
        print "Found in {0} folders".format(foldercount) 

    else:
        print "Wrong input, try again"


def getMagic(dirPath):
    while True:
        if os.path.exists(dirPath):
            break
        else:
            print "Directory doesn't exist!"
            continue
    for rootfolder, subfolders, files in os.walk(dirPath):
        for f in files:
            bestand = f 
            mymagic = magic.Magic(mime=True)
            mytype = mymagic.from_file(bestand)
            print mytype
            print ("The MIME type of the file %s is %s" %(bestand, mytype))

dirPath = raw_input("Directory to check files in: ")        
get_ext(dirPath)       
getMagic(dirPath)   

get_ext() 正常工作,给我一个文件名和扩展名。 但是,当我尝试获取 MIME 类型时,它会以某种方式抛出以下错误:

Traceback (most recent call last):
  File "/home/nick/workspace/Proto/asdfasdf.py", line 80, in <module>
    getMagic(dirPath)     
  File "/home/nick/workspace/Proto/asdfasdf.py", line 74, in getMagic
    mytype = mymagic.from_file(bestand)
  File "/usr/local/lib/python2.7/dist-packages/magic.py", line 75, in     from_file
    raise IOError("File does not exist: " + filename)
IOError: File does not exist: 2

我知道文件“2”确实存在,是一个纯文本文档。 如果我在脚本中硬编码文件的路径,它确实会给我 MIME,但我希望脚本遍历一个目录,给我其中文件的所有 mime。

有人可以解释为什么会抛出这个错误以及如何解决这个问题吗? 我正在使用通过 pip install python-magic

安装的 python-magic 模块

谢谢

os.walk 的文档中我们可以看到

filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).

您需要获取完整路径为

bestand = os.path.join(rootfolder, f)