如何编写搜索函数来获取所需文件的路径?

How to write a search function to get the path of the desired file?

我有一个看起来像图像的目录结构。

这个结构的组织形式是objectsFile 个对象都采用以下形式:例如 {type:'file', name:'file a1'}Folder对象类似于 File 对象,但它们有一个额外的键 children,它是一个子数组 folders/files。

这是之前图像中 rootFolder 对象的控制台日志:

目标: 我想要的是编写一个搜索函数,returns 数组形式的特定文件的路径。例如,如果我要搜索 file c3,结果将是:['folder a3', 'folder b4'].

我尝试了很多东西,但递归不是我的强项。

代码可以在javascriptpython;我关心的是算法本身。

这里是用来生成rootFolder的代码:

// defining constants
const TYPES = {FILE: 'file', FOLDER: 'folder'}
const FILE_NAMES = {
  FILE_A1:'file a1', 
  FILE_A2:'file a2', 
  FILE_A4:'file a4',
  FILE_B2:'file b2',
  FILE_B3:'file b3',
  FILE_C1:'file c1',
  FILE_C2:'file c2',
  FILE_C3:'file c3',
  FILE_C4:'file c4',
  FILE_C5:'file c5',
}
const FOLDER_NAMES = {
  ROOT_FOLDER:'root folder',
  FOLDER_A3:'folder a3',
  FOLDER_B1:'folder b1',
  FOLDER_B4:'folder b4'
}
// defining classes
class File {
  constructor(type, name) {
    this.name = name
    this.type = type
  }
}
class Folder {
  constructor (type, name, children) {
    this.name = name
    this.type = type
    this.children = children
  }
}
// defining file objects
const fileA1 = new File(TYPES.FILE, FILE_NAMES.FILE_A1)
const fileA2 = new File(TYPES.FILE, FILE_NAMES.FILE_A2)
const fileA4 = new File(TYPES.FILE, FILE_NAMES.FILE_A4)
const fileB2 = new File(TYPES.FILE, FILE_NAMES.FILE_B2)
const fileB3 = new File(TYPES.FILE, FILE_NAMES.FILE_B3)
const fileC1 = new File(TYPES.FILE, FILE_NAMES.FILE_C1)
const fileC2 = new File(TYPES.FILE, FILE_NAMES.FILE_C2)
const fileC3 = new File(TYPES.FILE, FILE_NAMES.FILE_C3)
const fileC4 = new File(TYPES.FILE, FILE_NAMES.FILE_C4)
const fileC5 = new File(TYPES.FILE, FILE_NAMES.FILE_C5)
// defining folder objects
const folderB1 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_B1, [fileC1, fileC2])
const folderB4 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_B4, [fileC3, fileC4, fileC5])
const folderA3 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_A3, [folderB1, fileB2, fileB3, folderB4])
const rootFolder = new Folder(TYPES.FOLDER, FOLDER_NAMES.ROOT_FOLDER, [fileA1, fileA2, folderA3, fileA4])

console.log(rootFolder);

删除指示

在开始之前,先去掉由20多个变量造成的无数indirection。我们可以只使用两 (2) 个 类 和一个 root -

class File {
  constructor(type, name) {
    this.name = name
    this.type = type
  }
}

class Folder {
  constructor(type, name, children) {
    this.name = name
    this.type = type
    this.children = children
  }
}

const root =
  new Folder("folder", "root folder", [
    new File("file", "file a1"),
    new File("file", "file a2"),
    new Folder("folder", "folder a3", [
      new Folder("folder", "folder b1", [
        new File("file", "file c1"),
        new File("file", "file c2")
      ]),
      new File("file", "file b2"),
      new File("file", "file b3"),
      new Folder("folder", "folder b4", [
        new File("file", "file c3"),
        new File("file", "file c4"),
        new File("file", "file c5")
      ])
    ]),
    new File("file", "file a4")
  ])

删除冗余

如果您有单独的 FileFolder 类,则不需要

TYPES.FILETYPES.FOLDER。每个实例都已经有一个关联的类型 -

class File {
  constructor(name) { // ✅ type not necessary
    this.name = name
  }
}

class Folder {
  constructor (name, children) { // ✅ type not necessary
    this.name = name
    this.children = children
  }
}

const root =
  new Folder("root folder", [
    new File("file a1"),
    new File("file a2"),
    new Folder("folder a3", [
      new Folder("folder b1", [
        new File("file c1"),
        new File("file c2")
      ]),
      new File("file b2"),
      new File("file b3"),
      new Folder("folder b4", [
        new File("file c3"),
        new File("file c4"),
        new File("file c5")
      ])
    ]),
    new File("file a4")
  ])

多个 return 值

因为文件系统支持具有相同名称的文件夹或文件(在不同的层次结构),重要的是我们的 search 算法可以扫描 所有 可能的匹配项询问。举个完整的例子,让我们在 folder b1 folder b4 中包含 file c3。搜索此文件时,我们希望看到两个路径 returned -

const root =
  new Folder("root folder", [
    new File("file a1"),
    new File("file a2"),
    new Folder("folder a3", [
      new Folder("folder b1", [
        new File("file c1"),
        new File("file c2"),
        new File("file c3")  // ✅ could have the same name
      ]),
      new File("file b2"),
      new File("file b3"),
      new Folder("folder b4", [
        new File("file c3"), // ✅ could have the same name
        new File("file c4"),
        new File("file c5")
      ])
    ]),
    new File("file a4")
  ])

算法

现在我们已经解决了代码的其他问题,让我们编写 search(t, query)。我们可以使用 inductive reasoning -

  1. 如果 t 是一个文件并且其 namequery 匹配,则产生单例结果 [t]
  2. (归纳)t 不是文件。如果 t 是一个文件夹并且它的 name 匹配 query 输入单例结果 [t]。对于 t.children 的所有 child,对于递归 sub-problem search(child, query) 的所有 path,将 t 添加到 path 和产量。
  3. (inductive) t 既不是文件也不是文件夹。抛出类型错误通知调用者。
function* search(t, query) {
  switch(t?.constructor) {
    case File:
      if (t.name == query)
        yield [t]
      break
    case Folder:
      if (t.name == query)
        yield [t]
      for (const child of t.children)
        for (const path of search(child, query))
          yield [t, ...path]
      break
    default:
      throw Error(`cannot search unsupported type ${t}`)
  }
}

这是一些搜索示例 -

for (const path of search(root, "root folder"))
  console.log([".", ...path.map(f => f.name)].join("/"))
./root folder
for (const path of search(root, "file a1"))
  console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/file a1

多次找到文件或文件夹时,return编辑多个路径 -

for (const path of search(root, "file c3"))
  console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3

当找不到文件或文件夹时,不给出输出-

for (const path of search(root, "file Q"))
  console.log([".", ...path.map(f => f.name)].join("/"))
⚠️ no output

来电决定效果

通过生成一系列 FolderFile 对象,我们的 search 算法保持通用和可重用。默认情况下不会组装字符串,并且不会向控制台记录任何内容。如果调用者希望创建一个 / 分隔的路径,如上所示。不同效果的选项留给来电者决定。

只有一个结果

使用生成器非常适合 search,因为调用者可以选择使用所有匹配项或任意数量的匹配项。在这个例子中,我们使用通用的 first 到 return 只有 first 匹配。请注意,在找到匹配项后生成器会立即取消,并且 cpu -

不会执行任何其他工作
function first(iter) {
  for (const v of iter)
    return v
}

const match = first(search(root, "file c3"))

if (match)
  console.log("found:", match.map(f => f.name).join("/"))
else
  console.error("no match found")
found: root folderfolder a3/folder b1/file c3

如果以这种方式使用 first,请确保你做了正确的 null-check,因为如果找不到匹配项,search 可能不会产生任何结果 -

const match = first(search(root, "ZZZ"))

if (match)
  console.log("found:", match.map(f => f.name).join("/"))
else
  console.error("no match found")
no match found

higher-order 搜索

在原始算法中,匹配是使用 == 确定的。如果我们允许调用者指定什么构成匹配,该算法可能会更有用 -

function* search(t, query) {
  switch(t?.constructor) {
    case File:
      if (Boolean(query(t.name))) // ✅ query is a func
        yield [t]
      break
    case Folder:
      if (Boolean(query(t.name))) // ✅ query is a func
        yield [t]
      for (const child of t.children)
        for (const path of search(child, query))
          yield [t, ...path]
      break
    default:
      throw Error(`cannot search unsupported type ${t}`)
  }
}

让我们找到所有以 file c -

开头的文件
for (const path of search(root, filename => filename.startsWith("file c")))
  console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c1
./root folder/folder a3/folder b1/file c2
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3
./root folder/folder a3/folder b4/file c4
./root folder/folder a3/folder b4/file c5

专业化

使用higher-order search 函数,您可以轻松实现searchByString 等功能。这被称为 专业化 -

function searchByString(t, query) {
  return search(t, filename => filename == query)
}
for (const path of searchByString(root, "file c3"))
  console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3

代码演示

上面的代码已经过压缩,因此您可以在一个地方看到所有内容。 运行 片段并验证输出 -

class File { constructor(name) { this.name = name } }

class Folder { constructor(name, children) { this.name = name;  this.children = children } }

function* search(t, query) {
  switch(t?.constructor) {
    case File: if (Boolean(query(t.name))) yield [t];  break
    case Folder: if (Boolean(query(t.name))) yield [t]; for (const child of t.children) for (const path of search(child, query)) yield [t, ...path]; break
    default: throw Error(`cannot search unsupported type ${t}`)
  }
}

function searchByString(t, query) {
  return search(t, filename => filename == query)
}

const relative = (path) => [".", ...path.map(f => f.name)].join("/")

const root = new Folder("root folder", [new File("file a1"), new File("file a2"), new Folder("folder a3", [new Folder("folder b1", [new File("file c1"), new File("file c2"), new File("file c3")]), new File("file b2"), new File("file b3"), new Folder("folder b4", [new File("file c3"), new File("file c4"), new File("file c5")])]), new File("file a4")])

console.log("== exact match ==")
for (const path of search(root, filename => filename == "root folder"))
  console.log(relative(path))

console.log("== starts with 'file c' ==")
for (const path of search(root, filename => filename.startsWith("file c")))
  console.log(relative(path))

console.log("== searchByString ==")
for (const path of searchByString(root, "file c3"))
  console.log(relative(path))
.as-console-wrapper { min-height: 100%; top: 0; }

== exact match ==
./root folder
== starts with 'file c' ==
./root folder/folder a3/folder b1/file c1
./root folder/folder a3/folder b1/file c2
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3
./root folder/folder a3/folder b4/file c4
./root folder/folder a3/folder b4/file c5
== searchByString ==
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3

python

search和Python写的一模一样-

class file:
  def __init__(self, name):
    self.name = name

class folder:
  def __init__(self, name, children):
    self.name = name
    self.children = children

def search(t, query):
  if isinstance(t, file):
    if t.name == query:
      yield [t]
  elif isinstance(t, folder):
    if t.name == query:
      yield [t]
    for child in t.children:
      for path in search(child, query):
        yield [t, *path]
  else:
    raise TypeError("unsupported", t)

如果您使用的是 Python 3.10,您可以使用一些新功能。调整 search 以利用新的结构 match..case 语法 -

from dataclasses import dataclass
from typing import Generator, Union

Ftree = Union["file", "folder"]

@dataclass
class file:
  name: str

@dataclass
class folder:
  name: str
  children: list[Ftree]

def search(t: Ftree, query: str) -> Generator[list[Ftree], None, None]:
  match t:
    case file(name):
      if name == query: yield [t]
    case folder(name, children):
      if name == query: yield [t]
      for child in t.children:
        for path in search(child, query):
          yield [t, *path]
    case _:
      raise TypeError("unsupported", t)

给定一棵树 -

root = \
  folder("root folder", [
    file("file a1"),
    file("file a2"),
    folder("folder a3", [
      folder("folder b1", [
        file("file c1"),
        file("file c2"),
        file("file c3")
      ]),
      file("file b2"),
      file("file b3"),
      folder("folder b4", [
        file("file c3"),
        file("file c4"),
        file("file c5")
      ])
    ]),
    file("file a4")
  ])

两种实现的行为相同 -

for path in search(root, "root folder"):
  print("/".join([".", *map(lambda f: f.name, path)]))
./root folder
for path in search(root, "file c3"):
  print("/".join([".", *map(lambda f: f.name, path)]))
./root folder/folder a3/folder b1/file c3
./root folder/folder a3/folder b4/file c3