获取 DOM 树中所有根到叶路径的列表
Get a list of all root-to-leaf paths in a DOM tree
到目前为止,我了解到当我将 XML 解析为 DOM 对象时,它表示为一棵树。我正在尝试使用 TreeWalker 获取从根到我的 XML 文档(n 叉树)的任何叶子的所有路径的列表,但是我想知道我是否应该自己实现或者是否有任何现有实施已定义。我在官方文档中找到了none
给定 xml 文档:
<node id="A">
<node id = "AA">
<node id = "AAA"></node>
</node>
<node id = "AB">
</node>
<node id = "AC">
</node>
</node>
预期列表应包含:
A, AA, AAA
A, AB
A, AC
我会使用这样的算法:
在只有一个函数的树中往下走,如果没有 child,往上走,输出每个 parent 节点。
function deep(node n){
if(n has childs)
foreach(child c) {
deep(c);
}
else goup(n);
}
function goup(node n){
if(node has no parent) echo n.id
else echo goup(n.parent()) . ", " . n.id
}
Non-recursive TreeWalker
这很简单。找到所有叶子元素,然后通过遍历父链显示路径。
NodeFilter leafElements = new NodeFilter() {
@Override
public short acceptNode(Node node) {
for (Node child = node.getFirstChild(); child != null; child = child.getNextSibling())
if (child.getNodeType() == Node.ELEMENT_NODE)
return NodeFilter.FILTER_SKIP;
return NodeFilter.FILTER_ACCEPT;
}
};
TreeWalker walker = ((DocumentTraversal)document).createTreeWalker(document.getDocumentElement(),
NodeFilter.SHOW_ELEMENT,
leafElements,
false);
for (Element leaf; (leaf = (Element)walker.nextNode()) != null; ) {
Deque<String> path = new ArrayDeque<>();
for (Node node = leaf; node.getNodeType() == Node.ELEMENT_NODE; node = node.getParentNode())
path.addFirst(((Element)node).getAttribute("id"));
System.out.println(path);
}
输出
[A, AA, AAA]
[A, AB]
[A, AC]
简单递归
也比较简单。递归后代,维护到目前为止的路径,并打印叶元素。
showLeafPaths(document.getDocumentElement(), new StringBuilder());
private static void showLeafPaths(Element elem, StringBuilder path) {
final int pathLen = path.length();
if (pathLen != 0)
path.append(", ");
path.append(elem.getAttribute("id"));
boolean hasChild = false;
for (Node child = elem.getFirstChild(); child != null; child = child.getNextSibling())
if (child.getNodeType() == Node.ELEMENT_NODE) {
hasChild = true;
showLeafPaths((Element)child, path);
}
if (! hasChild)
System.out.println(path);
path.setLength(pathLen);
}
输出
A, AA, AAA
A, AB
A, AC
到目前为止,我了解到当我将 XML 解析为 DOM 对象时,它表示为一棵树。我正在尝试使用 TreeWalker 获取从根到我的 XML 文档(n 叉树)的任何叶子的所有路径的列表,但是我想知道我是否应该自己实现或者是否有任何现有实施已定义。我在官方文档中找到了none
给定 xml 文档:
<node id="A">
<node id = "AA">
<node id = "AAA"></node>
</node>
<node id = "AB">
</node>
<node id = "AC">
</node>
</node>
预期列表应包含:
A, AA, AAA
A, AB
A, AC
我会使用这样的算法:
在只有一个函数的树中往下走,如果没有 child,往上走,输出每个 parent 节点。
function deep(node n){
if(n has childs)
foreach(child c) {
deep(c);
}
else goup(n);
}
function goup(node n){
if(node has no parent) echo n.id
else echo goup(n.parent()) . ", " . n.id
}
Non-recursive TreeWalker
这很简单。找到所有叶子元素,然后通过遍历父链显示路径。
NodeFilter leafElements = new NodeFilter() {
@Override
public short acceptNode(Node node) {
for (Node child = node.getFirstChild(); child != null; child = child.getNextSibling())
if (child.getNodeType() == Node.ELEMENT_NODE)
return NodeFilter.FILTER_SKIP;
return NodeFilter.FILTER_ACCEPT;
}
};
TreeWalker walker = ((DocumentTraversal)document).createTreeWalker(document.getDocumentElement(),
NodeFilter.SHOW_ELEMENT,
leafElements,
false);
for (Element leaf; (leaf = (Element)walker.nextNode()) != null; ) {
Deque<String> path = new ArrayDeque<>();
for (Node node = leaf; node.getNodeType() == Node.ELEMENT_NODE; node = node.getParentNode())
path.addFirst(((Element)node).getAttribute("id"));
System.out.println(path);
}
输出
[A, AA, AAA]
[A, AB]
[A, AC]
简单递归
也比较简单。递归后代,维护到目前为止的路径,并打印叶元素。
showLeafPaths(document.getDocumentElement(), new StringBuilder());
private static void showLeafPaths(Element elem, StringBuilder path) {
final int pathLen = path.length();
if (pathLen != 0)
path.append(", ");
path.append(elem.getAttribute("id"));
boolean hasChild = false;
for (Node child = elem.getFirstChild(); child != null; child = child.getNextSibling())
if (child.getNodeType() == Node.ELEMENT_NODE) {
hasChild = true;
showLeafPaths((Element)child, path);
}
if (! hasChild)
System.out.println(path);
path.setLength(pathLen);
}
输出
A, AA, AAA
A, AB
A, AC