Phylo 树没有分支长度使用 phylosig () R

Phylo tree has no branch lengths using phylosig () R

我很难理解为什么我的系统树(作为 phyloseq 对象从 QIIME2 导入)在 phylosig() 中使用时似乎没有分支长度。我正在尝试计算我的 16S 数据集与单个连续元数据变量相比的系统发育信号。所有示例数据集都包含在这个问题的底部。

    glacialpath <- metadata[,c(7)]
    
    phylosig(phylo, glacialpath, method = "K", test = TRUE)

Error in vcv.phylo(tree) : the tree has no branch lengths
    In addition: Warning message:
    In drop.tip(tree, setdiff(tree$tip.label, names(x))) :
      drop all tips of the tree: returning NULL

我的树 phylo 从完整的 phyloseq 对象中分离出来 phylo <- phy_tree(physeq = physeq) 并且已经确认分支长度:

Phylogenetic tree with 1290 tips and 1289 internal nodes.

Tip labels:
  e54924083c02bd088c69537d02406eb8, 3112899fc7a2adb7f74a081a82c7cde4, db5d745b02355b6fed513c4953b62326, 2faf2046aa9ab2f6598df79cd52e9c7b, bec8db81cc8ec453136c82ede8327a9f, 601e47b8adcbd21d159f74421710e1b5, ...
Node labels:
  0.942, 0.563, 0.711, 0.999, 0.000, 0.528, ...

Rooted; includes branch lengths.

dput() of phylo 太大了,无法包含,所以希望我提供的内容足够了。感谢任何帮助。

> dput (metadata)
structure(list(sample = structure(1:48, .Label = c("sample-10", 
"sample-11", "sample-12", "sample-14", "sample-16", "sample-18", 
"sample-19", "sample-20", "sample-21", "sample-22", "sample-23", 
"sample-24", "sample-25", "sample-26", "sample-27", "sample-28", 
"sample-29", "sample-30", "sample-31", "sample-32", "sample-33", 
"sample-34", "sample-35", "sample-36", "sample-37", "sample-40", 
"sample-41", "sample-43", "sample-44", "sample-45", "sample-46", 
"sample-50", "sample-55", "sample-56", "sample-57", "sample-58", 
"sample-59", "sample-61", "sample-62", "sample-63", "sample-64", 
"sample-65", "sample-67", "sample-68", "sample-69", "sample-70", 
"sample-71", "sample-8"), class = "factor"), pond = structure(c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L), .Label = c("Lilly", 
"RHM1", "RHM2", "SS", "TS"), class = "factor"), meh = structure(c(16L, 
17L, 19L, 4L, 1L, 16L, 18L, 20L, 3L, 5L, 7L, 10L, 11L, 22L, 1L, 
2L, 14L, 16L, 18L, 20L, 3L, 5L, 7L, 10L, 1L, 15L, 16L, 18L, 19L, 
20L, 21L, 8L, 1L, 2L, 14L, 15L, 16L, 18L, 19L, 20L, 21L, 3L, 
5L, 6L, 9L, 12L, 13L, 1L), .Label = c("0-1", "1-2.", "10-11.", 
"11-12.", "12-13.", "13-14.5", "14-15", "14-16.", "14.5-17.5", 
"16-17", "17-19", "17.5-18.5", "18.5-20", "2-3.", "3-4.", "4-5.", 
"5-6.", "6-7.", "7-8.", "8-9.", "9-10.", "9-11."), class = "factor"), 
    depth = c(4, 5, 7, 11, 0, 4, 6, 8, 10, 12, 14, 16, 17, 9, 
    0, 1, 2, 4, 6, 8, 10, 12, 14, 16, 0, 3, 4, 6, 7, 8, 9, 14, 
    0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14, 19, 18, 0), om = structure(c(29L, 
    13L, 9L, 9L, 27L, 27L, 26L, 26L, 23L, 23L, 12L, 12L, 20L, 
    34L, 24L, 24L, 22L, 22L, 25L, 25L, 21L, 21L, 11L, 11L, 8L, 
    7L, 5L, 3L, 3L, 2L, 1L, 4L, 33L, 30L, 31L, 32L, 28L, 18L, 
    19L, 17L, 15L, 17L, 16L, 14L, 18L, 19L, 10L, 6L), .Label = c("1.2", 
    "1.3", "1.4", "1.5", "1.7", "19.9", "2.1", "2.6", "2.8", 
    "2.9", "27.9", "29.8", "3.2", "3.3", "3.4", "3.5", "3.6", 
    "3.7", "3.8", "30.2", "30.6", "32.1", "32.5", "32.6", "32.9", 
    "35.7", "36.2", "4.2", "4.3", "5.4", "5.8", "6", "6.4", "na"
    ), class = "factor"), water = structure(c(19L, 16L, 12L, 
    13L, 28L, 28L, 35L, 35L, 30L, 30L, 31L, 31L, 28L, 36L, 27L, 
    27L, 33L, 33L, 34L, 34L, 31L, 31L, 29L, 29L, 20L, 21L, 15L, 
    5L, 3L, 2L, 1L, 4L, 26L, 23L, 24L, 25L, 22L, 13L, 14L, 7L, 
    8L, 9L, 11L, 10L, 17L, 18L, 6L, 32L), .Label = c("16.1", 
    "17", "17.5", "18.8", "21.6", "21.8", "22.3", "22.4", "22.6", 
    "22.7", "23.1", "23.6", "24.4", "24.5", "24.6", "25.5", "26.9", 
    "28", "29.1", "31.8", "32.8", "34.9", "40.5", "42.9", "43.2", 
    "50.9", "59", "60.1", "61.4", "61.5", "62.5", "62.9", "63.2", 
    "63.8", "66.4", "na"), class = "factor"), glacialpath = c(19.4, 
    19.4, 19.4, 19.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 
    6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 
    18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 16.8, 16.8, 
    16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 
    16.8, 16.8, 16.8, 19.4), elevation = c(8.2, 8.2, 8.2, 8.2, 
    18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 
    18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 
    13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 5.5, 5.5, 
    5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 
    5.5, 8.2), area = c(20536L, 20536L, 20536L, 20536L, 3571L, 
    3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 
    3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 
    3571L, 10369L, 10369L, 10369L, 10369L, 10369L, 10369L, 10369L, 
    10369L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 
    57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 
    20536L), tectonics = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("outwash", 
    "uplift"), class = "factor"), lat = c(60.52, 60.52, 60.52, 
    60.52, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 
    60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 
    60.478, 60.478, 60.478, 60.478, 60.478, 60.491, 60.491, 60.491, 
    60.491, 60.491, 60.491, 60.491, 60.491, 60.425, 60.425, 60.425, 
    60.425, 60.425, 60.425, 60.425, 60.425, 60.425, 60.425, 60.425, 
    60.425, 60.425, 60.425, 60.425, 60.52), long = c(145.601, 
    145.601, 145.601, 145.601, 145.382, 145.382, 145.382, 145.382, 
    145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 
    145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 
    145.382, 145.382, 145.547, 145.547, 145.547, 145.547, 145.547, 
    145.547, 145.547, 145.547, 145.473, 145.473, 145.473, 145.473, 
    145.473, 145.473, 145.473, 145.473, 145.473, 145.473, 145.473, 
    145.473, 145.473, 145.473, 145.473, 145.601), x.depth = c(1L, 
    1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 1L), y.depth = c(2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 
    6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
    19L, 20L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 
    5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L)), class = "data.frame", row.names = c(NA, 
-48L))

问题不在于树中缺少分支长度。错误消息的意思是树没有与特征变量中的值名称相对应的提示。 glacialpath 必须是命名向量,并且名称必须存在于系统发育中。

一般来说,一个好的做法是检查这三个方面可能存在的问题。

  1. phylo 不是正确的 phylo 格式。
  2. phylo 应包含名称与 metadata.
  3. 中的提示相对应的提示
  4. 与#2 相关,glacialpath 缺少名称。

phylo 不是正确的 phylo 格式

phytools::phylosig 要求树的格式正确 phylo。尝试用 str(phylo) 探索树,看看 phylo$edge.length 中的所有值是否都是数字并且没有缺失值。

phylo 应包含名称与 metadata

中的提示相对应的提示

应该计算系统发育信号的样本是什么?假设 metadata 中的列 sample 包含名称,将树减少到可用数据的大小:

tr2 <- ape::keep.tip(phy = phylo, tip = metadata$sample)
plot(tr2) # to check that the tree was trimmed correctly

虽然 phytools::phylosig 可以在处理过程中 trim 树,但最好事先准备树的子集,因为用户可以检查 trimmed 树是否按预期绘制,包括分支长度。在给定的示例中,树提示不是特征向量名称中存在的名称。

glacialpath 缺少名字

glacialpath 不是命名向量。要设置向量的名称,请使用:

glacialpath <- setNames(object = metadata[, 7], nm = metadata$sample)

最后,一个好的做法是将变量命名为与函数名称不同的名称,或者在这种情况下,类。