计算嵌套列表中的最大差异

Question

我使用 WordBlob 创建了一个列表，其中包含来自文本文档的列表。现在我想创建一个列表，每个列表中的差异最大，我只对极性感兴趣。我想到将最高和最低数字附加到另一个列表，然后将它们相互减去。但是我怎么才能引用 'polarity' 中的数字呢？这是我的嵌套列表：

[[Sentiment(polarity=0.35, subjectivity=0.65),
  Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=0.6, subjectivity=0.87),
  Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=0.0, subjectivity=0.0)],
 [Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=0.5, subjectivity=0.8),
  Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=-0.29, subjectivity=0.54),
  Sentiment(polarity=0.0, subjectivity=0.0),
  Sentiment(polarity=0.25, subjectivity=1.0)],
  [Sentiment(polarity=0.5, subjectivity=0.8),
  Sentiment(polarity=0.0, subjectivity=0.0)]]

有人有想法吗？感谢您的帮助。

Answer 1

您可以使用 python 内置函数 min 和 max 及其 key 参数来查找列表中的 smallest/biggest 值，给定关键标准。写成一个函数，它可能看起来像这样：

def polarity_diffs(sentiments):
    diffs = []
    for row in sentiments:
        smallest = min(row, key=lambda s: s.polarity).polarity
        biggest = max(row, key=lambda s: s.polarity).polarity
        diffs.append(biggest - smallest)
    return diffs

给定一个虚拟对象和一些测试数据 -

class Sentiment:  # Example class
    def __init__(self, polarity, subjectivity):
        self.polarity = polarity
        self.subjectivity = subjectivity

test_data = [
    # normal values
    [Sentiment(polarity=0.35, subjectivity=0.65),
     Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.6, subjectivity=0.87),
     Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.0, subjectivity=0.0)],
    # more normal values
    [Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.5, subjectivity=0.8),
     Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=-0.29, subjectivity=0.54),
     Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.25, subjectivity=1.0)],
    # only a single entry
    [Sentiment(polarity=0.35, subjectivity=0.65)],
    # multiple entries, but identical
    [Sentiment(polarity=0.0, subjectivity=0.0),
     Sentiment(polarity=0.0, subjectivity=0.0)]
]

- 这些是结果：

for diff in polarity_diffs(x):
    print(diff)
0.6   # normal values
0.79  # more normal values
0.0   # only a single entry
0.0   # multiple entries, but identical

Answer 2

给出一个示例 class 这就是您在您的案例中访问所需元素的方式：

class Sentiment:  # Example class
    def __init__(self, polarity, subjectivity):
        self.polarity = polarity
        self.subjectivity = subjectivity


ar = [[Sentiment(polarity=0.35, subjectivity=0.65),
      Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=0.6, subjectivity=0.87),
      Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=0.0, subjectivity=0.0)],
     [Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=0.5, subjectivity=0.8),
      Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=-0.29, subjectivity=0.54),
      Sentiment(polarity=0.0, subjectivity=0.0),
      Sentiment(polarity=0.25, subjectivity=1.0)]]

print(ar[0][0].polarity)  # this is the first polarity value

计算嵌套列表中的最大差异

Calculate the greatest difference in a nested list

python

nested-lists

textblob