阅读 Python 中科学记数法的指数

Question

我正在尝试生成一组汇总数据，所以我不关心数字本身我只关心指数 - 目标是找到 7 位数字的总数（例如. phone 个数字）。我目前处理这个问题的方式非常简单

我有一个 CSV 格式的数据集，它看起来像这样：

"1.108941100000000000e+07, 4.867837000000000000e+06, ...“

# numlist is the dataset

x = np.trunc(np.log10(numlist))    
total = (x == 6).sum()

这给了我 7 位数字的个数。当我选择这种方法时，我假设输入是一个整数列表，但现在我看到数据实际上可能是科学计数法中的 given/stored。如果它是以科学计数法给出，是否有更快的方法来获得相同的结果？有没有一种方法可以仅从 csv 文件加载指数并完全跳过 log10 行为？

此外，我并不局限于使用 numpy 数组，但经过一些实验后，它们是实现我的目的最快的实现。

Answer 1

您可能希望编写自定义解析器以在读取文件时使用，而不是读取所有数据只是为了稍后将其丢弃。

大小为 `n`

的指数计数

def count_exponents(path, n):
    n_str = 'e+0' + str(n)
    out = 0
    with open(path) as fp:
        for line in fp:
            out += line.count(n_str)
    return out

Return 指数

import re
pattern = re.compile('e([+\-]\d+)')

def get_exponents(path):
    with open(path) as fp:
        out = [pattern.findall(line) for line in fp]
    return out

阅读 Python 中科学记数法的指数

Reading Exponents of Scientific Notation in Python

python

numpy

scientific-notation

pandas

大小为 `n`

Return 指数

阅读 Python 中科学记数法的指数

Reading Exponents of Scientific Notation in Python

python

numpy

scientific-notation

pandas

大小为 n

Return 指数

大小为 `n`