扩展当前代码以包括中位数和众数

Extending current code to include both median and mode

我有这行代码用于一次作业,但我不知道如何将中位数和众数添加到代码中以使其 运行 不出错。

def main():
    filename = input('File name: ')
    num=0
    try:

        infile = open(filename, 'r')
        count = 0
        total = 0.0
        average = 0.0
        maximum = 0
        minimum = 0
        range1 = 0

        for line in infile:
            num = int(line)
            count = count + 1
            total = total + num

            if count == 1:
                maximum = num
                minimum = num
            else:
                if num > maximum:
                    maximum = num
                if num < minimum:
                minimum = num

    if count > 0:
        average = total / count
        range1 = maximum - minimum

我会立即进入并向您展示代码。这是一个非常简单且相当 pythonic 的解决方案。

解决方案

import statistics


def open_file(filename):
    try:
        return open(filename, 'r')
    except OSError as e:
        print(e)
        return None


def main():
    # Read file. Note that we are trusting the user input here without sanitizing.
    fd = open_file(input('File name: '))

    if fd is None:  # Ensure we have a file descriptor
        return

    data = fd.read()  # Read whole file
    if data == '':
        print("No data in file")
        return
    lines = data.split('\n')  # Split the data into a list of strings

    # We need to convert the list of strings to a list of integers
    # I don't know a pythonic way of doing this.
    for number, item in enumerate(lines):
        lines[number] = int(item)

    total_lines = len(lines)
    total_sum = sum(lines)
    maximum = max(lines)
    minimum = min(lines)

    # Here is the python magic, no need to reinvent the wheel!
    mean = statistics.mean(lines)  # mean == average
    median = statistics.median(lines)
    mode = "No mode!"
    try:
        mode = statistics.mode(lines)
    except statistics.StatisticsError as ec:
        pass  # No mode, due to having the same quantity of 2 or more different values 

    print("Total lines: " + str(total_lines))
    print("Sum: " + str(total_sum))
    print("Max: " + str(maximum))
    print("Min: " + str(minimum))
    print("Mean: " + str(mean))
    print("Median: " + str(median))
    print("Mode: " + str(mode))


if __name__ == '__main__':
    main()

说明

通常,在 python 中,可以安全地假设,如果您想使用众所周知的算法计算任何普通值,那么已经有一个函数专门为您编写。无需重新发明轮子!

这些功能通常也不难在网上找到。例如,您可以通过谷歌搜索 python calculate the median

找到有关统计库的建议

虽然您有解决方案,但我强烈建议您查看统计库的源代码(在下面发布),并自己弄清楚这些函数是如何工作的。它将帮助您成长为开发人员和数学家。

statistics.py

平均值

def mean(data):
    """Return the sample arithmetic mean of data.

    >>> mean([1, 2, 3, 4, 4])
    2.8

    >>> from fractions import Fraction as F
    >>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
    Fraction(13, 21)

    >>> from decimal import Decimal as D
    >>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
    Decimal('0.5625')

    If ``data`` is empty, StatisticsError will be raised.
    """
    if iter(data) is data:
        data = list(data)
    n = len(data)
    if n < 1:
        raise StatisticsError('mean requires at least one data point')
    T, total, count = _sum(data)
    assert count == n
    return _convert(total/n, T)

中位数

def median(data):
    """Return the median (middle value) of numeric data.

    When the number of data points is odd, return the middle data point.
    When the number of data points is even, the median is interpolated by
    taking the average of the two middle values:

    >>> median([1, 3, 5])
    3
    >>> median([1, 3, 5, 7])
    4.0

    """
    data = sorted(data)
    n = len(data)
    if n == 0:
        raise StatisticsError("no median for empty data")
    if n%2 == 1:
        return data[n//2]
    else:
        i = n//2
        return (data[i - 1] + data[i])/2

模式

def mode(data):
    """Return the most common data point from discrete or nominal data.

    ``mode`` assumes discrete data, and returns a single value. This is the
    standard treatment of the mode as commonly taught in schools:

    >>> mode([1, 1, 2, 3, 3, 3, 3, 4])
    3

    This also works with nominal (non-numeric) data:

    >>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
    'red'

    If there is not exactly one most common value, ``mode`` will raise
    StatisticsError.
    """
    # Generate a table of sorted (value, frequency) pairs.
    table = _counts(data)
    if len(table) == 1:
        return table[0][0]
    elif table:
        raise StatisticsError(
                'no unique mode; found %d equally common values' % len(table)
                )
    else:
        raise StatisticsError('no mode for empty data')