将日志分析脚本从 Python 转换为 Nim

Converting a log analysis script from Python to Nim

Nim 看起来(非常)接近 Python,但我仍然很难翻译以下脚本:

import sys

months = { "Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
           "Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12 }

months_r = { v:k for k,v in months.items() }

totals = {}

for line in sys.stdin:
    if "redis" in line and "Partial" in line:
        f1, f2 = line.split()[:2]
        w = (months[f1], int(f2))
        totals[w] = totals.get(w, 0) + 1

for k in sorted(totals.keys()):
    print(months_r[k[0]], k[1], totals[k])

即使阅读了几个小时的手册,我仍然不确定元组和来回转换月份名称的方法(我尝试 table 失败了,我没能成功像在 Python).

中那样访问 table

任何帮助将不胜感激。

谢谢

我没有好消息。 Nim 代码比 python 代码长很多,但无论如何这是我的实现,尽管我不明白你的程序应该做什么。

import tables, strutils, os, parseutils, algorithm

var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
        "Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable

var invertedMonths: Table[int, string]

for k, v in months:
    invertedMonths[v] = k

var totals: Table[(int, int), int]

while true:
    let line = readLine(stdin)
    if line.strip == "": break
    if "redis" in line and "Partial" in line:
        let args = line.split()
        let w = (months[args[0]], parseInt(args[1]))
        totals[w] = totals.getOrDefault(w, 0) + 1

var keys = newSeq[(int, int)](totals.len)

keys.setLen(0)

for k in totals.keys():
    keys.add(k)

# i have no idea how python would sort this
keys.sort(proc(a, b: (int, int)): int =
    a[0] + a[1] - b[0] - b[1]
)

for k in keys:
    echo invertedMonths[k[0]], " ", k[1], " ", totals[k]

编辑

经过一些建议重组代码如下,长度现在看起来好多了。

import tables, strutils, algorithm, sequtils

var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
        "Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable

var invertedMonths: Table[int, string]

for k, v in months: invertedMonths[v] = k

var totals: Table[(int, int), int]

for line in stdin.lines:
    if line.strip == "": break
    if "redis" in line and "Partial" in line:
        let args = line.split()
        let w = (months[args[0]], parseInt(args[1]))
        totals[w] = totals.getOrDefault(w, 0) + 1

for k in toSeq(totals.keys).sorted:
    echo invertedMonths[k[0]], " ", k[1], " ", totals[k]

现在只是愚蠢,但如果不重构它以使用枚举而不是月份表,我不能不管它

import tables, strutils, algorithm,sequtils

type Month = enum Jan=1,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec

var totals: CountTable[(Month, int)]

for line in stdin.lines:
  if "redis" in line and "Partial" in line:
        let flds = line.split()
        inc totals,(flds[0].parseEnum[:Month], flds[1].parseInt)

for k in toSeq(totals.keys).sorted:
    echo k[0], " ", k[1], " ", totals[k]

编辑:不得不使用 CountTable 对@xbello 的解决方案进行标记,太好了。

问题是在数字和树字母之间转换月份名称? Nim 有这个:

import times

echo Month(2)                        # February
echo ($Month(2))[.. 2]               # Feb
echo parse("Feb", "MMM").month       # February
echo ord(parse("Feb", "MMM").month)  # 2

那么对于计数你可以这样做:

import tables, times

var totals = initCountTable[(Month, int)]()

# The following is some sample data, you just get each
# tuple[Mont, int] from your parsed file
var sample = @[
  (m: 1.Month, i: 3),
  (m: 2.Month, i: 3),
  (m: 1.Month, i: 1),
  (m: 1.Month, i: 3),
  (m: 12.Month, i: 2)]

for item in sample:
  totals.inc(s)

echo totals
# {(January, 3): 2, (February, 3): 1, (January, 1): 1, (December, 2): 1}

最后,排序。据我了解,这是按每月名称的字母顺序排序的,每个 int 的关系。我的印象是您想按月排序而不是按名称排序,因此所有 table gimnastics。如果您使用 Month 作为 table 键,Nim 将按月排序:

import algorithm

echo sorted(@[12.Month, 1.Month, 6.Month])
#@[January, June, December]

将它们放在一起,并假设您有一个遵循“月日消息”模式的日志文件,例如:

Jan 1 Log Message Partial redis
Mar 31 A Partial redis in the same date
Jan 02 More Log Messages but not captured
Mar 31 Even More Log Messages Partial redis
Jan 15 An out of order message Partial redis

这可以在 5 干净的行中完成,9 如果算上变量声明和导入:

import algorithm, sequtils, strscans, strutils, tables, times

var totals = initCountTable[(Month, int)]()
var month, msg: string
var day: int

for l in stdin.lines:
  if scanf(l, "$w $i $*", month, day, msg) and "redis" in l and "Partial" in l:
    totals.inc (parse(month, "MMM").month, day)

for k in sorted(toSeq(totals.keys())):
  echo k, ": ", totals[k]

结果:

(January, 1): 1
(January, 15): 1
(March, 31): 2