将日志分析脚本从 Python 转换为 Nim
Converting a log analysis script from Python to Nim
Nim 看起来(非常)接近 Python,但我仍然很难翻译以下脚本:
import sys
months = { "Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12 }
months_r = { v:k for k,v in months.items() }
totals = {}
for line in sys.stdin:
if "redis" in line and "Partial" in line:
f1, f2 = line.split()[:2]
w = (months[f1], int(f2))
totals[w] = totals.get(w, 0) + 1
for k in sorted(totals.keys()):
print(months_r[k[0]], k[1], totals[k])
即使阅读了几个小时的手册,我仍然不确定元组和来回转换月份名称的方法(我尝试 table 失败了,我没能成功像在 Python).
中那样访问 table
任何帮助将不胜感激。
谢谢
我没有好消息。 Nim 代码比 python 代码长很多,但无论如何这是我的实现,尽管我不明白你的程序应该做什么。
import tables, strutils, os, parseutils, algorithm
var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable
var invertedMonths: Table[int, string]
for k, v in months:
invertedMonths[v] = k
var totals: Table[(int, int), int]
while true:
let line = readLine(stdin)
if line.strip == "": break
if "redis" in line and "Partial" in line:
let args = line.split()
let w = (months[args[0]], parseInt(args[1]))
totals[w] = totals.getOrDefault(w, 0) + 1
var keys = newSeq[(int, int)](totals.len)
keys.setLen(0)
for k in totals.keys():
keys.add(k)
# i have no idea how python would sort this
keys.sort(proc(a, b: (int, int)): int =
a[0] + a[1] - b[0] - b[1]
)
for k in keys:
echo invertedMonths[k[0]], " ", k[1], " ", totals[k]
编辑
经过一些建议重组代码如下,长度现在看起来好多了。
import tables, strutils, algorithm, sequtils
var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable
var invertedMonths: Table[int, string]
for k, v in months: invertedMonths[v] = k
var totals: Table[(int, int), int]
for line in stdin.lines:
if line.strip == "": break
if "redis" in line and "Partial" in line:
let args = line.split()
let w = (months[args[0]], parseInt(args[1]))
totals[w] = totals.getOrDefault(w, 0) + 1
for k in toSeq(totals.keys).sorted:
echo invertedMonths[k[0]], " ", k[1], " ", totals[k]
现在只是愚蠢,但如果不重构它以使用枚举而不是月份表,我不能不管它
import tables, strutils, algorithm,sequtils
type Month = enum Jan=1,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
var totals: CountTable[(Month, int)]
for line in stdin.lines:
if "redis" in line and "Partial" in line:
let flds = line.split()
inc totals,(flds[0].parseEnum[:Month], flds[1].parseInt)
for k in toSeq(totals.keys).sorted:
echo k[0], " ", k[1], " ", totals[k]
编辑:不得不使用 CountTable 对@xbello 的解决方案进行标记,太好了。
问题是在数字和树字母之间转换月份名称? Nim 有这个:
import times
echo Month(2) # February
echo ($Month(2))[.. 2] # Feb
echo parse("Feb", "MMM").month # February
echo ord(parse("Feb", "MMM").month) # 2
那么对于计数你可以这样做:
import tables, times
var totals = initCountTable[(Month, int)]()
# The following is some sample data, you just get each
# tuple[Mont, int] from your parsed file
var sample = @[
(m: 1.Month, i: 3),
(m: 2.Month, i: 3),
(m: 1.Month, i: 1),
(m: 1.Month, i: 3),
(m: 12.Month, i: 2)]
for item in sample:
totals.inc(s)
echo totals
# {(January, 3): 2, (February, 3): 1, (January, 1): 1, (December, 2): 1}
最后,排序。据我了解,这是按每月名称的字母顺序排序的,每个 int 的关系。我的印象是您想按月排序而不是按名称排序,因此所有 table gimnastics。如果您使用 Month
作为 table 键,Nim 将按月排序:
import algorithm
echo sorted(@[12.Month, 1.Month, 6.Month])
#@[January, June, December]
将它们放在一起,并假设您有一个遵循“月日消息”模式的日志文件,例如:
Jan 1 Log Message Partial redis
Mar 31 A Partial redis in the same date
Jan 02 More Log Messages but not captured
Mar 31 Even More Log Messages Partial redis
Jan 15 An out of order message Partial redis
这可以在 5 干净的行中完成,9 如果算上变量声明和导入:
import algorithm, sequtils, strscans, strutils, tables, times
var totals = initCountTable[(Month, int)]()
var month, msg: string
var day: int
for l in stdin.lines:
if scanf(l, "$w $i $*", month, day, msg) and "redis" in l and "Partial" in l:
totals.inc (parse(month, "MMM").month, day)
for k in sorted(toSeq(totals.keys())):
echo k, ": ", totals[k]
结果:
(January, 1): 1
(January, 15): 1
(March, 31): 2
Nim 看起来(非常)接近 Python,但我仍然很难翻译以下脚本:
import sys
months = { "Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12 }
months_r = { v:k for k,v in months.items() }
totals = {}
for line in sys.stdin:
if "redis" in line and "Partial" in line:
f1, f2 = line.split()[:2]
w = (months[f1], int(f2))
totals[w] = totals.get(w, 0) + 1
for k in sorted(totals.keys()):
print(months_r[k[0]], k[1], totals[k])
即使阅读了几个小时的手册,我仍然不确定元组和来回转换月份名称的方法(我尝试 table 失败了,我没能成功像在 Python).
中那样访问 table任何帮助将不胜感激。
谢谢
我没有好消息。 Nim 代码比 python 代码长很多,但无论如何这是我的实现,尽管我不明白你的程序应该做什么。
import tables, strutils, os, parseutils, algorithm
var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable
var invertedMonths: Table[int, string]
for k, v in months:
invertedMonths[v] = k
var totals: Table[(int, int), int]
while true:
let line = readLine(stdin)
if line.strip == "": break
if "redis" in line and "Partial" in line:
let args = line.split()
let w = (months[args[0]], parseInt(args[1]))
totals[w] = totals.getOrDefault(w, 0) + 1
var keys = newSeq[(int, int)](totals.len)
keys.setLen(0)
for k in totals.keys():
keys.add(k)
# i have no idea how python would sort this
keys.sort(proc(a, b: (int, int)): int =
a[0] + a[1] - b[0] - b[1]
)
for k in keys:
echo invertedMonths[k[0]], " ", k[1], " ", totals[k]
编辑
经过一些建议重组代码如下,长度现在看起来好多了。
import tables, strutils, algorithm, sequtils
var months = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6,
"Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10, "Nov": 11, "Dec": 12}.toTable
var invertedMonths: Table[int, string]
for k, v in months: invertedMonths[v] = k
var totals: Table[(int, int), int]
for line in stdin.lines:
if line.strip == "": break
if "redis" in line and "Partial" in line:
let args = line.split()
let w = (months[args[0]], parseInt(args[1]))
totals[w] = totals.getOrDefault(w, 0) + 1
for k in toSeq(totals.keys).sorted:
echo invertedMonths[k[0]], " ", k[1], " ", totals[k]
现在只是愚蠢,但如果不重构它以使用枚举而不是月份表,我不能不管它
import tables, strutils, algorithm,sequtils
type Month = enum Jan=1,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
var totals: CountTable[(Month, int)]
for line in stdin.lines:
if "redis" in line and "Partial" in line:
let flds = line.split()
inc totals,(flds[0].parseEnum[:Month], flds[1].parseInt)
for k in toSeq(totals.keys).sorted:
echo k[0], " ", k[1], " ", totals[k]
编辑:不得不使用 CountTable 对@xbello 的解决方案进行标记,太好了。
问题是在数字和树字母之间转换月份名称? Nim 有这个:
import times
echo Month(2) # February
echo ($Month(2))[.. 2] # Feb
echo parse("Feb", "MMM").month # February
echo ord(parse("Feb", "MMM").month) # 2
那么对于计数你可以这样做:
import tables, times
var totals = initCountTable[(Month, int)]()
# The following is some sample data, you just get each
# tuple[Mont, int] from your parsed file
var sample = @[
(m: 1.Month, i: 3),
(m: 2.Month, i: 3),
(m: 1.Month, i: 1),
(m: 1.Month, i: 3),
(m: 12.Month, i: 2)]
for item in sample:
totals.inc(s)
echo totals
# {(January, 3): 2, (February, 3): 1, (January, 1): 1, (December, 2): 1}
最后,排序。据我了解,这是按每月名称的字母顺序排序的,每个 int 的关系。我的印象是您想按月排序而不是按名称排序,因此所有 table gimnastics。如果您使用 Month
作为 table 键,Nim 将按月排序:
import algorithm
echo sorted(@[12.Month, 1.Month, 6.Month])
#@[January, June, December]
将它们放在一起,并假设您有一个遵循“月日消息”模式的日志文件,例如:
Jan 1 Log Message Partial redis
Mar 31 A Partial redis in the same date
Jan 02 More Log Messages but not captured
Mar 31 Even More Log Messages Partial redis
Jan 15 An out of order message Partial redis
这可以在 5 干净的行中完成,9 如果算上变量声明和导入:
import algorithm, sequtils, strscans, strutils, tables, times
var totals = initCountTable[(Month, int)]()
var month, msg: string
var day: int
for l in stdin.lines:
if scanf(l, "$w $i $*", month, day, msg) and "redis" in l and "Partial" in l:
totals.inc (parse(month, "MMM").month, day)
for k in sorted(toSeq(totals.keys())):
echo k, ": ", totals[k]
结果:
(January, 1): 1
(January, 15): 1
(March, 31): 2