使用 f# 解析日志文件
parse log files with f#
我正在尝试从 iis 日志文件中解析数据。
每一行都有一个我需要的日期:
u_ex15090503.log:3040:2015-09-05 03:57:45
还有我需要的姓名和电子邮件地址:
&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&
我首先会像这样获取正确的列。这部分工作正常。
//get the correct column
let getCol =
let line = fileReader inputFile
line
|> Seq.filter (fun line -> not (line.StartsWith("#")))
|> Seq.map (fun line -> line.Split())
|> Seq.map (fun line -> line.[7],1)
|> Seq.toArray
getCol
现在我需要解析上面的内容并获取日期、姓名和电子邮件,但我很难弄清楚该怎么做。
到目前为止我有这个,它给了我 2 个错误(如下):
//split the above column at every "&"
let getDataInCol =
let line = getCol
line
|> Seq.map (fun line -> line.Split('&'))
|> Seq.map (fun line -> line.[5], 1)
|> Seq.toArray
getDataInCol
Seq.map (fun line -> line.Split('&'))
the field constructor 'Split' is not defined
错误:
Seq.map (fun line -> line.[5], 1)
the operator 'expr.[idx]' has been used on an object of indeterminate type based on information prior to this program point.
也许我做错了。我是 f# 的新手,所以对于草率的代码我深表歉意。
这样会得到姓名和电子邮件。您仍然需要解析日期。
#r "Newtonsoft.Json.dll"
open System
open System.Text.RegularExpressions
open Newtonsoft.Json.Linq
let (|Regex|_|) pattern input =
let m = Regex.Match(input, pattern)
if m.Success then Some(List.tail [ for g in m.Groups -> g.Value ])
else None
type ActorDetails =
{
Date: DateTime
Name: string
Email: string
}
let parseActorDetails queryString =
match queryString with
| Regex @"[\?|&]actor=([^&]+)" [json] ->
let jsonValue = JValue.Parse(Uri.UnescapeDataString(json))
{
Date = DateTime.UtcNow (* replace with parsed date *)
Name = jsonValue.Value<JArray>("name").[0].Value<string>()
Email = jsonValue.Value<JArray>("mbox").[0].Value<string>().[7..]
}
| _ -> invalidArg "queryString" "Invalid format"
parseActorDetails "&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&"
val it : ActorDetails = {Date = 11/10/2015 9:14:25 PM;
Name = "James, Smith";
Email = "James.Smith@student.colled.edu";}
我正在尝试从 iis 日志文件中解析数据。
每一行都有一个我需要的日期:
u_ex15090503.log:3040:2015-09-05 03:57:45
还有我需要的姓名和电子邮件地址:
&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&
我首先会像这样获取正确的列。这部分工作正常。
//get the correct column
let getCol =
let line = fileReader inputFile
line
|> Seq.filter (fun line -> not (line.StartsWith("#")))
|> Seq.map (fun line -> line.Split())
|> Seq.map (fun line -> line.[7],1)
|> Seq.toArray
getCol
现在我需要解析上面的内容并获取日期、姓名和电子邮件,但我很难弄清楚该怎么做。
到目前为止我有这个,它给了我 2 个错误(如下):
//split the above column at every "&"
let getDataInCol =
let line = getCol
line
|> Seq.map (fun line -> line.Split('&'))
|> Seq.map (fun line -> line.[5], 1)
|> Seq.toArray
getDataInCol
Seq.map (fun line -> line.Split('&'))
the field constructor 'Split' is not defined
错误:
Seq.map (fun line -> line.[5], 1)
the operator 'expr.[idx]' has been used on an object of indeterminate type based on information prior to this program point.
也许我做错了。我是 f# 的新手,所以对于草率的代码我深表歉意。
这样会得到姓名和电子邮件。您仍然需要解析日期。
#r "Newtonsoft.Json.dll"
open System
open System.Text.RegularExpressions
open Newtonsoft.Json.Linq
let (|Regex|_|) pattern input =
let m = Regex.Match(input, pattern)
if m.Success then Some(List.tail [ for g in m.Groups -> g.Value ])
else None
type ActorDetails =
{
Date: DateTime
Name: string
Email: string
}
let parseActorDetails queryString =
match queryString with
| Regex @"[\?|&]actor=([^&]+)" [json] ->
let jsonValue = JValue.Parse(Uri.UnescapeDataString(json))
{
Date = DateTime.UtcNow (* replace with parsed date *)
Name = jsonValue.Value<JArray>("name").[0].Value<string>()
Email = jsonValue.Value<JArray>("mbox").[0].Value<string>().[7..]
}
| _ -> invalidArg "queryString" "Invalid format"
parseActorDetails "&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&"
val it : ActorDetails = {Date = 11/10/2015 9:14:25 PM;
Name = "James, Smith";
Email = "James.Smith@student.colled.edu";}