使用 ruta 从缩进文本文件中提取文本
extract text from indented text file using ruta
我正在尝试使用 Ruta 从文本文件中提取数据。我尝试了几种方法,但无法获得所需的确切信息。我需要从缩进文本文件中获取借款人姓名。
示例:
Borrower Name: Alice SSN: 000-00-000
Co-Borrower Name: SSN:
我注释了 Borrower Name 关键字和 SSN 关键字,但无法找出获取姓名的查询。
Document{->RETAINTYPE(SPACE)};
DECLARE BorrowerKeyword, NameKeyword, BorrowerNameKeyword;
W{REGEXP("Borrower")->BorrowerKeyword};
W{REGEXP("Name")->NameKeyword};
(SPACE BorrowerKeyword SPACE NameKeyword){-> BorrowerNameKeyword};
DECLARE SSNKeyword;
W{REGEXP("SSN")->SSNKeyword};
DECLARE BorrowerNameLine;
Line{CONTAINS(BorrowerNameKeyword,10,100),
CONTAINS(SSNKeyword,10,50)-> MARK(BorrowerNameLine)}; // Not able to annotate BorrowerNameLine
// other way but that also didn't work.
DECLARE BorrowerName;
RETAINTYPE(SPACE);
CW.ct=="Borrower" CW.ct=="Name" COLON n:W{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=n.ct)};
RETAINTYPE;
请指出我遗漏和更正的问题
尝试在使用后过滤掉 SPACE
以避免任何不良影响并简化规则开发。
REMOVERETAINTYPE(SPACE);
DECLARE Borrower, Name;
CW{REGEXP("\bBorrower") -> Borrower} CW{REGEXP("Name") -> Name};
Borrower Name COLON n:W{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=n.ct)};
我正在尝试使用 Ruta 从文本文件中提取数据。我尝试了几种方法,但无法获得所需的确切信息。我需要从缩进文本文件中获取借款人姓名。
示例:
Borrower Name: Alice SSN: 000-00-000
Co-Borrower Name: SSN:
我注释了 Borrower Name 关键字和 SSN 关键字,但无法找出获取姓名的查询。
Document{->RETAINTYPE(SPACE)};
DECLARE BorrowerKeyword, NameKeyword, BorrowerNameKeyword;
W{REGEXP("Borrower")->BorrowerKeyword};
W{REGEXP("Name")->NameKeyword};
(SPACE BorrowerKeyword SPACE NameKeyword){-> BorrowerNameKeyword};
DECLARE SSNKeyword;
W{REGEXP("SSN")->SSNKeyword};
DECLARE BorrowerNameLine;
Line{CONTAINS(BorrowerNameKeyword,10,100),
CONTAINS(SSNKeyword,10,50)-> MARK(BorrowerNameLine)}; // Not able to annotate BorrowerNameLine
// other way but that also didn't work.
DECLARE BorrowerName;
RETAINTYPE(SPACE);
CW.ct=="Borrower" CW.ct=="Name" COLON n:W{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=n.ct)};
RETAINTYPE;
请指出我遗漏和更正的问题
尝试在使用后过滤掉 SPACE
以避免任何不良影响并简化规则开发。
REMOVERETAINTYPE(SPACE);
DECLARE Borrower, Name;
CW{REGEXP("\bBorrower") -> Borrower} CW{REGEXP("Name") -> Name};
Borrower Name COLON n:W{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=n.ct)};