awk:生成 icalendar 文件。如何打印一些连续的行?

Awk: generating icalendar file. How to print some consecutive lines?

Thanks to Ed Morton's answer, I could do some testing with Thunderbird and icalendar validator. I edited my question adding entries without description and the expected result with precise requirements.

我正在编写一个脚本来从计划文本文件生成一个 icalendar 文件。我想获取日期之后的行的描述内容。假设我有一个计划文件:

lun 06 05 2019 08 15 09 00 F206
    A descritpion text.
ven 10 05 2019 11 00 11 45 G202
    Another description text
    - on multiple; 
    - lines.
lun 13 05 2019 08 15 09 00 F206
ven 17 05 2019 11 00 11 45 G202
    A long description with more than 75 characters.
    This happen often when multiple lines are
    joined in one. So the program shoud split every lines
    To 75 characters including the word description.
lun 20 05 2019 08 15 09 00 F206
    A description text.

我的脚本是这样的,我是awk的新手:

#!/bin/bash
awk ' BEGIN { print "BEGIN:VCALENDAR\r\n\
... some entries here ...\r\n\
END:VTIMEZONE\r" ;}
~/^(lun|mar|mer|jeu|ven)$/ { print "BEGIN:VEVENT\r\n\
... some entries here ...\r\n\
DTSTART;TZID=Europe/Zurich:""""""T""""00\r\n\
DTEND;TZID=Europe/Zurich:""""""T""""00\r\n\
TRANSP:OPAQUE\r\n\
DESCRIPTION: >>>HERE I NEED THE DESCRIPTIVE LINES<<<< \r\n\
END:VEVENT\r"}
END { print "END:VCALENDAR" } ' <  > .ics

预期结果:

BEGIN:VCALENDAR
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190506T081500
DTEND;TZID=Europe/Zurich:20190506T090000
DESCRIPTION:A descritpion text.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190510T110000
DTEND;TZID=Europe/Zurich:20190510T114500
DESCRIPTION:Another description text\n- on multiple;\n- lines.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190513T081500
DTEND;TZID=Europe/Zurich:20190513T090000
END:VEVENT                  
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190517T110000
DTEND;TZID=Europe/Zurich:20190517T114500
DESCRIPTION:A long description with more than 75 characters.\nThis happen
 often when multiple lines are\njoined in one. So the program shoud split 
 every lines\nTo 75 characters including the word description.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190520T081500
DTEND;TZID=Europe/Zurich:20190520T090000
DESCRIPTION: A description text.
END:VEVENT
END:VCALENDAR

所以确切的要求是:

  1. 没有描述的行不应打印 DESCRIPTION:.
  2. 多行描述应该用文字 \n 连接和分隔。这是使用 printf "%s%s", [=13=], "\n"
  3. 各行应拆分为少于 75 个字符,并以 \r\n
  4. 结尾
  5. 附加说明行应以 space.
  6. 开头

你真的走对了。这是与 flag 逻辑集成的脚本:

#!/bin/bash
awk 'BEGIN {print "BEGIN:VCALENDAR\r\n\
... some entries here ...\r\n\
END:VTIMEZONE\r" ;}
~/^(lun|mar|mer|jeu|ven)$/ && flag {flag = !flag; print "END:VEVENT\r"}
~/^(lun|mar|mer|jeu|ven)$/ && !flag {flag = !flag; print "BEGIN:VEVENT\r\n\
... some entries here ...\r\n\
DTSTART;TZID=Europe/Zurich:""""""T""""00\r\n\
DTEND;TZID=Europe/Zurich:""""""T""""00\r\n\
TRANSP:OPAQUE\r\n\
DESCRIPTION: "; next}
flag {print [=10=]}
END { print "END:VCALENDAR" } ' < 

输出:

BEGIN:VCALENDAR
... some entries here ...
END:VTIMEZONE
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190506T081500
DTEND;TZID=Europe/Zurich:20190506T090000
TRANSP:OPAQUE
DESCRIPTION:
    Some descriptive lines here.
    Lorem ipsumi dolor sit amet, consectetur adipiscing elitr.
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190510T110000
DTEND;TZID=Europe/Zurich:20190510T114500
TRANSP:OPAQUE
DESCRIPTION:
    sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
    - enim
    - ad minim veniam.
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190513T081500
DTEND;TZID=Europe/Zurich:20190513T090000
TRANSP:OPAQUE
DESCRIPTION:
    exercitation ullamco
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190517T110000
DTEND;TZID=Europe/Zurich:20190517T114500
TRANSP:OPAQUE
DESCRIPTION:
    quis nostrud
END:VCALENDAR
$ cat tst.awk
BEGIN {
    ORS="\r\n"

    print "BEGIN:VCALENDAR"
    print "... some entries here ..."
    print "END:VTIMEZONE"
}
/^[^[:space:]]/ {
    prtEndVevent()

    print "BEGIN:VEVENT"
    print "... some entries here ..."

    date =   
    begt =   "00"
    endt =   "00"

    print "DTSTART;TZID=Europe/Zurich:" date "T" begt
    print "DTEND;TZID=Europe/Zurich:"   date "T" endt
    next
}
{
    gsub(/^[[:space:]]+|[[:space:]]+$/,"")
    desc = (desc == "" ? "DESCRIPTION:" : desc RS) [=10=]
}
END {
    prtEndVevent()
    print "END:VCALENDAR"
}

function prtEndVevent(       wid) {
    if ( desc != "" ) {
        wid = 74
        gsub(RS,"\n",desc)
        while ( desc !~ /^ ?$/ ) {
            print substr(desc,1,wid)
            desc = " " substr(desc,wid+1)
        }
        desc = ""
    }
    if ( endVevent != "" ) {
        print endVevent
    }
    endVevent = "END:VEVENT"
}

.

$ awk -f tst.awk file
BEGIN:VCALENDAR
... some entries here ...
END:VTIMEZONE
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190506T081500
DTEND;TZID=Europe/Zurich:20190506T090000
DESCRIPTION:A descritpion text.
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190510T110000
DTEND;TZID=Europe/Zurich:20190510T114500
DESCRIPTION:Another description text\n- on multiple;\n- lines.
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190513T081500
DTEND;TZID=Europe/Zurich:20190513T090000
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190517T110000
DTEND;TZID=Europe/Zurich:20190517T114500
DESCRIPTION:A long description with more than 75 characters.\nThis happen
 often when multiple lines are\njoined in one. So the program shoud split
 every lines\nTo 75 characters including the word description.
END:VEVENT
BEGIN:VEVENT
... some entries here ...
DTSTART;TZID=Europe/Zurich:20190520T081500
DTEND;TZID=Europe/Zurich:20190520T090000
DESCRIPTION:A description text.
END:VEVENT
END:VCALENDAR

请注意,这是在字符位置换行,而不是单词边界,因此如果单词越过第 75 个字符位置,它将被拆分。如果这不是您想要的,您可以更新 prtDesc() 以一次打印一个单词,检查所有单词的总长度 + 打印的空白加上下一个单词是否小于 75(并决定如何处理描述字符串长度超过 75 个字符且没有空格!)或调用 UNIX 命令 fold 为您进行换行。

如果您考虑使用 getline,请务必先阅读并完全理解 http://awk.freeshell.org/AllAboutGetline