如何:按照 Bash 中的标题大小写规则转换文本

How To: Convert Text Following Title Case Rules in Bash

如何在遵循规则的同时将字符串转换为标题大小写,而不只是简单地将单词的每个首字母大写?

示例规则:

  • 将所有单词大写,除了:
  • 小写所有冠词(a、the)、介词(to、at、in、with)和并列连词(and、but、or)
  • 不管词性如何,将标题中的第一个和最后一个单词大写

在 bash 中有什么简单的方法可以做到这一点? One-liners 表示赞赏。

(作为附加说明,这将在 parcellite 操作中使用。)

$ cat titles.txt
purple haze
Somebody To Love
fire on the mountain
THE SONG REMAINS THE SAME
Watch the NorthWind rise
eight miles high
just dropped in
strawberry letter 23

$ cat cap.awk
BEGIN { split("a the to at in on with and but or", w)
        for (i in w) nocap[w[i]] }

function cap(word) {
    return toupper(substr(word,1,1)) tolower(substr(word,2))
}

{
  for (i=1; i<=NF; ++i) {
      printf "%s%s", (i==1||i==NF||!(tolower($i) in nocap)?cap($i):tolower($i)),
                     (i==NF?"\n":" ")
  }
}

$ awk -f cap.awk titles.txt
Purple Haze
Somebody to Love
Fire on the Mountain
The Song Remains the Same
Watch the Northwind Rise
Eight Miles High
Just Dropped In
Strawberry Letter 23

编辑(作为一个班轮):

$ echo "the sun also rises" | awk 'BEGIN{split("a the to at in on with and but or",w); for(i in w)nocap[w[i]]}function cap(word){return toupper(substr(word,1,1)) tolower(substr(word,2))}{for(i=1;i<=NF;++i){printf "%s%s",(i==1||i==NF||!(tolower($i) in nocap)?cap($i):tolower($i)),(i==NF?"\n":" ")}}'
The Sun Also Rises

感谢@jas 对此给出了很好的回答。最终,parcellite 我需要的是 shell 中的这个 one-long-liner:(为了管道的爱!)

echo '%s' | sed 's/\<./\u&/g' | sed 's/\ The\ /\ the\ /' | sed 's/\ A\ /\ a\ /' | sed 's/\ An\ /\ an\ /' | sed 's/\ As\ /\ as\ /' | sed 's/\ At\ /\ at\ /' | sed 's/\ But\ /\ but\ /' | sed 's/\ By\ /\ by\ /' | sed 's/\ For\ /\ for\ /' | sed 's/\ In\ /\ in\ /' | sed 's/\ Of\ /\ of\ /' | sed 's/\ Off\ /\ off\ /' | sed 's/\ On\ /\ on\ /' | sed 's/\ Per\ /\ per\ /' | sed 's/\ To\ /\ to\ /' | sed 's/\ Up\ /\ up\ /' | sed 's/\ Via\ /\ via\ /' | sed 's/\ And\ /\ and\ /' | sed 's/\ Nor\ /\ nor\ /' | sed 's/\ Or\ /\ or\ /' | sed 's/\ So\ /\ so\ /' | sed 's/\ Yet\ /\ yet\ /' | parcellite

sed当然是从循环中生成的:

for word in {The,A,An,As,At,But,By,For,In,Of,Off,On,Per,To,Up,Via,And,Nor,Or,So,Yet}
do
    low=`echo "$word" | tr '[A-Z]' '[a-z]'`
    printf "sed 's/\ $word\ /\ $low\ /' | "
done

感谢那些尝试过的人。 :-)