正在为 IFTTT 小程序解析 /r/listenToThis 中的歌曲标题
Parsing a song title from /r/listenToThis for an IFTTT applet
我有一组歌曲名称,来自 this subreddit,如下所示:
[
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"$uicideboy$ - Death",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Blab - afdosafhsd (2000)",
"Something strange and badly formatted without any artist [Classical]",
"シロとクロ「ミッドナイトにグッドナイト」(Goodnight to Midnight - Shirotokuro) - (Official Music Video) [Indie/Alternative]",
"Victor Love - Irrationality (feat. Spiritual Front) [Industrial Rock/Cyberpunk]"
...
]
我正在尝试从他们那里解析标题和艺术家,但我真的很难用正则表达式。
我尝试使用 "-"
拆分它,但之后才得到艺术家真的很烦人。
我也尝试过使用正则表达式,但我无法真正让某些东西正常工作。
这就是我对艺术家的看法:/(?<= -{1,2} )[\S ]*(?= \[|\( )/i
这是标题:/[\S ]*(?= -{1,2} )/i
.
每个条目都是一个歌曲名称。在歌曲名称之前可能是歌曲的艺术家,后面跟着一两个(或者 3 个?)破折号。然后可以在方括号中添加流派 and/or 括号中的发布日期。我不期望完美的准确性,某些格式可能很奇怪,在那些情况下,我宁愿 artist
未定义而不是一些奇怪的解析。
例如:
[
{ title: "MYTCH", artist: "Lophelia" },
{ title: "Pressure to Party", artist: "Julia Jacklin" },
{ title: "I'm Going Home", artist: "The homeless Gospel Choir" },
{ title: "The last night of the world", artist: "Lea Salonga and Simon Bowman" },
{ title: "Death", artist: "$uicideboy$" },
{ title: "Joni Mitchell Tapes", artist: "SNFU" },
{ title: "afdosafhsd", artist: "Blab" },
{ title: "Something strange and badly formatted without any artist" },
{ title: "Goodnight to midnight", artist: "shirotokuro" }, // Probably impossible with some kind of AI
{ title: "Irrationality" artist: "Victor Love" }
]
要达到预期结果,请使用以下选项
1. 对于标题,使用 substr 从 indexOf of '- '(extra space) 的位置开始并检查 '[' 如果没有 '[' 的索引,然后使用子字符串长度
v.substring(v.indexOf('- ')+1, v.indexOf(' [') !== -1? v.indexOf(' [') : v.length).trim()
对于艺术家,使用位置为 0 且 indexOf 为“-”的 substr
v.substr(0, v.indexOf('-')).trim()})
参考工作代码
let arr = [
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"$uicideboy$ - Death",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Blab - afdosafhsd (2000)",
"Something strange and badly formatted without any artist [Classical]"
]
let result = arr.reduce((acc, v) => {
acc.push({
title: v.substring(v.indexOf('- ')+1, v.indexOf(' [') !== -1? v.indexOf(' [') : (v.indexOf(' (') !== -1? v.indexOf(' (') : v.length)).trim(),
artist: v.substr(0, v.indexOf('-')).trim()})
return acc
}, [])
console.log(result)
codepen - https://codepen.io/nagasai/pen/zQKRXj?editors=1010
你可以这样做:
const songs = [
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz]",
"Death - $uicideboy$",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Title - Aritst (2000)",
"Something strange and badly formatted without any artist [Classical]",
];
const trailingRgx = /\s*((\[[^\]]+\])|(\(\d+\))).*$/;
const details = songs.map(song => {
const splitted = song.split(/\s+\-+\s+/);
let title = splitted[0];
let artist = splitted[1];
if (splitted.length >= 2) {
artist = artist.replace(trailingRgx, '');
} else {
title = title.replace(trailingRgx, '');
}
return {
title,
artist
}
});
console.log(details);
您可以使用此正则表达式来捕获您在 post 中描述的标题和艺术家部分。
^([^-[\]()\n]+)-* *([^[\]()\n]*)
Regex Demo(故意以 PCRE 风格显示以保留组颜色以吸引视觉吸引力,但它也适用于 Javascript 风格)
JS代码演示,
const songs = ["Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz]",
"Death - $uicideboy$",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Title - Aritst (2000)",
"Something strange and badly formatted without any artist [Classical]"]
songs.forEach(song => {
m = /^([^-[\]()\n]+)-* *([^[\]()\n]*)/.exec(song)
console.log("Title: " + m[1] + ", Artist: " + m[2])
})
我有一组歌曲名称,来自 this subreddit,如下所示:
[
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"$uicideboy$ - Death",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Blab - afdosafhsd (2000)",
"Something strange and badly formatted without any artist [Classical]",
"シロとクロ「ミッドナイトにグッドナイト」(Goodnight to Midnight - Shirotokuro) - (Official Music Video) [Indie/Alternative]",
"Victor Love - Irrationality (feat. Spiritual Front) [Industrial Rock/Cyberpunk]"
...
]
我正在尝试从他们那里解析标题和艺术家,但我真的很难用正则表达式。
我尝试使用 "-"
拆分它,但之后才得到艺术家真的很烦人。
我也尝试过使用正则表达式,但我无法真正让某些东西正常工作。
这就是我对艺术家的看法:/(?<= -{1,2} )[\S ]*(?= \[|\( )/i
这是标题:/[\S ]*(?= -{1,2} )/i
.
每个条目都是一个歌曲名称。在歌曲名称之前可能是歌曲的艺术家,后面跟着一两个(或者 3 个?)破折号。然后可以在方括号中添加流派 and/or 括号中的发布日期。我不期望完美的准确性,某些格式可能很奇怪,在那些情况下,我宁愿 artist
未定义而不是一些奇怪的解析。
例如:
[
{ title: "MYTCH", artist: "Lophelia" },
{ title: "Pressure to Party", artist: "Julia Jacklin" },
{ title: "I'm Going Home", artist: "The homeless Gospel Choir" },
{ title: "The last night of the world", artist: "Lea Salonga and Simon Bowman" },
{ title: "Death", artist: "$uicideboy$" },
{ title: "Joni Mitchell Tapes", artist: "SNFU" },
{ title: "afdosafhsd", artist: "Blab" },
{ title: "Something strange and badly formatted without any artist" },
{ title: "Goodnight to midnight", artist: "shirotokuro" }, // Probably impossible with some kind of AI
{ title: "Irrationality" artist: "Victor Love" }
]
要达到预期结果,请使用以下选项 1. 对于标题,使用 substr 从 indexOf of '- '(extra space) 的位置开始并检查 '[' 如果没有 '[' 的索引,然后使用子字符串长度
v.substring(v.indexOf('- ')+1, v.indexOf(' [') !== -1? v.indexOf(' [') : v.length).trim()
对于艺术家,使用位置为 0 且 indexOf 为“-”的 substr
v.substr(0, v.indexOf('-')).trim()})
参考工作代码
let arr = [
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"$uicideboy$ - Death",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Blab - afdosafhsd (2000)",
"Something strange and badly formatted without any artist [Classical]"
]
let result = arr.reduce((acc, v) => {
acc.push({
title: v.substring(v.indexOf('- ')+1, v.indexOf(' [') !== -1? v.indexOf(' [') : (v.indexOf(' (') !== -1? v.indexOf(' (') : v.length)).trim(),
artist: v.substr(0, v.indexOf('-')).trim()})
return acc
}, [])
console.log(result)
codepen - https://codepen.io/nagasai/pen/zQKRXj?editors=1010
你可以这样做:
const songs = [
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz]",
"Death - $uicideboy$",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Title - Aritst (2000)",
"Something strange and badly formatted without any artist [Classical]",
];
const trailingRgx = /\s*((\[[^\]]+\])|(\(\d+\))).*$/;
const details = songs.map(song => {
const splitted = song.split(/\s+\-+\s+/);
let title = splitted[0];
let artist = splitted[1];
if (splitted.length >= 2) {
artist = artist.replace(trailingRgx, '');
} else {
title = title.replace(trailingRgx, '');
}
return {
title,
artist
}
});
console.log(details);
您可以使用此正则表达式来捕获您在 post 中描述的标题和艺术家部分。
^([^-[\]()\n]+)-* *([^[\]()\n]*)
Regex Demo(故意以 PCRE 风格显示以保留组颜色以吸引视觉吸引力,但它也适用于 Javascript 风格)
JS代码演示,
const songs = ["Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz] (2019)",
"Julia Jacklin - Pressure to Party [Rock] (2019)",
"The Homeless Gospel Choir - I'm Going Home [Folk-Punk] (2019) cover of Pat the Bunny | A Fistful of Vinyl",
"Lea Salonga and Simon Bowman - The last night of the world [musical] (1990)",
"Lophelia -- MYTCH [Acoustic Prog-Rock/Jazz]",
"Death - $uicideboy$",
"SNFU -- Joni Mitchell Tapes [Punk/Alternative] (1993)",
"Title - Aritst (2000)",
"Something strange and badly formatted without any artist [Classical]"]
songs.forEach(song => {
m = /^([^-[\]()\n]+)-* *([^[\]()\n]*)/.exec(song)
console.log("Title: " + m[1] + ", Artist: " + m[2])
})