从 URL 中提取 Google 驱动器文件夹 ID

Extract Google drive folder id from URL's

我只是想从一堆不同的 google 驱动器 URL 的

中提取 Google 驱动器文件夹 ID

cat links.txt

https://drive.google.com/drive/mobile/folders/1mzr8lgf50p9z6p-7RyHn4XjnyKSvyyuE?usp=sharing

https://drive.google.com/open?id=1_7vwy0-y0BqvPOtG2Or4pvoChnZHrHAx

https://drive.google.com/folderview?id=1rOLhig0g3DdgB9YfvW8HiqRA6o6LxAFF

https://drive.google.com/file/d/1o2J_NwHS3l1-fM71HaDN-xxres1jHkb_/view?usp=drivesdk

https://drive.google.com/drive/folders/0AKzaqn_X7nxiUk9PVA

https://drive.google.com/drive/mobile/folders/0AKzaqn_X7nxiUk9PVA

https://drive.google.com/drive/mobile/folders/0AKzaqn_X7nxiUk9PVA/1re_-YAGfTuyE1Gt848vzTu4ZDC6j23sG/1Ye90fM5qYMYkXp4QMAcQftsJCFVHswWj/149W7xNROO33zaPvIYTNwvtVGAXFxCg_b?sort=13&direction=a

https://drive.google.com/drive/mobile/folders/1nY48t6MATb0XM-iEdeWzEs70qXW2N4Y9?sort=13&direction=a

https://drive.google.com/drive/folders/1M3Xp3xz44NS8QJO5XJT5DK55MohwN6tF?sort=13&direction=a

预期输出

1mzr8lgf50p9z6p-7RyHn4XjnyKSvyyuE

1_7vwy0-y0BqvPOtG2Or4pvoChnZHrHAx

1rOLhig0g3DdgB9YfvW8HiqRA6o6LxAFF

1o2J_NwHS3l1-fM71HaDN-xxres1jHkb_

0AKzaqn_X7nxiUk9PVA

0AKzaqn_X7nxiUk9PVA

149W7xNROO33zaPvIYTNwvtVGAXFxCg_b

1nY48t6MATb0XM-iEdeWzEs70qXW2N4Y9

1M3Xp3xz44NS8QJO5XJT5DK55MohwN6tF

经过 trial/error 一个小时后,我确实想出了这个正则表达式 - ([01A-Z])(?=[\w-]*[A-Za-z])[\w-]+

除了无法正确处理倒数第 3 个 link 外,它似乎运行良好。如果 URL 中有多个嵌套文件夹 ID,我需要输出中最里面的一个。有人可以帮我解决这个错误并可能改进正则表达式,如果它可以比我的更有效的方式完成

你可以试试这个sed:

sed -E 's~.*[/=]([01A-Z][-_[:alnum:]]+)([?/].*|$)~~' links.txt

1mzr8lgf50p9z6p-7RyHn4XjnyKSvyyuE
1_7vwy0-y0BqvPOtG2Or4pvoChnZHrHAx
1rOLhig0g3DdgB9YfvW8HiqRA6o6LxAFF
1o2J_NwHS3l1-fM71HaDN-xxres1jHkb_
0AKzaqn_X7nxiUk9PVA
0AKzaqn_X7nxiUk9PVA
149W7xNROO33zaPvIYTNwvtVGAXFxCg_b
1nY48t6MATb0XM-iEdeWzEs70qXW2N4Y9
1M3Xp3xz44NS8QJO5XJT5DK55MohwN6tF

使用 GNU awk:

awk '{print $NF}' FPAT='[a-zA-Z0-9_-]{19,34}' file

$NF: contains last column

FPAT: A regular expression describing the contents of the fields in a record. When set, gawk parses the input into fields, where the fields match the regular expression, instead of using the value of FS as the field separator.

输出:

1mzr8lgf50p9z6p-7RyHn4XjnyKSvyyuE
1_7vwy0-y0BqvPOtG2Or4pvoChnZHrHAx
1rOLhig0g3DdgB9YfvW8HiqRA6o6LxAFF
1o2J_NwHS3l1-fM71HaDN-xxres1jHkb_
0AKzaqn_X7nxiUk9PVA
0AKzaqn_X7nxiUk9PVA
149W7xNROO33zaPvIYTNwvtVGAXFxCg_b
1nY48t6MATb0XM-iEdeWzEs70qXW2N4Y9
1M3Xp3xz44NS8QJO5XJT5DK55MohwN6tF