如何清理剪贴板数据中的麻烦字符，以便我可以在 IDLE 中粘贴到 python 脚本中？

Question

我想复制网站上显示的数据表，并使用 IDLE 作为字符串变量直接将文本粘贴到脚本中。这有时不起作用，因为复制的 material 中的某些内容 IDLE 不会接受为可保存的。结果行为不是错误消息，而是 IDLE 只是忽略保存请求。它只是坐在那里，直到我关闭而不保存。

我目前这种行为 - 我当然不想保存包含麻烦字符的 python 脚本。

有什么方法可以从计算机的剪贴板中删除那些讨厌的字符，以便我继续编写脚本吗？

如果我只需要这样做一次，我可以进去查看站点的 html 并可能提取它，或者在 the table of satellites on this page 的情况下，我可以进入google 应用程序并获取它。

但是为了这个问题的目的，我想要一种方法来 "fix" 我的剪贴板中的数据，我可以使用 IDLE 和运行将它作为字符串粘贴到脚本中.

我先在 .txt 文件中尝试 "Paste and Match Style" 来清理它，但没有成功。我有Sublime Text 2但不是很熟悉，如果里面有比较好用的功能就好了

尝试在提示符处粘贴三引号 thing = """ """ 会出现以下错误消息：Unsupported characters in input:

注意：在 OSX 中使用 Python 和 IDLE 版本“2.7.11”，Tk 版本“8.5.9”（我知道，这些已经有一年了） .

编辑： 这是我的剪贴板上的一大块数据，正如评论中所建议的那样。从这里复制（如图所示）会导致在 IDLE 中的保存尝试失败，因此这里至少有一些讨厌的符号。我在一对三重引号之间粘贴，例如thing = """ """

1   2/6/2000    PICOSAT 1&2 (TETHERED)  Aerospace Corporation   mil Opal    Opal    T   5   N   Minotaur-1
2   2/10/2000   PICOSAT 3 (JAK) Santa Clara University  uni Opal    Opal    E   2   N   Minotaur-1
3   2/10/2000   PICOSAT 6 (StenSat) Stensat Group. LLC  civ Opal    Opal    C   2   N   Minotaur-1
4   2/12/2000   PICOSAT 4 (Thelma)  Santa Clara University  uni Opal    Opal    S   2   N   Minotaur-1
5   2/12/2000   PICOSAT 5 (Louise)  Santa Clara University  uni Opal    Opal    S   2   N   Minotaur-1
6   9/6/2001    PICOSAT 7&8 (TETHERED)  Aerospace Corporation   mil Opal    Opal    T   2   D   Minotaur-1
7   12/2/2002   MEPSI   Aerospace Corporation   mil 2U  SSPL    T   2   D   Shuttle
8   6/30/2003   DTUSAT 1    Technical University of Denmark uni 1U  PPOD    E   2   N   Rokot-KM
9   6/30/2003   CUTE-1 (CO-55)  Tokyo Institute of Technology   uni 1U  PPOD    E   3   N   Rokot-KM
10  6/30/2003   QUAKESAT 1  Stanford University uni 3U  PPOD    S   5   N   Rokot-KM
11  6/30/2003   AAU CUBESAT 1   Aalborg University  uni 1U  PPOD    E   2   N   Rokot-KM
12  6/30/2003   CANX-1  UTIAS (University of Toronto)   uni 1U  PPOD    E   2   N   Rokot-KM
13  6/30/2003   CUBESAT XI-IV (CO-57)   University of Tokyo uni 1U  PPOD    E   4   S   Rokot-KM
14  10/27/2005  UWE-1   University of Würzburg  uni 1U  TPOD    E   3   N   Kosmos-3M
15  10/27/2005  CUBESAT XI-V (CO-58)    University of Tokyo uni 1U  TPOD    E   5   N   Kosmos-3M
16  10/27/2005  Ncube 2 Norweigan Universities  uni 1U  TPOD    E   2   N   Kosmos-3M
17  2/21/2006   CUTE 1.7    Tokyo Institute of Technology   uni 2U  JPOD    C   2   D   M-5 (2)
18  7/26/2006   AeroCube 1  Aerospace Corporation   mil 1U  PPOD    T   1   D   Dnepr-1
19  7/26/2006   SEEDS   Nihon University    uni 1U  PPOD    E   1   D   Dnepr-1
20  7/26/2006   SACRED  University of Arizona   uni 1U  PPOD    E   1   D   Dnepr-1

Answer 1

我会尝试扫描字符串并找到正常可打印范围之外的字符。也许陌生的字符会更容易识别。

text = """ <here comes your pasted text> """

def normal(c):
  return (32 <= ord(c) <= 127) or (c in '\n\r\t')

strange = set(ord(c) for c in text if not normal(c))

print strange

我想知道 strange 中可能会出现哪些字符代码。

如何清理剪贴板数据中的麻烦字符，以便我可以在 IDLE 中粘贴到 python 脚本中？

How to clean up troublesome characters in clipboard data so I can paste into a python script in IDLE?

string

clipboard

python-idle

python-2.7