Python 的可访问 class 变量、敏感数据和恶意编码者(黑帽黑客)

Python's accessible class variables, sensitive data, and malicious coders (black-hat hackers)

我试图让我正在做的项目无法访问一个变量,我 运行 跨 Does Python have “private” variables in classes? 上的 SO post。对我来说,它提出了一些有趣的问题,为了尝试让这个问题可以回答,我将用 Q1Q2 等标记。我'我环顾四周,但没有找到所问问题的答案,尤其是有关敏感数据的问题。

我在 that post, but it seems that the general consensus was something like if you see a variable with a _ before it, act like an adult and realize you shouldn't be messing with it. The same kind of idea was put forward for variables preceded by __. There, I got the general idea that you trust people not to use tricks like those described here and (in more detail) here. I also found some good information at 中找到了有用的东西。

当您谈论良好的编码实践时,这些都是非常好的建议。

我 post 在对我分享的 post 的评论中表达了一些想法。我的主要问题 was posted 作为评论。

I'm surprised there hasn't been more discussion of those who want to introduce malicious code. This is a real question: Is there no way in Python to prevent a black-hat hacker from accessing your variables and methods and inserting code/data that could deny service, reveal personal (or proprietary company) informationQ1? If Python doesn't allow this type of security, should it ever be used for sensitive dataQ2?

我是否完全遗漏了什么:恶意编码者甚至可以访问变量和方法来插入 code/data,这可能会拒绝服务或泄露敏感数据Q3?

我想我可能误解了一个概念,遗漏了一些东西,把问题放在了不属于它的地方,或者只是完全忽略了运行什么是计算机安全。但是,我想了解这里发生了什么。如果我完全偏离了目标,我想要一个告诉我如此的答案,但我也想知道我是如何完全偏离目标以及如何重新开始。

我在这里问的问题的另一部分来自我对那些 posts/answers 发表的另一条评论。 @SLott said(有些转述)

... I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected. ... Here's my question "protected [or private] from whom?"

为了弄清楚我的担忧是否值得关注,我commented post。在这里,已编辑。

Q: "protected from whom?" A: "From malicious, black-hat hackers who would want to access variables and functions so as to be able to deny service, to access sensitive info, ..." It seems the A._no_touch = 5 approach would cause such a malicious coder to laugh at my "please don't touch this". My A.__get_SSN(self) seems to be just wishful hoping that B.H. (Black Hat) doesn't know the x = A(); x._A__get_SSN() trick (trick by @Zorf).

我可能把问题放在了错误的地方,如果是这样,我希望有人告诉我我把问题放在错误的地方,但也要解释一下。 是否有基于 class 的方法确保安全的方法Q4在PythonQ5中,还有哪些其他非class和变量解决方案可以处理敏感数据?

这里有一些代码显示了为什么我看到这些问题的答案是想知道 是否应该将 Python 用于敏感数据 Q2。它不是完整的代码(为什么我要放下这些私有值和方法而不在任何地方使用它们?),但我希望它能说明我想问的事情的类型。我在 Python 交互式控制台输入并 运行 所有这些。

## Type this into the interpreter to define the class.
class A():
  def __init__(self):
    self.name = "Nice guy."
    self.just_a_4 = 4
    self.my_number = 4
    self._this_needs_to_be_pi = 3.14
    self.__SSN = "I hope you do not hack this..."
    self.__bank_acct_num = 123
  def get_info():
    print("Name, SSN, bank account.")
  def change_my_number(self, another_num):
    self.my_number = another_num
  def _get_more_info(self):
    print("Address, health problems.")
  def send_private_info(self):
    print(self.name, self.__SSN, self.__bank_acct_num)
  def __give_20_bucks_to(self, ssn):
    self.__SSN += " has "
  def say_my_name(self):
    print("my name")
  def say_my_real_name(self):
    print(self.name)
  def __say_my_bank(self):
    print(str(self.__bank_acct_num))
>>> my_a = A()
>>> my_a._this_needs_to_be_pi
3.14
>>> my_a._this_needs_to_be_pi=4 # I just ignored begins-with-`_` 'rule'.
>>> my_a._this_needs_to_be_pi
4

## This next method could actually be setting up some kind of secure connection,  
## I guess, which could send the private data. I just print it, here.
>>> my_a.send_private_info()
Nice guy. I hope you do not hack this... 123

## Easy access and change a "private" variable
>>> my_a.__SSN
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__SSN'
>>> my_a.__dict__
{'name': 'Nice guy.', 'just_a_4': 4, 'my_number': 4, '_this_needs_to_be_pi': 4, 
'_A__SSN': 'I hope you do not hack this...', '_A__bank_acct_num': 123}
>>> my_a._A__SSN
'I hope you do not hack this...'

# (maybe) potentially more dangerous
>>> def give_me_your_money(self, bank_num):
      print("I don't know how to inject code, but I can")
      print("access your bank account number:")
      print(my_a._A__bank_acct_num)
      print("and use my bank account number:")
      print(bank_num)
>>> give_me_your_money(my_a,345)
I don't know how to inject code, but I can
access your bank account number:
123
and use my account number:
345

此时,我重新输入了 class 定义,这可能是不必要的。

>>> this_a = A()
>>> this_a.__give_20_bucks_to('unnecessary param')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__give_20_bucks_to'
>>> this_a._A__give_20_bucks_to('unnecessary param')
>>> this_a._A__SSN
'I hope you do not hack this... has '

## Adding a fake "private" variable, `this_a.__SSN`
>>> this_a.__SSN = "B.H.'s SSN"
>>> this_a.__dict__
{'name': 'Nice guy.', 'just_a_4': 4, 'my_number': 4, '_this_needs_to_be_pi': 3.14, 
'_A__SSN': 'I hope you do not hack this... has ', '_A__bank_acct_num': 123, 
'__SSN': "B.H.'s SSN"}
>>> this_a.__SSN
"B.H.'s SSN"

## Now, changing the real one and "sending/stealing the money"
>>> this_a._A__SSN = "B.H.'s SSN"
>>> this_a._A__give_20_bucks_to('unnecessary param')
>>> this_a._A__SSN
"B.H.'s SSN has "

实际上,我在之前的合同工作中做过一些敏感数据的工作——不是 SSN 和银行帐号,而是人们的年龄、地址、phone 号码、个人历史、婚姻和其他关系历史,犯罪记录等。我没有参与保护这些数据的编程;我通过帮助确定数据的真实性来帮助尝试提取有用的信息,为机器学习做准备。我们获得了使用此类数据的许可和法律许可。另一个主要问题是:如何在 Python 中收集、管理、分析这些敏感数据并得出有用的结论Q6?从我在这里讨论的内容来看,classes(或任何其他数据结构,我没有在这里讨论,但似乎有同样的问题)似乎不允许这样做安全地(私下或以受保护的方式)完成。我想基于class的解决方案可能与编译有关。这是真的吗Q7?

最后,由于将我带到这里的不是安全性而是代码可靠性,我将 post 另一个 post 我发现并发表评论来完成我的问题。

@Marcin ,

[In response to the OP's words,] "The problem is simple. I want private variables to be accessed and changed only inside the class." [Marcin responded] So, don't write code outside the class that accesses variables starting with __. Use pylint or the like to catch style mistakes like that.

的目标是看看我的想法是否代表了实际的编码问题。我希望它不会给人留下粗鲁的印象

It seems this answer would be nice if you wrote code only for your own personal enjoyment and never had to hand it on to someone else to maintain it. Any time you're in a collaborative coding environment (any post-secondary education and/or work experience), the code will be used by many. Someone down the line will want to use an easy way to change your __you_really_should_not_touch_this variable. They may have a good reason for doing so, but it's possible you set up your code such that their "easy way" is going to break things.

我的观点是否有效,或者大多数编码员是否尊重双下划线Q8有没有更好的方法,使用Python,来保护代码的完整性——比__策略更好Q9?

privateprotected 对于 security 不存在。它们的存在是为了在您的代码中执行 合同 ,即逻辑 encapsulation。如果你将一块标记为protectedprivate,这意味着它是一个逻辑实现细节class,并且其他代码不应该直接接触它,因为其他代码可能无法 [能够] 正确使用它并且可能会弄乱状态。

例如,如果您的逻辑规则是每当您更改 self._a 时,您还必须使用某个值更新 self._b,那么您不希望外部代码修改这些变量,因为您如果外部代码不遵循此规则,内部状态可能会变得混乱。您只希望您的一个 class 在内部处理此问题,因为这会定位潜在的故障点。

最后所有这些都被编译成一个大字节,并且所有数据在运行时都存储在内存中。在这一点上,应用程序范围内的各个内存偏移量都没有任何保护,它只是字节汤。 protectedprivate 是程序员对他们自己的代码施加的约束,以保持他们自己的逻辑清晰。为此,或多或少像 _ 这样的非正式约定就足够了。

攻击者无法在单个属性级别进行攻击。 运行 软件对他们来说是一个黑盒子,内部发生的任何事情都无关紧要。 如果攻击者能够实际访问单个内存偏移量,或者实际上注入代码,那么无论哪种方式,这几乎都是游戏。 protectedprivate 在这一点上并不重要。