使用 + 与 f 字符串的字符串连接
String concatenation with + vs. f-string
假设我有两个变量:
>>> a = "hello"
>>> b = "world"
我可以用两种方式连接它们;使用 +
:
>>> a + b
"helloworld"
或使用 f 弦:
>>> f"{a}{b}"
"helloworld"
哪种方法更好或更好的做法?有人告诉我,就性能和稳健性而言,f 弦是更好的做法,我想详细了解原因。
嗯,方法 2 是一种更清晰的方式来表达您希望将字符串连接在一起。这就是为什么我建议在您提供的情况下使用该方法。
方法一的问题是当您开始使用不同的类型时。
# integers
a = 1
b = 2
a + b # 3
f'{a}{b}' # 12
# lists
a = [2, 3]
b = [5, 6]
a + b # [2, 3, 5, 6]
f'{a}{b}' # [2, 3][5, 6]
# dicts
a = {'a': 3}
b = {'a': 5}
a + b # TypeError
f'{a}{b}' # {'a': 3}{'b': 5}
当值不是字符串时,您将得到不同的结果。我不能说性能,但取决于该行执行的次数,它几乎没有区别。
这有两个方面:性能和便利性。
在 Python 3.8.0 中使用 timeit
,我发现使用 f 字符串的连接始终比 +
慢,但对于较长的字符串,百分比差异很小:
>>> from timeit import timeit
>>> timeit('a + b', setup='a, b = "hello", "world"')
0.059246900000289315
>>> timeit('f"{a}{b}"', setup='a, b = "hello", "world"')
0.06997206999949412
>>> timeit('a + b', setup='a, b = "hello"*100, "world"*100')
0.10218418099975679
>>> timeit('f"{a}{b}"', setup='a, b = "hello"*100, "world"*100')
0.1108272269993904
>>> timeit('a + b', setup='a, b = "hello"*10000, "world"*10000')
2.6094200410007033
>>> timeit('f"{a}{b}"', setup='a, b = "hello"*10000, "world"*10000')
2.7300010479993944
但是,当您的输入还不是字符串时,f 字符串可能会更方便一些:
>>> a, b = [1, 2, 3], True
>>> str(a) + str(b)
'[1, 2, 3]True'
>>> f'{a}{b}'
'[1, 2, 3]True'
在性能方面,我原以为格式字符串文字会比字符串连接快得多,但我震惊地发现事实并非如此。
我使用 timeit
模块来测试格式化字符串文字与字符串连接所花费的时间。我测试了长度为 10 到 100 万个字符的字符串。
from timeit import timeit
import matplotlib.pyplot as plt
n = 1000000000
setup = """\
a = 'a'*{str_len}
b = 'b'*{str_len}
"""
fstr_stmt = """\
f'{a}{b}'
"""
concat_stmt = """\
a+b
"""
str_lens = [10, 100, 1000, 10000, 100000, 1000000]
fstr_t = []
concat_t = []
for str_len in str_lens:
n_iters = n//str_len
fstr_t.append(timeit(setup=setup.format(str_len=str_len), stmt=fstr_stmt, number=n_iters)/n_iters)
concat_t.append(timeit(setup=setup.format(str_len=str_len), stmt=concat_stmt, number=n_iters)/n_iters)
ratio = fstr_t[-1]/concat_t[-1]
print(f"For two strings of length {str_len:7d}, concatenation is {ratio:.5f} times faster than f-strings")
plt.plot(str_lens, fstr_t, "r*-")
plt.plot(str_lens, concat_t, "c*-")
plt.xscale("log")
plt.yscale("log")
plt.xlabel("String length (log scale)")
plt.ylabel("Seconds per iteration (log scale)")
plt.grid()
plt.show()
控制台输出:
For two strings of length 10, concatenation is 1.06938 times faster than f-strings
For two strings of length 100, concatenation is 1.14887 times faster than f-strings
For two strings of length 1000, concatenation is 1.13994 times faster than f-strings
For two strings of length 10000, concatenation is 1.26934 times faster than f-strings
For two strings of length 100000, concatenation is 1.21585 times faster than f-strings
For two strings of length 1000000, concatenation is 1.01816 times faster than f-strings
剧情:
总结:
使用字符串连接运算符比使用格式字符串文字稍微快一些。除非您正在执行数十万个字符串连接并且需要非常快地完成它们,否则选择的实现不太可能产生影响。
从可读性的角度来看,f-string 文字比字符串连接更美观且更易于阅读。此外,正如 Daniel 的回答所指出的,f-strings 能够处理不同类型的输入,而使用 +
需要两个对象都是字符串(或者 __add__
的重载,以及__radd__
方法)。
Edit:正如 chepner 在他们的评论中指出的那样,使用 f-strings 是 当超过两个字符串时效率更高参与其中。例如,将另一个变量 c
添加到设置和 timeit
语句会产生以下控制台输出:
For three strings of length 10, concatenation is 0.77931 times faster than f-strings
For three strings of length 100, concatenation is 0.67699 times faster than f-strings
For three strings of length 1000, concatenation is 0.60220 times faster than f-strings
For three strings of length 10000, concatenation is 1.27484 times faster than f-strings
For three strings of length 100000, concatenation is 0.98911 times faster than f-strings
For three strings of length 1000000, concatenation is 0.60201 times faster than f-strings
如果要连接的字符串更多(>2 个,每个 1 个字符),f 字符串的性能会更好:
>>> from timeit import timeit
>>> timeit('a+b', setup='a,b = "h", "e"')
0.05678774899979544
>>> timeit('f"{a}{b}"', setup='a,b = "h", "e"')
0.09656870200024059
>>> timeit('a+b+c', setup='a,b,c = "h", "e", "l"')
0.09475198700010878
>>> timeit('f"{a}{b}{c}"', setup='a,b,c = "h", "e", "l"')
0.08498188300018228
>>> timeit('a+b+c+d', setup='a,b,c,d = "h", "e", "l", "l"')
0.13406166100003247
>>> timeit('f"{a}{b}{c}{d}"', setup='a,b,c,d = "h", "e", "l", "l"')
0.09481844199990519
>>> timeit('a+b+c+d+e', setup='a,b,c,d,e = "h", "e", "l", "l","o"')
0.21804361799991057
>>> timeit('f"{a}{b}{c}{d}{e}"', setup='a,b,c,d,e = "h", "e", "l", "l","o"')
0.11850353900013033
取决于您的需要。但大多数 f-string 更好
原因:
1、简洁明了
# if you have 2 to join:
a + b
f"{a}{b}"
# if you have 4 to join:
f"this is {a} and {b}, also {c} and {d}"
"this is" + a + "and" + b + "also" + c + "and" + d
2、速度更快,使用更方便
# example: it allows mult-line
f"""
this{a}, this{b}
and this{c}
"""
# example: don't need type convert
"this is: " + str(a)
f"this is: {a}"
但是,正如我所说,取决于您的需要
假设我有两个变量:
>>> a = "hello"
>>> b = "world"
我可以用两种方式连接它们;使用 +
:
>>> a + b
"helloworld"
或使用 f 弦:
>>> f"{a}{b}"
"helloworld"
哪种方法更好或更好的做法?有人告诉我,就性能和稳健性而言,f 弦是更好的做法,我想详细了解原因。
嗯,方法 2 是一种更清晰的方式来表达您希望将字符串连接在一起。这就是为什么我建议在您提供的情况下使用该方法。
方法一的问题是当您开始使用不同的类型时。
# integers
a = 1
b = 2
a + b # 3
f'{a}{b}' # 12
# lists
a = [2, 3]
b = [5, 6]
a + b # [2, 3, 5, 6]
f'{a}{b}' # [2, 3][5, 6]
# dicts
a = {'a': 3}
b = {'a': 5}
a + b # TypeError
f'{a}{b}' # {'a': 3}{'b': 5}
当值不是字符串时,您将得到不同的结果。我不能说性能,但取决于该行执行的次数,它几乎没有区别。
这有两个方面:性能和便利性。
在 Python 3.8.0 中使用 timeit
,我发现使用 f 字符串的连接始终比 +
慢,但对于较长的字符串,百分比差异很小:
>>> from timeit import timeit
>>> timeit('a + b', setup='a, b = "hello", "world"')
0.059246900000289315
>>> timeit('f"{a}{b}"', setup='a, b = "hello", "world"')
0.06997206999949412
>>> timeit('a + b', setup='a, b = "hello"*100, "world"*100')
0.10218418099975679
>>> timeit('f"{a}{b}"', setup='a, b = "hello"*100, "world"*100')
0.1108272269993904
>>> timeit('a + b', setup='a, b = "hello"*10000, "world"*10000')
2.6094200410007033
>>> timeit('f"{a}{b}"', setup='a, b = "hello"*10000, "world"*10000')
2.7300010479993944
但是,当您的输入还不是字符串时,f 字符串可能会更方便一些:
>>> a, b = [1, 2, 3], True
>>> str(a) + str(b)
'[1, 2, 3]True'
>>> f'{a}{b}'
'[1, 2, 3]True'
在性能方面,我原以为格式字符串文字会比字符串连接快得多,但我震惊地发现事实并非如此。
我使用 timeit
模块来测试格式化字符串文字与字符串连接所花费的时间。我测试了长度为 10 到 100 万个字符的字符串。
from timeit import timeit
import matplotlib.pyplot as plt
n = 1000000000
setup = """\
a = 'a'*{str_len}
b = 'b'*{str_len}
"""
fstr_stmt = """\
f'{a}{b}'
"""
concat_stmt = """\
a+b
"""
str_lens = [10, 100, 1000, 10000, 100000, 1000000]
fstr_t = []
concat_t = []
for str_len in str_lens:
n_iters = n//str_len
fstr_t.append(timeit(setup=setup.format(str_len=str_len), stmt=fstr_stmt, number=n_iters)/n_iters)
concat_t.append(timeit(setup=setup.format(str_len=str_len), stmt=concat_stmt, number=n_iters)/n_iters)
ratio = fstr_t[-1]/concat_t[-1]
print(f"For two strings of length {str_len:7d}, concatenation is {ratio:.5f} times faster than f-strings")
plt.plot(str_lens, fstr_t, "r*-")
plt.plot(str_lens, concat_t, "c*-")
plt.xscale("log")
plt.yscale("log")
plt.xlabel("String length (log scale)")
plt.ylabel("Seconds per iteration (log scale)")
plt.grid()
plt.show()
控制台输出:
For two strings of length 10, concatenation is 1.06938 times faster than f-strings
For two strings of length 100, concatenation is 1.14887 times faster than f-strings
For two strings of length 1000, concatenation is 1.13994 times faster than f-strings
For two strings of length 10000, concatenation is 1.26934 times faster than f-strings
For two strings of length 100000, concatenation is 1.21585 times faster than f-strings
For two strings of length 1000000, concatenation is 1.01816 times faster than f-strings
剧情:
总结: 使用字符串连接运算符比使用格式字符串文字稍微快一些。除非您正在执行数十万个字符串连接并且需要非常快地完成它们,否则选择的实现不太可能产生影响。
从可读性的角度来看,f-string 文字比字符串连接更美观且更易于阅读。此外,正如 Daniel 的回答所指出的,f-strings 能够处理不同类型的输入,而使用 +
需要两个对象都是字符串(或者 __add__
的重载,以及__radd__
方法)。
Edit:正如 chepner 在他们的评论中指出的那样,使用 f-strings 是 当超过两个字符串时效率更高参与其中。例如,将另一个变量 c
添加到设置和 timeit
语句会产生以下控制台输出:
For three strings of length 10, concatenation is 0.77931 times faster than f-strings
For three strings of length 100, concatenation is 0.67699 times faster than f-strings
For three strings of length 1000, concatenation is 0.60220 times faster than f-strings
For three strings of length 10000, concatenation is 1.27484 times faster than f-strings
For three strings of length 100000, concatenation is 0.98911 times faster than f-strings
For three strings of length 1000000, concatenation is 0.60201 times faster than f-strings
如果要连接的字符串更多(>2 个,每个 1 个字符),f 字符串的性能会更好:
>>> from timeit import timeit
>>> timeit('a+b', setup='a,b = "h", "e"')
0.05678774899979544
>>> timeit('f"{a}{b}"', setup='a,b = "h", "e"')
0.09656870200024059
>>> timeit('a+b+c', setup='a,b,c = "h", "e", "l"')
0.09475198700010878
>>> timeit('f"{a}{b}{c}"', setup='a,b,c = "h", "e", "l"')
0.08498188300018228
>>> timeit('a+b+c+d', setup='a,b,c,d = "h", "e", "l", "l"')
0.13406166100003247
>>> timeit('f"{a}{b}{c}{d}"', setup='a,b,c,d = "h", "e", "l", "l"')
0.09481844199990519
>>> timeit('a+b+c+d+e', setup='a,b,c,d,e = "h", "e", "l", "l","o"')
0.21804361799991057
>>> timeit('f"{a}{b}{c}{d}{e}"', setup='a,b,c,d,e = "h", "e", "l", "l","o"')
0.11850353900013033
取决于您的需要。但大多数 f-string 更好
原因: 1、简洁明了
# if you have 2 to join:
a + b
f"{a}{b}"
# if you have 4 to join:
f"this is {a} and {b}, also {c} and {d}"
"this is" + a + "and" + b + "also" + c + "and" + d
2、速度更快,使用更方便
# example: it allows mult-line
f"""
this{a}, this{b}
and this{c}
"""
# example: don't need type convert
"this is: " + str(a)
f"this is: {a}"
但是,正如我所说,取决于您的需要