pandas multiindex(层次索引)减列追加结果
pandas multiindex (hierarchical index) subtract columns and append result
我有这段代码运行得很好...
from pandas_datareader import data as pdr
import pandas as pd
myStartDate = '2020-12-15' # (Format: Year-Month-Day)
myEndDate = '2020-12-31' # (Format: Year-Month-Day)
myTickers = 'TSLA'
myData = pdr.get_data_yahoo(myTickers, myStartDate, myEndDate)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 200)
print(myData.head())
结果是:
High Low Open Close Volume Adj Close
Date
20-12-15 646.90 623.79 643.28 633.25 45223600 633.250000
20-12-16 632.50 605.00 628.22 622.77 42095800 622.770020
20-12-17 658.82 619.50 628.19 655.90 56270100 655.900024
20-12-18 695.00 628.53 668.90 695.00 22212620 695.000000
20-12-21 668.50 646.07 666.23 649.85 58045300 649.859985
那么,我有这个:
myData['Some New Column Name'] = myData['High'] - myData['Low']
print(myData.head())
而且效果很好,结果是这样的:
High Low Open Close Volume Adj Close Some New Column Name
Date
20-12-15 646.90 623.79 643.28 633.25 45223600 633.25 23.10
20-12-16 632.50 605.00 628.22 622.77 42095800 622.77 27.50
20-12-17 658.82 619.50 628.19 655.90 56270100 655.90 39.32
20-12-18 695.00 628.53 668.90 695.00 22212620 695.00 66.46
20-12-21 668.50 646.07 666.23 649.85 58045300 649.85 22.42
我想修改代码以像这样执行多个“myTickers”:
from pandas_datareader import data as pdr
import pandas as pd
myStartDate = '2020-12-15' # (Format: Year-Month-Day)
myEndDate = '2020-12-31' # (Format: Year-Month-Day)
myTickers = ['TSLA', 'SQ']
myData = pdr.get_data_yahoo(myTickers, myStartDate, myEndDate)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 200)
print(myData.head())
效果很好,结果是:
Attributes Adj Close Close High Low Open Volume
Symbols TSLA SQ TSLA SQ TSLA SQ TSLA SQ TSLA SQ TSLA SQ
Date
12/15/20 633.25 219.99 633.25 219.99 646.90 221.72 623.80 216.73 643.28 218.41 45223600 5713600
12/16/20 622.77 227.08 622.77 227.08 632.50 227.96 605.00 220.03 628.23 223.86 42095800 8216500
12/17/20 655.90 230.74 655.90 230.74 658.82 237.09 619.50 227.70 628.19 230.00 56270100 10440100
12/18/20 695.00 235.45 695.00 235.45 695.00 236.37 628.54 231.10 668.90 235.00 22212620 8099500
12/21/20 649.86 233.50 649.86 233.50 668.50 241.85 646.07 232.26 666.24 236.09 58045300 1115540
与第一个例子类似,我想做的是:(特斯拉的高点)-(特斯拉的低点)=(一些新的列名)和(SQ 的高点)-(SQ 的低点)=(一些新的列名)但是我对如何执行此操作 pandas 不是很有经验。我的理解是这是一个多索引或(分层索引)pandas 数据框。阅读一些文档,我认为这样的事情可能会起作用:
myData['Some New Column Name']['TSLA', 'SQ'] = myData['High'] - myData['Low']
或
myData[['Some New Column Name'],['TSLA', 'SQ']] = myData['High'] - myData['Low']
但这行不通...我觉得可能很接近?
明确地说,我想要这样的东西:
Attributes Adj Close Close ... Open Volume Some New Col
Symbols TSLA SQ TSLA SQ ... TSLA SQ TSLA SQ TSLA SQ
Date ...
12/15/20 633.25 219.99 633.25 219.99 ... 643.28 218.41 45223600 5713600 23.10 4.99
12/16/20 622.77 227.08 622.77 227.08 ... 628.23 223.86 42095800 8216500 27.50 7.93
12/17/20 655.90 230.74 655.90 230.74 ... 628.19 230.00 56270100 1044010 39.32 9.39
12/18/20 695.00 235.45 695.00 235.45 ... 668.90 235.00 22212620 8099500 66.46 5.27
12/21/20 649.86 233.50 649.86 233.50 ... 666.24 236.09 58045300 1115540 22.43 9.59
我刚试过:
myNewData = myData['High'] - myData['Low']
结果是这样的:
Symbols TSLA SQ
Date
2020-12-15 23.100037 4.990005
2020-12-16 27.500000 7.930008
2020-12-17 39.320007 9.389999
2020-12-18 66.460022 5.269989
2020-12-21 22.429993 9.590012
我可能可以使用这个...但是...最好以某种干净的方式将其附加到 myData
我觉得你们很接近...
myData[[('Some new column', c) for c in myData['High'].columns]] = myData['High'] - myData['Low']
我有这段代码运行得很好...
from pandas_datareader import data as pdr
import pandas as pd
myStartDate = '2020-12-15' # (Format: Year-Month-Day)
myEndDate = '2020-12-31' # (Format: Year-Month-Day)
myTickers = 'TSLA'
myData = pdr.get_data_yahoo(myTickers, myStartDate, myEndDate)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 200)
print(myData.head())
结果是:
High Low Open Close Volume Adj Close
Date
20-12-15 646.90 623.79 643.28 633.25 45223600 633.250000
20-12-16 632.50 605.00 628.22 622.77 42095800 622.770020
20-12-17 658.82 619.50 628.19 655.90 56270100 655.900024
20-12-18 695.00 628.53 668.90 695.00 22212620 695.000000
20-12-21 668.50 646.07 666.23 649.85 58045300 649.859985
那么,我有这个:
myData['Some New Column Name'] = myData['High'] - myData['Low']
print(myData.head())
而且效果很好,结果是这样的:
High Low Open Close Volume Adj Close Some New Column Name
Date
20-12-15 646.90 623.79 643.28 633.25 45223600 633.25 23.10
20-12-16 632.50 605.00 628.22 622.77 42095800 622.77 27.50
20-12-17 658.82 619.50 628.19 655.90 56270100 655.90 39.32
20-12-18 695.00 628.53 668.90 695.00 22212620 695.00 66.46
20-12-21 668.50 646.07 666.23 649.85 58045300 649.85 22.42
我想修改代码以像这样执行多个“myTickers”:
from pandas_datareader import data as pdr
import pandas as pd
myStartDate = '2020-12-15' # (Format: Year-Month-Day)
myEndDate = '2020-12-31' # (Format: Year-Month-Day)
myTickers = ['TSLA', 'SQ']
myData = pdr.get_data_yahoo(myTickers, myStartDate, myEndDate)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 200)
print(myData.head())
效果很好,结果是:
Attributes Adj Close Close High Low Open Volume
Symbols TSLA SQ TSLA SQ TSLA SQ TSLA SQ TSLA SQ TSLA SQ
Date
12/15/20 633.25 219.99 633.25 219.99 646.90 221.72 623.80 216.73 643.28 218.41 45223600 5713600
12/16/20 622.77 227.08 622.77 227.08 632.50 227.96 605.00 220.03 628.23 223.86 42095800 8216500
12/17/20 655.90 230.74 655.90 230.74 658.82 237.09 619.50 227.70 628.19 230.00 56270100 10440100
12/18/20 695.00 235.45 695.00 235.45 695.00 236.37 628.54 231.10 668.90 235.00 22212620 8099500
12/21/20 649.86 233.50 649.86 233.50 668.50 241.85 646.07 232.26 666.24 236.09 58045300 1115540
与第一个例子类似,我想做的是:(特斯拉的高点)-(特斯拉的低点)=(一些新的列名)和(SQ 的高点)-(SQ 的低点)=(一些新的列名)但是我对如何执行此操作 pandas 不是很有经验。我的理解是这是一个多索引或(分层索引)pandas 数据框。阅读一些文档,我认为这样的事情可能会起作用:
myData['Some New Column Name']['TSLA', 'SQ'] = myData['High'] - myData['Low']
或
myData[['Some New Column Name'],['TSLA', 'SQ']] = myData['High'] - myData['Low']
但这行不通...我觉得可能很接近?
明确地说,我想要这样的东西:
Attributes Adj Close Close ... Open Volume Some New Col
Symbols TSLA SQ TSLA SQ ... TSLA SQ TSLA SQ TSLA SQ
Date ...
12/15/20 633.25 219.99 633.25 219.99 ... 643.28 218.41 45223600 5713600 23.10 4.99
12/16/20 622.77 227.08 622.77 227.08 ... 628.23 223.86 42095800 8216500 27.50 7.93
12/17/20 655.90 230.74 655.90 230.74 ... 628.19 230.00 56270100 1044010 39.32 9.39
12/18/20 695.00 235.45 695.00 235.45 ... 668.90 235.00 22212620 8099500 66.46 5.27
12/21/20 649.86 233.50 649.86 233.50 ... 666.24 236.09 58045300 1115540 22.43 9.59
我刚试过:
myNewData = myData['High'] - myData['Low']
结果是这样的:
Symbols TSLA SQ
Date
2020-12-15 23.100037 4.990005
2020-12-16 27.500000 7.930008
2020-12-17 39.320007 9.389999
2020-12-18 66.460022 5.269989
2020-12-21 22.429993 9.590012
我可能可以使用这个...但是...最好以某种干净的方式将其附加到 myData
我觉得你们很接近...
myData[[('Some new column', c) for c in myData['High'].columns]] = myData['High'] - myData['Low']