如何理解不同值的语义?
How can understand semantic meaning for different value?
我想获取apple's financial data
,从https://www.sec.gov/dera/data/financial-statement-and-notes-data-set.html
下载https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2022_01_notes.zip
。解压后放入/tmp/2022_01_notes
。你可以获取table sub,num
和网页中的字段定义 https://www.sec.gov/files/aqfsn_1.pdf
.
我计算 zip 文件的 MD5 消息摘要。
md5sum 2022_01_notes.zip
b1cdf638200991e1bbe260489093bf67 2022_01_notes.zip
官网或者我的dropbox都可以下载:
https://www.dropbox.com/s/5ntwasipze8vr29/2022_01_notes.zip?dl=0
无论从哪里下载,请检查md5sum值,可能SEC上传的文件有误,以后他们会更新zip文件。
import pandas as pd
df_sub = pd.read_csv('/tmp/2022_01_notes/sub.tsv',sep='\t')
df_sub[df_sub['cik'] == 320193] #apple's cik is 321093
df_sub
adsh cik name sic countryba stprba cityba ... instance nciks aciks pubfloatusd floatdate floataxis floatmems
4329 0000320193-22-000006 320193 APPLE INC 3571.0 US CA CUPERTINO ... aapl-20220127_htm.xml 1 NaN NaN NaN NaN NaN
4731 0000320193-22-000007 320193 APPLE INC 3571.0 US CA CUPERTINO ... aapl-20211225_htm.xml 1 NaN NaN NaN NaN NaN
0000320193-22-000007
是其2022Q2数据的访问号。
df_num = pd.read_csv('/tmp/2022_01_notes/num.tsv',sep='\t')
#get all apple's financial data in xbrl concepts format
df_apple = df_num[df_num['adsh'] == '0000320193-22-000007' ]
#extract only one concept ----RevenueFromContractWithCustomerExcludingAssessedTax
#it is revenue mapping into financial accountant concept from xbrl taxonomy.
df_apple_revenue = df_apple[df_apple['tag'] == 'RevenueFromContractWithCustomerExcludingAssessedTax']
df_apple_revenue_2021 = df_apple_revenue[df_apple_revenue['ddate'] == 20201231]
df_apple_revenue_2021
在我的终端控制台上显示数据框太长了,我写到一个excel
df_apple_revenue_2021.to_csv('/tmp/apple_revenue_2021.csv')
并在 excel 中显示,将内容粘贴到此处。
前两行,8285000000
和15761000000
是什么意思?请给8285000000
和15761000000
一个合理的描述。
0000320193-22-000007 RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021 20201231 1 USD 0xf159835fd3644f228d15724ad9d1837c 0 8285000000 0 1 0.013698995 5 -6
0000320193-22-000007 RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021 20201231 1 USD 0x58c22680ab8dbbfb662ff4e14055c1bd 1 15761000000 0 1 0.013698995 5 -6
要解释这些数字,您必须追溯到提取它们的文件。在这种情况下,0000320193-22-000007
的accession-number
的归档是Form 10-Q For the Fiscal Quarter Ended December 25, 2021。如果您查看该文件,您会在 table Net sales by reportable segment
中的数据框中找到 value
数字中的七个,特别是 Three Months Ended December 26,2020
.
因此,例如,8285000000
指的是那个时期的 Japan
段,而 15761000000
在 Net sales by category
table 中指的是 Services
同一报告期的类别。 table 在数据框中又包含六个 value
。
我想获取apple's financial data
,从https://www.sec.gov/dera/data/financial-statement-and-notes-data-set.html
下载https://www.sec.gov/files/dera/data/financial-statement-and-notes-data-sets/2022_01_notes.zip
。解压后放入/tmp/2022_01_notes
。你可以获取table sub,num
和网页中的字段定义 https://www.sec.gov/files/aqfsn_1.pdf
.
我计算 zip 文件的 MD5 消息摘要。
md5sum 2022_01_notes.zip
b1cdf638200991e1bbe260489093bf67 2022_01_notes.zip
官网或者我的dropbox都可以下载:
https://www.dropbox.com/s/5ntwasipze8vr29/2022_01_notes.zip?dl=0
无论从哪里下载,请检查md5sum值,可能SEC上传的文件有误,以后他们会更新zip文件。
import pandas as pd
df_sub = pd.read_csv('/tmp/2022_01_notes/sub.tsv',sep='\t')
df_sub[df_sub['cik'] == 320193] #apple's cik is 321093
df_sub
adsh cik name sic countryba stprba cityba ... instance nciks aciks pubfloatusd floatdate floataxis floatmems
4329 0000320193-22-000006 320193 APPLE INC 3571.0 US CA CUPERTINO ... aapl-20220127_htm.xml 1 NaN NaN NaN NaN NaN
4731 0000320193-22-000007 320193 APPLE INC 3571.0 US CA CUPERTINO ... aapl-20211225_htm.xml 1 NaN NaN NaN NaN NaN
0000320193-22-000007
是其2022Q2数据的访问号。
df_num = pd.read_csv('/tmp/2022_01_notes/num.tsv',sep='\t')
#get all apple's financial data in xbrl concepts format
df_apple = df_num[df_num['adsh'] == '0000320193-22-000007' ]
#extract only one concept ----RevenueFromContractWithCustomerExcludingAssessedTax
#it is revenue mapping into financial accountant concept from xbrl taxonomy.
df_apple_revenue = df_apple[df_apple['tag'] == 'RevenueFromContractWithCustomerExcludingAssessedTax']
df_apple_revenue_2021 = df_apple_revenue[df_apple_revenue['ddate'] == 20201231]
df_apple_revenue_2021
在我的终端控制台上显示数据框太长了,我写到一个excel
df_apple_revenue_2021.to_csv('/tmp/apple_revenue_2021.csv')
并在 excel 中显示,将内容粘贴到此处。
前两行,8285000000
和15761000000
是什么意思?请给8285000000
和15761000000
一个合理的描述。
0000320193-22-000007 RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021 20201231 1 USD 0xf159835fd3644f228d15724ad9d1837c 0 8285000000 0 1 0.013698995 5 -6
0000320193-22-000007 RevenueFromContractWithCustomerExcludingAssessedTax us-gaap/2021 20201231 1 USD 0x58c22680ab8dbbfb662ff4e14055c1bd 1 15761000000 0 1 0.013698995 5 -6
要解释这些数字,您必须追溯到提取它们的文件。在这种情况下,0000320193-22-000007
的accession-number
的归档是Form 10-Q For the Fiscal Quarter Ended December 25, 2021。如果您查看该文件,您会在 table Net sales by reportable segment
中的数据框中找到 value
数字中的七个,特别是 Three Months Ended December 26,2020
.
因此,例如,8285000000
指的是那个时期的 Japan
段,而 15761000000
在 Net sales by category
table 中指的是 Services
同一报告期的类别。 table 在数据框中又包含六个 value
。