将不一致的数据集从列转置到行 csv 或 excel
Transpose Inconsistent data set from columns to rows csv or excel
您好,我有一个包含大约 380k 行和树列的 csv 文件。
A 列 - 我有密钥,它针对每个 trim 版本重复,并且每个 trim 版本每次都以品牌一词开头
B 列 - 规格
C 列 - trim ID,每个 trim 版本
的编号相同
我的问题是我的数据范围不一致,一些trim版本最多有55行数据,而其他版本只有5-6
(A)KEY (B)VALUE (C)TRIM ID
====== ======== ==========
1. Brand Mitsubishi 20001
2. Model ASX 20001
3. Trim ASX (facelift 2012) 20001
4. Engine 1.8 DI-D (114 Hp) 20001
5. Doors 5 20001
6. Power 114 hp 20001
7. Maximum speed 189 km/h 20001
8. From 0 to 100 km/h 10.2 sec 20001
9. Fuel tank volume 63 l 20001
10. Year into production 2012 20001
11. Seats 5 20001
1. Brand BMW 20015
2. Model M4 20015
3. Trim M4 (F83) 20015
4. Engine 3.0 (431 Hp) DCT 20015
5. Power 431 hp 20015
1. Brand AUDI 25003
2. Model A4 25003
3. Trim 1.9TDI AVANT SLINE 25003
4. Power 131 hp 25003
我想将数据转换为每个 TRIM 版本的一行并匹配数据。例如,每次找到品牌时,都会用数据开始新行,其余数据与列名品牌、型号……座位等相匹配。
像这样:
Brand Model Generation Engine Doors Power Maximum speed Seats Length
===== ===== ========= ===== ===== ===== ============= ===== ======
AUDI A4 2.0T SLINE 2.0T 5 210 220 4 4520
BMW M3 330 280 4
HONDA CIVIC 1.6i 4 160 4
我试图用函数解决这个问题,但我想我需要 vba 脚本,但我不擅长。请帮助我。
我认为这可以通过一个支点轻松完成 table。只需将您的 csv 数据导入 excel 并将其转换为数据透视表 table.
好吧,我做到了(以及您提供的数据样本)
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
编辑 #1
此代码创建 headers
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
'to create headers
Rows("1:1").Insert Shift:=xlDown
Range("A1").FormulaR1C1 = "Key"
Range("B1").FormulaR1C1 = "Value"
Range("C1").FormulaR1C1 = "Trim"
Rows("1:1").Font.Bold = True
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
正如我在屏幕截图中看到的那样,您将代码放入工作sheet,这将return 错误 1004
因为你不能 "manipulate" 来自 sheet 的另一个 sheet。如果您 need/want 这样做,您需要在一个模块内进行,然后从该模块调用该过程。
在这种情况下,您需要添加一个新模块
在 VBA
中选择了工作簿
Insert >>> Module
一个新模块将在您的项目中,并在该模块中添加来自 Edit #2 的过程,然后 运行 它, F5
.
如果您需要改进,请告诉我。
编辑 #2
很高兴能为您提供帮助...您遇到的错误是因为我确实向您发送了自定义函数...抱歉...开始!
Function myTime() As String
Dim HH
Dim MM
Dim SS
Dim TT
HH = Hour(Now)
MM = Minute(Now)
SS = Second(Now)
myTime = Format(HH, "00") & Format(MM, "00") & Format(SS, "00")
End Function
将此函数放在与所有代码相同的模块中。
我会回答我的问题,因为我找到了一个名为 OpenRefine 的完美而强大的解决方案,它是一个前 google 项目(Google Refine)。
由于我的数据集现在超过一百万行,这是最快和最好的解决方案(比 excel 好得多)。
您好,我有一个包含大约 380k 行和树列的 csv 文件。 A 列 - 我有密钥,它针对每个 trim 版本重复,并且每个 trim 版本每次都以品牌一词开头 B 列 - 规格 C 列 - trim ID,每个 trim 版本
的编号相同我的问题是我的数据范围不一致,一些trim版本最多有55行数据,而其他版本只有5-6
(A)KEY (B)VALUE (C)TRIM ID
====== ======== ==========
1. Brand Mitsubishi 20001
2. Model ASX 20001
3. Trim ASX (facelift 2012) 20001
4. Engine 1.8 DI-D (114 Hp) 20001
5. Doors 5 20001
6. Power 114 hp 20001
7. Maximum speed 189 km/h 20001
8. From 0 to 100 km/h 10.2 sec 20001
9. Fuel tank volume 63 l 20001
10. Year into production 2012 20001
11. Seats 5 20001
1. Brand BMW 20015
2. Model M4 20015
3. Trim M4 (F83) 20015
4. Engine 3.0 (431 Hp) DCT 20015
5. Power 431 hp 20015
1. Brand AUDI 25003
2. Model A4 25003
3. Trim 1.9TDI AVANT SLINE 25003
4. Power 131 hp 25003
我想将数据转换为每个 TRIM 版本的一行并匹配数据。例如,每次找到品牌时,都会用数据开始新行,其余数据与列名品牌、型号……座位等相匹配。
像这样:
Brand Model Generation Engine Doors Power Maximum speed Seats Length
===== ===== ========= ===== ===== ===== ============= ===== ======
AUDI A4 2.0T SLINE 2.0T 5 210 220 4 4520
BMW M3 330 280 4
HONDA CIVIC 1.6i 4 160 4
我试图用函数解决这个问题,但我想我需要 vba 脚本,但我不擅长。请帮助我。
我认为这可以通过一个支点轻松完成 table。只需将您的 csv 数据导入 excel 并将其转换为数据透视表 table.
好吧,我做到了(以及您提供的数据样本)
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
编辑 #1
此代码创建 headers
Sub createDataTable()
Dim r
Dim c
Dim i
Dim rng As Range
Dim newSht As Worksheet
Dim dataSht As Worksheet
Dim j 'the counter for the rows of the table
Dim colName As Range
Dim theAddress
Set dataSht = Sheets("Data")
dataSht.Activate
'to create headers
Rows("1:1").Insert Shift:=xlDown
Range("A1").FormulaR1C1 = "Key"
Range("B1").FormulaR1C1 = "Value"
Range("C1").FormulaR1C1 = "Trim"
Rows("1:1").Font.Bold = True
r = Range("A1").End(xlDown).Row 'take the last row of the data
c = Range("A1").End(xlToRight).Column 'Take the last columns of the data
Set rng = Range(Cells(2, 1), Cells(r, 1)) 'Store the column 1=A of the data
Sheets.Add After:=Sheets(Sheets.Count) 'Add a new sheet
Set newSht = ActiveSheet 'Store the new sheet int the var
newSht.Name = myTime 'Rename the new sheet with the function
j = 1
dataSht.Activate
rng.Copy
Range("H1").PasteSpecial xlPasteAll
Application.CutCopyMode = False
ActiveSheet.Range("$H:$H").RemoveDuplicates Columns:=1, Header:=xlNo
Range("H1", Selection.End(xlDown)).Copy
newSht.Activate
Range("A1").PasteSpecial Paste:=xlPasteAll, Transpose:=True
Rows("1:1").Font.Bold = True
dataSht.Activate 'Go to the new sheet (just in case)
Range("H1", Selection.End(xlDown)).ClearContents
For Each i In rng 'Here comes the magic
If i.Value = "Brand" Then 'If is a Brand set a new row
j = j + 1
End If
newSht.Activate 'Lets go to the new sheet
With newSht.Range("A1:BZ1") 'With the headers...
Set colName = .Find(i.Value, LookIn:=xlValues) '...Find the header of the column in that range
If Not colName Is Nothing Then 'If colName has something then
theAddress = colName.Address 'Put the address of the address just for reference
Do 'and inner loop
Range(Cells(j, colName.Column), Cells(j, colName.Column)).Value = i.Offset(0, 1).Value
'put the value of the field inside the cell below the right header in the right row
'col header = colName.column
'right row = j
'Set colName = .FindNext(colName) 'this is not necesary, because the header are unique
Loop While Not colName Is Nothing And colName.Address <> theAddress
End If
End With
dataSht.Activate
Next i
End Sub
正如我在屏幕截图中看到的那样,您将代码放入工作sheet,这将return 错误 1004
因为你不能 "manipulate" 来自 sheet 的另一个 sheet。如果您 need/want 这样做,您需要在一个模块内进行,然后从该模块调用该过程。
在这种情况下,您需要添加一个新模块 在 VBA
中选择了工作簿Insert >>> Module
一个新模块将在您的项目中,并在该模块中添加来自 Edit #2 的过程,然后 运行 它, F5
.
如果您需要改进,请告诉我。
编辑 #2
很高兴能为您提供帮助...您遇到的错误是因为我确实向您发送了自定义函数...抱歉...开始!
Function myTime() As String
Dim HH
Dim MM
Dim SS
Dim TT
HH = Hour(Now)
MM = Minute(Now)
SS = Second(Now)
myTime = Format(HH, "00") & Format(MM, "00") & Format(SS, "00")
End Function
将此函数放在与所有代码相同的模块中。
我会回答我的问题,因为我找到了一个名为 OpenRefine 的完美而强大的解决方案,它是一个前 google 项目(Google Refine)。
由于我的数据集现在超过一百万行,这是最快和最好的解决方案(比 excel 好得多)。