从 CSV 中删除逗号,以便可以正确存储在 sql db 中
Removing commas from CSV so can be stored in sql db correctly
所以我正在导入一个 CSV 文件,其中导入的地址单元格包含逗号,因此它自然会尝试将每个单独的逗号分隔值分配到行中的一个新单元格中,这当然会失败。我尝试查找解决方案,但没有找到任何与我当前的导入方法一致的解决方案。
CSV 导入代码:
protected void Upload(object sender, EventArgs e)
{
//Upload and save the file
string csvPath = Server.MapPath("~/Files/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
FileUpload1.SaveAs(csvPath);
DataTable dt = new DataTable();
dt.Columns.AddRange(new DataColumn[13] { new DataColumn("ID", typeof(int)),
new DataColumn("ZooplaURL", typeof(string)),
new DataColumn("Branch",typeof(string)),
new DataColumn("HouseNumber",typeof(string)),
new DataColumn("PropAddress",typeof(string)),
new DataColumn("Town",typeof(string)),
new DataColumn("County",typeof(string)),
new DataColumn("Postcode",typeof(string)),
new DataColumn("Price",typeof(string)),
new DataColumn("PropType",typeof(string)),
new DataColumn("Beds",typeof(string)),
new DataColumn("PropStatus",typeof(string)),
new DataColumn("Weeks",typeof(string)) });
string csvData = File.ReadAllText(csvPath);
foreach (string row in csvData.Split('\n'))
{
if (!string.IsNullOrEmpty(row))
{
dt.Rows.Add();
int i = 0;
foreach (string cell in row.Split(','))
{
dt.Rows[dt.Rows.Count - 1][i] = cell;
i++;
}
}
}
string consString = ConfigurationManager.ConnectionStrings["TortoiseDBConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(consString))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(con))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = "dbo.Zoopla";
con.Open();
sqlBulkCopy.WriteToServer(dt);
con.Close();
}
}
}
CSV 看起来像这样:
38041001,http://www.zoopla.co.uk/for-sale/details/38041001,Connells – Hampton, PE7,,New Lakeside,Peterborough,,PE7 8HU,215000,Detached,4,For Sale,0
38040800,http://www.zoopla.co.uk/for-sale/details/38040800,Peter Lane, PE1,,Rothbart Way,Peterborough,,PE7 8DZ,300000,Detached,5,For Sale,0
38025706,http://www.zoopla.co.uk/for-sale/details/38025706,Connells – Hampton, PE7,,Hornbeam Road,Peterborough,,PE7 8FY,190000,Semi-detached,3,For Sale,0
您需要使用某些字符(例如:“\”)对 CSV 文件中的逗号进行转义
然后当你开始分割线时,使用一个正则表达式分割每个逗号前面没有转义字符(“\”)。
编辑:这个正则表达式将检测前面没有“\”字符的任何逗号”
"[^\],"
我强烈推荐使用 CsvHelper (nuget / github) 并让它为您进行解析。然后你可以做这样的事情:
using (StreamReader sr = new StreamReader(File.OpenRead(csvPath)))
{
using (CsvReader csv = new CsvReader(sr))
{
while (csv.Read()) // read a csv line
{
string town = null;
if (csv.TryGetField<string>(7, out town) &&
!string.IsNullOrWhiteSpace(town))
{
// this will replace the comma with a space.
town = town.Replace(",", " ");
}
所以我正在导入一个 CSV 文件,其中导入的地址单元格包含逗号,因此它自然会尝试将每个单独的逗号分隔值分配到行中的一个新单元格中,这当然会失败。我尝试查找解决方案,但没有找到任何与我当前的导入方法一致的解决方案。
CSV 导入代码:
protected void Upload(object sender, EventArgs e)
{
//Upload and save the file
string csvPath = Server.MapPath("~/Files/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
FileUpload1.SaveAs(csvPath);
DataTable dt = new DataTable();
dt.Columns.AddRange(new DataColumn[13] { new DataColumn("ID", typeof(int)),
new DataColumn("ZooplaURL", typeof(string)),
new DataColumn("Branch",typeof(string)),
new DataColumn("HouseNumber",typeof(string)),
new DataColumn("PropAddress",typeof(string)),
new DataColumn("Town",typeof(string)),
new DataColumn("County",typeof(string)),
new DataColumn("Postcode",typeof(string)),
new DataColumn("Price",typeof(string)),
new DataColumn("PropType",typeof(string)),
new DataColumn("Beds",typeof(string)),
new DataColumn("PropStatus",typeof(string)),
new DataColumn("Weeks",typeof(string)) });
string csvData = File.ReadAllText(csvPath);
foreach (string row in csvData.Split('\n'))
{
if (!string.IsNullOrEmpty(row))
{
dt.Rows.Add();
int i = 0;
foreach (string cell in row.Split(','))
{
dt.Rows[dt.Rows.Count - 1][i] = cell;
i++;
}
}
}
string consString = ConfigurationManager.ConnectionStrings["TortoiseDBConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(consString))
{
using (SqlBulkCopy sqlBulkCopy = new SqlBulkCopy(con))
{
//Set the database table name
sqlBulkCopy.DestinationTableName = "dbo.Zoopla";
con.Open();
sqlBulkCopy.WriteToServer(dt);
con.Close();
}
}
}
CSV 看起来像这样:
38041001,http://www.zoopla.co.uk/for-sale/details/38041001,Connells – Hampton, PE7,,New Lakeside,Peterborough,,PE7 8HU,215000,Detached,4,For Sale,0
38040800,http://www.zoopla.co.uk/for-sale/details/38040800,Peter Lane, PE1,,Rothbart Way,Peterborough,,PE7 8DZ,300000,Detached,5,For Sale,0
38025706,http://www.zoopla.co.uk/for-sale/details/38025706,Connells – Hampton, PE7,,Hornbeam Road,Peterborough,,PE7 8FY,190000,Semi-detached,3,For Sale,0
您需要使用某些字符(例如:“\”)对 CSV 文件中的逗号进行转义 然后当你开始分割线时,使用一个正则表达式分割每个逗号前面没有转义字符(“\”)。
编辑:这个正则表达式将检测前面没有“\”字符的任何逗号” "[^\],"
我强烈推荐使用 CsvHelper (nuget / github) 并让它为您进行解析。然后你可以做这样的事情:
using (StreamReader sr = new StreamReader(File.OpenRead(csvPath)))
{
using (CsvReader csv = new CsvReader(sr))
{
while (csv.Read()) // read a csv line
{
string town = null;
if (csv.TryGetField<string>(7, out town) &&
!string.IsNullOrWhiteSpace(town))
{
// this will replace the comma with a space.
town = town.Replace(",", " ");
}