使用 exceldatareader 从 excel 文件中获取数据并将其粘贴到 xml 文件中?
Using exceldatareader to take data from excel file and paste it in a xml file?
我有一些 xml 格式的文件
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD with OASIS Tables v1.0 20120330//EN" "JATS-journalpublishing-oasis-article1.dtd">
<article article-type="proceedings" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://www.niso.org/standards/z39-96/ns/oasis-exchange/table">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id"/>
<journal-title-group>
<journal-title>Eleventh International Conference on Correlation Optics</journal-title>
</journal-title-group>
<issn pub-type="epub">0277-786X</issn>
<publisher>
<publisher-name>SPIE</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">@@</article-id>
<title-group>
<article-title>Comparison of absorption spectra of adenocarcinoma and squamous cell carcinoma cervical tissue</article-title>
</title-group>
<pub-date>
<year>2013</year>
</pub-date>
<volume>9066</volume>
<issue>90661T</issue>
<isbn content-type="conf-proc">978-1-5106-0374-5</isbn>
<fpage>90661T-1</fpage>
<lpage>90661T-7</lpage>
</article-meta>
</front>
我有如下所示的 xlsx 文件
Filename Pages DOI Publisher Year Price
01022018_1020 14 10.1111/12.2033558 Springer 2017 .13
01022018_1021 7 10.1111/12.2033559 Elsevier 2018 .25
01022018_1130 30 10.1113/20.3023658 Elsevier 2017 .25
01022018_1301 10 10.1113/20.3023777 Wiley 2015 .50
我想为每个文件(按文件名搜索)获取字段 DOI 和 Publisher 的相应值并更新它们到 xml 文件,即如果文件名为 01022018_1021.xml,则获取其各自的 DOI 和 Publisher 即 10.1111/12.2033559 和 Elsevier 并将它们放在 <publisher-name>Elsevier</publisher-name>
和 <article-id pub-id-type="doi">10.1111/12.2033559</article-id>
[ 这样的节点中=17=]
我以前从未使用过 exceldatareader 但我试过这样
var xmlFiles=Directory.EnumerateFiles(@"D:\test\testobject","*.xml",SearchOption.AllDirectories);
Parallel.ForEach(xmlFiles,xmlFile=>{
var name=Path.GetFileName(xmlFile);
FileStream stream = File.Open(@"C:\Don\Downloads\Download-Sample-File-xlsx.xlsx", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
DataSet result = excelReader.AsDataSet(new ExcelDataSetConfiguration() {
ConfigureDataTable = (_) => new ExcelDataTableConfiguration() {
UseHeaderRow = true
}
});
DataTable dt = result.Tables[0];
foreach (DataRow row in dt.Rows)
{
var x = (string)row["Publisher"];
var y = (string)row["DOI"];
/*how do I get the required values
var pub_name="...";
var doi="...";
*/
XDocument doc=XDocument.Load(xmlFile);
doc.Descendants("pub-name").First().Value=pub_name;
doc.Descendants("article-id").First().Value=doi;
doc.Save(xmlFile);
}
excelReader.Close();
});
Console.WriteLine("Done");
Console.ReadLine();
有人能帮忙吗?
看起来你已经有了这些值。
doc.Descendants("pub-name").First().Value=x;
doc.Descendants("article-id").First().Value=y;
编辑:我没听懂需要找到正确的行。
DataTable dt = result.Tables[0];
var filename = Path.GetFileNameWithoutExtension(xmlFile);
// Get the row for the current file
var rowForFile = dt.Rows.Cast<DataRow>()
.SingleOrDefault(r => (string)r["Filename"] == filename);
if (rowForFile != null)
{
XDocument doc = XDocument.Load(xmlFile);
doc.Descendants("publisher-name").First().Value = (string)rowForFile["Publisher"];
doc.Descendants("article-id").First().Value = (string)rowForFile["DOI"];
doc.Save(xmlFile);
}
我有一些 xml 格式的文件
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD with OASIS Tables v1.0 20120330//EN" "JATS-journalpublishing-oasis-article1.dtd">
<article article-type="proceedings" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://www.niso.org/standards/z39-96/ns/oasis-exchange/table">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id"/>
<journal-title-group>
<journal-title>Eleventh International Conference on Correlation Optics</journal-title>
</journal-title-group>
<issn pub-type="epub">0277-786X</issn>
<publisher>
<publisher-name>SPIE</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">@@</article-id>
<title-group>
<article-title>Comparison of absorption spectra of adenocarcinoma and squamous cell carcinoma cervical tissue</article-title>
</title-group>
<pub-date>
<year>2013</year>
</pub-date>
<volume>9066</volume>
<issue>90661T</issue>
<isbn content-type="conf-proc">978-1-5106-0374-5</isbn>
<fpage>90661T-1</fpage>
<lpage>90661T-7</lpage>
</article-meta>
</front>
我有如下所示的 xlsx 文件
Filename Pages DOI Publisher Year Price
01022018_1020 14 10.1111/12.2033558 Springer 2017 .13
01022018_1021 7 10.1111/12.2033559 Elsevier 2018 .25
01022018_1130 30 10.1113/20.3023658 Elsevier 2017 .25
01022018_1301 10 10.1113/20.3023777 Wiley 2015 .50
我想为每个文件(按文件名搜索)获取字段 DOI 和 Publisher 的相应值并更新它们到 xml 文件,即如果文件名为 01022018_1021.xml,则获取其各自的 DOI 和 Publisher 即 10.1111/12.2033559 和 Elsevier 并将它们放在 <publisher-name>Elsevier</publisher-name>
和 <article-id pub-id-type="doi">10.1111/12.2033559</article-id>
[ 这样的节点中=17=]
我以前从未使用过 exceldatareader 但我试过这样
var xmlFiles=Directory.EnumerateFiles(@"D:\test\testobject","*.xml",SearchOption.AllDirectories);
Parallel.ForEach(xmlFiles,xmlFile=>{
var name=Path.GetFileName(xmlFile);
FileStream stream = File.Open(@"C:\Don\Downloads\Download-Sample-File-xlsx.xlsx", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
DataSet result = excelReader.AsDataSet(new ExcelDataSetConfiguration() {
ConfigureDataTable = (_) => new ExcelDataTableConfiguration() {
UseHeaderRow = true
}
});
DataTable dt = result.Tables[0];
foreach (DataRow row in dt.Rows)
{
var x = (string)row["Publisher"];
var y = (string)row["DOI"];
/*how do I get the required values
var pub_name="...";
var doi="...";
*/
XDocument doc=XDocument.Load(xmlFile);
doc.Descendants("pub-name").First().Value=pub_name;
doc.Descendants("article-id").First().Value=doi;
doc.Save(xmlFile);
}
excelReader.Close();
});
Console.WriteLine("Done");
Console.ReadLine();
有人能帮忙吗?
看起来你已经有了这些值。
doc.Descendants("pub-name").First().Value=x;
doc.Descendants("article-id").First().Value=y;
编辑:我没听懂需要找到正确的行。
DataTable dt = result.Tables[0];
var filename = Path.GetFileNameWithoutExtension(xmlFile);
// Get the row for the current file
var rowForFile = dt.Rows.Cast<DataRow>()
.SingleOrDefault(r => (string)r["Filename"] == filename);
if (rowForFile != null)
{
XDocument doc = XDocument.Load(xmlFile);
doc.Descendants("publisher-name").First().Value = (string)rowForFile["Publisher"];
doc.Descendants("article-id").First().Value = (string)rowForFile["DOI"];
doc.Save(xmlFile);
}