使用 beanIo 将平面文件转换为 json

Flat file to json conversion using beanIo

我正在尝试使用 beanIo 将固定长度的平面文件解析为 json

代码:

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.util.Map;

import org.beanio.BeanIOConfigurationException;
import org.beanio.BeanReader;
import org.beanio.StreamFactory;
import org.junit.Test;

import com.google.gson.Gson;

public class EmployeeBeanIOHandlerTest {

    @Test
    public void testHandleEmployee() {

        // mapping pattern file
        String mappingPatternFile = "pattern-mapping.xml";

        // data file (csv)
        String objectFile = "employee.csv";

        // stream name defined in pattern mapping file
        String streamName = "empData";

        Gson gson = new Gson();

        BeanReader beanReader = null;
        Reader reader = null;
        StreamFactory factory = null;
        InputStream in = null;

        try {

            System.out.println("## RESULT FOR " + objectFile + " ##");

            // create a StreamFactory
            factory = StreamFactory.newInstance();

            // load the setting file
            in = this.getClass().getClassLoader()
                    .getResourceAsStream(mappingPatternFile);

            // get input stream reader of object file (data file)
            reader = new InputStreamReader(this.getClass().getClassLoader()
                    .getResourceAsStream(objectFile));

            // load input stream to stream factory
            factory.load(in);

            beanReader = factory.createReader(streamName, reader);
            Map<?, ?> record = null;
            while ((record = (Map<?, ?>) beanReader.read()) != null) {
                System.out.println(beanReader.getRecordName() + ": "
                        + gson.toJson(record));
            }

        } catch (BeanIOConfigurationException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                in.close();
                if (beanReader != null) {
                    beanReader.close();
                }
                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

}

但是我看到的输出是:

header: {"id":"Header","date":"01012013"}

emp: {"lastName":"Lilik","title":"Senior Developer","hireDate":"Oct 1, 2009 
12:00:00 AM","salary":7500000,"firstName":"Robertus"}

emp: {"lastName":"Doe","title":"Architect","hireDate":"Jan 15, 2008 12:00:00 AM","salary":8000000,"firstName":"Jane"}

emp: {"lastName":"Anderson","title":"Manager","hireDate":"Mar 18, 2006 12:00:00 AM","salary":9000000,"firstName":"Jon"}

trailer: {"id":"Trailer","count":"3"}

它为找到的每条记录生成单独的 json 对象。

参考站点: http://www.sourcefreak.com/2013/06/painless-flat-file-parsing-with-beanio/

以下是我的要求:

  1. 我想要一个合并的 Json 文件。
  2. 如果有重复记录,它应该形成一个 json 数组。

我将不胜感激。

此答案基于 OP 提供的 link 中的数据和 pattern-mapping.xml 文件。

数据:

Header,01012013
Robertus,Lilik,Senior Developer,"75,000,00",10012009
Jane,Doe,Architect,"80,000,00",01152008
Jon,Anderson,Manager,"90,000,00",03182006
Footer,3

映射文件:
这是修改后的 pattern-mapping.xml 文件。请注意使用 <group> 元素 (myGroup) 将所有内容封装到一个组中,这将强制 BeanReader 一次性读取所有内容。对于 HeaderFooter 记录,我还将 maxOccurs 更改为 1(一)。此外,添加了 collection="list" attribute to theemp` 记录

<?xml version='1.0' encoding='UTF-8' ?>
<beanio xmlns="http://www.beanio.org/2012/03" 
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">

  <stream name="empData" format="csv">
    <group name="myGroup" class="map">
      <record name="header" class="map" ridLength="0-2" maxOccurs="1">
        <field name="id" rid="true" maxOccurs="1" literal="Header" />
        <field name="date" />
      </record>

      <record name="emp" class="map" ridLength="4-5" collection="list">
        <field name="firstName" />
        <field name="lastName" />
        <field name="title" />
        <field name="salary" type="java.math.BigDecimal" format="#,###,###,00" />
        <field name="hireDate" type="java.util.Date" format="MMddyyyy" minOccurs="0" />
      </record>

      <record name="trailer" class="map" ridLength="2" maxOccurs="1">
        <field name="id" />
        <field name="count" />
      </record>
    </group>
  </stream>
</beanio>

使用提供的测试用例和修改后的映射文件,我们得到这个结果(由我重新格式化):

myGroup: {
  "trailer": {
    "count": "3",
    "id": "Footer"
  },
  "header": {
    "date": "01012013",
    "id": "Header"
  },
  "emp": [
    {
      "firstName": "Robertus",
      "lastName": "Lilik",
      "hireDate": "Oct 1, 2009 12:00:00 AM",
      "title": "Senior Developer",
      "salary": 7500000
    },
    {
      "firstName": "Jane",
      "lastName": "Doe",
      "hireDate": "Jan 15, 2008 12:00:00 AM",
      "title": "Architect",
      "salary": 8000000
    },
    {
      "firstName": "Jon",
      "lastName": "Anderson",
      "hireDate": "Mar 18, 2006 12:00:00 AM",
      "title": "Manager",
      "salary": 9000000
    }
  ]
}

希望对您有所帮助