SSIS C# 脚本任务源反序列化 JSon 每行一个条目

SSIS C# Script task source deserializing JSon one entry per row

我必须假设我是 SQL 人而不是 C# 人。

我必须摄取这样的JSon:

[
{
    "gameId": "a_string_id",
    "name": "A string with the name",
    "width": 1280,
    "height": 720,
    "description": "A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble",
    "themeUrl": "/address1/address2/address3/filename.jpg",
    "thumbnailUrl": "/address1/address2/address3/filename.jpg",
    "verticalThumbnailUrl": "",
    "helpUrl": "",
    "trivia": [],
    "traits": [
      "aString"
    ],
    "seoName": "a-long-name",
    "friendlyName": "a-friendly-name"
  },
  {
    "gameId": "a_string_id",
    "name": "A string with the name",
    "width": 1600,
    "height": 878,
    "description": "",
    "themeUrl": "/address1/address2/address3/filename.jpg",
    "thumbnailUrl": "/address1/address2/address3/filename.jpg",
    "verticalThumbnailUrl": "",
    "helpUrl": "",
    "trivia": [],
    "traits": [],
    "seoName": "a-long-name",
    "friendlyName": "a-friendly-name"
  }
]

我需要使用脚本任务源来执行此操作。 我需要在每一行放一个json的实体,最好已经分列了,因为总共json很长,没有这样的变量可以包含它

我的代码如下:

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections.Generic;
using System.Net;
using System.Web.Script.Serialization;
using System.Collections;

/// <summary>
/// This is the class to which to add your code.  Do not change the name, attributes, or parent
/// of this class.
/// </summary>
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{


    /// <summary>
    /// This method is called once, before rows begin to be processed in the data flow.
    ///
    /// You can remove this method if you don't need to do anything here.
    /// </summary>
    public override void PreExecute()
    {
        base.PreExecute();
    }

    /// <summary>
    /// This method is called after all the rows have passed through this component.
    ///
    /// You can delete this method if you don't need to do anything here.
    /// </summary>
    public override void PostExecute()
    {
        base.PostExecute();
    }

    public override void CreateNewOutputRows()
    {
        /*
          Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
          For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
        */

        String json = DownloadJson("http://someapi.com");


        // Convert json string to .net object using the old school JavaScriptSerializer class
        //JavaScriptSerializer serialize = new JavaScriptSerializer();
        //Root righe = (Root)serialize.Deserialize(json, typeof(Root));

        // Loop through array of earthquakes, outputing desired values to SSIS buffer
        //foreach (var feature in righe.MyArray)
        //{
        //    Output0Buffer.AddRow();
        //    Output0Buffer.gameid = feature.gameid;
        //}
        Output0Buffer.AddRow();
        Output0Buffer.gameid = json;


    }


    public static string DownloadJson(string downloadURL)
    {
        using (WebClient client = new WebClient())
        {
            return client.DownloadString(downloadURL);
        }
    }


    // Root myDeserializedClass = JsonConvert.DeserializeObject<Root>(myJsonResponse); 
    public class MyArray
    {
        public string gameId { get; set; }
        public string name { get; set; }
        public int width { get; set; }
        public int height { get; set; }
        public string description { get; set; }
        public string themeUrl { get; set; }
        public string thumbnailUrl { get; set; }
        public string verticalThumbnailUrl { get; set; }
        public string helpUrl { get; set; }
        public List<string> trivia { get; set; }
        public List<object> traits { get; set; }
        public string seoName { get; set; }
        public string friendlyName { get; set; }

        internal static IEnumerator GetEnumerator()
        {
            throw new NotImplementedException();
        }

        public string gameid { get; set; }
    }

    public class Root
    {
        public List<MyArray> MyArray { get; set; }
    }

    public class MyArrayList : IEnumerable<MyArray>
{
        private List<MyArray> carbootsales;



        public IEnumerator<MyArray> GetEnumerator()
    {
        return carbootsales.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return MyArray.GetEnumerator();
    }
}


}

现在我只尝试使用 1 列 GameId。

我收到的错误如下:

at Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer.SetString(Int32 columnIndex, String value) at Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer.set_Item(Int32 columnIndex, Object value) at ScriptMain.CreateNewOutputRows() at UserComponent.PrimeOutput(Int32 Outputs, Int32[] OutputIDs, PipelineBuffer[] Buffers, OutputNameMap OutputMap) at Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.PrimeOutput(Int32 outputs, Int32[] outputIDs, PipelineBuffer[] buffers)

这个错误对我来说是阿拉伯语。

如果有人能提供帮助,甚至提出解决问题的新方法,那就太好了。

谢谢

此代码块抛出异常

String json = DownloadJson("http://someapi.com");
Output0Buffer.AddRow();
Output0Buffer.gameid = json;

它可能会溢出您为字符串列定义的任何长度。但是,我有一个午餐时间,所以你有各种各样的解决方案 ;)

我做的不同

我定义了一个成员变量myArray,它是一个MyArray类型的老式数组。当我们将 json 变成对象时,这将是每个项目级别(game1、game2 等)的列表。

在我的 PreExecute 方法中,我将下载 json 并将其转换为我们的 class 实例,类型为数组。

我在尝试解压缩 json.

时遇到定义不明确的错误,因此我找到了 Root、MyArrayList 定义

我从您的 MyArray class 中删除了枚举器和小写的 gameid,因为 json 表明它应该是 gameId

的第一个定义

在我的 CreateNewOutputRows 中,我将枚举成员数组,对于我找到的每个项目,我将向输出缓冲区添加一个新行,然后用数据填充它。

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections.Generic;
using System.Net;
// Add reference to System.Web.Extensions
using System.Web.Script.Serialization;
using System.Collections;

// Root myDeserializedClass = JsonConvert.DeserializeObject<Root>(myJsonResponse); 
public class MyArray
{
    public string gameId { get; set; }
    public string name { get; set; }
    public int width { get; set; }
    public int height { get; set; }
    public string description { get; set; }
    public string themeUrl { get; set; }
    public string thumbnailUrl { get; set; }
    public string verticalThumbnailUrl { get; set; }
    public string helpUrl { get; set; }
    public List<string> trivia { get; set; }
    public List<object> traits { get; set; }
    public string seoName { get; set; }
    public string friendlyName { get; set; }


}

/// <summary>
/// This is the class to which to add your code.  Do not change the name, attributes, or parent
/// of this class.
/// </summary>
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    //Root rootDocument;
    MyArray[] myArray;

    public string DownloadJson(string downloadURL)
    {
        return @"[
{
            'gameId': 'a_string_id',
    'name': 'A string with the name',
    'width': 1280,
    'height': 720,
    'description': 'A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble A lot of mumble jumble',
    'themeUrl': '/address1/address2/address3/filename.jpg',
    'thumbnailUrl': '/address1/address2/address3/filename.jpg',
    'verticalThumbnailUrl': '',
    'helpUrl': '',
    'trivia': [],
    'traits': [
      'aString'
    ],
    'seoName': 'a-long-name',
    'friendlyName': 'a-friendly-name'
  },
  {
    'gameId': 'a_string_id',
    'name': 'A string with the name',
    'width': 1600,
    'height': 878,
    'description': '',
    'themeUrl': '/address1/address2/address3/filename.jpg',
    'thumbnailUrl': '/address1/address2/address3/filename.jpg',
    'verticalThumbnailUrl': '',
    'helpUrl': '',
    'trivia': [],
    'traits': [],
    'seoName': 'a-long-name',
    'friendlyName': 'a-friendly-name'
  }
]";
        //using (WebClient client = new WebClient())
        //{
        //    return client.DownloadString(downloadURL);
        //}
    }

    public override void PreExecute()
    {
        base.PreExecute();
        // You likely want to have this as a Variable and passed in to the method
        string json = DownloadJson("http://someapi.com");

        JavaScriptSerializer serialize = new JavaScriptSerializer();
    myArray = serialize.Deserialize<MyArray[]>(json);

}

    /// <summary>
    /// This method is called after all the rows have passed through this component.
    ///
    /// You can delete this method if you don't need to do anything here.
    /// </summary>
    public override void PostExecute()
    {
        base.PostExecute();
    }

    public override void CreateNewOutputRows()
    {
        foreach (var item in this.myArray)
        {
            Output0Buffer.AddRow();
            Output0Buffer.gameid = item.gameId;
            Output0Buffer.height = item.height;
            Output0Buffer.width = item.width;
            Output0Buffer.name = item.name;
        }
    }

}

在“输入和输出”选项卡中,我定义了 MyArray class 中的 N 列中的 4 列。请务必调整这些数据类型和长度以匹配最大预期长度。如果您需要处理 unicode 数据,您可能需要将其切换为 Unicode string/DT_WSTR 数据类型。

你可以在这里看到我有两行流过并且我有数据。

此解决方案中未解决

MyArray 中有两个列表:琐事和特征。您将无法按原样将它们添加到缓冲区中,因为那不是问题。迂腐的回答是肯定的,但现实的答案是不要来自一个有 15 年 SSIS 经验的人。

对于这两列,答案取决于您打算如何存储这些数据?也许它会成为一个 table 的糟糕设计决策,比如存储分隔列表。这是“最容易”处理的,因为您可以像

一样将数据压缩在一起
Output0Buffer.trivia = string.Join(";", item.trivia);

该代码大致正确,但它不检查这些集合是否为空或分隔符是否已存在于其中。无论如何,这由数据知识工作者来决定。

另一种方法可能是您创建一个 OutputTrivaBuffer 并将 gameId 和单个琐事值发送到单独的路径。

根本没有错误处理,因此您需要加强此代码。

您可能需要向 SSIS 包添加一个变量,其中包含 API 的 url,然后将其作为 ReadOnly 参数传递给脚本组件,以便您可以指向在不同的端点 (dev/prod) 而无需编辑包。