在 BIML 中的数据流之前创建 table

Create table before the dataflow in BIML

我正在使用 BIML 和 BIDSHelper 创建 SSIS 包。我正在尝试将数据从 csv 导入到 sql 服务器。我想在数据流发生之前在目标数据库中创建 table。这是我的代码:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>       
    <OleDbConnection Name="CM_OLE" 
                     ConnectionString="Data Source=(localdb)\projects;Initial Catalog=test;Integrated Security=SSPI;Provider=SQLNCLI11">
    </OleDbConnection>
    <FlatFileConnection
            Name="FF Source"
            FileFormat="FFF Source"
            FilePath="F:\test.csv"
            CreateInProject="false" />
</Connections>
<FileFormats>
    <FlatFileFormat
            Name="FFF Source"
            CodePage="1252"
            RowDelimiter="CRLF"
            ColumnNamesInFirstDataRow="true"
            IsUnicode="false"
            FlatFileType="Delimited"
            TextQualifer="_x0022_"
            HeaderRowsToSkip="0">
        <Columns>               
            <Column Name="Column1" Length="50" InputLength="50" MaximumWidth="50" DataType="AnsiString"  ColumnType="Delimited"  CodePage="1252" Delimiter="," TextQualified="true" />
            <Column Name="Column2" Precision="10" Scale="2"  DataType="Decimal"  ColumnType="Delimited"  CodePage="1252" Delimiter="CRLF" TextQualified="true"  />
        </Columns>
    </FlatFileFormat>
</FileFormats>  
<Packages>      
    <Package ConstraintMode="Linear" Name="NumericParsingFromFlatFileInsertIdentity">
        <Tasks> 
            <ExecuteSQL Name="Create table sometablename" ConnectionName="CM_OLE">
                 <DirectInput>
                      CREATE TABLE sometablename(column1 varchar(50) NOT NULL, column2 varchar(10,2) NOT NULL);
                      GO 
                 </DirectInput>
            </ExecuteSQL>
            <Dataflow Name="DFT Source">
                <Transformations>
                    <FlatFileSource ConnectionName="FF Source" Name="FF Source" />
                    <OleDbDestination ConnectionName="CM_OLE" Name="OLEDB DST">
                        <ExternalTableOutput Table="sometablename"></ExternalTableOutput>
                    </OleDbDestination>                     
                </Transformations>
            </Dataflow>         
        </Tasks>
    </Package>
</Packages>

当我尝试生成包时,它显示 cannot execute query select * from sometablename invalid object name。我知道 table sometablename 不存在所以它会抛出错误。那么,我怎样才能自动创建 table 呢?我已经阅读了系列BI Thoughts and Theories。第 2 部分展示了创建 table 的方法。我的理解是,最后它还会创建 ExecuteSQl 来创建 table。我很困惑如何在数据流之前 运行 table 创建脚本,或者 BIML 必须提供什么其他替代方案?

提前致谢

It seems what you're trying to do is not possible with BIML.

SSIS dataflows require ALL external column metadata to be available at design time. There is no way around this, so the Biml compiler is required to query the data source to get this information, which is then emitted into the package. BIDS/SSDT does this validation constantly as you are working. Biml does it only at build time.

The purpose of ValidateExternalMetadata=false is actually for SSIS to refrain from checking that the external columns defined in the dataflow metadata still match the external data source during the validation phase when the package is run. But at design/build time, we still need that metadata to exist so that we can create the external column metadata in the first place. To be clear, this is true both for native BIDS/SSDT and for Biml.

ValidateExternalMetadata was provided by the SSIS team for scenarios such as dynamically creating tables or files that will match a predetermined schema. Usually you would have the schema prebuilt on your dev environment (which you build against) and then dynamically create the same schema on production as it's needed. Disabling validation means that you can do the dynamic creation as part of the same package that reads from or loads into those dynamically created objects.

We do recognize that there's a need to do builds without having the schema materialized in Dev either. One of the things we're looking at doing in a future release is an "Offline Metadata" feature that would allow you to use Biml to declare your dataflow metadata without having to retrieve it at build time. There would be some scripting work on the user's part to construct the metadata to match what it will look like at run time, but if they get that right, scenarios like yours will be enabled.

您可以将 ValidateExternalMetadata="false" 添加到您的 OLE DB 目标。在您的开发环境中手动创建 table,然后生成包。

它应该可以在任何其他环境中正常执行,因为您将 ValidateExternalMetadata 设置为 false。

关于一些相关的说明,请查看 Samuel Vanga's article 并注意 "Create Objects" 方面。 运行 该包将在数据库中创建您的表,之后您可以生成依赖于这些表的 SSIS 包。

我用他的例子实现了以下工作流程:

  1. 阅读Excel 字段名称、数据类型的工作簿电子表格(这是在要求平面文件中的数据时提供给客户的模板)
  2. 使用平面文件names/ids、字段[名称、数据类型、定界符、精度、比例等]填充元数据表
  3. 读取元数据表以通知平面文件源、创建登台表、创建读取平面文件并填充登台表的包。

对于任何其他试图实现这一点的人来说,Biml 现在可以通过 OfflineSchema 元数据元素引用不存在的对象。这允许您指定无法连接到的表或结果集,以便 Biml 引擎基于 SSIS 构建。

https://varigence.com/Documentation/Language/Element/AstOfflineSchemaNode