如何让 CAS 在部分更新期间更新一小部分记录属性？

Question

我在 Oracle Commerce 11.1 上，在一个应用程序上运行仅 CAS（没有 Forge）。

基线更新工作正常。我对部分更新有疑问。

我们有一个包含需要更新的记录子集的提取文件。但是，此文件仅列出每条记录的一小部分属性（即它仅提供实际更改的属性）。

当我进行部分更新时（使用 CAS-only 部署模板附带的默认机制），它成功完成但更新的记录只有文件中提供的字段子集 - 所有没有改变的字段只是丢失了。就好像 CAS 只是将现有记录（具有完整的属性集）替换为仅包含提取文件中的少数属性的新记录。

例如，假设其中一条记录如下所示：

Record 23
---------
id 23
name Test
inventoryCount 23
buyable 1
imageUrl test.jpg

并说部分提取文件有这样的条目

Record 23
---------
id 23
inventoryCount 10

我在部分更新后得到的结果是这样的：

Record 23
---------
id 23
inventoryCount 10

我想知道如何让 CAS 保留这些属性而不是删除它们。我知道 Forge 可以做到这一点。

Answer 1

我已经确认并没有真正明确的机制来执行此操作，所以我发明了自己的机制。

总结一下它是如何工作的：我定制了 PartialUpdate beanshell 脚本，以便在最后一英里爬行运行后立即调用我创建的名为 DGIDXTransformer 的自定义组件（即它扩展了 CustomComponent）。 class 解压缩并解析最后一英里爬网创建的文件，该文件应该被送入 DGIDX 并写出该文件的修改版本。具体来说，它会修改所有更新信息，以便更新记录而不是用新属性替换。这有点 hacky，因为没有记录 DGIDX 输入文件的格式，但根据我的研究，这种格式在未来的 Endeca 版本中不太可能发生太大变化。

这是 DGIDXTransformer：

import com.endeca.soleng.eac.toolkit.component.*;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

import java.io.*;
import java.nio.file.AccessDeniedException;
import java.nio.file.Files;
import java.util.Map;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

/**
 * Custom component which runs during the PartialUpdate beanshell script. It transforms the DGIDX-compatible input file
 * that CAS produces so that records will be updated instead of replaced.
 *
 * Expects only one property entry called "dgidxInputFileDirectory", specifying the directory to look in to
 * find the file to transform (relative to the config directory).
 *
 * @author chairbender
 */
public class DGIDXTransformer extends CustomComponent {
    private static final String DGIDX_INPUT_FILE_DIRECTORY_PROPERTY_NAME = "dgidxInputFileDirectory";
    private static final String RECORD_SPEC_PROPERTY_NAME = "record.spec";

    /**
     * Does the transformation as specified in the class javadoc.
     */
    public void transformDGIDXInputFileToUpdateInsteadOfReplace() throws Exception {
        //Find the file in the directory
        Map<String, String> properties = getProperties();
        if (null == properties || !properties.containsKey(DGIDX_INPUT_FILE_DIRECTORY_PROPERTY_NAME)) {
            throw new Exception();
        } else {
            File directory = new File(properties.get(DGIDX_INPUT_FILE_DIRECTORY_PROPERTY_NAME));
            File[] gzipFiles = directory.listFiles(new FilenameFilter() {
                @Override
                public boolean accept(File dir, String name) {
                    return name.endsWith(".xml.gz");

                }
            });
            if (gzipFiles == null || gzipFiles.length == 0) {
                throw new Exception();
            } else {
                File gzipFile = gzipFiles[0];
                File unzippedFile = unzipFile(gzipFile);

                transformInputFile(unzippedFile, unzippedFile.getAbsolutePath().replace(".xml", "transformed.xml"));

                //delete the extra files in a way that throws an exception if deletion fails
                Files.delete(gzipFile.toPath());
                Files.delete(unzippedFile.toPath());

            }
        }



    }

    /**
     * Gzips the passed file and saves it at the specified location
     * @param toGzip file to gzip
     * @param outputPath where to output the gzipped file
     *
     */
    private void gzipFile(File toGzip,String outputPath) throws IOException {
        byte[] buffer = new byte[1024];

        GZIPOutputStream gzipOutputStream =
                new GZIPOutputStream(new FileOutputStream(outputPath,false));

        FileInputStream inputStream =
                new FileInputStream(toGzip);

        int len;
        while ((len = inputStream.read(buffer)) > 0) {
            gzipOutputStream.write(buffer, 0, len);
        }

        inputStream.close();

        gzipOutputStream.finish();
        gzipOutputStream.close();
        inputStream.close();
    }

    /**
     *
     * @param unzippedFile file representing DGIDX input data to transform
     * @param transformedFilePath path where transformed file should go.
     * @return the transformed file
     */
    private File transformInputFile(File unzippedFile, String transformedFilePath) throws IOException {
        File outputFile = new File(transformedFilePath);

        //Since the XML and the transformation isn't very complicated, we'll just write it out line by line as we go through the
        //unzipped file line-by-line
        BufferedReader unzippedFileReader = new BufferedReader(new FileReader(unzippedFile));
        BufferedWriter outputFileWriter = new BufferedWriter(new FileWriter(outputFile));

        String nextLine;
        while ((nextLine = unzippedFileReader.readLine()) != null) {
            if (nextLine.contains("RECORD_ADD_OR_REPLACE")) {
                //If the line contains RECORD_ADD_OR_REPLACE, need to change it to RECORD_UPDATE
                outputFileWriter.write(nextLine.replace("RECORD_ADD_OR_REPLACE","RECORD_UPDATE"));
            } else if (nextLine.contains("<PROP NAME=")) {
                //if this line contains <PROP NAME="...">, and the property
                //name isn't the record spec, we need to transform this element only if it isn't the record spec.
                String propertyName = nextLine.split("\"")[1];
                if (!propertyName.equals(RECORD_SPEC_PROPERTY_NAME)) {
                    //Read the property value from the next line
                    String propertyValueLine = unzippedFileReader.readLine();
                    String propertyValue = propertyValueLine.replace("<PVAL>","").replace("</PVAL>","").trim();

                    //Now write the PVAL_DELETE and PVAL_ADD entries
                    outputFileWriter.write("<PVAL_DELETE><PROPERTY_NAME NAME=\"" + propertyName + "\"/></PVAL_DELETE>");
                    outputFileWriter.write("<PVAL_ADD><PROP NAME=\"" + propertyName + "\"><PVAL>" + propertyValue + "</PVAL></PROP></PVAL_ADD>");

                    //Discard the closing element line of the input file
                    unzippedFileReader.readLine();
                } else {
                    //it's not the record spec, so don't transform it.
                    outputFileWriter.write(nextLine);
                }
            } else {
                //Just output the line
                outputFileWriter.write(nextLine);
            }
        }
        unzippedFileReader.close();
        outputFileWriter.close();
        return outputFile;
    }

    /**
     *
     * @param gzipFile file to un-gzip. Will create the un-gzipped version in the same directory as gzipFile,
     *                 but without the ".gz" ending.
     * @return the unzipped version of the file.
     */
    private File unzipFile(File gzipFile) throws IOException {
        //Un-gzip the file in one pass
        GZIPInputStream gzipInputStream =
                new GZIPInputStream(new FileInputStream(gzipFile));
        File outputFile = new File(gzipFile.getAbsolutePath().replace(".gz",""));
        FileOutputStream outputStream =
                new FileOutputStream(outputFile);

        int len;
        byte[] buffer = new byte[1024];
        while ((len = gzipInputStream.read(buffer)) > 0) {
            outputStream.write(buffer, 0, len);
        }

        gzipInputStream.close();
        outputStream.close();

        return outputFile;
    }


}

这被编译成一个 JAR，进入 config/lib/java。

这是 DataIngest.xml 中的自定义组件定义：

<custom-component id="DGIDXTransformer" host-id="ITLHost" class="com.chairbender.DGIDXTransformer">
    <properties>
        <property name="dgidxInputFileDirectory" value="../data/cas_output" />
    </properties>
</custom-component>

这里是自定义 PartialUpdate 脚本的相关部分：

  CAS.runIncrementalCasCrawl("${lastMileCrawlName}");     
  DGIDXTransformer.transformDGIDXInputFileToUpdateInsteadOfReplace();     
  CAS.archiveDvalIdMappingsForCrawlIfChanged("${lastMileCrawlName}");

如何让 CAS 在部分更新期间更新一小部分记录属性？

How do I get CAS to update a small subset of record properties during a partial update?

etl

endeca