寻找 SnappyCodec 的 Avro 抛出 NoClassDefFoundError

Avro looking for SnappyCodec is throwing NoClassDefFoundError

我浏览了整个网络,发现这是一个很常见的错误,但没有任何解决方案对我有所帮助。

我正在阅读 kafka 主题。到目前为止,我这样做没有遇到任何问题,但是现在我在 aws 环境中 运行 上的 flink 集群上遇到了这个错误 ,但在我的 [=35= 中却没有] (intellij):

NoClassDefFoundError: org/xerial/snappy/Snappy
at org.apache.avro.file.SnappyCodec.decompress(SnappyCodec.java:58)
at org.apache.avro.file.DataFileStream$DataBlock.decompressUsing(DataFileStream.java:352)
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:199)
at flink.streaming.mtsas.functions.AvroDeserializationSchema.deserialize(AvroDeserializationSchema.java:37)
at org.apache.flink.streaming.util.serialization.KeyedDeserializationSchemaWrapper.deserialize(KeyedDeserializationSchemaWrapper.java:39)
at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:145)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:255)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:262)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)

从我在网上找到的内容来看,通常的原因似乎与编译的版本与预期版本不同有关,简而言之。但我只是不知所措。这也是 pom.xml:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
    <groupId>com.group.version1</groupId>
    <artifactId>parent</artifactId>
    <version>1.0-SNAPSHOT</version>
</parent>

<artifactId>this.artifact</artifactId>

<properties>
    <flink.version>1.3.0</flink.version>
    <avro.version>1.8.1</avro.version>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
</properties>

<dependencies>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-java</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-nifi_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-streaming-java_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-core</artifactId>
        <version>2.8.8</version>
    </dependency>

    <dependency>
        <groupId>com.google.code.gson</groupId>
        <artifactId>gson</artifactId>
        <version>2.8.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.logging.log4j</groupId>
        <artifactId>log4j-core</artifactId>
        <version>2.8.1</version>
    </dependency>
    <dependency>
        <groupId>org.codehaus.groovy</groupId>
        <artifactId>groovy-all</artifactId>
        <version>2.4.5</version>
    </dependency>
    <dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
        <version>21.0</version>
    </dependency>
    <dependency>
        <groupId>com.googlecode.json-simple</groupId>
        <artifactId>json-simple</artifactId>
        <version>1.1.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro</artifactId>
        <version>${avro.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-maven-plugin</artifactId>
        <version>${avro.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-kafka-0.10_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-cassandra_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-filesystem_2.10</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-avro</artifactId>
        <version>0.10.2</version>
    </dependency>

    <dependency>
        <groupId>com.datastax.cassandra</groupId>
        <artifactId>cassandra-driver-core</artifactId>
        <version>3.2.0</version>
    </dependency>

    <dependency>
        <groupId>com.blah</groupId>
        <artifactId>custom-resources</artifactId>
        <version>1.0</version>
    </dependency>
    <dependency>
        <groupId>com.blah</groupId>
        <artifactId>custom-executor</artifactId>
        <version>1.0</version>
    </dependency>

    <dependency>
        <groupId>net.sf.dozer</groupId>
        <artifactId>dozer</artifactId>
        <version>5.4.0</version>
    </dependency>

    <dependency>
        <groupId>com.group.version1</groupId>
        <artifactId>test-utils</artifactId>
        <scope>test</scope>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-simple</artifactId>
            </exclusion>
        </exclusions>
    </dependency>

   <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-statebackend-rocksdb_2.11</artifactId>
        <version>1.2.1</version>
    </dependency>

    <dependency>
        <groupId>com.datastax.cassandra</groupId>
        <artifactId>cassandra-driver-dse</artifactId>
        <version>3.0.0-rc1</version>
    </dependency>
    <dependency>
        <groupId>com.datastax.cassandra</groupId>
        <artifactId>dse-driver</artifactId>
        <version>1.1.2</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.7.2</version>
    </dependency>
</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.avro</groupId>
            <artifactId>avro-maven-plugin</artifactId>
            <version>${avro.version}</version>
            <executions>
                <execution>
                    <phase>generate-sources</phase>
                    <goals>
                        <goal>schema</goal>
                    </goals>
                    <configuration>
                        <sourceDirectory>${project.basedir}/src/main/customavro/</sourceDirectory>
                        <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <transformers>
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                <manifestEntries>
                                    <Main-Class>flink.streaming.custom.CustomProcessor</Main-Class>
                                </manifestEntries>
                            </transformer>
                        </transformers>
                        <filters>
                            <filter>
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                    </configuration>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-install-plugin</artifactId>
            <version>2.5</version>
            <executions>
                <execution>
                    <id>inst_1</id>
                    <phase>clean</phase>
                    <goals>
                        <goal>install-file</goal>
                    </goals>
                    <configuration>
                        <groupId>com.blah</groupId>
                        <artifactId>custom-resources</artifactId>
                        <version>1.0</version>
                        <packaging>jar</packaging>
                        <file>${basedir}/lib/custom_resources-1.0.jar</file>
                    </configuration>
                </execution>
                <execution>
                    <id>inst_2</id>
                    <phase>clean</phase>
                    <goals>
                        <goal>install-file</goal>
                    </goals>
                    <configuration>
                        <groupId>com.blah</groupId>
                        <artifactId>custom-executor</artifactId>
                        <version>1.0</version>
                        <packaging>jar</packaging>
                        <file>${basedir}/lib/custom-executor-1.0.jar</file>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

我敢肯定这不是一件难事,但我确实刚刚达到了这样一个地步,即你最终弊大于利地尝试一件又一件的事情。

提前感谢您的帮助!

编辑:

还应该补充一点,我可以在我的 IDE 中的那条路径 (org.xerial.snappy.Snappy) 中找到 Snappy.java。

在与 data artisans(Flink 的创建者)的联系人交谈后,发现 1.3 存在一个错误,这就是造成这种情况的原因。这是错误报告 https://issues.apache.org/jira/browse/FLINK-6965 的 link。