如何将 ElasticSearch 与 MySQL 集成？

Question

在我的一个项目中，我计划将 ElasticSearch 与 MySQL 一起使用。我已经成功安装了 ElasticSearch。我能够单独管理 ES 中的索引。但我不知道如何使用 MySQL.

实现相同的功能

我看了一些文档，但我有点困惑，没有一个清晰的思路。

Answer 1

我终于找到了答案。分享我的发现。

要将 ElasticSearch 与 Mysql 一起使用，您需要 Java 数据库连接 (JDBC) 导入器。使用 JDBC 驱动程序，您可以将 mysql 数据同步到 elasticsearch。

我正在使用 ubuntu 14.04 LTS，您需要将 Java8 安装到运行 elasticsearch，因为它写在 Java

以下是安装 ElasticSearch 2.2.0 和 ElasticSearch-jdbc2.2.0 的步骤，请注意 两者版本必须相同

安装后Java8.....安装elasticsearch 2.2.0如下

# cd /opt

# wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb

# sudo dpkg -i elasticsearch-2.2.0.deb

此安装过程将在 /usr/share/elasticsearch/ 中安装 Elasticsearch，其配置文件将放置在 /etc/elasticsearch 中。

现在让我们在配置文件中做一些基本配置。这里 /etc/elasticsearch/elasticsearch.yml 是我们的配置文件您可以通过

打开文件进行更改

nano /etc/elasticsearch/elasticsearch.yml

并更改集群名称和节点名称

例如：

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
 cluster.name: servercluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
 node.name: vps.server.com
#
# Add custom attributes to the node:
#
# node.rack: r1

现在保存文件并启动elasticsearch

 /etc/init.d/elasticsearch start

测试ES安装与否运行关注

 curl -XGET 'http://localhost:9200/?pretty'

如果你得到关注，那么你的 elasticsearch 现在已经安装好了:)

{
  "name" : "vps.server.com",
  "cluster_name" : "servercluster",
  "version" : {
    "number" : "2.2.0",
    "build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe",
    "build_timestamp" : "2016-01-27T13:32:39Z",
    "build_snapshot" : false,
    "lucene_version" : "5.4.1"
  },
  "tagline" : "You Know, for Search"
}

现在让我们安装elasticsearch-JDBC

从 http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.3.1/elasticsearch-jdbc-2.3.3.1-dist.zip 下载并在 /etc/elasticsearch/ 中提取相同的内容，并在那里创建 "logs" 文件夹（日志路径应为 /etc/elasticsearch/logs）

我在 mysql 中创建了一个名为“ElasticSearchDatabase”的数据库，其中 table 名为 "test" 包含字段 id、name 和 email

cd /etc/elasticsearch

和运行关注

echo '{
"type":"jdbc",
"jdbc":{

"url":"jdbc:mysql://localhost:3306/ElasticSearchDatabase",
"user":"root",
"password":"",
"sql":"SELECT id as _id, id, name,email FROM test",
"index":"users",
"type":"users",
"autocommit":"true",
"metrics": {
            "enabled" : true
        },
        "elasticsearch" : {
             "cluster" : "servercluster",
             "host" : "localhost",
             "port" : 9300 
        } 
}
}' | java -cp "/etc/elasticsearch/elasticsearch-jdbc-2.2.0.0/lib/*" -"Dlog4j.configurationFile=file:////etc/elasticsearch/elasticsearch-jdbc-2.2.0.0/bin/log4j2.xml" "org.xbib.tools.Runner" "org.xbib.tools.JDBCImporter"

现在检查 mysql 数据是否导入 ES

curl -XGET http://localhost:9200/users/_search/?pretty

如果一切顺利，您将能够以 json 格式查看所有 mysql 数据如果有任何错误，您将能够在 /etc/elasticsearch/logs/jdbc.log 文件

中看到它们

注意：

在旧版本的 ES 中...插件 Elasticsearch-river-jdbc 已被完全弃用最新版本所以不要使用它。

我希望我能节省你的时间:)

如有任何进一步的想法，我们将不胜感激

参考 url : https://github.com/jprante/elasticsearch-jdbc

Answer 2

从 ES 5.x 开始，他们已经通过 logstash 插件提供了开箱即用的功能。

这将定期从数据库中导入数据并推送到 ES 服务器。

必须创建一个下面给出的简单导入文件（here 也有描述）并使用 logstash 来运行脚本。 Logstash 支持运行按计划执行此脚本。

# file: contacts-index-logstash.conf
input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
        jdbc_user => "user"
        jdbc_password => "pswd"
        schedule => "* * * * *"
        jdbc_validate_connection => true
        jdbc_driver_library => "/path/to/latest/mysql-connector-java-jar"
        jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
        statement => "SELECT * from contacts where updatedAt > :sql_last_value"
    }
}
output {
    elasticsearch {
        protocol => http
        index => "contacts"
        document_type => "contact"
        document_id => "%{id}"
        host => "ES_NODE_HOST"
    }
}
# "* * * * *" -> run every minute
# sql_last_value is a built in parameter whose value is set to Thursday, 1 January 1970,
# or 0 if use_column_value is true and tracking_column is set

您可以从 maven here 下载 mysql jar。

如果执行此脚本时ES中不存在索引，则会自动创建索引。就像对 elasticsearch

的正常 post 调用一样

Answer 3

为了让它更简单，我创建了一个 PHP class 来设置 MySQL with Elasticsearch。使用我的 Class，您可以在 elasticsearch 中同步您的 MySQL 数据，还可以执行全文搜索。您只需设置 SQL 查询，class 将为您完成剩下的工作。

Answer 4

logstash JDBC 插件将完成这项工作：

input {
  jdbc { 
    jdbc_connection_string => "jdbc:mysql://localhost:3306/testdb"
    jdbc_user => "root"
    jdbc_password => "factweavers"
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "/home/comp/Downloads/mysql-connector-java-5.1.38.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    # our query
    schedule => "* * * *"
    statement => "SELECT" * FROM testtable where Date > :sql_last_value order by Date"
    use_column_value => true
    tracking_column => Date
}

output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "test-migrate"
  "document_type" => "data"
  "document_id" => "%{personid}"
  }
}

如何将 ElasticSearch 与 MySQL 集成？

How to integrate ElasticSearch with MySQL?

mysql

elasticsearch

logstash

elasticsearch-5

elk