如何从索引中删除行
How to delete row from Index
我知道我不能从索引中删除行,我只能从实时索引中删除行。但我必须从索引中删除行,但我现在不知道该怎么做。所以,这是我的 table 和记录:
+------+------+--------+
| id | name | status |
+------+------+--------+
| 1 | aaa | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+------+--------+
这是我的 sphinx 配置:
source mainSourse : mainConfSourse
{
sql_query = \
SELECT id, name, status \
from test_table
sql_field_string = name
sql_attr_uint = status
}
index testIndex
{
source = mainSourse
path = C:/sphinx/data/test/testIndex
morphology = stem_enru
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
min_prefix_len = 3
index_exact_words = 1
expand_keywords = 1
}
index testIndexRT
{
type = rt
path = C:/sphinx/data/test/testIndexRT
rt_field = name
rt_attr_string = name
rt_attr_uint = status
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
min_prefix_len = 3
index_exact_words = 1
expand_keywords = 1
}
sphinx 服务器启动后,如果我想从 testIndex 更新记录,我只需将新记录写入 testIndexRT 示例:
insert into testIndexRT (id,name,status) values (1,'aaa_updated',1);
然后那个请求 select * from testIndex,testIndexRT where status=1;
告诉我:
+------+-------------+--------+
| id | name | status |
+------+-------------+--------+
| 1 | aaa_updated | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+-------------+--------+
成功了,太棒了!但是当我想从索引中删除记录时,问题就开始了。我以为我会像更新一样让它变得简单但是在这段代码 update testIndexRT set status=2 where id=1
之后我看到:
+------+------+--------+
| id | name | status |
+------+------+--------+
| 1 | aaa | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+------+--------+
sphinx 刚刚向我展示了 testIndex 的记录,尽管 id 为 1 的行已在 testIndexRT 中更新 select * from testIndexRT;
:
+------+--------+-------------+
| id | status | name |
+------+--------+-------------+
| 1 | 2 | aaa_updated |
+------+--------+-------------+
我意识到它的方法不起作用:(我无法将所有记录从数据库保存到 testIndexRT,因为我的真实 table 很大,大小约为 60 Gb。有人请告诉我,也许吧还有其他我不知道的方法吗?
60G 对于 RT 索引应该不是问题,但如果你想坚持使用普通索引,你可以使用 main+delta 技术来实现你想要的。这是一个关于此的互动课程 - https://play.manticoresearch.com/maindelta/(它基于 Manticore Search,它是 Sphinx 的一个分支,但在 Sphinx 中应该都是一样的,只是 killlist_target 在 Sphinx 3 中的命名不同)。
这是另一个例子:
MySQL:
mysql> desc data;
+---------+------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------+------+-----+-------------------+-----------------------------+
| id | bigint(20) | NO | PRI | 0 | |
| body | text | YES | | NULL | |
| updated | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+---------+------------+------+-----+-------------------+-----------------------------+
3 rows in set (0.00 sec)
mysql> desc helper;
+----------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+-------------------+-----------------------------+
| chunk_id | varchar(255) | NO | PRI | | |
| built | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+----------+--------------+------+-----+-------------------+-----------------------------+
2 rows in set (0.00 sec)
配置:
source main
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass =
sql_db = test
sql_query_pre = replace into helper set chunk_id = '1_tmp', built = now()
sql_query = select id, body, unix_timestamp(updated) updated from data where updated >= from_unixtime($start) and updated <= from_unixtime($end)
sql_query_range = select (select unix_timestamp(min(updated)) from data) min, (select unix_timestamp(built) - 1 from helper where chunk_id = '1_tmp') max
sql_query_post_index = replace into helper set chunk_id = '1', built = (select built from helper t where chunk_id = '1_tmp')
sql_range_step = 100
sql_field_string = body
sql_attr_timestamp = updated
}
source delta : main
{
sql_query_pre =
sql_query_range = select (select unix_timestamp(built) from helper where chunk_id = '1') min, unix_timestamp() max
sql_query_killlist = select id from data where updated >= (select built from helper where chunk_id = '1')
killlist_target = idx_main:kl
}
index idx
{
type = distributed
local = idx_main
local = idx_delta
}
我知道我不能从索引中删除行,我只能从实时索引中删除行。但我必须从索引中删除行,但我现在不知道该怎么做。所以,这是我的 table 和记录:
+------+------+--------+
| id | name | status |
+------+------+--------+
| 1 | aaa | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+------+--------+
这是我的 sphinx 配置:
source mainSourse : mainConfSourse
{
sql_query = \
SELECT id, name, status \
from test_table
sql_field_string = name
sql_attr_uint = status
}
index testIndex
{
source = mainSourse
path = C:/sphinx/data/test/testIndex
morphology = stem_enru
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
min_prefix_len = 3
index_exact_words = 1
expand_keywords = 1
}
index testIndexRT
{
type = rt
path = C:/sphinx/data/test/testIndexRT
rt_field = name
rt_attr_string = name
rt_attr_uint = status
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+0435, U+451->U+0435
min_prefix_len = 3
index_exact_words = 1
expand_keywords = 1
}
sphinx 服务器启动后,如果我想从 testIndex 更新记录,我只需将新记录写入 testIndexRT 示例:
insert into testIndexRT (id,name,status) values (1,'aaa_updated',1);
然后那个请求 select * from testIndex,testIndexRT where status=1;
告诉我:
+------+-------------+--------+
| id | name | status |
+------+-------------+--------+
| 1 | aaa_updated | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+-------------+--------+
成功了,太棒了!但是当我想从索引中删除记录时,问题就开始了。我以为我会像更新一样让它变得简单但是在这段代码 update testIndexRT set status=2 where id=1
之后我看到:
+------+------+--------+
| id | name | status |
+------+------+--------+
| 1 | aaa | 1 |
| 2 | bbb | 1 |
| 3 | ccc | 1 |
+------+------+--------+
sphinx 刚刚向我展示了 testIndex 的记录,尽管 id 为 1 的行已在 testIndexRT 中更新 select * from testIndexRT;
:
+------+--------+-------------+
| id | status | name |
+------+--------+-------------+
| 1 | 2 | aaa_updated |
+------+--------+-------------+
我意识到它的方法不起作用:(我无法将所有记录从数据库保存到 testIndexRT,因为我的真实 table 很大,大小约为 60 Gb。有人请告诉我,也许吧还有其他我不知道的方法吗?
60G 对于 RT 索引应该不是问题,但如果你想坚持使用普通索引,你可以使用 main+delta 技术来实现你想要的。这是一个关于此的互动课程 - https://play.manticoresearch.com/maindelta/(它基于 Manticore Search,它是 Sphinx 的一个分支,但在 Sphinx 中应该都是一样的,只是 killlist_target 在 Sphinx 3 中的命名不同)。
这是另一个例子:
MySQL:
mysql> desc data;
+---------+------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------+------+-----+-------------------+-----------------------------+
| id | bigint(20) | NO | PRI | 0 | |
| body | text | YES | | NULL | |
| updated | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+---------+------------+------+-----+-------------------+-----------------------------+
3 rows in set (0.00 sec)
mysql> desc helper;
+----------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+-------------------+-----------------------------+
| chunk_id | varchar(255) | NO | PRI | | |
| built | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+----------+--------------+------+-----+-------------------+-----------------------------+
2 rows in set (0.00 sec)
配置:
source main
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass =
sql_db = test
sql_query_pre = replace into helper set chunk_id = '1_tmp', built = now()
sql_query = select id, body, unix_timestamp(updated) updated from data where updated >= from_unixtime($start) and updated <= from_unixtime($end)
sql_query_range = select (select unix_timestamp(min(updated)) from data) min, (select unix_timestamp(built) - 1 from helper where chunk_id = '1_tmp') max
sql_query_post_index = replace into helper set chunk_id = '1', built = (select built from helper t where chunk_id = '1_tmp')
sql_range_step = 100
sql_field_string = body
sql_attr_timestamp = updated
}
source delta : main
{
sql_query_pre =
sql_query_range = select (select unix_timestamp(built) from helper where chunk_id = '1') min, unix_timestamp() max
sql_query_killlist = select id from data where updated >= (select built from helper where chunk_id = '1')
killlist_target = idx_main:kl
}
index idx
{
type = distributed
local = idx_main
local = idx_delta
}