当需要来自多个表的字段时，如何避免在 JPA 中使用 Hibernate 加入 'explosions'？

Question

假设我在 JpaRepository 中有以下方法：

@EntityGraph(value = "Subject.allJoins", type = EntityGraphType.FETCH)
    @Query("select s from Subject s" + FIND_QUERY_WHERE)
    Page<Subject> findInProject(@Param("projectId") UUID projectId, <additional params>

如您所见，我已经将 EntityGraph 与我需要的连接一起使用。 Hibernate 生成的结果 SQL 查询如下（大部分省略了 where）

select
    subject0_.id,
    <all kinds of fields including some duplicates>
from
    subject subject0_
    left outer join project project1_ on subject0_.project_id = project1_.id
    left outer join subject_property_value properties2_ on subject0_.id = properties2_.subject_id
    left outer join property_value propertyva3_ on properties2_.property_value_id = propertyva3_.id
    left outer join ingestion_id_mapping ingestedme4_ on subject0_.id = ingestedme4_.subject_id
where
    subject0_.project_id = '123'
order by
    subject0_.name asc

由于这里所有的连接都是将结果乘以行数作为连接的结果，即使主题总数只有几百，结果集也会爆炸成数十万行。

请注意，我将进行投影，这将避免选择某些字段，但仍然需要连接。

我可以做些什么来优化它？

请注意，我确实需要所有数据立即序列化到客户端，因此仅通过获取模型实体并为每个关联使用 Getter 方法将其留给 Hibernate 花费的时间甚至比这要长得多。

我目前的想法是，对于每个单独的连接，我必须使用相同的位置多次执行查询，然后将结果合并到一个对象中。如果由于原始 table 中添加或删除的行而在后续查询中读取更多或更少的行，这不是世界末日，因为我可以只获取主题 ID 的最小子集并从中得出结果。

但是有比这更聪明 and/or 更简单的事情吗？

Answer 1

问题是，fetch-join 为每个相关 entity/table 做了一个子 select。相反，您应该只加入具有 1:1 关系的实体。然后在第一次访问其他实体时获取它们。这导致每个主题一行，一个 select 有 n 行的每个实体不在初始 select.

如果子 select 花费的时间太长，请尝试将记录数最少的实体添加到 select。

Answer 2

我举一个足球俱乐部的例子，它有一个国家、一个体育场和一个球员列表。

您的第一个查询应该只用于从数据库中过滤您想要的行。此时，您还可以获取 1:1 关系，但不能获取 1:n。因此，在我的示例中，第一个查询应该：

过滤所有符合条件的俱乐部
获取所有 1:1 关系（每个俱乐部的国家和体育场）。

然后，您可以为每个子列表制作一个专用的。仍然在我的示例中，您将 select 每个俱乐部在您作为查询参数提供的列表中的球员（作为您的第一个查询的结果）。查询类似于：

 String jpql = "select p from Player p where p.club in :clubs";

这样做，你也可以提供entityGraph来加载球员的属性。当您继续分页时，这很有效（第一个查询的结果并不重要）。

Vlad Mihalcea 很好地描述了这种方法： The best way to fix the Hibernate MultipleBagFetchException

强烈建议你看一看

Answer 3

这是 Blaze-Persistence Entity Views 的完美用例。

我创建了库以允许在 JPA 模型和自定义接口或抽象 class 定义的模型之间轻松映射，类似于 Spring 类固醇数据投影。这个想法是您按照自己喜欢的方式定义目标结构（领域模型），并通过 JPQL 表达式将属性（getter）映射到实体模型。由于属性名称用作默认映射，因此您大多不需要显式映射，因为 80% 的用例都具有作为实体模型子集的 DTO。

有趣的是，您可以指定应该使用的提取策略。示例模型可能如下所示：

@EntityView(Subject.class)
public interface SubjectView {
    @IdMapping
    Integer getId();
    ProjectView getProject();
    @Mapping(fetch = SUBSELECT)
    Set<PropertyValueView> getProperties();
    Set<IngestionMappingView> getMappings();
}
@EntityView(Project.class)
public interface ProjectView {
    @IdMapping
    Integer getId();
    String getName();
}
@EntityView(PropertyValue.class)
public interface PropertyValueView {
    @IdMapping
    Integer getId();
    String getName();
}
@EntityView(IngestionMapping.class)
public interface IngestionMappingView {
    @IdMapping
    Integer getId();
    String getName();
}

查询就是将实体视图应用于查询，最简单的就是通过 id 进行查询。

SubjectView p = entityViewManager.find(entityManager, SubjectView.class, id);

Spring 数据集成允许您几乎像 Spring 数据预测一样使用它：https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features 即拥有类似于以下的存储库

@Repository
public interface SubjectRepository {
    Page<SubjectView> findByProjectId(@Param("projectId") UUID projectId, <additional params>);
}

您可以在 entity-view documentation 中阅读有关支持的提取策略的更多信息，我通常建议您尽可能使用 MULTISET 提取策略，因为这通常会提供最佳性能。

当需要来自多个表的字段时，如何避免在 JPA 中使用 Hibernate 加入 'explosions'？

How to avoid join 'explosions' in JPA with Hibernate when fields from multiple tables are needed?

java

spring

hibernate

jpa

jpql