51工具盒子

依楼听风雨
笑看云卷云舒,淡观潮起潮落

使用 Spring 和 Hibernate 进行表分区

简介 {#简介}

本文将带你了解如何使用 Spring 和 Hibernate 实现表分区。

表分区的目标是将一个大型表分割为多个较小的分区表,以便关联的表和索引记录可以放入内存缓冲池,从而实现更高效的查找或扫描操作。

使用 PostgreSQL 进行表分区 {#使用-postgresql-进行表分区}

PostgreSQL 为 表分区 提供了三种策略:

  • 列表分区(List Partitioning)
  • 范围分区(Range Partitioning)
  • Hash 分区(Hash Partitioning)

本例使用列表分区,按大洲来划分数据表。

例如,users 分区如下:

CREATE TABLE users (
    id bigint NOT NULL,
    first_name varchar(255),
    last_name varchar(255),
    registered_on timestamp(6),
    partition_key varchar(255),
    PRIMARY KEY (id, partition_key)
)
PARTITION BY LIST (partition_key)

CREATE TABLE users_asia PARTITION OF users FOR VALUES IN ('Asia')

CREATE TABLE users_africa PARTITION OF users FOR VALUES IN ('Africa')

CREATE TABLE users_north_america PARTITION OF users FOR VALUES IN ('North America')

CREATE TABLE users_south_america PARTITION OF users FOR VALUES IN ('South America')

CREATE TABLE users_europe PARTITION OF users FOR VALUES IN ('Europe')

CREATE TABLE users_australia PARTITION OF users FOR VALUES IN ('Australia')

posts 表分区如下:

CREATE TABLE posts (
    id bigint NOT NULL,
    title varchar(255),
    created_on timestamp(6),
    user_id bigint,
    partition_key varchar(255),
    PRIMARY KEY (id, partition_key)
)
PARTITION BY LIST (partition_key)

CREATE TABLE posts_asia PARTITION OF posts FOR VALUES IN ('Asia')

CREATE TABLE posts_africa PARTITION OF posts FOR VALUES IN ('Africa')

CREATE TABLE posts_north_america PARTITION OF posts FOR VALUES IN ('North America')

CREATE TABLE posts_south_america PARTITION OF posts FOR VALUES IN ('South America')

CREATE TABLE posts_europe PARTITION OF posts FOR VALUES IN ('Europe')

CREATE TABLE posts_australia PARTITION OF posts FOR VALUES IN ('Australia')

ALTER TABLE IF EXISTS posts ADD CONSTRAINT fk_posts_user_id FOREIGN KEY (user_id, partition_key) REFERENCES users

用 JPA 和 Hibernate 映射表分区 {#用-jpa-和-hibernate-映射表分区}

使用 Spring 和 Hibernate 实现表分区需要在读写分区表的记录时提供分区 Key。

首先,创建一个 PartitionAware 基类,每个映射到分区表的 JPA 实体都将继承该基类:

@MappedSuperclass
@FilterDef(
    name = PartitionAware.PARTITION_KEY,
    parameters = @ParamDef(
        name = PartitionAware.PARTITION_KEY,
        type = String.class
    )
)
@Filter(
    name = PartitionAware.PARTITION_KEY,
    condition = "partition_key = :partitionKey"
)
public abstract class PartitionAware<T extends PartitionAware> {
public static final String PARTITION_KEY = &quot;partitionKey&quot;;

@Column(name = &quot;partition_key&quot;)
@PartitionKey
private String partitionKey;

public String getPartitionKey() {
    return partitionKey;
}

public T setPartitionKey(String partitionKey) {
    this.partitionKey = partitionKey;
    return (T) this;
}

public T setPartition(Partition partition) {
    this.partitionKey = partition.getKey();
    return (T) this;
}

}

  • @Filter 注解允许动态启用分区过滤功能
  • @PartitionKey 注解是在 Hibernate 6.2 中引入的,它允定义一个分区 key。

Partition 对象是一个枚举,用于定义应用支持的分区:

public enum Partition {
    ASIA("Asia"),
    AFRICA("Africa"),
    NORTH_AMERICA("North America"),
    SOUTH_AMERICA("South America"),
    EUROPE("Europe"),
    AUSTRALIA("Australia"),
    ;
private final String key;

Partition(String key) {
    this.key = key;
}

public String getKey() {
    return key;
}

}

@MappedSuperclass 注解用于表示实体类的父类。它可以用来定义多个实体类之间的公共映射和字段。它本身并不被视为一个实体,不能直接持久化到数据库中。

User 实体继承了 PartitionAware 类,如下:

@Entity
@Table(name = "users")
public class User extends PartitionAware<User> {
@Id
@GeneratedValue
private Long id;

@Column(name = &quot;first_name&quot;)
private String firstName;

@Column(name = &quot;last_name&quot;)
private String lastName;

@Column(name = &quot;registered_on&quot;)
@CreationTimestamp
private LocalDateTime createdOn = LocalDateTime.now();
 
// get / set 方法省略

}

Post 实体同样,如下:

@Entity
@Table(name = "posts")
public class Post extends PartitionAware<Post> {
@Id
@GeneratedValue
private Long id;

private String title;

@Column(name = &quot;created_on&quot;)
@CreationTimestamp
private LocalDateTime createdOn;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = &quot;user_id&quot;)
private User user;
 
// get / set 方法省略

}

现在,通过 Spring EntityManager 初始化机制在创建 JPA EntityManager 时默认启用分区过滤:

@Bean
public JpaTransactionManager transactionManager(
        EntityManagerFactory entityManagerFactory){
    JpaTransactionManager transactionManager = new JpaTransactionManager();
    transactionManager.setEntityManagerFactory(entityManagerFactory);
    transactionManager.setEntityManagerInitializer(entityManager -> {
        User user = UserContext.getCurrent();
        if (user != null) {
            entityManager.unwrap(Session.class)
                .enableFilter(PartitionAware.PARTITION_KEY)
                .setParameter(
                    PartitionAware.PARTITION_KEY,
                    user.getPartitionKey()
                );
        }
    });
    return transactionManager;
}

当前分区是根据当前登录的 User 设置的。由于启用了分区 Filter,每次实体查询都会从当前登录 User 的分区中选择 UserPost 实体。

创建以下 Hibernate PersistEventListener,在当前登录的 User 持久化的每个实体上设置 partition_key

public class PartitionAwareInsertEventListener
        implements PersistEventListener {
public static final PartitionAwareInsertEventListener INSTANCE =
    new PartitionAwareInsertEventListener();

@Override
public void onPersist(PersistEvent event)
        throws HibernateException {
    final Object entity = event.getObject();

    if (entity instanceof PartitionAware partitionAware) {
        if (partitionAware.getPartitionKey() == null) {
            FilterImpl partitionKeyFilter = (FilterImpl) event
                .getSession()
                .getEnabledFilter(PartitionAware.PARTITION_KEY);
            partitionAware.setPartitionKey(
                (String) partitionKeyFilter
                    .getParameter(PartitionAware.PARTITION_KEY)
            );
        }
    }
}

@Override
public void onPersist(PersistEvent event, PersistContext persistContext)
        throws HibernateException {
    onPersist(event);
}

}

使用以下 PartitionAwareEventListenerIntegrator 注册 PartitionAwareInsertEventListener

public class PartitionAwareEventListenerIntegrator
        implements Integrator {
public static final PartitionAwareEventListenerIntegrator INSTANCE =
    new PartitionAwareEventListenerIntegrator();

@Override
public void integrate(
        Metadata metadata,
        BootstrapContext bootstrapContext,
        SessionFactoryImplementor sessionFactory) {
    sessionFactory
        .getServiceRegistry()
        .getService(EventListenerRegistry.class)
        .prependListeners(
            EventType.PERSIST,
            PartitionAwareInsertEventListener.INSTANCE
        );
}

@Override
public void disintegrate(
    SessionFactoryImplementor sessionFactory,
    SessionFactoryServiceRegistry serviceRegistry) {
}

}

PartitionAwareEventListenerIntegrator 是通过基于 Java 的属性配置提供给 Spring 的:

properties.put(
    EntityManagerFactoryBuilderImpl.INTEGRATOR_PROVIDER,
    (IntegratorProvider) () -> List.of(
        PartitionAwareEventListenerIntegrator.INSTANCE
    )
);

就这样!

测试 {#测试}

持久化一个 User 和 3 个 Post 实体:

final User vlad = new User()
    .setFirstName("Vlad")
    .setLastName("Mihalcea")
    .setPartition(Partition.EUROPE);

userRepository.persist(vlad);

UserContext.logIn(vlad);

forumService.createPosts(LongStream.rangeClosed(1, POST_COUNT) .mapToObj(postId -> new Post() .setTitle( String.format( "High-Performance Java Persistence - Part %d", postId ) ) .setUser(vlad) ) .toList() );

Hibernate 将生成以下 SQL INSERT 语句:

INSERT INTO users (
    registered_on,
    first_name,
    last_name,
    partition_key,
    id
)
VALUES (
    '2023-11-09 11:22:55.802704',
    'Vlad, Mihalcea',
    'Europe',
    1
)
INSERT INTO posts (
    created_on,
    partition_key,
    title,
    user_id,
    id
)
VALUES (
    '2023-11-09 11:19:55.856126',
    'Europe',
    'High-Performance Java Persistence - Part 1',
    1,
    1
), (
    '2023-11-09 11:19:55.856126',
    'Europe',
    'High-Performance Java Persistence - Part 2',
    1,
    2
),(
    '2023-11-09 11:19:55.856126',
    'Europe',
    'High-Performance Java Persistence - Part 3',
    1,
    3
)

检索 Post 实体:

List<Post> posts = forumService.findByIds(
    LongStream.rangeClosed(1, POST_COUNT).boxed().toList()
);

Hibernate 将根据当前登录 User 的分区进行过滤:

SELECT
    p.id,
    p.created_on,
    p.partition_key,
    p.title,
    p.user_id
FROM
    posts p
WHERE
    p.partition_key = 'Europe' AND
    p.id in (1, 2, 3)

在对该 SQL 查询运行 EXPLAIN ANALYZE 时,可以看到只有 posts_europe 表被扫描:

Seq Scan on posts_europe p 
  (cost=0.00..11.14 rows=1 width=1056)
  (actual time=0.022..0.023 rows=3 loops=1)
  Filter: (
    ((partition_key)::text = 'Europe'::text) AND
    (id = ANY ('{1,2,3}'::bigint[]))
  )

参考:https://vladmihalcea.com/table-partitioning-spring-hibernate/

赞(5)
未经允许不得转载:工具盒子 » 使用 Spring 和 Hibernate 进行表分区