简介 {#简介}
本文将带你了解如何使用 Spring 和 Hibernate 实现表分区。
表分区的目标是将一个大型表分割为多个较小的分区表,以便关联的表和索引记录可以放入内存缓冲池,从而实现更高效的查找或扫描操作。
使用 PostgreSQL 进行表分区 {#使用-postgresql-进行表分区}
PostgreSQL 为 表分区 提供了三种策略:
- 列表分区(List Partitioning)
- 范围分区(Range Partitioning)
- Hash 分区(Hash Partitioning)
本例使用列表分区,按大洲来划分数据表。
例如,users
分区如下:
CREATE TABLE users ( id bigint NOT NULL, first_name varchar(255), last_name varchar(255), registered_on timestamp(6), partition_key varchar(255), PRIMARY KEY (id, partition_key) ) PARTITION BY LIST (partition_key)
CREATE TABLE users_asia PARTITION OF users FOR VALUES IN ('Asia')
CREATE TABLE users_africa PARTITION OF users FOR VALUES IN ('Africa')
CREATE TABLE users_north_america PARTITION OF users FOR VALUES IN ('North America')
CREATE TABLE users_south_america PARTITION OF users FOR VALUES IN ('South America')
CREATE TABLE users_europe PARTITION OF users FOR VALUES IN ('Europe')
CREATE TABLE users_australia PARTITION OF users FOR VALUES IN ('Australia')
posts
表分区如下:
CREATE TABLE posts ( id bigint NOT NULL, title varchar(255), created_on timestamp(6), user_id bigint, partition_key varchar(255), PRIMARY KEY (id, partition_key) ) PARTITION BY LIST (partition_key)
CREATE TABLE posts_asia PARTITION OF posts FOR VALUES IN ('Asia')
CREATE TABLE posts_africa PARTITION OF posts FOR VALUES IN ('Africa')
CREATE TABLE posts_north_america PARTITION OF posts FOR VALUES IN ('North America')
CREATE TABLE posts_south_america PARTITION OF posts FOR VALUES IN ('South America')
CREATE TABLE posts_europe PARTITION OF posts FOR VALUES IN ('Europe')
CREATE TABLE posts_australia PARTITION OF posts FOR VALUES IN ('Australia')
ALTER TABLE IF EXISTS posts ADD CONSTRAINT fk_posts_user_id FOREIGN KEY (user_id, partition_key) REFERENCES users
用 JPA 和 Hibernate 映射表分区 {#用-jpa-和-hibernate-映射表分区}
使用 Spring 和 Hibernate 实现表分区需要在读写分区表的记录时提供分区 Key。
首先,创建一个 PartitionAware
基类,每个映射到分区表的 JPA 实体都将继承该基类:
@MappedSuperclass @FilterDef( name = PartitionAware.PARTITION_KEY, parameters = @ParamDef( name = PartitionAware.PARTITION_KEY, type = String.class ) ) @Filter( name = PartitionAware.PARTITION_KEY, condition = "partition_key = :partitionKey" ) public abstract class PartitionAware<T extends PartitionAware> {
public static final String PARTITION_KEY = "partitionKey"; @Column(name = "partition_key") @PartitionKey private String partitionKey; public String getPartitionKey() { return partitionKey; } public T setPartitionKey(String partitionKey) { this.partitionKey = partitionKey; return (T) this; } public T setPartition(Partition partition) { this.partitionKey = partition.getKey(); return (T) this; }
}
@Filter
注解允许动态启用分区过滤功能@PartitionKey
注解是在 Hibernate 6.2 中引入的,它允定义一个分区 key。
Partition
对象是一个枚举,用于定义应用支持的分区:
public enum Partition { ASIA("Asia"), AFRICA("Africa"), NORTH_AMERICA("North America"), SOUTH_AMERICA("South America"), EUROPE("Europe"), AUSTRALIA("Australia"), ;
private final String key; Partition(String key) { this.key = key; } public String getKey() { return key; }
}
@MappedSuperclass
注解用于表示实体类的父类。它可以用来定义多个实体类之间的公共映射和字段。它本身并不被视为一个实体,不能直接持久化到数据库中。
User
实体继承了 PartitionAware
类,如下:
@Entity @Table(name = "users") public class User extends PartitionAware<User> {
@Id @GeneratedValue private Long id; @Column(name = "first_name") private String firstName; @Column(name = "last_name") private String lastName; @Column(name = "registered_on") @CreationTimestamp private LocalDateTime createdOn = LocalDateTime.now(); // get / set 方法省略
}
Post
实体同样,如下:
@Entity @Table(name = "posts") public class Post extends PartitionAware<Post> {
@Id @GeneratedValue private Long id; private String title; @Column(name = "created_on") @CreationTimestamp private LocalDateTime createdOn; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "user_id") private User user; // get / set 方法省略
}
现在,通过 Spring EntityManager
初始化机制在创建 JPA EntityManager
时默认启用分区过滤:
@Bean
public JpaTransactionManager transactionManager(
EntityManagerFactory entityManagerFactory){
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setEntityManagerFactory(entityManagerFactory);
transactionManager.setEntityManagerInitializer(entityManager -> {
User user = UserContext.getCurrent();
if (user != null) {
entityManager.unwrap(Session.class)
.enableFilter(PartitionAware.PARTITION_KEY)
.setParameter(
PartitionAware.PARTITION_KEY,
user.getPartitionKey()
);
}
});
return transactionManager;
}
当前分区是根据当前登录的 User
设置的。由于启用了分区 Filter
,每次实体查询都会从当前登录 User
的分区中选择 User
和 Post
实体。
创建以下 Hibernate PersistEventListener
,在当前登录的 User
持久化的每个实体上设置 partition_key
列
public class PartitionAwareInsertEventListener implements PersistEventListener {
public static final PartitionAwareInsertEventListener INSTANCE = new PartitionAwareInsertEventListener(); @Override public void onPersist(PersistEvent event) throws HibernateException { final Object entity = event.getObject(); if (entity instanceof PartitionAware partitionAware) { if (partitionAware.getPartitionKey() == null) { FilterImpl partitionKeyFilter = (FilterImpl) event .getSession() .getEnabledFilter(PartitionAware.PARTITION_KEY); partitionAware.setPartitionKey( (String) partitionKeyFilter .getParameter(PartitionAware.PARTITION_KEY) ); } } } @Override public void onPersist(PersistEvent event, PersistContext persistContext) throws HibernateException { onPersist(event); }
}
使用以下 PartitionAwareEventListenerIntegrator
注册 PartitionAwareInsertEventListener
:
public class PartitionAwareEventListenerIntegrator implements Integrator {
public static final PartitionAwareEventListenerIntegrator INSTANCE = new PartitionAwareEventListenerIntegrator(); @Override public void integrate( Metadata metadata, BootstrapContext bootstrapContext, SessionFactoryImplementor sessionFactory) { sessionFactory .getServiceRegistry() .getService(EventListenerRegistry.class) .prependListeners( EventType.PERSIST, PartitionAwareInsertEventListener.INSTANCE ); } @Override public void disintegrate( SessionFactoryImplementor sessionFactory, SessionFactoryServiceRegistry serviceRegistry) { }
}
PartitionAwareEventListenerIntegrator
是通过基于 Java 的属性配置提供给 Spring 的:
properties.put(
EntityManagerFactoryBuilderImpl.INTEGRATOR_PROVIDER,
(IntegratorProvider) () -> List.of(
PartitionAwareEventListenerIntegrator.INSTANCE
)
);
就这样!
测试 {#测试}
持久化一个 User
和 3 个 Post
实体:
final User vlad = new User() .setFirstName("Vlad") .setLastName("Mihalcea") .setPartition(Partition.EUROPE);
userRepository.persist(vlad);
UserContext.logIn(vlad);
forumService.createPosts(LongStream.rangeClosed(1, POST_COUNT) .mapToObj(postId -> new Post() .setTitle( String.format( "High-Performance Java Persistence - Part %d", postId ) ) .setUser(vlad) ) .toList() );
Hibernate 将生成以下 SQL INSERT
语句:
INSERT INTO users (
registered_on,
first_name,
last_name,
partition_key,
id
)
VALUES (
'2023-11-09 11:22:55.802704',
'Vlad, Mihalcea',
'Europe',
1
)
INSERT INTO posts (
created_on,
partition_key,
title,
user_id,
id
)
VALUES (
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 1',
1,
1
), (
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 2',
1,
2
),(
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 3',
1,
3
)
检索 Post
实体:
List<Post> posts = forumService.findByIds(
LongStream.rangeClosed(1, POST_COUNT).boxed().toList()
);
Hibernate 将根据当前登录 User
的分区进行过滤:
SELECT
p.id,
p.created_on,
p.partition_key,
p.title,
p.user_id
FROM
posts p
WHERE
p.partition_key = 'Europe' AND
p.id in (1, 2, 3)
在对该 SQL 查询运行 EXPLAIN ANALYZE
时,可以看到只有 posts_europe
表被扫描:
Seq Scan on posts_europe p
(cost=0.00..11.14 rows=1 width=1056)
(actual time=0.022..0.023 rows=3 loops=1)
Filter: (
((partition_key)::text = 'Europe'::text) AND
(id = ANY ('{1,2,3}'::bigint[]))
)
参考:https://vladmihalcea.com/table-partitioning-spring-hibernate/