Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

B+ tree optimizations #497

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

B+ tree optimizations #497

wants to merge 3 commits into from

Conversation

Lss-lmj
Copy link

@Lss-lmj Lss-lmj commented Dec 26, 2024

1. 锁的管理优化

  • 目标:提高并发性能,避免死锁,减少锁的持有时间。
  • 方案详情
    • insert_entry_into_parent 函数中,创建新根节点时,先释放子节点的锁(disk_buffer_pool_->unpin_page(frame); disk_buffer_pool_->unpin_page(new_frame);),再进行根节点的加锁和更新操作。这样可以避免在创建根节点过程中长时间持有子节点的锁,提高并发性能,因为在创建新根节点时,子节点的修改已经完成,不需要再持有其锁。
    • 递归调用 insert_entry_into_parent 之前,释放当前父节点的锁(disk_buffer_pool_->unpin_page(parent_frame);)。这是因为在递归调用过程中,当前父节点的操作已经完成,不需要继续持有锁,减少锁的持有时间,提高并发性能。
    • split 函数中,将新节点的初始化操作封装到 initializeNewNode 函数中,使锁的获取和节点初始化逻辑更清晰,便于维护和理解锁的使用范围。

2. 资源管理优化

  • 目标:确保资源正确分配和释放,避免资源泄漏,提高资源使用效率。
  • 方案详情
    • make_key 函数中,使用 auto 关键字简化 mem_pool_item_->alloc_unique_ptr() 的返回值处理。同时,添加 try-catch 块来捕获 memcpy 操作可能抛出的异常,更细致地处理内存复制过程中的错误情况。如果内存复制失败,不仅记录错误日志,还能确保函数返回 nullptr,避免返回未完全初始化的指针,防止后续使用该指针导致未定义行为。
    • open 函数中,当打开文件失败时,添加更详细的错误日志(LOG_ERROR("Failed to fully open b+tree. File name: %s, Error code: %d:%s", file_name, rc, strrc(rc));),方便问题排查。同时,对于错误情况,可以根据具体需求在注释中添加更详细的错误恢复操作建议,例如清理部分已分配的资源,虽然目前只是注释,但为后续优化提供了方向。

3. 代替裸指针进行内存管理

  • 目标:使用智能指针代替裸指针,简化内存管理,避免手动释放内存带来的错误,确保内存安全。
  • 方案详情:在 make_key 函数中,使用 std::unique_ptr(通过 mem_pool_item_->alloc_unique_ptr() 返回)来管理动态分配的内存。std::unique_ptr 会在其生命周期结束时自动释放所管理的内存,避免了手动释放内存可能导致的内存泄漏或悬挂指针问题。通过这种方式,提高了内存管理的安全性和可靠性,减少了因内存管理不当而引入的错误。

4. insert_entry_into_leaf_node 函数性能优化

  • 目标:减少节点分裂操作的性能开销,提高插入操作的整体性能。
  • 方案详情
    • 预分配逻辑:添加了节点接近满时的预分配逻辑,当 leaf_node.size() >= leaf_node.max_size() - 5 时,提前调用 split 函数分配新节点。这是因为节点分裂操作是一个相对耗时的操作,涉及到节点数据的移动和新节点的创建等复杂操作。通过提前预分配,当节点真正满时,无需立即进行分裂操作,直接将数据插入到新节点中,避免了在高并发插入场景下,节点满时才进行分裂可能导致的性能瓶颈,从而减少性能开销,提高插入操作的整体性能。
    • 减少节点满时的处理时间:预分配新节点后,在后续插入操作中,根据插入位置判断数据应插入到原节点还是新节点,减少了节点满时才进行分裂操作的等待时间,提高了插入操作的响应速度。同时,这种方式也能在一定程度上均衡节点的负载,避免某些节点频繁分裂而影响性能。

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@nautaa nautaa self-requested a review December 26, 2024 11:01
@@ -916,6 +916,8 @@ RC BplusTreeHandler::open(LogHandler &log_handler, BufferPoolManager &bpm, const
if (OB_FAIL(rc)) {
LOG_WARN("Failed to open file name=%s, rc=%d:%s", file_name, rc, strrc(rc));
return rc;
}else {
LOG_ERROR("Failed to fully open b+tree. File name: %s, Error code: %d:%s", file_name, rc, strrc(rc));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why LOG_ERROR here

@@ -1258,6 +1260,15 @@ RC BplusTreeHandler::crabing_protocal_fetch_page(
return rc;
}


/*
优化信息@sjk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove unnecessary comments @sjk ?

if (key == nullptr) {
LOG_WARN("Failed to alloc memory for key.");
return nullptr;
auto key = mem_pool_item_->alloc_unique_ptr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is no need to change to auto.

return nullptr;
}
try {
memcpy(key.get(), user_key, file_header_.attr_length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why memcpy maybe fail? what is the scenario?


/*
优化信息
在创建新根节点时,原代码在更新根节点相关信息后才释放子节点锁,优化后先释放子节点锁,再处理根节点加锁和更新,减少了子节点锁的持有时间,提高并发性能。
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are not appropriate here.

template <typename IndexNodeHandlerType>
void BplusTreeHandler::initializeNewNode(BplusTreeMiniTransaction &mtr, Frame *new_frame, IndexNodeHandlerType &old_node)
{
IndexNodeHandlerType new_node(mtr, file_header_, new_frame);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why new_node appears in function and out of function?


Frame *new_frame = nullptr;
// 添加预分配逻辑,当节点接近满时提前分配新节点
if (leaf_node.size() < leaf_node.max_size() - 5) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is >=?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard code 5 is ok? if max_size() <=5 ?

@nautaa
Copy link
Member

nautaa commented Dec 26, 2024

@Lss-lmj 看起来你优化了一些性能,可以给出一些数字嘛?看起来在你修改后产生了死锁,ci 也没有通过。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants