Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xclbin to ELF flow migration #8581

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

rbramand-xilinx
Copy link
Collaborator

@rbramand-xilinx rbramand-xilinx commented Oct 29, 2024

Problem solved by the commit

This PR enables new flow where ELF file is input instead of xclbin.
spec - https://confluence.amd.com/display/AIE/AIE+Compiler+Artifacts#AIECompilerArtifacts-Flow2

spec of new config elf - https://confluence.amd.com/pages/viewpage.action?pageId=1485002156#Profile/Config/CtrlcodeElfSpec-Config.elf

This new Elf has partition size and kernel signature.
It has multiple .ctrltext and .ctrldata sections and multiple .pdi sections.
Each ctrltext section represents one control code and pdi addresses along with kernel args needs to be patched into the control code buffer.
Control code that needs to be run is decided while creating xrt::kernel object using kernel name - <kernel_name>:<ctrl_code_id> eg: "DPU:0" runs with control code from section .ctrltext0 and "DPU:1" runs with control code from section .ctrlcode1

Sample test case :

#include "xrt/xrt_hw_context.h"
#include "xrt/xrt_kernel.h"
#include "xrt/xrt_elf.h"

int main(int argc, char** argv)
{
    std::string elf_path {argv[1]};   // elf path
    auto elf = xrt::elf(elf_path);

    auto device = xrt::device(0);
    auto ctx = xrt::hw_context(device, elf);

    std::string kernel_name = "DPU:1";
    auto kernel = xrt::ext::kernel(ctx, kernel_name);
    
    //  create args
    auto run = kernel(arg1, arg2 ...);
    run.wait();

    return 0;
}

XRT first class object changes :
xrt::hw_context - Added new APIs to create context using Elf instead of xclbin
xrt::kernel - Added new ext API to create kernel object using kernel name, added changes to construct kernel args from kernel signature.
xrt::elf - can now take new Elf with os abi 70 as input
xrt::module - added changes to parse this new Elf, also refactored code

TODO : Refactor code to parse ELF file using xrt::elf class and make it similar to xrt::xclbin. Created CR - https://jira.xilinx.com/browse/CR-1219757 for tracking this.

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered

This is a new feature

How problem was solved, alternative solutions (if any) and why they were rejected

Hw context can be created either with xclbin (traditional flow) or with xrt::elf. A new constructor is provided for same.
A new xrt::ext::kernel constructor is provided that takes ctx and kernel name as input.
Added code to parse new Elf file and patch it according to spec.
Rest of the flow remains same

Risks (if any) associated the changes in the commit

Low to medium
Tested the existing flows but needs more testing with all the available test cases.

What has been tested and how, request additional testing if necessary

Tested with new application flow on aie2p simnow (needs changes with respect to flow in amdxdna shim and firmware, changes are yet to be merged, so tested with local drops)

Tested existing test cases on phoenix hw (linux) and tests passes so existing flow didn't break.

TODO : check whether existing aie2ps test cases work

Documentation impact (if any)

Added doxygen comments in code for new APIs added, may be we need to document about new flow after it is stabilized.

@rbramand-xilinx rbramand-xilinx changed the title Enable new XRT test case flow without xclbin Xclbin to ELF flow migration Oct 29, 2024
Copy link
Collaborator

@stsoe stsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I am not completely done reviewing, but maybe address my first points and I will review again.

src/runtime_src/core/common/api/hw_context_int.h Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/module_int.h Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_kernel.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_kernel.cpp Outdated Show resolved Hide resolved
Comment on lines 962 to 980
kernel_properties::mailbox_type
get_mailbox_from_ini(const std::string& kname)
{
static auto mailbox_kernels = xrt_core::config::get_mailbox_kernels();
return (mailbox_kernels.find("/" + kname + "/") != std::string::npos)
? xrt_core::xclbin::kernel_properties::mailbox_type::inout
: xrt_core::xclbin::kernel_properties::mailbox_type::none;
}

// Kernel auto restart counter offset
// Needed until meta-data support (Vitis-1147)
kernel_properties::restart_type
get_restart_from_ini(const std::string& kname)
{
static auto restart_kernels = xrt_core::config::get_auto_restart_kernels();
return (restart_kernels.find("/" + kname + "/") != std::string::npos)
? 1
: 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not. Instead of rebuilding the properties, please cache a pointer to the existing properties that have all the information

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to export these functions?

Comment on lines 983 to 988
bool
get_sw_reset_from_ini(const std::string& kname)
{
static auto reset_kernels = xrt_core::config::get_sw_reset_kernels();
return (reset_kernels.find("/" + kname + "/") != std::string::npos);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same export question

Comment on lines 427 to 435
kernel_properties::mailbox_type
get_mailbox_from_ini(const std::string& kname);

kernel_properties::restart_type
get_restart_from_ini(const std::string& kname);

bool
get_sw_reset_from_ini(const std::string& kname);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets avoid.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need these functions while creating properties in ELF flow as well @stsoe , should I use static functions there as well? I moved these functions in header so that there wont be code duplicity

@AShivangi AShivangi requested a review from hlaccabu October 29, 2024 17:19
@AShivangi
Copy link
Collaborator

@hlaccabu recently added some code to enable elf flow in xrt-smi validate for npu3. Please work with him to make sure that this change doesn't breaks the existing code.

Copy link
Collaborator

@hlaccabu hlaccabu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to go from driver perspective. As long as driver's xrt submodule (and corresponding xrt_plugin.deb) contains these changes as well as the XRT .deb.

@rbramand-xilinx
Copy link
Collaborator Author

good to go from driver perspective. As long as driver's xrt submodule (and corresponding xrt_plugin.deb) contains these changes as well as the XRT .deb.

@hlaccabu can you please check if your feature doesn't break with these changes

@hlaccabu
Copy link
Collaborator

good to go from driver perspective. As long as driver's xrt submodule (and corresponding xrt_plugin.deb) contains these changes as well as the XRT .deb.

@hlaccabu can you please check if your feature doesn't break with these changes

Yep you're all set, the xrt-smi tests that I enabled for npu3 still run

@rbramand-xilinx rbramand-xilinx force-pushed the xclbin_to_elf1 branch 2 times, most recently from ca47bb5 to a899344 Compare November 4, 2024 12:12
@rbramand-xilinx
Copy link
Collaborator Author

Hi @stsoe , created CR - https://jira.xilinx.com/browse/CR-1219757 for refactoring existing code to move ELF parsing to xrt::elf object and make it similar to xrt::xclbin.
Please review this PR, I will do the refactoring using next set of PRs

@rbramand-xilinx rbramand-xilinx requested a review from stsoe November 4, 2024 12:26
@larry9523
Copy link
Collaborator

@rbramand-xilinx , Did we run any exsiting preemption test to make sure we won't break any?

@rbramand-xilinx
Copy link
Collaborator Author

@rbramand-xilinx , Did we run any exsiting preemption test to make sure we won't break any?

Hi @larry9523 , I have run preemption test cases with my changes and found 1 bug and fixed it. Thanks

Copy link
Collaborator

@stsoe stsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments. I did not review xrt_module.cpp.

src/runtime_src/core/common/api/module_int.h Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
src/runtime_src/core/common/api/xrt_hw_context.cpp Outdated Show resolved Hide resolved
Comment on lines 1607 to 1594
// initialize kernel name and ctrl code index
auto i = nm.find(":");
if (i == std::string::npos) {
// default case - ctrl code 0 will be used
name = nm.substr(0, nm.size());
m_ctrl_code_index = 0;
}
else {
name = nm.substr(0, i);
m_ctrl_code_index = std::stoul(nm.substr(i+1, nm.size()-i-1));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, this is a little hard to parse.
This is the second place name is parsed at :. Can this parsing be re-factored into say

std::pair<std::string, uint32_t>
static get_kernel_name_and_ctrl_code_index(const std::string& nm)
{
   ...
   return {name, index};
}

Then use as

std::tie(name, m_ctrl_code_index) = get_name_and_ctrl_code_index(nm);

Maybe choose a better function name for the refactored code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refactor the code Soren, my attempt was to use initializer list for most of the class members but yeah it does not look good.

Comment on lines 962 to 980
kernel_properties::mailbox_type
get_mailbox_from_ini(const std::string& kname)
{
static auto mailbox_kernels = xrt_core::config::get_mailbox_kernels();
return (mailbox_kernels.find("/" + kname + "/") != std::string::npos)
? xrt_core::xclbin::kernel_properties::mailbox_type::inout
: xrt_core::xclbin::kernel_properties::mailbox_type::none;
}

// Kernel auto restart counter offset
// Needed until meta-data support (Vitis-1147)
kernel_properties::restart_type
get_restart_from_ini(const std::string& kname)
{
static auto restart_kernels = xrt_core::config::get_auto_restart_kernels();
return (restart_kernels.find("/" + kname + "/") != std::string::npos)
? 1
: 0;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to export these functions?

Comment on lines 983 to 988
bool
get_sw_reset_from_ini(const std::string& kname)
{
static auto reset_kernels = xrt_core::config::get_sw_reset_kernels();
return (reset_kernels.find("/" + kname + "/") != std::string::npos);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same export question

Comment on lines 427 to 435
kernel_properties::mailbox_type
get_mailbox_from_ini(const std::string& kname);

kernel_properties::restart_type
get_restart_from_ini(const std::string& kname);

bool
get_sw_reset_from_ini(const std::string& kname);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open?

src/runtime_src/core/include/experimental/xrt_module.h Outdated Show resolved Hide resolved
@rbramand-xilinx rbramand-xilinx force-pushed the xclbin_to_elf1 branch 3 times, most recently from f3576d4 to fccc82e Compare November 12, 2024 10:07
Copy link
Collaborator

@stsoe stsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. Let me know when you have resolved if we really need to export internal xclbin_parser functions or not.

@rbramand-xilinx
Copy link
Collaborator Author

Looks good now. Let me know when you have resolved if we really need to export internal xclbin_parser functions or not.

Hi @stsoe, we don't need those functions to be exported. I have reverted the change and verified the testcase it passes.
Thanks for the insights.

rbramand added 6 commits November 14, 2024 15:02
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Signed-off-by: rbramand <rbramand@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants