Kernel-Level Precision Data Collection in PyTorch¶
Overview¶
This document describes the configuration examples and collection results of kernel-level precision data collection. For details about how to use the msProbe data collection function, see Precision Data Collection in PyTorch.
Preparations¶
Environment Setup
Install msProbe by referring to msProbe Installation Guide.
Constraints
Only the PyTorch framework is supported.
Quick Start¶
See PyTorch Precision Data Collection - Quick Start.
Kernel Dump Configuration¶
When using kernel dump, set task to tensor and list to an API name. Currently, kernel dump can collect data of only one API in each step.
The API naming format is {api_type}.{api_name}.{Number of API calls}.{forward/backward}. For details, see the API name in the L1 dump result file dump.json.
Example:
{
"task": "tensor",
"dump_path": "./dump_path",
"level": "L2",
"rank": [],
"step": [],
"tensor": {
"scope": [],
"list": ["Functional.linear.0.backward"]
}
}
Dump Result File¶
Data Collection Results¶
If the kernel-level data collection is successful, the following information is displayed:
Note: If no data is generated after the preceding information is printed, rectify the fault by referring to FAQs.
If the kernel dump encounters an unsupported API, the following information is displayed:
{api_name} indicates the API name.
Output File Description¶
After kernel-level data collection is successful, the following files are generated in the specified dump_path directory:
├── /home/data_dump/
│ ├── step0
│ │ ├── 20241201103000 # Date and time format, indicating 10:30:00, 2024-12-01.
│ │ │ ├── 0 # Device ID
│ │ │ │ ├──{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp} # Kernel-layer operator data
│ │ │ ...
│ │ ├── kernel_config_{device_id}.json # Intermediate file generated during kernel dump API call. Generally, you do not need to pay attention to it.
│ │ ...
│ ├── step1
│ ...
Appendix¶
FAQs¶
Why is the collection result file empty?¶
- Check whether the tool usage, configuration file content, and API name format in list are correct.
- Check whether the API is running on the Ascend NPU. If the API is running on other devices, no kernel-level data exists.
- If the problem persists, use the torch_npu.npu API provided by the Ascend Extension for PyTorch plugin to collect kernel-level data. The kernel dump function of the tool is implemented based on the init_dump, set_dump, and finalize_dump sub-interfaces. For details about the torch_npu.npu API, see torch_npu.npu.