跳转至

Precision Data Collection Baseline in MindSpore

Time Expansion Baseline for Data Collection in "statistics" Mode (MD5 Disabled)

This baseline is a reference for performance expansion when data is collected in "statistics" mode in the MindSpore framework. It shows performance scaling of a 38B language model with eight ranks in different collection modes.

Collection Mode Without Tool (Time Required) With Tool but Dump Disabled (Time Required) With Tool and Dump Enabled (Time Required)
L0 ≈ 340 ms ≈ 340 ms (no change) ≈ 1.2s (3.5x)
L1 ≈ 340 ms ≈ 0.7–1.2s (2–4x) ≈ 3.8s (11x)
mix ≈ 340 ms ≈ 0.7–1.2s (2–4x) ≈ 5.5s (16x)

Data Size Baseline in "tensor" Mode

This baseline is a reference for data size changes in "tensor" mode in the MindSpore framework. It shows the data size changes of a 38B language model across different collection modes, global batch sizes, and configurations (single-rank vs. eight-rank).

Data Size Changes

Collection Mode global_batch_size Single-Rank Eight-Rank
L0 1 262 GB 2.1 TB
2 480 GB 3.8 TB
3 928 GB 7.4 TB
L1 1 2.1 TB 17.1 TB
2 2.8 TB 22.7 TB
3 4.2 TB 34.3 TB
mix 1 2.4 TB 19.2 TB
2 3.3TB 26.6TB
3 5.1 TB 41.4 TB