Refactor SchedulePolicy to improve code organization #2571

libratiger · 2024-12-25T11:03:59Z

Motivation

When I try to deep into the Zero-Overhead Batch Scheduler , I find is hard to get clear on the scheduling, and is hard to impl a new scheduling policy, so I try to refactor SchedulePolicy,and make it easy to add new policy for me and others.

Modifications

Move sorting logic into separate static methods for better maintainability
Improve policy validation and adjustment logic

Testing:

Add new test file test_schedule_policy.py with basic unit tests
Cover policy initialization and FCFS scheduling validation

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

libratiger · 2024-12-25T11:17:06Z

python -m sglang.bench_one_batch --model-path Qwen/Qwen2.5-3B-Instruct

with following result

max_total_num_tokens=1802895
Warmup ...
Prefill. latency: 0.04046 s, throughput:  25310.52 token/s
Decode.  latency: 0.00728 s, throughput:    137.29 token/s
Decode.  latency: 0.00711 s, throughput:    140.58 token/s
Decode.  latency: 0.00711 s, throughput:    140.65 token/s
Decode.  latency: 0.00711 s, throughput:    140.62 token/s
Decode.  latency: 0.00710 s, throughput:    140.76 token/s
Decode.  median latency: 0.00711 s, median throughput:    140.62 token/s
Total. latency:  0.090 s, throughput:  11406.91 token/s
Benchmark ...
Prefill. latency: 0.03336 s, throughput:  30699.61 token/s
Decode.  latency: 0.00716 s, throughput:    139.63 token/s
Decode.  latency: 0.00711 s, throughput:    140.73 token/s
Decode.  latency: 0.00710 s, throughput:    140.84 token/s
Decode.  latency: 0.00710 s, throughput:    140.85 token/s
Decode.  latency: 0.00709 s, throughput:    140.99 token/s
Decode.  median latency: 0.00710 s, median throughput:    140.85 token/s
Total. latency:  0.140 s, throughput:   7431.72 token/s

libratiger · 2024-12-25T11:17:17Z

cc @merrymercy

…nto newscheduler

libratiger · 2024-12-27T11:00:33Z

ping @hnyls2002 for your review feedback 😄

merrymercy · 2024-12-28T06:11:45Z

@hnyls2002 please take a look

libratiger added 2 commits December 25, 2024 17:22

simplify the scheduling policy

c5058c2

add the scheduling policy unitest file

fd815cc

libratiger requested review from merrymercy, Ying1123, zhyncs and hnyls2002 as code owners December 25, 2024 11:04

Merge branch 'main' into newscheduler

eaf0b2d

libratiger mentioned this pull request Dec 26, 2024

Refactor Scheduler to improve code organization #2593

Open

3 tasks

merrymercy assigned hnyls2002 Dec 26, 2024

libratiger added 3 commits December 27, 2024 08:39

Fix the union TypeError before python3.10

56955c1

Merge branch 'newscheduler' of https://github.com/libratiger/sglang i…

90db060

…nto newscheduler

Merge branch 'main' into newscheduler

2fc8eb4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor SchedulePolicy to improve code organization #2571

Refactor SchedulePolicy to improve code organization #2571

libratiger commented Dec 25, 2024 •

edited

Loading

libratiger commented Dec 25, 2024

libratiger commented Dec 25, 2024

libratiger commented Dec 27, 2024

merrymercy commented Dec 28, 2024

Refactor SchedulePolicy to improve code organization #2571

Are you sure you want to change the base?

Refactor SchedulePolicy to improve code organization #2571

Conversation

libratiger commented Dec 25, 2024 • edited Loading

Motivation

Modifications

Checklist

libratiger commented Dec 25, 2024

libratiger commented Dec 25, 2024

libratiger commented Dec 27, 2024

merrymercy commented Dec 28, 2024

libratiger commented Dec 25, 2024 •

edited

Loading