ATP

In-network Aggregation for Multi-tenant Learning

ATP uses emerging programmable switch hardware to support in-network aggregation at multiple rack switches in a cluster to speedup DT jobs. ATP outperforms existing systems accelerating training throughput by up to 38% - 66% in a multi-rack cluster shared by multiple DT jobs.

Posts