Many-cores are envisaged to include hundreds of processing cores etched on to a single die and will execute tens of multi-threaded tasks in parallel to exploit their massive parallel processing potential. A task can be sped up by assigning it to more than one core. Moreover, processing requirements of tasks are in a constant state of flux and some of the cores assigned to a task entering a low processing requirement phase can be transferred to a task entering high requirement phase, maximizing overall performance of the system. This scheduling problem of partial core reallocations can be solved optimally in polynomial time using a dynamic programming based scheduler. Dynamic programming is an inherently centralized algorithm that uses only one of the available cores for scheduling-related computations and hence is not scalable. In this work, we introduce a distributed scheduler that disburses all scheduling-related computations throughout the many-core allowing it to scale up. We prove that our proposed scheduler is optimal and hence converges to the same solution as the centralized optimal scheduler. Our simulations show that the proposed distributed scheduler can result in 1000x reduction in per-core processing overhead in comparison to the centralized scheduler and hence is more suited for scheduling on many-cores.