Many-core processors contain a large number of cores housed on a single die and can execute multitudes of tasks in parallel. Many-cores are quintessential in several emerging highperformance computing tasks on embedded systems. The accompanying qualitative schedules are the key in achieving the full potential of many-cores for these tasks. Schedulers are low-level Operating System (OS) sub-routines which develop task execution schedules on processors at run-time. Many-core schedules are several times bigger than existing multi-core schedules. Therefore, we require new schedulers capable of scaling up with the increase in the size of many-cores while preserving the schedule’s quality. Many-cores also have several new micro-architectural features and schedulers must exploit them for a more efficient hardware-software co-design of schedules. A scheduler may have different objectives depending upon the overlying system requirements. This dissertation introduces several different schedulers catering to these objectives. Unfortunately, most of the objectives have NP-hard complexity and hence cannot be achieved in polynomial time for a general case unless P=NP. While most of our peers have tried to attain them heuristically, we attain them using different non-heuristic schedulers. The schedulers provide strong theoretical guarantees on the schedule quality. The schedulers are also designed to be highly scalable and are structurally better suited for further improvements in future. We present two schedulers for maximizing the performance of many-cores. First is a distributed scheduler, while the second one is a centralized greedy scheduler. Schedulers provide optimal many-core performance by moving allocated cores amongst tasks at run-time. Maximization of performance is not the objective of all systems and some systems prioritize fairness over performance. We also present a distributed scheduler for maximizing the fairness of many-cores. Scheduler improves overall fairness by moving allocated cores amongst tasks, and results in a fairer schedule than state-of-the-art. Furthermore, the scheduler can maximize fairness optimally particularly in the case of fixed performance. Performance can be increased even by just spatially rearranging the allocations without any actual change in their sizes. We present a distributed scheduler that defragments the many-core to improve its performance. The scheduler performs optimal defragmentation particularly in the case when all the tasks produce only power-of-two threads outperforming state-of-the-art. The above-proposed scheduler does not work for a many-core with Static Non-Uniform Cache Access (S-NUCA) caches. We, therefore, introduce a new scheduler for this many-core which exploits topology-introduced heterogeneity in the cores of many-core due to the presence of S-NUCA caches to improve performance. The scheduler uses an exact algorithm on the pruned search-space to quickly develop a schedule and outperforms state-of-the-art. Many-cores operate within a strict power budget which constraints their performance. We developed a centralized scheduler for probabilistic power budgeting with only linear-time computational complexity. The scheduler performs power budgeting a magnitude faster than stateof- the-art with near-equal performance. On similar lines, we also developed a probabilistic power budgeting scheduler for many-cores when they execute Quality of Service (QoS) tasks. Finally, we also develop an open-source toolchain called HotSniper for real-world representative evaluations of many-core schedulers. HotSniper integrates Hotspot thermal modeling tool with Sniper many-core simulator. HotSniper also allows interval thermal simulations of many-cores deployed in open systems, which were not previously possible.