Most system calls (essentially those that put the process on a queue for a service) result in the kernel scheduler taking the next most urgent process and making it run. Complicated slightly for multi-core, and where processes can be allocated to specific cores or CPUs.
Processor time is also allocated in maximum time slots, and at each clock tick the scheduler checks if the current process has used up its whole time slice. If so, it is suspended (i.e. not returned from the tick interrupt) and its temporary nice value is increased, so it is further from the head of the schedule queue. This ensures that programs that do frequent I/O get lots of short time slots, and CPU hogs get a few long ones.
All that describes fairly early Unix systems (because I haven't been working at that level lately), but it probably has not changed greatly. You can't improve that much on the original design.