Typically, a scheduling algorithm for an n x n packet switch with a crossbar as the data fabric divides time into slots, each of duration tp sufficient to transmit a packet. If a scheduling round requires tr > tp time, then the switch can transmit multiple packets, up to s = ⌊tr/tp⌋, between each mapped input-output pair under the current mapping. If s = 1, there exists a frame-based scheduling algorithm with Θ(log n) delay. For uniform random traffic, we establish that the delay is Ω(n) for any s > 1, hence, s = 1 is the only case where a Θ(log n) delay is achievable.Given the importance of achieving a low s, it is imperative to develop extremely fast scheduling algorithms (that reduce tr) on a mesh-based structure (corresponding to the crossbar topology of the switch). We present results for a fast scheduling algorithm that runs on a mesh-of-trees topology that can be overlaid on the crossbar switching fabric.