b2915
0fc1e820 · CUDA: faster large batch FA without tensor cores (#7314) · May 17, 2024