26b a3b?

#1
by mothemang - opened

Love your model but my machine gets so few tokens per second with 31b... Any possibility for the MOE gemma version to be trained?

Sign up or log in to comment