Recent Posts

Worth reading

A paper on pruning transformer attention heads during inference without retraining.
Why we chose a microblog format over a traditional blog.