Fun fact: internally, we are most excited about something majority of people find boring. Yesterday we released a completely reworked backend for inner monologue, reducing time to first token by ~25%, and, far more importantly, making latency more stable, reducing spikes: pic.twitter.com/E0zBZ3lHyY
— Mikhail Parakhin (@MParakhin) June 29, 2023