Do you know when we can expect an update on the realtime API? It’s still in beta...

jeffharris · 2025-03-20T23:30:13 1742513413

we're working hard on it at the moment and hope we'll have a snapshot ready in the next month or so

we've debugged the cutoff issues and have fixes for them internally but we need a snapshot that's better across the board, not just cutoffs (working on it!)

we're all in on S2S models both for API and ChatGPT, so there will be lots more coming to Realtime this year

For today: the new noise cancellation and semantic voice activity detector are available in Realtime. And ofc you can use gpt-4o-transribe for user transcripts there

taf2 · 2025-03-20T21:43:11 1742506991

Agreed- really not liking how they are neglecting it… I hope they are just hard at work behind the scenes and will release something soon

jeffharris · 2025-03-20T23:32:57 1742513577

S2S is where we're investing the most effort on audio ... sorry it's been slow but we are working hard on it

Top priorities at the moment 1) Better function calling performance 2) Improved perception accuracy (not mishearing) 3) More reliable instruction following 4) Bug fixes (cutoffs, run ons, modality steering)

dandiep · 2025-03-21T02:14:13 1742523253

Appreciate the efforts. It’s not there yet, but when it gets there it will open up a lot of use cases.

Any fine tuning for s2s in the horizon?