ChatGPT 的 stt 是用的 whisper 吗？感觉比所有其他的语音输入都要强

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

这是一个创建于 472 天前的主题，其中的信息可能已经有所发展或是发生改变。

中英文混输比讯飞强，纯中文和讯飞差不多
说的是这个东西

第 1 条附言 · 2024 年 10 月 18 日

这个跟多模肽有关系么？我记得 GPT 3.5 的时候就有这个功能，现在选择 GPT4 也可以用这个功能。

whisper

stt

语音输入

4 条回复 • 2024-10-19 19:18:22 +08:00

malusama

2024 年 10 月 18 日

这玩意估计就是模型支持语音的输入输出。。毕竟早就是多模态的了

kyor0

2024 年 10 月 18 日

4o 是多模台的

cyp0633

2024 年 10 月 19 日

如果是 whisper ，效果会远不如讯飞

FlashEcho

2024 年 10 月 19 日

官方文档里就有： https://platform.openai.com/docs/guides/speech-to-text

The Audio API provides two speech to text endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model.

ChatGPT 的 stt 是用的 whisper 吗？ 感觉比所有其他的语音输入都要强

ChatGPT 的 stt 是用的 whisper 吗？感觉比所有其他的语音输入都要强