WhatsApp business cloud API - Different audio encoding for WhatsApp voice messages from android phone vs. windows desktop WhatsApp client
I am sending the audio file from a WhatsApp voice message (WhatsApp business cloud API) to google speech to text recognition. The very weird thing is that it works for voice messages sent from a windows whatsapp client. So having the official WhatsApp windows WhatsApp program send a voice message does work. But it does not work when sending the voice message from an android whatsapp app. So there seems to be different encoding for the voice audio file. Both have Opus Audio codec and both have sample rate of 48000 Hz. Both files can be played with VLC, but the android file is much smaller Any help or ideas on that?
Not sure if that info helps: The working audio file from windows whatsapp desktop has bits per sample 32. The not working one from android has no information about bits per sample. Also the not working file is much smaller.
It happened to me too, you just need to convert the file to WAV and then you don't need to specify sampleRate nor encoding (source) when providing it to the google speech API.
For the conversion this answer worked for me, just change 'mp3' to 'wav'.
If another answer also answers this question, why not just flag or vote to close the question as a duplicate rather than post a low-quality answer?
@HovercraftFullOfEels because I think it may be helpful to have the answer here too for people looking for the specific problem of whatsapp API. Personally I was having problems deciding which format to convert the audio to (given that google accepts a range of codecs) and then what library to use to achieve the conversion easily ✌️ then I found this one that worked perfectly.