MiniMax

Remote

Scanned

Enable powerful text-to-speech, voice cloning, video, and image generation capabilities through a unified MCP server interface. Integrate with popular MCP clients to generate speech, clone voices, and create multimedia content seamlessly. Enhance your applications with advanced multimedia generation tools backed by MiniMax APIs.

Tools

text_to_audio

Convert text to audio with a given voice and save the output audio file to a given directory. Directory is optional, if not provided, the output file will be saved to $HOME/Desktop. Voice id is optional, if not provided, the default voice will be used. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: text (str): The text to convert to speech. voice_id (str, optional): The id of the voice to use. For example, "male-qn-qingse"/"audiobook_female_1"/"cute_boy"/"Charming_Lady"... model (string, optional): The model to use. speed (float, optional): Speed of the generated audio. Controls the speed of the generated speech. Values range from 0.5 to 2.0, with 1.0 being the default speed. vol (float, optional): Volume of the generated audio. Controls the volume of the generated speech. Values range from 0 to 10, with 1 being the default volume. pitch (int, optional): Pitch of the generated audio. Controls the speed of the generated speech. Values range from -12 to 12, with 0 being the default speed. emotion (str, optional): Emotion of the generated audio. Controls the emotion of the generated speech. Values range ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "neutral"], with "happy" being the default emotion. sample_rate (int, optional): Sample rate of the generated audio. Controls the sample rate of the generated speech. Values range [8000,16000,22050,24000,32000,44100] with 32000 being the default sample rate. bitrate (int, optional): Bitrate of the generated audio. Controls the bitrate of the generated speech. Values range [32000,64000,128000,256000] with 128000 being the default bitrate. channel (int, optional): Channel of the generated audio. Controls the channel of the generated speech. Values range [1, 2] with 1 being the default channel. format (str, optional): Format of the generated audio. Controls the format of the generated speech. Values range ["pcm", "mp3","flac"] with "mp3" being the default format. language_boost (str, optional): Language boost of the generated audio. Controls the language boost of the generated speech. Values range ['Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'auto'] with "auto" being the default language boost. Returns: Text content with the path to the output file and name of the voice used.

list_voices

List all voices available. Only supports when api_host is https://api.minimax.chat Args: voice_type (str, optional): The type of voices to list. Values range ["all", "system", "voice_cloning"], with "all" being the default. Returns: Text content with the list of voices.

voice_clone

Clone a voice using provided audio files. The new voice will be charged upon first use. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: voice_id (str): The id of the voice to use. file (str): The path to the audio file to clone or a URL to the audio file. text (str, optional): The text to use for the demo audio. is_url (bool, optional): Whether the file is a URL. Defaults to False. Returns: Text content with the voice id of the cloned voice.

play_audio

Play an audio file. Supports WAV and MP3 formats. Not supports video. Args: input_file_path (str): The path to the audio file to play. is_url (bool, optional): Whether the audio file is a URL. Returns: Text content with the path to the audio file.

View 2 more tools

Uh oh!

User must be logged in

Need help? Hop into our Discord