Sberbank’s GigaChat, Russia’s alternative to ChatGPT, is set to introduce a groundbreaking feature—music generation from text prompts. Denis Filippov, Vice President for New Digital Salute Surfaces at Sberbank, unveiled the upcoming enhancement at AI Journey 2023 – an online conference in the field of artificial intelligence technologies, revealing that GigaChat would soon enable users to create music simply by providing text prompts.
Talking about this new innovation, Filippov said, “GigaChat’s new functionality will come in handy for music enthusiasts and people of creative professions. SMEs is one of the target audiences, as we see it. With GigaChat, they can address their business issues quickly, legally, and enjoy high quality, by generating background music for cafés, beauty parlors, waiting areas, for video ads and the social media.”
By incorporating CLaMP and SymFormer networks into GigaChat, Sberbank aims to revolutionize the music generation process. Users can initiate the creative process by submitting prompts like “compose a lively country-style song” or “create music for a business center lounge.” The result is a unique audio file and a MIDI track compatible with any Digital Audio Workstation (DAW).
The integration of CLaMP and SymFormer neural networks powers GigaChat’s music generation capability. SymFormer, trained on the ML Space platform using the supercomputer Christofari and a diverse dataset of over 200,000 songs spanning various genres, interprets music as a score. This innovative approach applies the text-2-image methodology to sheet music, providing users with a versatile tool for music production.
Users can not only listen to and download the generated songs but also leverage the MIDI files in creative projects. The flexibility of GigaChat’s music generation allows for editing harmonies, changing arrangements, and producing distinct sounds. Additionally, the MIDI files can be seamlessly integrated into music production workflows.
Sberbank’s commitment to enhancing GigaChat extends beyond music generation. The platform will soon support the upload and processing of PDF files, enabling users to extract summaries or identify key insights from extensive financial reports. Users can also formulate questions about the text and its content, further expanding GigaChat’s utility.
According to Filippov, “Basic dialog interfaces of services are not yet massively adapted for document editing scenarios; research and experiments in building new interfaces are necessary, and they are being conducted by large companies, including Sber. Large language models can automate complex information search and retrieval, help with idea generation, document planning and structure, automatically create illustrations, generate complex documents for a given purpose, advise on stylistic issues, correct spelling based on context, and more. All these capabilities will soon be available to GigaChat users.”
Launched in April 2023, GigaChat distinguishes itself from competitors with its superior ability to communicate intelligently in Russian. Sberbank emphasizes the platform’s proficiency in understanding and generating content in the Russian language, catering to the preferences of a significant portion of the country’s population. This linguistic advantage positions GigaChat as a preferred choice for users in Russia seeking advanced and tailored AI interactions.
Read next: Adobe forays into Indian AI market with the acquisition of Bengaluru-based startup Rephrase.ai