Audio & Video
AI-assisted audio and video tools for editing, generation, transcription and media workflows.
Use this category as a practical starting point when you want to compare tools by workflow fit instead of vendor language.
How to compare Audio & Video tools
Start with the task you need to solve, then compare pricing, data handling, integrations, export options and how quickly the tool fits into your existing workflow.
What Utildesk checks
Utildesk keeps the catalogue focused on real use cases: what the tool does, who it is useful for, and whether the surrounding metadata is clear enough for people and AI agents.
159 tools in this category
Ableton Live
Ableton Live is a digital audio workstation for people who do not only record music linearly, but work with loops, clips, MIDI ideas, sound design, and stage setups. It is especially strong when a sketch needs to become a playable arrangement quickly.
Acapela Group
Acapela Group is a leading provider of Text-to-Speech (TTS) solutions that offers natural and expressive voices for a variety of applications. The technology enables the conversion of written text into high-quality, understandable speech recordings that are used in various industries such as education, telecommunications, accessibility, and entertainment. Acapela Group places a strong emphasis on individual adaptations and multilingual options to meet the needs of different users.
Acast
Acast is an innovative platform that specializes in hosting, monetizing, and analyzing podcasts. By utilizing modern technologies, including AI-powered tools, Acast enables podcasters to efficiently manage and make their content accessible to a wide audience. The platform supports both beginners and experienced podcasters and offers a range of features around audio content.
Adobe Enhance Speech
Adobe Enhance Speech is an AI tool for automatically improving spoken audio. It reduces common recording problems such as room echo, background noise, muffled voice quality, and uneven vocal presence, with the goal of turning simple recordings into clearer, podcast-like speech tracks. It is especially useful when audio is recorded outside a studio, using a laptop, headset, phone, or USB microphone in changing conditions.
Adobe Podcast
Adobe Podcast is an innovative platform designed specifically for podcasters and audio producers to simplify the recording, editing, and transcription of audio content. Featuring integrated AI-powered functions, Adobe Podcast helps create and publish professional podcasts more efficiently. Its freemium model allows users to test basic features for free and access advanced functions if needed.
Adobe Premiere Pro
Professional video editor for editing, color, audio, captions, and post-production workflows.
Adobe Premiere Rush
Adobe Premiere Rush is a user-friendly video editing software designed for content creators who want to produce high-quality videos quickly and easily. The application combines basic video editing features with an intuitive interface and is available on both desktop and mobile devices. With Premiere Rush, users can capture, edit, and share videos on various platforms.
Aive
Aive supports video-centered marketing and content workflows with automation, analysis, and creative optimization.
Alitu
Alitu is a KI-powered tool designed specifically for podcasters to simplify the recording and editing process. It automates many technical steps that are typically time-consuming, allowing users without extensive audio expertise to create professional podcasts. Alitu is particularly helpful for cleaning up, cutting, and adding music or effects to audio files without requiring complex software.
Amazon Alexa
Amazon Alexa is a voice-controlled virtual assistant that simplifies numerous tasks in everyday life. Through integration with smart home devices, control of music, answering of questions, and more, Alexa provides a versatile support for users. The technology is based on artificial intelligence and allows for intuitive control via speech.
Amazon Polly
Amazon Polly is a cloud-based service from Amazon Web Services (AWS) that converts text into naturally sounding speech. With advanced artificial intelligence, Polly produces realistic speech outputs from text, which can be used in various applications such as customer service, e-learning, audiobooks, or automation solutions. The API allows for easy integration into different systems and supports many languages and voices.
Amazon Rekognition
Amazon Rekognition is a cloud-based service from Amazon Web Services (AWS) that offers powerful AI-powered image and video analysis. With the help of machine learning, Rekognition can automatically recognize and analyze faces, objects, scenes, and activities in images and videos. Its scalability and easy integration make it suitable for both developers and organizations that want to intelligently analyze visual content.
Amazon Rekognition Video
Amazon Rekognition Video is a cloud-based service from Amazon Web Services (AWS) that enables automatic object, activity, face, and content analysis in video files. By leveraging machine learning, it helps businesses efficiently search, analyze, and manage video content without having to develop their own AI models.
Amazon Transcribe
Amazon Transcribe is Amazon Web Services' automatic speech recognition service for turning audio and video into text. It is used for meeting notes, media transcripts, contact-center analysis, subtitles, research interviews and internal documentation. The service is especially relevant for teams that already store files in AWS or want transcription to become part of a larger processing pipeline rather than a standalone manual task.
Anchor
A podcast hosting and distribution tool for creators who want to record, publish, and track episodes with minimal technical overhead.
Animoto
Animoto is a cloud-based video editor that allows users to create professional-looking videos quickly and easily. Using pre-made templates, automated video editing, and AI-powered features, Animoto turns photos, videos, and music into engaging video content. It is particularly popular among marketing experts, social media managers, and small businesses who want to create visually compelling videos without requiring extensive video editing knowledge.
Apple Siri
Apple Siri is Apple's voice assistant for iPhone, iPad, Mac, HomePod, and simple everyday automation.
AssemblyAI
AssemblyAI is a powerful platform for automatic speech recognition (ASR) and speech processing, primarily developed for developers and enterprises. It offers advanced AI-based transcription services that quickly convert audio and video files into text. The API of AssemblyAI enables easy integration into various applications to efficiently analyze and process speech data.
AudioMaster
AudioMaster is a versatile audio software tool specifically designed for mastering and editing audio files. With a user-friendly interface and mobile use options, the tool is aimed at musicians, producers, and audio enthusiasts who want to improve their sound quality quickly and effectively. Whether on the go or in the studio, AudioMaster offers a wide range of functions that make professional results possible even without in-depth technical knowledge.
Audiotool
Audiotool is a browser-based music production platform that allows users to create, edit, and publish electronic music directly in the web. Without software installation, Audiotool offers a comprehensive collection of virtual instruments, effects, and mixer tools that are both appealing to beginners and experienced producers. The platform supports collaborative work and direct exchange of projects in the community.
Audo
Audo is an audio tool for voice enhancement, noise reduction, and clearer recordings in content workflows.
Auphonic
Auphonic is a AI-powered tool for automated audio production and optimization. It helps users to quickly improve, transcribe, and prepare audio and video files for various platforms. Auphonic is particularly suitable for podcasters, journalists, content creators, and anyone who values high-quality sound without spending a lot of time on manual editing.
Avatarify
Avatarify is a tool for face animation and avatar effects for video experiments, especially useful for visual prototypes, filters, and playful live formats when rights, labeling, and context are clearly defined.
Avid Media Composer
Avid Media Composer is a professional video editing software that has been established in the film and television industry for decades. It offers powerful tools for editing, color correction, and post-production, and is used by many studios and independent producers worldwide. Focused on complex projects and collaborative workflows, Avid Media Composer is a central solution for professional video editing.
Avigilon
Avigilon provides video surveillance, security cameras, and analytics for physical security environments.
Avigilon Control Center
Avigilon Control Center (ACC) is a powerful video management software (VMS) designed specifically for security solutions. It offers advanced analytics, an intuitive interface, and flexible scalability, providing a comprehensive platform to monitor and manage video surveillance systems across various environments.
Avoma
Avoma is a AI-powered tool designed specifically for optimizing meetings, sales processes, and transcription. It helps teams streamline meetings, automatically capture key conversation topics, and gain valuable insights. With intelligent analysis functions, Avoma improves communication and simplifies post-meeting tasks.
Axis Camera Station
Axis Camera Station is a comprehensive video management software (VMS) designed specifically for small and medium-sized businesses to efficiently control video surveillance systems. It supports the management of Axis network cameras and encoders, offering a user-friendly interface for live viewing, recording, and playback of video footage. The software facilitates centralized monitoring and enhances security.
Axis Communications
Axis Communications provides network cameras, video hardware and analytics for IP based security and monitoring systems.
BigBlueButton
BigBlueButton is an open-source web conferencing tool for education and training, especially useful for schools, universities, and organizations that want self-hosting, classroom workflows, breakout rooms, recording, and moderation without relying on a proprietary all-in-one platform.
Biteable
Biteable is a user-friendly online tool that utilizes artificial intelligence to quickly and easily create marketing videos, explainer videos, and social media content. The platform is particularly suited for businesses, marketing professionals, and content creators who want to produce professional-looking videos without requiring extensive video production knowledge.
Bitwig Studio
Bitwig Studio is a modern digital audio workstation (DAW) that is known for its flexibility and extensive creative possibilities. Developed for musicians, producers, and sound designers, Bitwig Studio offers a modular environment for music production that provides numerous tools for both beginners and professionals. With an intuitive user interface and innovative features, Bitwig Studio supports the implementation of ideas in all music styles.
Boomy
Boomy is a audio and music tool for AI music generation for quick song sketches, background music, and creative audio experiments.
Boords
Boords is a video and production tool for storyboard and pre-production workflows for videos, animation, and client presentations.
BriefCam
BriefCam is an innovative video analysis platform that utilizes artificial intelligence (AI) to quickly and efficiently evaluate large volumes of video material. The software enables the identification, filtering, and summarization of relevant events in recorded videos, thereby enabling more effective security and surveillance tasks. BriefCam is applied in various industries, including public security, retail, transportation, and facility management.
Buzzsprout
Buzzsprout is a user-friendly podcast hosting platform that allows users to easily publish, manage, and distribute their podcasts. With a clear interface and automated tools, Buzzsprout helps podcasters get their content online and available on various platforms. The platform is suitable for both beginners and experienced podcasters who prioritize ease of use and reliable hosting.
Camtasia
Camtasia is a video and production tool for screen recording, tutorial editing, and learning-video production for clear step-by-step content.
Canva Video
Canva Video is a user-friendly online tool for creating and editing videos, which is particularly convincing due to its intuitive interface and versatile design options. It is designed for users who want to create engaging videos for social media, presentations, or marketing purposes without needing deep technical knowledge. Canva Video also supports both beginners and experienced designers in implementing creative video projects with its integrated AI features.
CapCut
A versatile mobile video editor for beginners and advanced users, with intuitive tools, AI-powered features, and quick social media publishing.
Celtx
Celtx helps teams organize scriptwriting, production, and pre-production in one place, especially when multiple people need to coordinate scenes, resources, and approvals.
Cisco Webex
Cisco Webex is a comprehensive platform for video conferencing, online meetings, and collaboration. It combines a user-friendly interface with a broad set of features, including AI-powered tools that help streamline meetings, improve communication, and make virtual events easier to organize. The platform is designed for teams and organizations that need reliable, secure, and scalable remote collaboration.
Clarifai
Clarifai is a powerful AI platform specializing in image and video recognition. Using modern artificial intelligence and machine learning, Clarifai enables businesses and developers to automatically analyze, categorize, and understand visual content. The platform supports various application areas from automatic image captioning to recognizing complex visual patterns.
Clarifai Video Recognition
Clarifai Video Recognition is a powerful AI-based solution for automated content analysis and recognition in video materials. The platform utilizes the latest deep-learning models to precisely identify objects, scenes, actions, and other relevant elements in videos, thereby enabling more efficient video workflows for media companies, security services, and marketing departments.
Cleanvoice AI
Cleanvoice AI is an intelligent audio tool designed to automate and simplify post-production of audio recordings. It uses artificial intelligence to automatically detect and remove unwanted elements such as filler words, background noise, and other imperfections in audio recordings. This helps to create professional-sounding audio files more quickly and efficiently without the need for extensive manual editing.
Clipchamp
Clipchamp is an easy-to-use online video editing platform with AI-powered features, templates, collaboration tools, and cloud storage for creating and editing videos without installing complex software.
D-ID
D-ID is an innovative AI-based platform that specializes in creating realistic, animated videos from photos and text. With advanced Deep-Learning technology, D-ID enables the automatic generation of face animations for marketing, training, design projects, and content creation. The platform offers a simple way to automate visual content and save time and resources.
Dahua Technology
Dahua Technology is a leading provider of video surveillance and security solutions. The company offers a wide range of hardware and software products that combine modern video analytics, intelligent monitoring, and reliable security features. Dahua caters to businesses and organizations needing robust and scalable video surveillance systems to effectively protect their assets and personnel.
Deep Dream Generator
Deep Dream Generator is a design and creative tool for AI image experiments, stylized visuals, and creative image variants with a surreal character.
DeepFaceLab
DeepFaceLab is an open-source software for creating deepfake videos. The application allows users to swap or manipulate faces in videos using artificial intelligence. It is particularly useful in the fields of research, media production, and creative projects. The software offers a range of tools for face reconstruction, training neural networks, and precise video editing.
Deepgram
Deepgram is a cloud-based platform for automatic speech recognition and transcription. With the latest algorithms, Deepgram enables the conversion of audio and video content into searchable text - precise, fast, and scalable. The solution is primarily aimed at developers and enterprises who want to integrate speech recognition into their applications, and offers flexible APIs and SDKs.
Descript
Descript is an innovative AI-powered software platform specifically designed for the editing of audio and video content. With a combination of advanced transcription, text-based editing, and multimedia cutting, Descript greatly simplifies the production of podcasts, videos, and other digital media. The intuitive interface and automated features make it a popular tool for content creators, marketers, and creatives of all skill levels.
Descript Overdub
Descript voice workflow for voice cloning, speech repair, and text-based audio editing.
Descript Studio Sound
Descript Studio Sound is an AI speech enhancement feature inside the Descript production workflow. It is designed to make voices sound clearer, closer, and more professional by reducing noise, room echo, muffled microphone quality, and uneven levels. Its practical value is that everyday recordings can become usable much faster, without rebuilding every track through a manual chain of audio plugins.
Discord
Discord is a versatile communication platform designed specifically for interaction within communities, teams, and groups. It combines text, voice, and video chat in a user-friendly interface and is well suited for both productive collaboration and casual communication. With its freemium pricing model, Discord offers both free core features and optional premium features that expand the user experience.
Ecrett Music
Ecrett Music generates licensable background music for videos, games, presentations, and content projects.
ElevenLabs
ElevenLabs is a cutting-edge AI-based audio platform specializing in the creation and editing of speech content. With modern text-to-speech technologies, ElevenLabs enables natural and expressive speech synthesis that can be used in various applications. The platform offers both a free entry-level version and paid plans with enhanced features.
Envision AI
Envision AI is a video-analysis tool for visual assistance and object recognition, with a strong focus on accessibility use cases and the practical questions of privacy, offline use, and misinterpretation.
FabFilter Pro-L 2
A professional limiter for mastering and final loudness control, with transparent signal processing, detailed metering, and flexible limiting modes for music production and audio post-production.
Fathom
Fathom is an intelligent tool for automatic transcription and summarization of online meetings. It helps users capture important conversation content without having to manually take notes, and supports productivity in teams. By integrating with popular video conferencing platforms, Fathom enables easy and efficient post-meeting preparation.
Filmora
Filmora is a video and production tool for accessible video editing for creators, tutorials, social clips, and simple productions.
Fireflies.ai
Fireflies.ai is a AI-powered tool for automatic transcription and recording of meetings. It helps teams to keep conversations efficient, create notes, and quickly find important information. By integrating with various meeting platforms and analyzing conversation content, Fireflies.ai simplifies post-meeting preparation and improves collaboration.
FL Studio
FL Studio is a audio and music tool for DAW for beatmaking, electronic music, recording, and full music production.
FlexClip
FlexClip is a user-friendly online tool for creating and editing videos. It is designed for users who want to create engaging videos for marketing, social media, presentations, or personal projects without needing advanced technical knowledge. With a wide range of templates, intuitive editing features, and automated functions, FlexClip supports the efficient creation of video content.
Fliki
Fliki is an innovative AI tool designed specifically for creating videos and podcasts from text content. With the help of artificial intelligence, Fliki transforms text into engaging audiovisual media suitable for marketing, education, or social media. The platform offers an intuitive user interface and a wide range of customization options to quickly and efficiently produce content.
FrameForge
FrameForge is an innovative software solution specifically designed for planning and visualizing film and video projects. By utilizing modern AI technologies, FrameForge helps filmmakers, designers, and creatives bring scripts to life in detailed storyboards and virtual sets, enabling more efficient planning, improved team communication, and significant time savings during pre-production.
Genetec Clearance
Genetec Clearance is a cloud-based platform designed for the secure management and sharing of video evidence and other digital proof. It enables law enforcement agencies, businesses, and organizations to efficiently store, organize, and share video data with authorized partners. The platform facilitates investigation collaboration and promotes transparency through straightforward access management and audit trails.
Google Cloud Text-to-Speech
Google Cloud Text-to-Speech is a powerful AI-based service that converts written text into naturally sounding speech. It uses advanced Deep Learning models to provide a wide range of voices and languages suitable for applications in audiobooks, speech assistants, learning programs, and more. With flexible customization options and a user-friendly API, this service is ideal for developers and businesses looking to create high-quality audio content automatically.
Google Cloud Video Intelligence
Google Cloud Video Intelligence is a cloud service for automatically analyzing video content. It uses machine learning to detect objects, scenes, activities, and spoken content, helping organizations categorize videos, streamline workflows, and quickly extract relevant information.
GoToMeeting
GoToMeeting is an online meeting platform for video calls, screen sharing and business collaboration.
Hera
Hera is an AI motion-design tool for creating short marketing, product and launch videos from prompts, assets and visual direction.
HeyGen
HeyGen is a practical tool for creating AI avatar videos, localizing video content, and producing synthetic presentations for marketing, training, support, and internal communication.
Higgsfield
Higgsfield is a video and production tool for AI video generation and creative motion experiments for social and campaign ideas.
Hindenburg Journalist
Specialized audio editing software for journalists, podcasters, and radio professionals, with an emphasis on ease of use, automation, and a streamlined production workflow.
IBM Watson Speech to Text
A cloud-based speech recognition service that converts audio into text with support for real-time and batch transcription, multiple languages, speaker identification, and API integration.
IBM Watson Text to Speech
A cloud-based text-to-speech service that turns written text into natural-sounding speech, supports multiple languages and voices, and helps teams build accessible, interactive applications.
IBM Watson Video Analytics
IBM Watson Video Analytics is an advanced solution for analyzing and evaluating video data using artificial intelligence. The platform enables companies to automatically process large volumes of video footage in order to gain valuable insights, improve security measures, and optimize operational workflows. With powerful features such as object detection, motion analysis, and automatic event detection, IBM Watson Video Analytics supports a wide range of use cases in industry, retail, public spaces, and more.
InVideo
A template-based video production tool for marketing and social media teams that helps combine scripts, clips, and text panels into publishable videos.
iSpeech
iSpeech is an AI-powered speech processing platform for text-to-speech and speech-to-text workflows, with APIs for integrating voice features into websites, apps, and business systems.
iZotope Ozone
iZotope Ozone is professional audio mastering software that uses AI-powered technologies to simplify and optimize the mastering process. With a broad set of tools and intelligent algorithms, it helps music producers, sound engineers, and creators take their sound to a new level, whether in the studio or on the go.
Jitsi Meet
Jitsi Meet is an open-source video conferencing platform for running online meetings quickly and easily, without registration or installation. It offers a secure and flexible solution for individuals, teams, and organizations looking for a straightforward way to communicate, with a strong focus on privacy and ease of use.
Kapwing
A browser-based, AI-assisted platform for creating and editing videos and multimedia content, with templates, collaboration tools, and simple design features for creators, marketers, teams, and beginners.
Kling AI
Kling AI is a cloud-based AI video tool for fast creation, editing, and export of professional-looking videos, with templates, effects, and team features.
Krisp
AI-powered audio software that removes background noise in real time for calls, video meetings, and recordings, with support for major communication tools and local processing for privacy.
LANDR
LANDR is a audio and music tool for mastering, music distribution, and audio workflows for independent musicians and creators.
Libsyn
Libsyn is an established podcast hosting platform focused on easy distribution and monetization of audio content, with tools for managing, publishing, and analyzing podcasts.
Lingvanex
Translation and language platform for text, speech, files, API, and business scenarios.
Loom
Loom is a powerful screen recording and video communication tool designed primarily for digital collaboration and customer interaction. It allows users to quickly create videos to explain complex topics, provide feedback, or share information visually—avoiding lengthy meetings or emails. Its intuitive interface makes it appealing to both individuals and teams.
Loudly
Loudly is a audio and music tool for AI music, soundtracks, and licensable audio variants for content production.
LoudMax
LoudMax is a free audio limiter designed specifically for mastering and adjusting the loudness of music and audio content. The plugin allows you to significantly boost the volume of an audio signal without audible distortion or quality loss. With its simple interface and efficient processing, LoudMax is a popular choice for musicians, producers, and audio engineers seeking a fast and reliable solution for volume optimization.
Lumiere
Lumiere is an AI tool for creative and productivity workflows, with an intuitive interface, freemium access, and paid plans for advanced features.
Magisto
Magisto is an AI-powered video editing platform that automates cutting, effects, music, and publishing so users can create polished videos quickly for marketing, social media, or personal use.
MeldaProduction MLimiter
A powerful, versatile limiter plugin for audio mastering, designed to maximize loudness while preserving clarity and control. It offers a user-friendly interface, detailed dynamics control, and a free version that makes it accessible for both beginners and experienced producers.
Microsoft Azure Cognitive Services - Text to Speech
Microsoft Azure Cognitive Services - Text to Speech is a powerful cloud-based service that converts written text into natural-sounding speech. With a wide range of voices, languages, and customization options, this service is suitable for applications in areas such as accessibility, customer service, e-learning, and more. Integration is handled through an API, offering flexible deployment options across a variety of software solutions.
Microsoft Azure Speech Service
Microsoft Azure Speech Service is a cloud-based speech processing platform for transcription, text-to-speech, translation, and speech understanding. It supports a wide range of use cases for customer service, media, education, and workflow automation.
Microsoft Azure Speech to Text
Microsoft Azure Speech to Text is a cloud-based service that converts spoken language into text. It is suitable for meeting transcription, app integration, accessibility, and productivity workflows, with support for real-time and batch transcription, speaker identification, and customizable speech models.
Milestone Systems
Milestone Systems is a business and operations platform for video management and security infrastructure for professional surveillance and site systems.
Mimic
Mimic is an AI-based speech synthesis tool for generating natural, realistic voices for audiobooks, virtual assistants, audio content, and other applications. It offers flexible voice generation with multiple languages, API integration, and plan-dependent offline use.
Mivi
Mivi is an AI-powered video tool for creating and editing videos quickly, with a freemium model and a range of templates, export options, and collaboration features.
Murf
Murf is a audio and music tool for AI voices, voiceovers, and speech production for videos, courses, and marketing material.
Mycroft
Mycroft is an open-source voice assistant with flexible customization, smart home control, privacy-focused local processing, and support for developers, hobbyists, and organizations looking for an independent alternative to proprietary assistants.
Naoma AI
Naoma AI is an AI video sales agent for B2B SaaS teams, designed to run personalized product demos, qualify leads and route prospects into the next sales step.
NightCafe Studio
NightCafe Studio is an AI-powered audio creation platform for generating soundscapes, music, and sound effects with adjustable parameters, cloud-based access, export options, and community features.
Noise Blocker
Noise Blocker is an AI-powered noise suppression tool for calls, meetings, recordings, and streaming, designed to reduce background noise and improve clarity.
Nuance Dragon
Powerful speech recognition software for dictation, transcription, and productivity, with high accuracy, customizable options, and support for both personal and professional use.
OBS Studio
OBS Studio is a video and production tool for open-source streaming and screen recording for live productions, tutorials, and events.
Ocenaudio
Ocenaudio is a free audio editor for quick cuts, recording checks, and simple editing without a complex studio environment.
OpenCV
Computer vision library for image and video processing, suited to teams with their own CV models, camera data, or edge projects.
Otter.ai
Otter.ai is an AI-powered transcription and note-taking tool for meetings, interviews, lectures, and other spoken content.
Pika
Pika is an AI-powered video tool for creating and editing video content more efficiently, with automated features, an intuitive interface, and collaboration options for creators, marketing teams, and businesses.
PixVerse
PixVerse is a video and production tool for AI video generation from prompts, images, or ideas for short creative clips.
Play.ht
Play.ht is a text-to-speech platform for turning written content into natural-sounding audio for podcasts, audiobooks, e-learning, and other use cases.
Podbean
Podbean is a comprehensive podcast platform that offers both hosting and monetization options. With a user-friendly interface and versatile features, Podbean helps podcasters create, publish, and make their content accessible to a broad audience. The platform is especially well suited for beginners and experienced podcasters who value ease of use and professional tools.
Podcastle
Podcastle is an AI-powered platform for creating, recording, and editing audio and video content, with tools for transcription, audio enhancement, collaboration, and publishing workflows.
PowerDirector
PowerDirector is a powerful desktop video editing application with an intuitive interface, built-in AI tools, and a wide range of effects, templates, and export options for both beginners and professionals.
ReadSpeaker
Natural-sounding text-to-speech software for websites, apps, and digital learning content, with multilingual voices, accessibility features, and API or widget integration.
Renderforest
Renderforest is a versatile online platform for creating professional videos, animations, logos, and websites with templates and AI-powered tools. It is especially useful for producing marketing videos, explainers, and other visual content quickly, even without deep design or video editing experience.
Resemble AI
Resemble AI is a voice synthesis and cloning tool for teams that need fast, flexible audio production with clear rules around consent, labeling, security, and editorial review.
Respeecher
Respeecher is a cloud-based voice cloning and synthetic speech tool for media teams that need repeatable workflows, clear consent handling, and reliable quality review for film, games, and localization.
ResponsiveVoice
ResponsiveVoice is an AI-powered text-to-speech solution that makes it easy to add voice output to websites and applications. It supports many languages and voices, with straightforward integration for accessibility, interactivity, and automated audio workflows.
Runway
Runway is an innovative AI platform that gives creators and developers powerful tools for creating and editing media content. With a focus on machine learning and real-time video processing, Runway makes it possible to integrate state-of-the-art AI models into creative workflows. The platform is suitable for both beginners and professionals and combines an intuitive interface with extensive functionality.
RX Elements by iZotope
RX Elements by iZotope is specialized audio editing software that focuses primarily on repairing and enhancing audio recordings. With a range of intelligent tools, it enables users to effectively remove unwanted noise such as hiss, clicks, or hum and improve the sound quality of speech and music recordings. The software is suitable for both beginners and advanced users who are looking for a cost-effective solution for audio restoration.
Samsung Bixby
A Samsung-only virtual assistant for voice control, automation, and quick access to information across compatible devices.
Slate Digital FG-X
Slate Digital FG-X is a professional mastering tool for maximizing loudness while preserving transparency, dynamics, and mix clarity.
Sonix
Sonix is an AI transcription and captioning tool for audio and video files. It helps turn interviews, meetings, podcasts, videos, and research recordings into searchable text faster.
Sora
A flexible video software for creating, editing, and managing video content, with tools for timelines, effects, collaboration, and multi-format export.
Soundraw
Soundraw is an AI music composition tool for creating and adapting tracks quickly for videos, podcasts, and other creative projects.
Soundtrap
Soundtrap is a audio and music tool for browser-based music production and audio collaboration for songs, podcasts, and education.
Speech-to-Text
AI-powered speech-to-text tools that automatically convert spoken language into written text for transcription, productivity, accessibility, and content workflows.
Speechify
Speechify is an AI-powered text-to-speech tool that turns written content into natural-sounding audio. It helps users consume text more efficiently for study, work, or leisure, with a user-friendly interface and a range of features. A free version is available, along with paid plans that add more advanced capabilities.
Speechly
Speechly is an AI-powered speech processing solution for adding real-time voice commands, speech recognition, and natural language understanding to web and mobile applications.
Speechmatics
Speechmatics provides automatic speech recognition and transcription for audio, video, meetings, and multilingual workflows.
Splice
Splice is a versatile platform focused on helping creatives produce audio and video content. With a combination of AI-powered tools and an extensive library of sounds, samples, and templates, Splice enables users to make their projects more efficient and more creative. The platform is aimed primarily at musicians, video producers, and content creators who want to boost their productivity.
Spreaker
Spreaker is a versatile platform for podcast creation and publishing, with tools for recording, editing, distribution, live streaming, analytics, monetization, and team collaboration.
Storyboarder
Storyboarder is best understood less by its raw feature list than by the actual workflow: storyboard sketches for film, animation, and video ideas. Its practical value shows up where scene order, camera ideas, and timing need to become visible early, without pushing every decision out into separate tools.
StudioBinder
A production management platform for film and video teams with planning, collaboration, task tracking, and media organization features that can also support audio-related workflows.
Suno AI
An AI-powered audio tool for creating, editing, and managing audio projects with intuitive workflows and flexible features for beginners and professionals alike.
Synthesia
Synthesia is an AI video production platform for creating professional videos with virtual avatars and automated voice synthesis, suitable for presentations, training, and marketing content.
T-RackS by IK Multimedia
T-RackS by IK Multimedia is a mixing and mastering suite for shaping finished audio with EQ, compression, limiting, saturation, metering, and analog-style color. It is aimed at musicians, producers, engineers, and podcasters who want more control over loudness, balance, and overall polish, while still relying on careful listening and reference-based decisions.
TDR Limiter 6 GE
TDR Limiter 6 GE is a professional audio plugin designed specifically for mastering and volume control. It offers precise and flexible dynamic processing with multiple limiter types and extensive customizable settings. Renowned for its high sound quality and user-friendly interface, it is a popular choice among sound engineers and music producers.
TeamSpeak
TeamSpeak is a versatile voice chat software widely used for online gaming and team communication. It delivers stable audio quality, low latency, and extensive features to support effective collaboration and clear real-time communication. The software offers a freemium model and caters to various user groups who prioritize reliable voice connections.
Temi
Temi is an automatic transcription service that enables quick and accurate conversion of audio and video files into text. Utilizing modern speech recognition technology, Temi is especially helpful for individuals who regularly need to transcribe audio content, such as journalists, students, and content creators. The service offers ease of use and delivers results rapidly, significantly boosting productivity.
Transana
Transana is specialized software for the transcription, coding, and analysis of audio and video material. It is particularly useful in research and qualitative data analysis, helping users systematically evaluate and interpret extensive multimedia data. The software offers a variety of tools to efficiently search, annotate, and categorize media content.
TurboScribe
TurboScribe is a modern transcription tool powered by artificial intelligence, designed specifically for fast and accurate conversion of audio files into text. It is ideal for users who want to transcribe audio content automatically, whether for interviews, meetings, podcasts, or other voice recordings. With an intuitive user interface and flexible pricing, TurboScribe offers both beginners and professional users an effective solution for audio transcription.
Verkada
Verkada is a modern video surveillance solution that combines cloud-based security technology with smart hardware. It provides businesses with an easy way to manage security cameras, analyze video data, and monitor multiple locations in real time. Verkada integrates video surveillance with advanced analytics to enhance control and enable faster responses to security incidents.
VivaCut
VivaCut is a versatile video editor aimed especially at users who want to edit professional videos on mobile devices or desktop. With a mix of easy-to-use tools and advanced features, VivaCut supports both beginners and experienced video creators in producing engaging videos for social media, presentations, or personal projects.
VLLO
A user-friendly mobile video editing app for creating, editing, and sharing professional-looking videos with an intuitive interface, versatile tools, and a freemium pricing model.
VN
A modern video editing app for beginners and experienced creators, with intuitive tools, multi-track editing, and flexible export options.
Vyrill
Vyrill is a video-commerce and video-intelligence platform for making product, UGC and review videos searchable, analyzable and usable in commerce workflows.
Wave.video
Wave.video is a versatile online platform for creating, editing, hosting, and sharing videos, with a focus on marketing, social media, and streaming. It combines templates, editing tools, branding options, collaboration features, and live streaming support in a single tool designed for both beginners and professionals.
WavePad
WavePad is a versatile audio editing tool for everything from simple trimming to more complex production work. It offers an intuitive interface, broad format support, and practical features for recording, editing, adding effects, batch processing, and exporting audio across different platforms.
Waves Abbey Road TG Mastering Chain
A mastering plugin that recreates the legendary Abbey Road console sound, with EQ, compression, limiting, and saturation in a flexible workflow for mixing and mastering.
Waves L1 Ultramaximizer
The Waves L1 Ultramaximizer is a professional audio plugin designed specifically for mastering and optimizing the loudness of music and audio productions. Utilizing precise limiting technology, it achieves maximum volume without distortion, preserving the sound quality of your tracks. As one of the most renowned tools in the audio industry, the L1 Ultramaximizer is an essential tool for producers, sound engineers, and musicians aiming to take their productions to the next level.
Waves L2 Ultramaximizer
A professional mastering limiter for controlling loudness with transparent clipping protection, dithering, and a simple interface.
WellSaid Labs
WellSaid Labs is a cloud-based AI text-to-speech platform for turning written content into natural-sounding voice recordings. It offers realistic voices, customization controls, API access, team collaboration, and export options for use in voice-overs, audiobooks, learning content, and podcasts.
WeVideo
A cloud-based video editing platform for creating, editing, and sharing videos online with collaboration tools, templates, stock media, and AI-assisted features.
Whereby
Whereby is a browser-based video conferencing platform for quick, simple online meetings, with fixed room links, screen sharing, chat, recording on paid plans, and flexible use for remote work, client meetings, and fast coordination.
Wispr Flow
Wispr Flow is an AI dictation tool for fast voice-first writing in apps, documents, chats, and workflows.
Zamzar AI
A practical file conversion tool for quickly preparing documents, images, audio, and video for further workflows, with clear limits around sensitive data, quality, and governance.
Zencastr
Zencastr is a audio and music tool for remote podcast recording, audio/video capture, and production workflow for conversations.