NewsBytes
    Hindi Tamil Telugu
    More
    In the news
    Narendra Modi
    Amit Shah
    Box Office Collection
    Bharatiya Janata Party (BJP)
    OTT releases
    Hindi Tamil Telugu
    NewsBytes
    User Placeholder

    Hi,

    Logout


    India Business World Politics Sports Technology Entertainment Auto Lifestyle Inspirational Career Bengaluru Delhi Mumbai Visual Stories Find Cricket Statistics Phones Reviews Fitness Bands Reviews Speakers Reviews

    Download Android App

    Follow us on
    • Facebook
    • Twitter
    • Linkedin
     
    Home / News / Technology News / Microsoft's AI tool creates 'deepfake voices' so real they're banned 
    In brief
    Simplifying... Inbrief
    • Microsoft's AI tool, VALL-E 2, has developed 'deepfake voices' so realistic they've been banned from public use due to misuse concerns.
    • The tool's advanced speech synthesis surpasses previous systems in robustness, naturalness, and speaker similarity, but its release is withheld due to potential risks around voice cloning and deepfake technology.
    • Despite this, researchers see potential applications in various fields, provided there are protocols for speaker approval and synthesized speech detection.
    Was a long read? Making it simpler...
    Next Article
    Next Article
    Microsoft's AI tool creates 'deepfake voices' so real they're banned 
    It can reproduce human speech via just a few seconds of audio

    Microsoft's AI tool creates 'deepfake voices' so real they're banned 

    By Dwaipayan Roy
    Jul 11, 2024
    10:05 am
    What's the story

    Microsoft has developed an AI speech generator, VALL-E 2, capable of so convincingly mimicking human voices, that it cannot be released to the public. According to a paper published on arXiv, the text-to-speech (TTS) generator can reproduce human speech using just a few seconds of audio. The researchers describe VALL-E 2 as "the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time."

    AI advancements

    Key reasons behind VALL-E 2's performance

    VALL-E 2's high-quality speech synthesis is attributed to two key features: "Repetition Aware Sampling" and "Grouped Code Modeling." The former improves the AI's conversion of text into speech by preventing repetitions of language units, the infinite loops of sounds, and phrases. The latter enhances efficiency by lowering sequence length, speeding up how quickly VALL-E 2 generates speech, and managing difficulties associated with processing long strings of sounds.

    Success

    VALL-E 2 surpasses previous AI systems in speech synthesis

    Researchers used audio samples from LibriSpeech and VCTK speech libraries and ELLA-V, an evaluation framework, to assess VALL-E 2's performance. They concluded that "VALL-E 2 surpasses previous zero-shot TTS systems in speech robustness, naturalness, and speaker similarity," making it the first to reach human parity on these benchmarks. However, the quality of VALL-E 2's output is influenced by factors like the length and quality of speech prompts, and environmental factors like background noise.

    Public release

    Microsoft withholds VALL-E 2 over misuse concerns

    Despite its capabilities, Microsoft has decided not to roll out VALL-E 2 to the public due to potential misuse risks. This decision echoes rising concerns around voice cloning and deepfake technology. The researchers stated in a blog post that "VALL-E 2 is purely a research project. Currently, we have no plans to incorporate VALL-E 2 into a product or expand access to the public."

    Future prospects

    Possible applications for AI speech tech

    The researchers suggested potential applications for AI speech technology like VALL-E 2 in education, entertainment, journalism, accessibility features, translation, interactive voice response systems, and chatbots. They stated that if the model is generalized to unseen speakers in the real world, it should have a protocol to ensure that the speaker approves the use of their voice, and a synthesized speech detection model.

    Facebook
    Whatsapp
    Twitter
    Linkedin
    Related News
    Latest
    Microsoft
    Artificial Intelligence and Machine Learning

    Latest

    Cryptocurrency prices today: Check rates of Bitcoin, Dogecoin, Tether, Ethereum  Cryptocurrency
    Trainee IAS officer Khedkar wanted house, car before joining: Report Indian Administrative Service (IAS)
    Bengaluru start-up developing 'expandable space habitat': Know its significance Space News
    Emraan Hashmi opens up about 'serial kisser' label in Bollywood Bollywood

    Microsoft

    Going solo? OpenAI shuts down observer program following Microsoft's exit Apple
    Notepad gets modern makeover: Spellcheck, autocorrect arrive after 40 years Windows 10
    Frustrated with Windows? Here's your guide to switching to Linux Windows 10
    Indian prices for Microsoft Surface Laptop 7, Pro 11 leaked Technology

    Artificial Intelligence and Machine Learning

    Galaxy Watch Ultra is Samsung's response to Apple Watch Ultra Samsung
    Samsung Galaxy Ring offers 7-day battery life, smartwatch-like health features  Samsung
    'I'm real person': UK politician quashes AI-powered campaign rumors Elections
    AI-powered job scams on the rise: How to protect yourself Google
    Next Article

    Live

    Indian Premier League (IPL) Celebrity Hollywood Bollywood UEFA Champions League Tennis Football Smartphones Cryptocurrency Upcoming Movies Premier League Cricket News Latest automobiles Latest Cars Upcoming Cars Latest Bikes Upcoming Tablets
    About Us Privacy Policy Terms & Conditions Contact Us Ethical Conduct Grievance Redressal News News Archive Topics Archive Download DevBytes Find Cricket Statistics
    Follow us on
    Facebook Twitter Linkedin
    All rights reserved © NewsBytes 2024
    filled star
    half filled star