• Home
  • About
    • Ravi Singh photo

      Ravi Singh

      A budding Data Scientist

    • Learn More
    • Email
    • Twitter
    • LinkedIn
    • Github
  • Posts
    • All Posts
    • All Tags
  • Projects

Speech to Face

20 Mar 20200

Reading time ~1 minute

How much can we infer about a person’s looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform this task using millions of natural Internet/YouTube videos of people speaking. During training, our model learns voice-face correlations that allow it to produce images that capture various physical attributes of the speakers such as age, gender and ethnicity. This is done in a self-supervised manner, by utilizing the……. Read the full article here



CNNimageencoder Share Tweet +1