Projects tagged with machine-learning:

  • Tacotron: End-to-end Speech Synthesis

    2016 - present tags: google machine-learning sound

    General architecture of Tacotron.

    The most exciting work I’ve been involved with on the Sound Understanding team has been the development of Tacotron, an end-to-end speech synthesis system that produces speech “end-to-end” from characters to waveform. The initial system took verbalized characters as input and produced a log-magnitude mel spectrogram, which we then synthesize to waveform via standard signal processing methods (Griffin-Lim and an inverse Short-time Fourier Transform). In Tacotron 2, we replaced this hand-designed synthesis method with a neural vocoder, initially based on WaveNet.

    One line of research on my team is direct-to-waveform acoustic models, skipping the intermediate spectrogram representation. In the Wave-Tacotron paper, we published a model that does just that.

    Check out our publications page for the full list of research that my team has co-authored with the Google Brain, Deepmind, and Google Speech teams.

  • Google Sound Understanding

    2015 - present tags: google sound machine-learning research

    In 2015, I joined the Sound Understanding team within Google Perception. We focus on building systems that can both analyze and synthesize sound. Being able to work on my hobby (sound and digital signal processing) as my full time job has been a dream come true. We operate as a hybrid research team, which means we both publish our work and deploy it to improve Alphabet’s products and services.

    I’ve had the opportunity to work on some neat tasks and projects during my time on the team, but speech synthesis has been what I’ve spent the most time working on.

  • Google TensorFlow

    2015 - present tags: google machine-learning

    TensorFlow's OG logo.

    In 2015, I joined the Sound Understanding team within Google Perception. Our main tool for machine learning research is TensorFlow. Over the years I’ve contributed a number of features to TensorFlow that have been crucial to the research work my team has done.

    The highlights include:

    • Added the tf.signal module of signal processing components.
    • Extended real and complex FFT support, implemented with Eigen (CPU), cuFFT (GPU), and TPU support.
    • Significantly expanded complex number support.
    • Bugfixes and contributions to various parts of the runtime, libraries and more.

    Check out the full list of commits on GitHub.