Google’s DeepMind artificial intelligence or WaveNet offers a better option. It can now have the ability to produce some of the most realistic human voices. What it does is to model audio waveforms from actual human voices samples and create its own sounds capturing the subtleties of human speech. It might not be as realistic as an actual person speaking but so far, it is much better than the sound produced by other text-to-speech programs. WaveNet has provided samples of speech generated that almost mimics a real human voice which on its own are just sounds with no content. As explained in their paper, Wavenet: A Generative Model For Raw Audio:
WaveNet has also shown that other audio signals can be synthesized such as automatically generated piano music. This is less complicated than producing speech as real as a human voice speaking.