recentpopularlog in

deep-learning

« earlier   
SeanNaren/deepspeech.pytorch: Speech Recognition using DeepSpeech2.
GitHub is where people build software. More than 27 million people use GitHub to discover, fork, and contribute to over 80 million projects.
speech-recognition  pytorch  speech  audio  deep-learning 
yesterday by nharbour
pgmmpk/tfrecord: Python way to Read/Write TFRecords
GitHub is where people build software. More than 27 million people use GitHub to discover, fork, and contribute to over 80 million projects.
tfrecord  pytorch  deep-learning  reading  format  python 
yesterday by nharbour
yuchenlin/lstm_sentence_classifier: LSTM-based Models for Sentence Classification in PyTorch
GitHub is where people build software. More than 27 million people use GitHub to discover, fork, and contribute to over 80 million projects.
nlp  lstm  rnn  classifier  example  code  github  deep-learning 
yesterday by nharbour
[D] How impactful could the choice of the Optimizer in NN be ? : MachineLearning
Hi, I was training a simple fully connected NN recently (on keras), and was stuck at a certain accuracy (45%) using...
adam  optimizer  deep-learning 
yesterday by pmigdal
The mostly complete chart of Neural Networks, explained
The zoo of neural network types grows exponentially. One needs a map to navigate between many emerging architectures and approaches. Fortunately, Fjodor van Veen from Asimov institute compiled a…
ml  ai  analytics  big_data  chart  cheatsheet  data_science  deep-learning  deep_learning  dnn 
yesterday by tranqy
[1805.07848] A Universal Music Translation Network
We present a method for translating music across musical instruments, genres, and styles. This method is based on a multi-domain wavenet autoencoder, with a shared encoder and a disentangled latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the domain-independent encoder allows us to translate even from musical domains that were not seen during training. The method is unsupervised and does not rely on supervision in the form of matched samples between domains or musical transcriptions. We evaluate our method on NSynth, as well as on a dataset collected from professional musicians, and achieve convincing translations, even when translating from whistling, potentially enabling the creation of instrumental music by untrained humans.
music  deep-learning  style-transfer 
2 days ago by arsyed
VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop
"We present a new neural text to speech (TTS) method that is able to transform text to speech in voices that are sampled in the wild. Unlike other systems, our solution is able to deal with unconstrained voice samples and without requiring aligned phonemes or linguistic features. The network architecture is simpler than those in the existing literature and is based on a novel shifting buffer working memory. The same buffer is used for estimating the attention, computing the output audio, and for updating the buffer itself. The input sentence is encoded using a context-free lookup table that contains one entry per character or phoneme. The speakers are similarly represented by a short vector that can also be fitted to new identities, even with only a few samples. Variability in the generated speech is achieved by priming the buffer prior to generating the audio. Experimental results on several datasets demonstrate convincing capabilities, making TTS accessible to a wider range of applications. In order to promote reproducibility, we release our source code and models."
speech-synthesis  deep-learning 
2 days ago by arsyed

Copy this bookmark:





to read