recentpopularlog in


« earlier   
roamresearch/modern-tensorflow.ipynb at master · roamanalytics/roamresearch
Contribute to roamanalytics/roamresearch development by creating an account on GitHub.
nlp  jupyter  tensorflow  best-practice  deep-learning 
11 hours ago by nharbour
302 Found
How To Create a ChatBot With tf-seq2seq For Free! - Added June 07, 2018 at 01:15PM
chatbot  machine-learning  nlp  read2of  tutorial 
yesterday by xenocid
302 Found
Building a Question-Answering System — Part 1 - Added June 07, 2018 at 01:14PM
data-science  nlp  read2of 
yesterday by xenocid
NLP's ImageNet moment has arrived
Word2vec and related methods are shallow approaches that trade expressivity for efficiency. Using word embeddings is like initializing a computer vision model with pretrained representations that only encode edges: they will be helpful for many tasks, but they fail to capture higher-level information that might be even more useful. A model initialized with word embeddings needs to learn from scratch not only to disambiguate words, but also to derive meaning from a sequence of words. This is the core aspect of language understanding, and it requires modeling complex language phenomena such as compositionality, polysemy, anaphora, long-term dependencies, agreement, negation, and many more. It should thus come as no surprise that NLP models initialized with these shallow representations still require a huge number of examples to achieve good performance.

In NLP, models are typically a lot shallower than their CV counterparts. Analysis of features has thus mostly focused on the first embedding layer, and little work has investigated the properties of higher layers for transfer learning. Let us consider the datasets that are large enough, fulfilling desideratum #1. Given the current state of NLP, there are several contenders.

Language modeling (LM) aims to predict the next word given its previous word. Existing benchmark datasets consist of up to 1B words, but as the task is unsupervised, any number of words can be used for training. See below for examples from the popular WikiText-2 dataset consisting of Wikipedia articles.

In light of this step change, it is very likely that in a year’s time NLP practitioners will download pretrained language models rather than pretrained word embeddings for use in their own models, similarly to how pre-trained ImageNet models are the starting point for most CV projects nowadays.
nlp  deeplearning 
2 days ago by mike

Copy this bookmark:

to read