recentpopularlog in

papers

« earlier   
Visual Discovery at Pinterest
Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017). This paper presents an overview of our visual discovery engine powering these services, and shares the rationales behind our technical and product decisions such as the use of object detection and interactive user interfaces. We conclude that this visual discovery engine significantly improves engagement in both search and recommendation tasks.
papers  ui  ux  object-recognition  neuralnets  recommender-systems  recommendations  pinterest  discovery 
yesterday by tarakc02
How to read and understand a scientific paper: a guide for non-scientists
yeah, #1 - read intro not abstract ---- do this, this becomes really obvious after reading a few when the abstract still makes no sense to you, lol
papers  research  science  education  reference 
2 days ago by teffalump
[1706.04601] Provable benefits of representation learning
There is general consensus that learning representations is useful for a variety of reasons, e.g. efficient use of labeled data (semi-supervised learning), transfer learning and understanding hidden structure of data. Popular techniques for representation learning include clustering, manifold learning, kernel-learning, autoencoders, Boltzmann machines, etc.
To study the relative merits of these techniques, it's essential to formalize the definition and goals of representation learning, so that they are all become instances of the same definition. This paper introduces such a formal framework that also formalizes the utility of learning the representation. It is related to previous Bayesian notions, but with some new twists. We show the usefulness of our framework by exhibiting simple and natural settings -- linear mixture models and loglinear models, where the power of representation learning can be formally shown. In these examples, representation learning can be performed provably and efficiently under plausible assumptions (despite being NP-hard), and furthermore: (i) it greatly reduces the need for labeled data (semi-supervised learning) and (ii) it allows solving classification tasks when simpler approaches like nearest neighbors require too much data (iii) it is more powerful than manifold learning methods.
papers  to-read  representation-learning 
2 days ago by mraginsky
[1707.05199] Online codes for analog signals
We revisit a classical scenario in communication theory: a source is generating a waveform which we sample at regular intervals; we wish to transform the signal in such a way as to minimize distortion in its reconstruction, despite noise. The transformation must be online (also called causal), in order to enable real-time signaling. The noise model we consider is adversarial $\ell_1$-bounded; this is the "atomic norm" convex relaxation of the standard adversary model in discrete-alphabet communications, namely sparsity (low Hamming weight). We require that our encoding not increase the power of the original signal.
In the "block coding" setting such encoding is possible due to the existence of large almost-Euclidean sections in $\ell_1$ spaces (established in the work of Dvoretzky, Milman, Ka\v{s}in, and Figiel, Lindenstrauss and Milman).
Our main result is that an analogous result is achievable even online. Equivalently, we show a "lower triangular" version of $\ell_1$ Dvoretzky theorems. In terms of communication, the result has the following form: If the signal is a stream of reals $x_1,\ldots$, one per unit time, which we encode causally into $\rho$ (a constant) reals per unit time (forming altogether an output stream $\mathcal{E}(x)$), and if the adversarial noise added to this encoded stream up to time $s$ is a vector $\vec{y}$, then at time $s$ the decoder's reconstruction of the input prefix $x_{[s]}$ is accurate in a time-weighted $\ell_2$ norm, to within $s^{-1/2+\delta}$ (any $\delta>0$) times the adversary's noise as measured in a time-weighted $\ell_1$ norm. The time-weighted decoding norm forces increasingly accurate reconstruction of the distant past, while the time-weighted noise norm permits only vanishing effect from noise in the distant past.
Encoding is linear, and decoding is performed by an LP analogous to those used in compressed sensing.
papers  to-read  random-matrices  communication  information-theory  analog-information-theory 
2 days ago by mraginsky
[1703.03400] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
"We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies."
papers  meta-learning 
2 days ago by arsyed
[1701.04079] Agent-Agnostic Human-in-the-Loop Reinforcement Learning
"Providing Reinforcement Learning agents with expert advice can dramatically improve various aspects of learning. Prior work has developed teaching protocols that enable agents to learn efficiently in complex environments; many of these methods tailor the teacher's guidance to agents with a particular representation or underlying learning scheme, offering effective but specialized teaching procedures. In this work, we explore protocol programs, an agent-agnostic schema for Human-in-the-Loop Reinforcement Learning. Our goal is to incorporate the beneficial properties of a human teacher into Reinforcement Learning without making strong assumptions about the inner workings of the agent. We show how to represent existing approaches such as action pruning, reward shaping, and training in simulation as special cases of our schema and conduct preliminary experiments on simple domains."
papers  reinforcement-learning  human-in-the-loop 
2 days ago by arsyed
[1706.00764] Hyperparameter Optimization: A Spectral Approach
"We give a simple, fast algorithm for hyperparameter optimization inspired by techniques from the analysis of Boolean functions. We focus on the high-dimensional regime where the canonical example is training a neural network with a large number of hyperparameters. The algorithm - an iterative application of compressed sensing techniques for orthogonal polynomials - requires only uniform sampling of the hyperparameters and is thus easily parallelizable. Experiments for training deep nets on Cifar-10 show that compared to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm finds significantly improved solutions, in some cases matching what is attainable by hand-tuning. In terms of overall running time (i.e., time required to sample various settings of hyperparameters plus additional computation time), we are at least an order of magnitude faster than Hyperband and even more so compared to Bayesian Optimization. We also outperform Random Search 5X. Additionally, our method comes with provable guarantees and yields the first quasi-polynomial time algorithm for learning decision trees under the uniform distribution with polynomial sample complexity, the first improvement in over two decades."
papers  optimization  hyperparameter 
2 days ago by arsyed
[1707.02476] Adversarial Examples, Uncertainty, and Transfer Testing Robustness in Gaussian Process Hybrid Deep Networks
"Deep neural networks (DNNs) have excellent representative power and are state of the art classifiers on many tasks. However, they often do not capture their own uncertainties well making them less robust in the real world as they overconfidently extrapolate and do not notice domain shift. Gaussian processes (GPs) with RBF kernels on the other hand have better calibrated uncertainties and do not overconfidently extrapolate far from data in their training set. However, GPs have poor representational power and do not perform as well as DNNs on complex domains. In this paper we show that GP hybrid deep networks, GPDNNs, (GPs on top of DNNs and trained end-to-end) inherit the nice properties of both GPs and DNNs and are much more robust to adversarial examples. When extrapolating to adversarial examples and testing in domain shift settings, GPDNNs frequently output high entropy class probabilities corresponding to essentially "don't know". GPDNNs are therefore promising as deep architectures that know when they don't know."
papers  neural-net  gaussian-processes 
5 days ago by arsyed

Copy this bookmark:





to read