recentpopularlog in

nhaliday : deep-learning   248

« earlier  
Ask HN: What's a promising area to work on? | Hacker News
hn  discussion  q-n-a  ideas  impact  trends  the-bones  speedometer  technology  applications  tech  cs  programming  list  top-n  recommendations  lens  machine-learning  deep-learning  security  privacy  crypto  software  hardware  cloud  biotech  CRISPR  bioinformatics  biohacking  blockchain  cryptocurrency  crypto-anarchy  healthcare  graphics  SIGGRAPH  vr  automation  universalism-particularism  expert-experience  reddit  social  arbitrage  supply-demand  ubiquity  cost-benefit  compensation  chart  career  planning  strategy  long-term  advice  sub-super  commentary  rhetoric  org:com  techtariat  human-capital  prioritizing  tech-infrastructure  working-stiff  data-science 
november 2019 by nhaliday
Why is Google Translate so bad for Latin? A longish answer. : latin
hmm:
> All it does its correlate sequences of up to five consecutive words in texts that have been manually translated into two or more languages.
That sort of system ought to be perfect for a dead language, though. Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.

We're not exactly inundated with brand new Latin to translate.
--
> Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.
What makes you think that the Google folks haven't done so and used that to create the language models they use?
> That sort of system ought to be perfect for a dead language, though.
Perhaps. But it will be bad at translating novel English sentences to Latin.
foreign-lang  reddit  social  discussion  language  the-classics  literature  dataset  measurement  roots  traces  syntax  anglo  nlp  stackex  links  q-n-a  linguistics  lexical  deep-learning  sequential  hmm  project  arrows  generalization  state-of-art  apollonian-dionysian  machine-learning  google 
june 2019 by nhaliday
classification - ImageNet: what is top-1 and top-5 error rate? - Cross Validated
Now, in the case of top-1 score, you check if the top class (the one having the highest probability) is the same as the target label.

In the case of top-5 score, you check if the target label is one of your top 5 predictions (the 5 ones with the highest probabilities).
nibble  q-n-a  overflow  machine-learning  deep-learning  metrics  comparison  ranking  top-n  classification  computer-vision  benchmarks  dataset  accuracy  error  jargon 
june 2019 by nhaliday
[1803.00085] Chinese Text in the Wild
We introduce Chinese Text in the Wild, a very large dataset of Chinese text in street view images.

...

We give baseline results using several state-of-the-art networks, including AlexNet, OverFeat, Google Inception and ResNet for character recognition, and YOLOv2 for character detection in images. Overall Google Inception has the best performance on recognition with 80.5% top-1 accuracy, while YOLOv2 achieves an mAP of 71.0% on detection. Dataset, source code and trained models will all be publicly available on the website.
nibble  pdf  papers  preprint  machine-learning  deep-learning  deepgoog  state-of-art  china  asia  writing  language  dataset  error  accuracy  computer-vision  pic  ocr  org:mat  benchmarks  questions 
may 2019 by nhaliday
Should I go for TensorFlow or PyTorch?
Honestly, most experts that I know love Pytorch and detest TensorFlow. Karpathy and Justin from Stanford for example. You can see Karpthy's thoughts and I've asked Justin personally and the answer was sharp: PYTORCH!!! TF has lots of PR but its API and graph model are horrible and will waste lots of your research time.

--

...

Updated Mar 12
Update after 2019 TF summit:

TL/DR: previously I was in the pytorch camp but with TF 2.0 it’s clear that Google is really going to try to have parity or try to be better than Pytorch in all aspects where people voiced concerns (ease of use/debugging/dynamic graphs). They seem to be allocating more resources on development than Facebook so the longer term currently looks promising for Google. Prior to TF 2.0 I thought that Pytorch team had more momentum. One area where FB/Pytorch is still stronger is Google is a bit more closed and doesn’t seem to release reproducible cutting edge models such as AlphaGo whereas FAIR released OpenGo for instance. Generally you will end up running into models that are only implemented in one framework of the other so chances are you might end up learning both.
q-n-a  qra  comparison  software  recommendations  cost-benefit  tradeoffs  python  libraries  machine-learning  deep-learning  data-science  sci-comp  tools  google  facebook  tech  competition  best-practices  trends  debugging  expert-experience  ecosystem  theory-practice  pragmatic  wire-guided  static-dynamic  state  academia  frameworks  open-closed 
may 2019 by nhaliday
Information Processing: Moore's Law and AI
Hint to technocratic planners: invest more in physicists, chemists, and materials scientists. The recent explosion in value from technology has been driven by physical science -- software gets way too much credit. From the former we got a factor of a million or more in compute power, data storage, and bandwidth. From the latter, we gained (perhaps) an order of magnitude or two in effectiveness: how much better are current OSes and programming languages than Unix and C, both of which are ~50 years old now?

...

Of relevance to this discussion: a big chunk of AlphaGo's performance improvement over other Go programs is due to raw compute power (link via Jess Riedel). The vertical axis is ELO score. You can see that without multi-GPU compute, AlphaGo has relatively pedestrian strength.
hsu  scitariat  comparison  software  hardware  performance  sv  tech  trends  ai  machine-learning  deep-learning  deepgoog  google  roots  impact  hard-tech  multiplicative  the-world-is-just-atoms  technology  trivia  cocktail  big-picture  hi-order-bits 
may 2019 by nhaliday
A Recipe for Training Neural Networks
acmtariat  org:bleg  nibble  machine-learning  deep-learning  howto  tutorial  guide  nitty-gritty  gotchas  init  list  checklists  expert-experience  abstraction  composition-decomposition  gradient-descent  data-science  error  debugging  benchmarks  programming  engineering  best-practices  dataviz  checking  plots  generalization  regularization  unsupervised  optimization  ensembles  random  methodology  multi  twitter  social  discussion  techtariat  links  org:med  pdf  visualization  python  recommendations  advice  devtools 
april 2019 by nhaliday
Workshop Abstract | Identifying and Understanding Deep Learning Phenomena
ICML 2019 workshop, June 15th 2019, Long Beach, CA

We solicit contributions that view the behavior of deep nets as natural phenomena, to be investigated with methods inspired from the natural sciences like physics, astronomy, and biology.
unit  workshop  acm  machine-learning  science  empirical  nitty-gritty  atoms  deep-learning  model-class  icml  data-science  rigor  replication  examples  ben-recht  physics 
april 2019 by nhaliday
Lateralization of brain function - Wikipedia
Language
Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing
The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/
In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]
gnon  reflection  books  summary  review  neuro  neuro-nitgrit  things  thinking  metabuch  order-disorder  apollonian-dionysian  bio  examples  near-far  symmetry  homo-hetero  logic  inference  intuition  problem-solving  analytical-holistic  n-factor  europe  the-great-west-whale  occident  alien-character  detail-architecture  art  theory-practice  philosophy  being-becoming  essence-existence  language  psychology  cog-psych  egalitarianism-hierarchy  direction  reason  learning  novelty  science  anglo  anglosphere  coarse-fine  neurons  truth  contradiction  matching  empirical  volo-avolo  curiosity  uncertainty  theos  axioms  intricacy  computation  analogy  essay  rhetoric  deep-materialism  new-religion  knowledge  expert-experience  confidence  biases  optimism  pessimism  realness  whole-partial-many  theory-of-mind  values  competition  reduction  subjective-objective  communication  telos-atelos  ends-means  turing  fiction  increase-decrease  innovation  creative  thick-thin  spengler  multi  ratty  hanson  complex-systems  structure  concrete  abstraction  network-s 
september 2018 by nhaliday
Moravec's paradox - Wikipedia
Moravec's paradox is the discovery by artificial intelligence and robotics researchers that, contrary to traditional assumptions, high-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources. The principle was articulated by Hans Moravec, Rodney Brooks, Marvin Minsky and others in the 1980s. As Moravec writes, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility".[1]

Similarly, Minsky emphasized that the most difficult human skills to reverse engineer are those that are unconscious. "In general, we're least aware of what our minds do best", he wrote, and added "we're more aware of simple processes that don't work well than of complex ones that work flawlessly".[2]

...

One possible explanation of the paradox, offered by Moravec, is based on evolution. All human skills are implemented biologically, using machinery designed by the process of natural selection. In the course of their evolution, natural selection has tended to preserve design improvements and optimizations. The older a skill is, the more time natural selection has had to improve the design. Abstract thought developed only very recently, and consequently, we should not expect its implementation to be particularly efficient.

As Moravec writes:

Encoded in the large, highly evolved sensory and motor portions of the human brain is a billion years of experience about the nature of the world and how to survive in it. The deliberate process we call reasoning is, I believe, the thinnest veneer of human thought, effective only because it is supported by this much older and much more powerful, though usually unconscious, sensorimotor knowledge. We are all prodigious olympians in perceptual and motor areas, so good that we make the difficult look easy. Abstract thought, though, is a new trick, perhaps less than 100 thousand years old. We have not yet mastered it. It is not all that intrinsically difficult; it just seems so when we do it.[3]

A compact way to express this argument would be:

- We should expect the difficulty of reverse-engineering any human skill to be roughly proportional to the amount of time that skill has been evolving in animals.
- The oldest human skills are largely unconscious and so appear to us to be effortless.
- Therefore, we should expect skills that appear effortless to be difficult to reverse-engineer, but skills that require effort may not necessarily be difficult to engineer at all.
concept  wiki  reference  paradox  ai  intelligence  reason  instinct  neuro  psychology  cog-psych  hardness  logic  deep-learning  time  evopsych  evolution  sapiens  the-self  EEA  embodied  embodied-cognition  abstraction  universalism-particularism  gnosis-logos  robotics 
june 2018 by nhaliday
Is the human brain analog or digital? - Quora
The brain is neither analog nor digital, but works using a signal processing paradigm that has some properties in common with both.
 
Unlike a digital computer, the brain does not use binary logic or binary addressable memory, and it does not perform binary arithmetic. Information in the brain is represented in terms of statistical approximations and estimations rather than exact values. The brain is also non-deterministic and cannot replay instruction sequences with error-free precision. So in all these ways, the brain is definitely not "digital".
 
At the same time, the signals sent around the brain are "either-or" states that are similar to binary. A neuron fires or it does not. These all-or-nothing pulses are the basic language of the brain. So in this sense, the brain is computing using something like binary signals. Instead of 1s and 0s, or "on" and "off", the brain uses "spike" or "no spike" (referring to the firing of a neuron).
q-n-a  qra  expert-experience  neuro  neuro-nitgrit  analogy  deep-learning  nature  discrete  smoothness  IEEE  bits  coding-theory  communication  trivia  bio  volo-avolo  causation  random  order-disorder  ems  models  methodology  abstraction  nitty-gritty  computation  physics  electromag  scale  coarse-fine 
april 2018 by nhaliday
AI-complete - Wikipedia
In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems is equivalent to that of solving the central artificial intelligence problem—making computers as intelligent as people, or strong AI.[1] To call a problem AI-complete reflects an attitude that it would not be solved by a simple specific algorithm.

AI-complete problems are hypothesised to include computer vision, natural language understanding, and dealing with unexpected circumstances while solving any real world problem.[2]

Currently, AI-complete problems cannot be solved with modern computer technology alone, but would also require human computation. This property can be useful, for instance to test for the presence of humans as with CAPTCHAs, and for computer security to circumvent brute-force attacks.[3][4]

...

AI-complete problems are hypothesised to include:

Bongard problems
Computer vision (and subproblems such as object recognition)
Natural language understanding (and subproblems such as text mining, machine translation, and word sense disambiguation[8])
Dealing with unexpected circumstances while solving any real world problem, whether it's navigation or planning or even the kind of reasoning done by expert systems.

...

Current AI systems can solve very simple and/or restricted versions of AI-complete problems, but never in their full generality. When AI researchers attempt to "scale up" their systems to handle more complicated, real world situations, the programs tend to become excessively brittle without commonsense knowledge or a rudimentary understanding of the situation: they fail as unexpected circumstances outside of its original problem context begin to appear. When human beings are dealing with new situations in the world, they are helped immensely by the fact that they know what to expect: they know what all things around them are, why they are there, what they are likely to do and so on. They can recognize unusual situations and adjust accordingly. A machine without strong AI has no other skills to fall back on.[9]
concept  reduction  cs  computation  complexity  wiki  reference  properties  computer-vision  ai  risk  ai-control  machine-learning  deep-learning  language  nlp  order-disorder  tactics  strategy  intelligence  humanity  speculation  crux 
march 2018 by nhaliday
Information Processing: Mathematical Theory of Deep Neural Networks (Princeton workshop)
"Recently, long-past-due theoretical results have begun to emerge. These results, and those that will follow in their wake, will begin to shed light on the properties of large, adaptive, distributed learning architectures, and stand to revolutionize how computer science and neuroscience understand these systems."
hsu  scitariat  commentary  links  research  research-program  workshop  events  princeton  sanjeev-arora  deep-learning  machine-learning  ai  generalization  explanans  off-convex  nibble  frontier  speedometer  state-of-art  big-surf  announcement 
january 2018 by nhaliday
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent  org:popup 
december 2017 by nhaliday
[1709.06560] Deep Reinforcement Learning that Matters
https://twitter.com/WAWilsonIV/status/912505885565452288
I’ve been experimenting w/ various kinds of value function approaches to RL lately, and its striking how primitive and bad things seem to be
At first I thought it was just that my code sucks, but then I played with the OpenAI baselines and nope, it’s the children that are wrong.
And now, what comes across my desk but this fantastic paper: (link: https://arxiv.org/abs/1709.06560) arxiv.org/abs/1709.06560 How long until the replication crisis hits AI?

https://twitter.com/WAWilsonIV/status/911318326504153088
Seriously I’m not blown away by the PhDs’ records over the last 30 years. I bet you’d get better payoff funding eccentrics and amateurs.
There are essentially zero fundamentally new ideas in AI, the papers are all grotesquely hyperparameter tuned, nobody knows why it works.

Deep Reinforcement Learning Doesn't Work Yet: https://www.alexirpan.com/2018/02/14/rl-hard.html
Once, on Facebook, I made the following claim.

Whenever someone asks me if reinforcement learning can solve their problem, I tell them it can’t. I think this is right at least 70% of the time.
papers  preprint  machine-learning  acm  frontier  speedometer  deep-learning  realness  replication  state-of-art  survey  reinforcement  multi  twitter  social  discussion  techtariat  ai  nibble  org:mat  unaffiliated  ratty  acmtariat  liner-notes  critique  sample-complexity  cost-benefit  todo 
september 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Superintelligence Risk Project Update II
https://www.jefftk.com/p/superintelligence-risk-project-update

https://www.jefftk.com/p/conversation-with-michael-littman
For example, I asked him what he thought of the idea that to we could get AGI with current techniques, primarily deep neural nets and reinforcement learning, without learning anything new about how intelligence works or how to implement it ("Prosaic AGI" [1]). He didn't think this was possible, and believes there are deep conceptual issues we still need to get a handle on. He's also less impressed with deep learning than he was before he started working in it: in his experience it's a much more brittle technology than he had been expecting. Specifically, when trying to replicate results, he's often found that they depend on a bunch of parameters being in just the right range, and without that the systems don't perform nearly as well.

The bottom line, to him, was that since we are still many breakthroughs away from getting to AGI, we can't productively work on reducing superintelligence risk now.

He told me that he worries that the AI risk community is not solving real problems: they're making deductions and inferences that are self-consistent but not being tested or verified in the world. Since we can't tell if that's progress, it probably isn't. I asked if he was referring to MIRI's work here, and he said their work was an example of the kind of approach he's skeptical about, though he wasn't trying to single them out. [2]

https://www.jefftk.com/p/conversation-with-an-ai-researcher
Earlier this week I had a conversation with an AI researcher [1] at one of the main industry labs as part of my project of assessing superintelligence risk. Here's what I got from them:

They see progress in ML as almost entirely constrained by hardware and data, to the point that if today's hardware and data had existed in the mid 1950s researchers would have gotten to approximately our current state within ten to twenty years. They gave the example of backprop: we saw how to train multi-layer neural nets decades before we had the computing power to actually train these nets to do useful things.

Similarly, people talk about AlphaGo as a big jump, where Go went from being "ten years away" to "done" within a couple years, but they said it wasn't like that. If Go work had stayed in academia, with academia-level budgets and resources, it probably would have taken nearly that long. What changed was a company seeing promising results, realizing what could be done, and putting way more engineers and hardware on the project than anyone had previously done. AlphaGo couldn't have happened earlier because the hardware wasn't there yet, and was only able to be brought forward by massive application of resources.

https://www.jefftk.com/p/superintelligence-risk-project-conclusion
Summary: I'm not convinced that AI risk should be highly prioritized, but I'm also not convinced that it shouldn't. Highly qualified researchers in a position to have a good sense the field have massively different views on core questions like how capable ML systems are now, how capable they will be soon, and how we can influence their development. I do think these questions are possible to get a better handle on, but I think this would require much deeper ML knowledge than I have.
ratty  core-rats  ai  risk  ai-control  prediction  expert  machine-learning  deep-learning  speedometer  links  research  research-program  frontier  multi  interview  deepgoog  games  hardware  performance  roots  impetus  chart  big-picture  state-of-art  reinforcement  futurism  🤖  🖥  expert-experience  singularity  miri-cfar  empirical  evidence-based  speculation  volo-avolo  clever-rats  acmtariat  robust  ideas  crux  atoms  detail-architecture  software  gradient-descent 
july 2017 by nhaliday
Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling  grokkability-clarity  org:popup 
june 2017 by nhaliday
Born Red - The New Yorker
Obama-Xi State Visit: How China's President Defines the Chinese Dream: https://www.theatlantic.com/international/archive/2015/09/xi-jinping-china-book-chinese-dream/406387/
interesting glimpse into Chinese cultural overtones

What’s new on Xi Jinping’s bookshelf this year: https://medium.com/shanghaiist/whats-new-on-xi-jinping-s-bookshelf-this-year-8d913dcc261f

China Moves to Let Xi Stay in Power by Abolishing Term Limit: https://www.nytimes.com/2018/02/25/world/asia/china-xi-jinping.html
news  org:mag  profile  china  asia  politics  government  foreign-policy  authoritarianism  sinosphere  history  mostly-modern  corruption  expansionism  usa  ideology  orient  statesmen  civil-liberty  democracy  obama  leadership  egalitarianism-hierarchy  kinship  communism  cold-war  elite  power  class  class-warfare  organizing  markets  capitalism  noble-lie  anomie  morality  multi  culture  polisci  civilization  expression-survival  individualism-collectivism  diversity  books  review  summary  facebook  barons  current-events  nationalism-globalism  ethnocentrism  identity-politics  great-powers  n-factor  alien-character  org:med  trends  list  speedometer  technology  ai  deep-learning  polanyi-marx  europe  the-great-west-whale  literature  big-peeps  the-classics  military  defense  letters  economics  broad-econ  environment  technocracy  org:rec 
april 2017 by nhaliday
Peter Norvig, the meaning of polynomials, debugging as psychotherapy | Quomodocumque
He briefly showed a demo where, given values of a polynomial, a machine can put together a few lines of code that successfully computes the polynomial. But the code looks weird to a human eye. To compute some quadratic, it nests for-loops and adds things up in a funny way that ends up giving the right output. So has it really ”learned” the polynomial? I think in computer science, you typically feel you’ve learned a function if you can accurately predict its value on a given input. For an algebraist like me, a function determines but isn’t determined by the values it takes; to me, there’s something about that quadratic polynomial the machine has failed to grasp. I don’t think there’s a right or wrong answer here, just a cultural difference to be aware of. Relevant: Norvig’s description of “the two cultures” at the end of this long post on natural language processing (which is interesting all the way through!)
mathtariat  org:bleg  nibble  tech  ai  talks  summary  philosophy  lens  comparison  math  cs  tcs  polynomials  nlp  debugging  psychology  cog-psych  complex-systems  deep-learning  analogy  legibility  interpretability  composition-decomposition  coupling-cohesion  apollonian-dionysian  heavyweights 
march 2017 by nhaliday
« earlier      
per page:    204080120160

Copy this bookmark:





to read