recentpopularlog in

nhaliday : machine-learning   495

« earlier  
Michael Akilian: Worker-in-the-loop Retrospective
Over the last ten years, many companies have created human-in-the-loop services that combine a mix of humans and algorithms. Now that some time has passed, we can tease out some patterns from their collective successes and failures. As someone who started a company in this space, my hope is that this retrospective can help prospective founders, investors, or companies navigating this space save time and fund more impactful projects.

A service is considered human-in-the-loop if it organizes its workflows with the intent to introduce models or heuristics that learn from the work of the humans executing the workflows. In this post, I will make reference to two common forms of human-in-the-loop:

User-in-the-loop (UITL): The end-user is interacting with suggestions from a software heuristic/ML system.
Worker-in-the-loop (WITL): A worker is paid to monitor suggestions from a software heuristic/ML system developed by the same company that pays the worker, but for the ultimate benefit of an end-user.
techtariat  reflection  business  tech  postmortem  automation  startups  hard-tech  ai  machine-learning  human-ml  cost-benefit  analysis  thinking  business-models  things  dimensionality  exploratory  markets  labor  economics  tech-infrastructure  gig-econ 
12 weeks ago by nhaliday
Ask HN: Getting into NLP in 2018? | Hacker News
syllogism (spaCy author):
I think it's probably a bad strategy to try to be the "NLP guy" to potential employers. You'd do much better off being a software engineer on a project with people with ML or NLP expertise.

NLP projects fail a lot. If you line up a job as a company's first NLP person, you'll probably be setting yourself up for failure. You'll get handed an idea that can't work, you won't know enough about how to push back to change it into something that might, etc. After the project fails, you might get a chance to fail at a second one, but maybe not a third. This isn't a great way to move into any new field.

I think a cunning plan would be to angle to be the person who "productionises" models.
...
.--
...

Basically, don't just work on having more powerful solutions. Make sure you've tried hard to have easier problems as well --- that part tends to be higher leverage.

https://news.ycombinator.com/item?id=14008752
https://news.ycombinator.com/item?id=12916498
https://algorithmia.com/blog/introduction-natural-language-processing-nlp
hn  q-n-a  discussion  tech  programming  machine-learning  nlp  strategy  career  planning  human-capital  init  advice  books  recommendations  course  unit  links  automation  project  examples  applications  multi  mooc  lectures  video  data-science  org:com  roadmap  summary  error  applicability-prereqs  ends-means  telos-atelos  cost-benefit 
november 2019 by nhaliday
Ask HN: What's a promising area to work on? | Hacker News
hn  discussion  q-n-a  ideas  impact  trends  the-bones  speedometer  technology  applications  tech  cs  programming  list  top-n  recommendations  lens  machine-learning  deep-learning  security  privacy  crypto  software  hardware  cloud  biotech  CRISPR  bioinformatics  biohacking  blockchain  cryptocurrency  crypto-anarchy  healthcare  graphics  SIGGRAPH  vr  automation  universalism-particularism  expert-experience  reddit  social  arbitrage  supply-demand  ubiquity  cost-benefit  compensation  chart  career  planning  strategy  long-term  advice  sub-super  commentary  rhetoric  org:com  techtariat  human-capital  prioritizing  tech-infrastructure  working-stiff  data-science 
november 2019 by nhaliday
Three best practices for building successful data pipelines - O'Reilly Media
Drawn from their experiences and my own, I’ve identified three key areas that are often overlooked in data pipelines, and those are making your analysis:
1. Reproducible
2. Consistent
3. Productionizable

...

Science that cannot be reproduced by an external third party is just not science — and this does apply to data science. One of the benefits of working in data science is the ability to apply the existing tools from software engineering. These tools let you isolate all the dependencies of your analyses and make them reproducible.

Dependencies fall into three categories:
1. Analysis code ...
2. Data sources ...
3. Algorithmic randomness ...

...

Establishing consistency in data
...

There are generally two ways of establishing the consistency of data sources. The first is by checking-in all code and data into a single revision control repository. The second method is to reserve source control for code and build a pipeline that explicitly depends on external data being in a stable, consistent format and location.

Checking data into version control is generally considered verboten for production software engineers, but it has a place in data analysis. For one thing, it makes your analysis very portable by isolating all dependencies into source control. Here are some conditions under which it makes sense to have both code and data in source control:
Small data sets ...
Regular analytics ...
Fixed source ...

Productionizability: Developing a common ETL
...

1. Common data format ...
2. Isolating library dependencies ...

https://blog.koresoftware.com/blog/etl-principles
Rigorously enforce the idempotency constraint
For efficiency, seek to load data incrementally
Always ensure that you can efficiently process historic data
Partition ingested data at the destination
Rest data between tasks
Pool resources for efficiency
Store all metadata together in one place
Manage login details in one place
Specify configuration details once
Parameterize sub flows and dynamically run tasks where possible
Execute conditionally
Develop your own workflow framework and reuse workflow components

more focused on details of specific technologies:
https://medium.com/@rchang/a-beginners-guide-to-data-engineering-part-i-4227c5c457d7

https://www.cloudera.com/documentation/director/cloud/topics/cloud_de_best_practices.html
techtariat  org:com  best-practices  engineering  code-organizing  machine-learning  data-science  yak-shaving  nitty-gritty  workflow  config  vcs  replication  homo-hetero  multi  org:med  design  system-design  links  shipping  minimalism  volo-avolo  causation  random  invariance  structure  arrows  protocol-metadata  interface-compatibility  project-management 
august 2019 by nhaliday
Information Processing: Beijing 2019 Notes
Trump, the trade war, and US-China relations came up frequently in discussion. Chinese opinion tends to focus on the long term. Our driver for a day trip to the Great Wall was an older man from the countryside, who has lived only 3 years in Beijing. I was surprised to hear him expressing a very balanced opinion about the situation. He understood Trump's position remarkably well -- China has done very well trading with the US, and owes much of its technological and scientific development to the West. A recalibration is in order, and it is natural for Trump to negotiate in the interest of US workers.

China's economy is less and less export-dependent, and domestic drivers of growth seem easy to identify. For example, there is still a lot of low-hanging fruit in the form of "catch up growth" -- but now this means not just catching up with the outside developed world, but Tier 2 and Tier 3 cities catching up with Tier 1 cities like Beijing, Shanghai, Shenzhen, etc.

China watchers have noted the rapidly increasing government and private sector debt necessary to drive growth here. Perhaps this portends a future crisis. However, I didn't get any sense of impending doom for the Chinese economy. To be fair there was very little inkling of what would happen to the US economy in 2007-8. Some of the people I met with are highly placed with special knowledge -- they are among the most likely to be aware of problems. Overall I had the impression of normalcy and quiet confidence, but perhaps this would have been different in an export/manufacturing hub like Shenzhen. [ Update: Today after posting this I did hear something about economic concerns... So situation is unclear. ]

Innovation is everywhere here. Perhaps the most obvious is the high level of convenience from the use of e-payment and delivery services. You can pay for everything using your mobile (increasingly, using just your face!), and you can have food and other items (think Amazon on steroids) delivered quickly to your apartment. Even museum admissions can be handled via QR code.

A highly placed technologist told me that in fields like AI or computer science, Chinese researchers and engineers have access to in-depth local discussions of important arXiv papers -- think StackOverflow in Mandarin. Since most researchers here can read English, they have access both to Western advances, and a Chinese language reservoir of knowledge and analysis. He anticipates that eventually the pace and depth of engineering implementation here will be unequaled.

IVF and genetic testing are huge businesses in China. Perhaps I'll comment more on this in the future. New technologies, in genomics as in other areas, tend to be received more positively here than in the US and Europe.

...

Note Added: In the comments AG points to a Quora post by a user called Janus Dongye Qimeng, an AI researcher in Cambridge UK, who seems to be a real China expert. I found these posts to be very interesting.

Infrastructure development in poor regions of China

Size of Chinese internet social network platforms

Can the US derail China 2025? (Core technology stacks in and outside China)

Huawei smartphone technology stack and impact of US entity list interdiction (software and hardware!)

Agriculture at Massive Scale

US-China AI competition

More recommenations: Bruno Maçães is one of my favorite modern geopolitical thinkers. A Straussian of sorts (PhD under Harvey Mansfield at Harvard), he was Secretary of State for European Affairs in Portugal, and has thought deeply about the future of Eurasia and of US-China relations. He spent the last year in Beijing and I was eager to meet with him while here. His recent essay Equilibrium Americanum appeared in the Berlin Policy Journal. Podcast interview -- we hope to have him on Manifold soon :-)
hsu  scitariat  china  asia  thucydides  tech  technology  ai  automation  machine-learning  trends  the-bones  links  reflection  qra  q-n-a  foreign-policy  world  usa  trade  nationalism-globalism  great-powers  economics  research  journos-pundits  straussian 
july 2019 by nhaliday
Why is Google Translate so bad for Latin? A longish answer. : latin
hmm:
> All it does its correlate sequences of up to five consecutive words in texts that have been manually translated into two or more languages.
That sort of system ought to be perfect for a dead language, though. Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.

We're not exactly inundated with brand new Latin to translate.
--
> Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.
What makes you think that the Google folks haven't done so and used that to create the language models they use?
> That sort of system ought to be perfect for a dead language, though.
Perhaps. But it will be bad at translating novel English sentences to Latin.
foreign-lang  reddit  social  discussion  language  the-classics  literature  dataset  measurement  roots  traces  syntax  anglo  nlp  stackex  links  q-n-a  linguistics  lexical  deep-learning  sequential  hmm  project  arrows  generalization  state-of-art  apollonian-dionysian  machine-learning  google 
june 2019 by nhaliday
classification - ImageNet: what is top-1 and top-5 error rate? - Cross Validated
Now, in the case of top-1 score, you check if the top class (the one having the highest probability) is the same as the target label.

In the case of top-5 score, you check if the target label is one of your top 5 predictions (the 5 ones with the highest probabilities).
nibble  q-n-a  overflow  machine-learning  deep-learning  metrics  comparison  ranking  top-n  classification  computer-vision  benchmarks  dataset  accuracy  error  jargon 
june 2019 by nhaliday
Analysis of Current and Future Computer Science Needs via Advertised Faculty Searches for 2019 - CRN
Differences are also seen when analyzing results based on the type of institution. Positions related to Security have the highest percentages for all but top-100 institutions. The area of Artificial Intelligence/Data Mining/Machine Learning is of most interest for top-100 PhD institutions. Roughly 35% of positions for PhD institutions are in data-oriented areas. The results show a strong interest in data-oriented areas by public PhD and private PhD, MS, and BS institutions while public MS and BS institutions are most interested in Security.
org:edu  data  analysis  visualization  trends  recruiting  jobs  career  planning  academia  higher-ed  cs  tcs  machine-learning  systems  pro-rata  measure  long-term  🎓  uncertainty  progression  grad-school  phd  distribution  ranking  top-n  security  status  s-factor  comparison  homo-hetero  correlation  org:ngo  white-paper  cost-benefit 
june 2019 by nhaliday
[1803.00085] Chinese Text in the Wild
We introduce Chinese Text in the Wild, a very large dataset of Chinese text in street view images.

...

We give baseline results using several state-of-the-art networks, including AlexNet, OverFeat, Google Inception and ResNet for character recognition, and YOLOv2 for character detection in images. Overall Google Inception has the best performance on recognition with 80.5% top-1 accuracy, while YOLOv2 achieves an mAP of 71.0% on detection. Dataset, source code and trained models will all be publicly available on the website.
nibble  pdf  papers  preprint  machine-learning  deep-learning  deepgoog  state-of-art  china  asia  writing  language  dataset  error  accuracy  computer-vision  pic  ocr  org:mat  benchmarks  questions 
may 2019 by nhaliday
Basic Error Rates
This page describes human error rates in a variety of contexts.

Most of the error rates are for mechanical errors. A good general figure for mechanical error rates appears to be about 0.5%.

Of course the denominator differs across studies. However only fairly simple actions are used in the denominator.

The Klemmer and Snyder study shows that much lower error rates are possible--in this case for people whose job consisted almost entirely of data entry.

The error rate for more complex logic errors is about 5%, based primarily on data on other pages, especially the program development page.
org:junk  list  links  objektbuch  data  database  error  accuracy  human-ml  machine-learning  ai  pro-rata  metrics  automation  benchmarks  marginal  nlp  language  density  writing  dataviz  meta:reading  speedometer 
may 2019 by nhaliday
Should I go for TensorFlow or PyTorch?
Honestly, most experts that I know love Pytorch and detest TensorFlow. Karpathy and Justin from Stanford for example. You can see Karpthy's thoughts and I've asked Justin personally and the answer was sharp: PYTORCH!!! TF has lots of PR but its API and graph model are horrible and will waste lots of your research time.

--

...

Updated Mar 12
Update after 2019 TF summit:

TL/DR: previously I was in the pytorch camp but with TF 2.0 it’s clear that Google is really going to try to have parity or try to be better than Pytorch in all aspects where people voiced concerns (ease of use/debugging/dynamic graphs). They seem to be allocating more resources on development than Facebook so the longer term currently looks promising for Google. Prior to TF 2.0 I thought that Pytorch team had more momentum. One area where FB/Pytorch is still stronger is Google is a bit more closed and doesn’t seem to release reproducible cutting edge models such as AlphaGo whereas FAIR released OpenGo for instance. Generally you will end up running into models that are only implemented in one framework of the other so chances are you might end up learning both.
q-n-a  qra  comparison  software  recommendations  cost-benefit  tradeoffs  python  libraries  machine-learning  deep-learning  data-science  sci-comp  tools  google  facebook  tech  competition  best-practices  trends  debugging  expert-experience  ecosystem  theory-practice  pragmatic  wire-guided  static-dynamic  state  academia  frameworks  open-closed 
may 2019 by nhaliday
Information Processing: Moore's Law and AI
Hint to technocratic planners: invest more in physicists, chemists, and materials scientists. The recent explosion in value from technology has been driven by physical science -- software gets way too much credit. From the former we got a factor of a million or more in compute power, data storage, and bandwidth. From the latter, we gained (perhaps) an order of magnitude or two in effectiveness: how much better are current OSes and programming languages than Unix and C, both of which are ~50 years old now?

...

Of relevance to this discussion: a big chunk of AlphaGo's performance improvement over other Go programs is due to raw compute power (link via Jess Riedel). The vertical axis is ELO score. You can see that without multi-GPU compute, AlphaGo has relatively pedestrian strength.
hsu  scitariat  comparison  software  hardware  performance  sv  tech  trends  ai  machine-learning  deep-learning  deepgoog  google  roots  impact  hard-tech  multiplicative  the-world-is-just-atoms  technology  trivia  cocktail  big-picture  hi-order-bits 
may 2019 by nhaliday
Burrito: Rethinking the Electronic Lab Notebook
Seems very well-suited for ML experiments (if you can get it to work), also the nilfs aspect is cool and basically implements exactly one of the my project ideas (mini-VCS for competitive programming). Unfortunately gnarly installation instructions specify running it on Linux VM: https://github.com/pgbovine/burrito/blob/master/INSTALL. Linux is hard requirement due to nilfs.
techtariat  project  tools  devtools  linux  programming  yak-shaving  integration-extension  nitty-gritty  workflow  exocortex  scholar  software  python  app  desktop  notetaking  state  machine-learning  data-science  nibble  sci-comp  oly  vcs  multi  repo  paste  homepage  research 
may 2019 by nhaliday
A Recipe for Training Neural Networks
acmtariat  org:bleg  nibble  machine-learning  deep-learning  howto  tutorial  guide  nitty-gritty  gotchas  init  list  checklists  expert-experience  abstraction  composition-decomposition  gradient-descent  data-science  error  debugging  benchmarks  programming  engineering  best-practices  dataviz  checking  plots  generalization  regularization  unsupervised  optimization  ensembles  random  methodology  multi  twitter  social  discussion  techtariat  links  org:med  pdf  visualization  python  recommendations  advice  devtools 
april 2019 by nhaliday
Workshop Abstract | Identifying and Understanding Deep Learning Phenomena
ICML 2019 workshop, June 15th 2019, Long Beach, CA

We solicit contributions that view the behavior of deep nets as natural phenomena, to be investigated with methods inspired from the natural sciences like physics, astronomy, and biology.
unit  workshop  acm  machine-learning  science  empirical  nitty-gritty  atoms  deep-learning  model-class  icml  data-science  rigor  replication  examples  ben-recht  physics 
april 2019 by nhaliday
Applications of computational learning theory in the cognitive sciences - Psychology & Neuroscience Stack Exchange
1. Gold's theorem on the unlearnability in the limit of certain sets of languages, among them context-free ones.

2. Ronald de Wolf's master's thesis on the impossibility to PAC-learn context-free languages.

The first made quiet a stir in the poverty-of-the-stimulus debate, and the second has been unnoticed by cognitive science.
q-n-a  stackex  psychology  cog-psych  learning  learning-theory  machine-learning  PAC  lower-bounds  no-go  language  linguistics  models  fall-2015 
february 2019 by nhaliday
Stack Overflow Developer Survey 2018
Rust, Python, Go in top most loved
F#/OCaml most high paying globally, Erlang/Scala/OCaml in the US (F# still in top 10)
ML specialists high-paid
editor usage: VSCode > VS > Sublime > Vim > Intellij >> Emacs
ranking  list  top-n  time-series  data  database  programming  engineering  pls  trends  stackex  poll  career  exploratory  network-structure  ubiquity  ocaml-sml  rust  golang  python  dotnet  money  jobs  compensation  erlang  scala  jvm  ai  ai-control  risk  futurism  ethical-algorithms  data-science  machine-learning  editors  devtools  tools  pro-rata  org:com  software  analysis  article  human-capital  let-me-see  expert-experience  complement-substitute 
december 2018 by nhaliday
Lateralization of brain function - Wikipedia
Language
Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing
The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/
In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]
gnon  reflection  books  summary  review  neuro  neuro-nitgrit  things  thinking  metabuch  order-disorder  apollonian-dionysian  bio  examples  near-far  symmetry  homo-hetero  logic  inference  intuition  problem-solving  analytical-holistic  n-factor  europe  the-great-west-whale  occident  alien-character  detail-architecture  art  theory-practice  philosophy  being-becoming  essence-existence  language  psychology  cog-psych  egalitarianism-hierarchy  direction  reason  learning  novelty  science  anglo  anglosphere  coarse-fine  neurons  truth  contradiction  matching  empirical  volo-avolo  curiosity  uncertainty  theos  axioms  intricacy  computation  analogy  essay  rhetoric  deep-materialism  new-religion  knowledge  expert-experience  confidence  biases  optimism  pessimism  realness  whole-partial-many  theory-of-mind  values  competition  reduction  subjective-objective  communication  telos-atelos  ends-means  turing  fiction  increase-decrease  innovation  creative  thick-thin  spengler  multi  ratty  hanson  complex-systems  structure  concrete  abstraction  network-s 
september 2018 by nhaliday
The Hanson-Yudkowsky AI-Foom Debate - Machine Intelligence Research Institute
How Deviant Recent AI Progress Lumpiness?: http://www.overcomingbias.com/2018/03/how-deviant-recent-ai-progress-lumpiness.html
I seem to disagree with most people working on artificial intelligence (AI) risk. While with them I expect rapid change once AI is powerful enough to replace most all human workers, I expect this change to be spread across the world, not concentrated in one main localized AI system. The efforts of AI risk folks to design AI systems whose values won’t drift might stop global AI value drift if there is just one main AI system. But doing so in a world of many AI systems at similar abilities levels requires strong global governance of AI systems, which is a tall order anytime soon. Their continued focus on preventing single system drift suggests that they expect a single main AI system.

The main reason that I understand to expect relatively local AI progress is if AI progress is unusually lumpy, i.e., arriving in unusually fewer larger packages rather than in the usual many smaller packages. If one AI team finds a big lump, it might jump way ahead of the other teams.

However, we have a vast literature on the lumpiness of research and innovation more generally, which clearly says that usually most of the value in innovation is found in many small innovations. We have also so far seen this in computer science (CS) and AI. Even if there have been historical examples where much value was found in particular big innovations, such as nuclear weapons or the origin of humans.

Apparently many people associated with AI risk, including the star machine learning (ML) researchers that they often idolize, find it intuitively plausible that AI and ML progress is exceptionally lumpy. Such researchers often say, “My project is ‘huge’, and will soon do it all!” A decade ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about our differing estimates of AI progress lumpiness. He recently offered Alpha Go Zero as evidence of AI lumpiness:

...

In this post, let me give another example (beyond two big lumps in a row) of what could change my mind. I offer a clear observable indicator, for which data should have available now: deviant citation lumpiness in recent ML research. One standard measure of research impact is citations; bigger lumpier developments gain more citations that smaller ones. And it turns out that the lumpiness of citations is remarkably constant across research fields! See this March 3 paper in Science:

I Still Don’t Get Foom: http://www.overcomingbias.com/2014/07/30855.html
All of which makes it look like I’m the one with the problem; everyone else gets it. Even so, I’m gonna try to explain my problem again, in the hope that someone can explain where I’m going wrong. Here goes.

“Intelligence” just means an ability to do mental/calculation tasks, averaged over many tasks. I’ve always found it plausible that machines will continue to do more kinds of mental tasks better, and eventually be better at pretty much all of them. But what I’ve found it hard to accept is a “local explosion.” This is where a single machine, built by a single project using only a tiny fraction of world resources, goes in a short time (e.g., weeks) from being so weak that it is usually beat by a single human with the usual tools, to so powerful that it easily takes over the entire world. Yes, smarter machines may greatly increase overall economic growth rates, and yes such growth may be uneven. But this degree of unevenness seems implausibly extreme. Let me explain.

If we count by economic value, humans now do most of the mental tasks worth doing. Evolution has given us a brain chock-full of useful well-honed modules. And the fact that most mental tasks require the use of many modules is enough to explain why some of us are smarter than others. (There’d be a common “g” factor in task performance even with independent module variation.) Our modules aren’t that different from those of other primates, but because ours are different enough to allow lots of cultural transmission of innovation, we’ve out-competed other primates handily.

We’ve had computers for over seventy years, and have slowly build up libraries of software modules for them. Like brains, computers do mental tasks by combining modules. An important mental task is software innovation: improving these modules, adding new ones, and finding new ways to combine them. Ideas for new modules are sometimes inspired by the modules we see in our brains. When an innovation team finds an improvement, they usually sell access to it, which gives them resources for new projects, and lets others take advantage of their innovation.

...

In Bostrom’s graph above the line for an initially small project and system has a much higher slope, which means that it becomes in a short time vastly better at software innovation. Better than the entire rest of the world put together. And my key question is: how could it plausibly do that? Since the rest of the world is already trying the best it can to usefully innovate, and to abstract to promote such innovation, what exactly gives one small project such a huge advantage to let it innovate so much faster?

...

In fact, most software innovation seems to be driven by hardware advances, instead of innovator creativity. Apparently, good ideas are available but must usually wait until hardware is cheap enough to support them.

Yes, sometimes architectural choices have wider impacts. But I was an artificial intelligence researcher for nine years, ending twenty years ago, and I never saw an architecture choice make a huge difference, relative to other reasonable architecture choices. For most big systems, overall architecture matters a lot less than getting lots of detail right. Researchers have long wandered the space of architectures, mostly rediscovering variations on what others found before.

Some hope that a small project could be much better at innovation because it specializes in that topic, and much better understands new theoretical insights into the basic nature of innovation or intelligence. But I don’t think those are actually topics where one can usefully specialize much, or where we’ll find much useful new theory. To be much better at learning, the project would instead have to be much better at hundreds of specific kinds of learning. Which is very hard to do in a small project.

What does Bostrom say? Alas, not much. He distinguishes several advantages of digital over human minds, but all software shares those advantages. Bostrom also distinguishes five paths: better software, brain emulation (i.e., ems), biological enhancement of humans, brain-computer interfaces, and better human organizations. He doesn’t think interfaces would work, and sees organizations and better biology as only playing supporting roles.

...

Similarly, while you might imagine someday standing in awe in front of a super intelligence that embodies all the power of a new age, superintelligence just isn’t the sort of thing that one project could invent. As “intelligence” is just the name we give to being better at many mental tasks by using many good mental modules, there’s no one place to improve it. So I can’t see a plausible way one project could increase its intelligence vastly faster than could the rest of the world.

Takeoff speeds: https://sideways-view.com/2018/02/24/takeoff-speeds/
Futurists have argued for years about whether the development of AGI will look more like a breakthrough within a small group (“fast takeoff”), or a continuous acceleration distributed across the broader economy or a large firm (“slow takeoff”).

I currently think a slow takeoff is significantly more likely. This post explains some of my reasoning and why I think it matters. Mostly the post lists arguments I often hear for a fast takeoff and explains why I don’t find them compelling.

(Note: this is not a post about whether an intelligence explosion will occur. That seems very likely to me. Quantitatively I expect it to go along these lines. So e.g. while I disagree with many of the claims and assumptions in Intelligence Explosion Microeconomics, I don’t disagree with the central thesis or with most of the arguments.)
ratty  lesswrong  subculture  miri-cfar  ai  risk  ai-control  futurism  books  debate  hanson  big-yud  prediction  contrarianism  singularity  local-global  speed  speedometer  time  frontier  distribution  smoothness  shift  pdf  economics  track-record  abstraction  analogy  links  wiki  list  evolution  mutation  selection  optimization  search  iteration-recursion  intelligence  metameta  chart  analysis  number  ems  coordination  cooperate-defect  death  values  formal-values  flux-stasis  philosophy  farmers-and-foragers  malthus  scale  studying  innovation  insight  conceptual-vocab  growth-econ  egalitarianism-hierarchy  inequality  authoritarianism  wealth  near-far  rationality  epistemic  biases  cycles  competition  arms  zero-positive-sum  deterrence  war  peace-violence  winner-take-all  technology  moloch  multi  plots  research  science  publishing  humanity  labor  marginal  urban-rural  structure  composition-decomposition  complex-systems  gregory-clark  decentralized  heavy-industry  magnitude  multiplicative  endogenous-exogenous  models  uncertainty  decision-theory  time-prefer 
april 2018 by nhaliday
AI-complete - Wikipedia
In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems is equivalent to that of solving the central artificial intelligence problem—making computers as intelligent as people, or strong AI.[1] To call a problem AI-complete reflects an attitude that it would not be solved by a simple specific algorithm.

AI-complete problems are hypothesised to include computer vision, natural language understanding, and dealing with unexpected circumstances while solving any real world problem.[2]

Currently, AI-complete problems cannot be solved with modern computer technology alone, but would also require human computation. This property can be useful, for instance to test for the presence of humans as with CAPTCHAs, and for computer security to circumvent brute-force attacks.[3][4]

...

AI-complete problems are hypothesised to include:

Bongard problems
Computer vision (and subproblems such as object recognition)
Natural language understanding (and subproblems such as text mining, machine translation, and word sense disambiguation[8])
Dealing with unexpected circumstances while solving any real world problem, whether it's navigation or planning or even the kind of reasoning done by expert systems.

...

Current AI systems can solve very simple and/or restricted versions of AI-complete problems, but never in their full generality. When AI researchers attempt to "scale up" their systems to handle more complicated, real world situations, the programs tend to become excessively brittle without commonsense knowledge or a rudimentary understanding of the situation: they fail as unexpected circumstances outside of its original problem context begin to appear. When human beings are dealing with new situations in the world, they are helped immensely by the fact that they know what to expect: they know what all things around them are, why they are there, what they are likely to do and so on. They can recognize unusual situations and adjust accordingly. A machine without strong AI has no other skills to fall back on.[9]
concept  reduction  cs  computation  complexity  wiki  reference  properties  computer-vision  ai  risk  ai-control  machine-learning  deep-learning  language  nlp  order-disorder  tactics  strategy  intelligence  humanity  speculation  crux 
march 2018 by nhaliday
Information Processing: Mathematical Theory of Deep Neural Networks (Princeton workshop)
"Recently, long-past-due theoretical results have begun to emerge. These results, and those that will follow in their wake, will begin to shed light on the properties of large, adaptive, distributed learning architectures, and stand to revolutionize how computer science and neuroscience understand these systems."
hsu  scitariat  commentary  links  research  research-program  workshop  events  princeton  sanjeev-arora  deep-learning  machine-learning  ai  generalization  explanans  off-convex  nibble  frontier  speedometer  state-of-art  big-surf  announcement 
january 2018 by nhaliday
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent  org:popup 
december 2017 by nhaliday
« earlier      
per page:    204080120160

Copy this bookmark:





to read