big list - Are there proofs that you feel you did not "understand" for a long time? - MathOverflow

nibble q-n-a overflow soft-question big-list math proofs expert-experience heavyweights gowers mathtariat reflection learning intricacy grokkability intuition algebra math.GR motivation math.GN topology synthesis math.CT computation tcs logic iteration-recursion math.CA extrema smoothness span-cover grokkability-clarity

august 2019 by nhaliday

nibble q-n-a overflow soft-question big-list math proofs expert-experience heavyweights gowers mathtariat reflection learning intricacy grokkability intuition algebra math.GR motivation math.GN topology synthesis math.CT computation tcs logic iteration-recursion math.CA extrema smoothness span-cover grokkability-clarity

august 2019 by nhaliday

The Existential Risk of Math Errors - Gwern.net

july 2019 by nhaliday

How big is this upper bound? Mathematicians have often made errors in proofs. But it’s rarer for ideas to be accepted for a long time and then rejected. But we can divide errors into 2 basic cases corresponding to type I and type II errors:

1. Mistakes where the theorem is still true, but the proof was incorrect (type I)

2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?

“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”

- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":

https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs

https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs

]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:

https://mathoverflow.net/questions/11517/computer-algebra-errors

I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2

They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:

Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:

https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/

https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods

Update: measured effort

In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/

You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]

ratty
gwern
analysis
essay
realness
truth
correctness
reason
philosophy
math
proofs
formal-methods
cs
programming
engineering
worse-is-better/the-right-thing
intuition
giants
old-anglo
error
street-fighting
heuristic
zooming
risk
threat-modeling
software
lens
logic
inference
physics
differential
geometry
estimate
distribution
robust
speculation
nonlinearity
cost-benefit
convexity-curvature
measure
scale
trivia
cocktail
history
early-modern
europe
math.CA
rigor
news
org:mag
org:sci
miri-cfar
pdf
thesis
comparison
examples
org:junk
q-n-a
stackex
pragmatic
tradeoffs
cracker-prog
techtariat
invariance
DSL
chart
ecosystem
grokkability
heavyweights
CAS
static-dynamic
lower-bounds
complexity
tcs
open-problems
big-surf
ideas
certificates-recognition
proof-systems
PCP
mediterranean
SDP
meta:prediction
epistemic
questions
guessing
distributed
overflow
nibble
soft-question
track-record
big-list
hmm
frontier
state-of-art
move-fast-(and-break-things)
grokkability-clarity
technical-writing
trust
1. Mistakes where the theorem is still true, but the proof was incorrect (type I)

2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?

“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”

- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":

https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs

https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs

]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:

https://mathoverflow.net/questions/11517/computer-algebra-errors

I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2

They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:

Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:

https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/

https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods

Update: measured effort

In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/

You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]

july 2019 by nhaliday

Lateralization of brain function - Wikipedia

september 2018 by nhaliday

Language

Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing

The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/

In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]

gnon
reflection
books
summary
review
neuro
neuro-nitgrit
things
thinking
metabuch
order-disorder
apollonian-dionysian
bio
examples
near-far
symmetry
homo-hetero
logic
inference
intuition
problem-solving
analytical-holistic
n-factor
europe
the-great-west-whale
occident
alien-character
detail-architecture
art
theory-practice
philosophy
being-becoming
essence-existence
language
psychology
cog-psych
egalitarianism-hierarchy
direction
reason
learning
novelty
science
anglo
anglosphere
coarse-fine
neurons
truth
contradiction
matching
empirical
volo-avolo
curiosity
uncertainty
theos
axioms
intricacy
computation
analogy
essay
rhetoric
deep-materialism
new-religion
knowledge
expert-experience
confidence
biases
optimism
pessimism
realness
whole-partial-many
theory-of-mind
values
competition
reduction
subjective-objective
communication
telos-atelos
ends-means
turing
fiction
increase-decrease
innovation
creative
thick-thin
spengler
multi
ratty
hanson
complex-systems
structure
concrete
abstraction
network-s
Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing

The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/

In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]

september 2018 by nhaliday

Argument, intuition, and recursion

ratty lesswrong clever-rats acmtariat nibble reflection thinking metameta metabuch skeleton reason math thick-thin empirical science rationality epistemic intuition logic economics models theory-practice applicability-prereqs heuristic problem-solving analytical-holistic futurism lens speedometer frontier caching universalism-particularism duality fourier examples ai risk speed robust reinforcement machine-learning social-science tricki meta:rhetoric debate crux composition-decomposition structure convergence zooming neurons checklists advice strategy meta:prediction tetlock

april 2018 by nhaliday

ratty lesswrong clever-rats acmtariat nibble reflection thinking metameta metabuch skeleton reason math thick-thin empirical science rationality epistemic intuition logic economics models theory-practice applicability-prereqs heuristic problem-solving analytical-holistic futurism lens speedometer frontier caching universalism-particularism duality fourier examples ai risk speed robust reinforcement machine-learning social-science tricki meta:rhetoric debate crux composition-decomposition structure convergence zooming neurons checklists advice strategy meta:prediction tetlock

april 2018 by nhaliday

The Hanson-Yudkowsky AI-Foom Debate - Machine Intelligence Research Institute

april 2018 by nhaliday

How Deviant Recent AI Progress Lumpiness?: http://www.overcomingbias.com/2018/03/how-deviant-recent-ai-progress-lumpiness.html

I seem to disagree with most people working on artificial intelligence (AI) risk. While with them I expect rapid change once AI is powerful enough to replace most all human workers, I expect this change to be spread across the world, not concentrated in one main localized AI system. The efforts of AI risk folks to design AI systems whose values won’t drift might stop global AI value drift if there is just one main AI system. But doing so in a world of many AI systems at similar abilities levels requires strong global governance of AI systems, which is a tall order anytime soon. Their continued focus on preventing single system drift suggests that they expect a single main AI system.

The main reason that I understand to expect relatively local AI progress is if AI progress is unusually lumpy, i.e., arriving in unusually fewer larger packages rather than in the usual many smaller packages. If one AI team finds a big lump, it might jump way ahead of the other teams.

However, we have a vast literature on the lumpiness of research and innovation more generally, which clearly says that usually most of the value in innovation is found in many small innovations. We have also so far seen this in computer science (CS) and AI. Even if there have been historical examples where much value was found in particular big innovations, such as nuclear weapons or the origin of humans.

Apparently many people associated with AI risk, including the star machine learning (ML) researchers that they often idolize, find it intuitively plausible that AI and ML progress is exceptionally lumpy. Such researchers often say, “My project is ‘huge’, and will soon do it all!” A decade ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about our differing estimates of AI progress lumpiness. He recently offered Alpha Go Zero as evidence of AI lumpiness:

...

In this post, let me give another example (beyond two big lumps in a row) of what could change my mind. I offer a clear observable indicator, for which data should have available now: deviant citation lumpiness in recent ML research. One standard measure of research impact is citations; bigger lumpier developments gain more citations that smaller ones. And it turns out that the lumpiness of citations is remarkably constant across research fields! See this March 3 paper in Science:

I Still Don’t Get Foom: http://www.overcomingbias.com/2014/07/30855.html

All of which makes it look like I’m the one with the problem; everyone else gets it. Even so, I’m gonna try to explain my problem again, in the hope that someone can explain where I’m going wrong. Here goes.

“Intelligence” just means an ability to do mental/calculation tasks, averaged over many tasks. I’ve always found it plausible that machines will continue to do more kinds of mental tasks better, and eventually be better at pretty much all of them. But what I’ve found it hard to accept is a “local explosion.” This is where a single machine, built by a single project using only a tiny fraction of world resources, goes in a short time (e.g., weeks) from being so weak that it is usually beat by a single human with the usual tools, to so powerful that it easily takes over the entire world. Yes, smarter machines may greatly increase overall economic growth rates, and yes such growth may be uneven. But this degree of unevenness seems implausibly extreme. Let me explain.

If we count by economic value, humans now do most of the mental tasks worth doing. Evolution has given us a brain chock-full of useful well-honed modules. And the fact that most mental tasks require the use of many modules is enough to explain why some of us are smarter than others. (There’d be a common “g” factor in task performance even with independent module variation.) Our modules aren’t that different from those of other primates, but because ours are different enough to allow lots of cultural transmission of innovation, we’ve out-competed other primates handily.

We’ve had computers for over seventy years, and have slowly build up libraries of software modules for them. Like brains, computers do mental tasks by combining modules. An important mental task is software innovation: improving these modules, adding new ones, and finding new ways to combine them. Ideas for new modules are sometimes inspired by the modules we see in our brains. When an innovation team finds an improvement, they usually sell access to it, which gives them resources for new projects, and lets others take advantage of their innovation.

...

In Bostrom’s graph above the line for an initially small project and system has a much higher slope, which means that it becomes in a short time vastly better at software innovation. Better than the entire rest of the world put together. And my key question is: how could it plausibly do that? Since the rest of the world is already trying the best it can to usefully innovate, and to abstract to promote such innovation, what exactly gives one small project such a huge advantage to let it innovate so much faster?

...

In fact, most software innovation seems to be driven by hardware advances, instead of innovator creativity. Apparently, good ideas are available but must usually wait until hardware is cheap enough to support them.

Yes, sometimes architectural choices have wider impacts. But I was an artificial intelligence researcher for nine years, ending twenty years ago, and I never saw an architecture choice make a huge difference, relative to other reasonable architecture choices. For most big systems, overall architecture matters a lot less than getting lots of detail right. Researchers have long wandered the space of architectures, mostly rediscovering variations on what others found before.

Some hope that a small project could be much better at innovation because it specializes in that topic, and much better understands new theoretical insights into the basic nature of innovation or intelligence. But I don’t think those are actually topics where one can usefully specialize much, or where we’ll find much useful new theory. To be much better at learning, the project would instead have to be much better at hundreds of specific kinds of learning. Which is very hard to do in a small project.

What does Bostrom say? Alas, not much. He distinguishes several advantages of digital over human minds, but all software shares those advantages. Bostrom also distinguishes five paths: better software, brain emulation (i.e., ems), biological enhancement of humans, brain-computer interfaces, and better human organizations. He doesn’t think interfaces would work, and sees organizations and better biology as only playing supporting roles.

...

Similarly, while you might imagine someday standing in awe in front of a super intelligence that embodies all the power of a new age, superintelligence just isn’t the sort of thing that one project could invent. As “intelligence” is just the name we give to being better at many mental tasks by using many good mental modules, there’s no one place to improve it. So I can’t see a plausible way one project could increase its intelligence vastly faster than could the rest of the world.

Takeoff speeds: https://sideways-view.com/2018/02/24/takeoff-speeds/

Futurists have argued for years about whether the development of AGI will look more like a breakthrough within a small group (“fast takeoff”), or a continuous acceleration distributed across the broader economy or a large firm (“slow takeoff”).

I currently think a slow takeoff is significantly more likely. This post explains some of my reasoning and why I think it matters. Mostly the post lists arguments I often hear for a fast takeoff and explains why I don’t find them compelling.

(Note: this is not a post about whether an intelligence explosion will occur. That seems very likely to me. Quantitatively I expect it to go along these lines. So e.g. while I disagree with many of the claims and assumptions in Intelligence Explosion Microeconomics, I don’t disagree with the central thesis or with most of the arguments.)

ratty
lesswrong
subculture
miri-cfar
ai
risk
ai-control
futurism
books
debate
hanson
big-yud
prediction
contrarianism
singularity
local-global
speed
speedometer
time
frontier
distribution
smoothness
shift
pdf
economics
track-record
abstraction
analogy
links
wiki
list
evolution
mutation
selection
optimization
search
iteration-recursion
intelligence
metameta
chart
analysis
number
ems
coordination
cooperate-defect
death
values
formal-values
flux-stasis
philosophy
farmers-and-foragers
malthus
scale
studying
innovation
insight
conceptual-vocab
growth-econ
egalitarianism-hierarchy
inequality
authoritarianism
wealth
near-far
rationality
epistemic
biases
cycles
competition
arms
zero-positive-sum
deterrence
war
peace-violence
winner-take-all
technology
moloch
multi
plots
research
science
publishing
humanity
labor
marginal
urban-rural
structure
composition-decomposition
complex-systems
gregory-clark
decentralized
heavy-industry
magnitude
multiplicative
endogenous-exogenous
models
uncertainty
decision-theory
time-prefer
I seem to disagree with most people working on artificial intelligence (AI) risk. While with them I expect rapid change once AI is powerful enough to replace most all human workers, I expect this change to be spread across the world, not concentrated in one main localized AI system. The efforts of AI risk folks to design AI systems whose values won’t drift might stop global AI value drift if there is just one main AI system. But doing so in a world of many AI systems at similar abilities levels requires strong global governance of AI systems, which is a tall order anytime soon. Their continued focus on preventing single system drift suggests that they expect a single main AI system.

The main reason that I understand to expect relatively local AI progress is if AI progress is unusually lumpy, i.e., arriving in unusually fewer larger packages rather than in the usual many smaller packages. If one AI team finds a big lump, it might jump way ahead of the other teams.

However, we have a vast literature on the lumpiness of research and innovation more generally, which clearly says that usually most of the value in innovation is found in many small innovations. We have also so far seen this in computer science (CS) and AI. Even if there have been historical examples where much value was found in particular big innovations, such as nuclear weapons or the origin of humans.

Apparently many people associated with AI risk, including the star machine learning (ML) researchers that they often idolize, find it intuitively plausible that AI and ML progress is exceptionally lumpy. Such researchers often say, “My project is ‘huge’, and will soon do it all!” A decade ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about our differing estimates of AI progress lumpiness. He recently offered Alpha Go Zero as evidence of AI lumpiness:

...

In this post, let me give another example (beyond two big lumps in a row) of what could change my mind. I offer a clear observable indicator, for which data should have available now: deviant citation lumpiness in recent ML research. One standard measure of research impact is citations; bigger lumpier developments gain more citations that smaller ones. And it turns out that the lumpiness of citations is remarkably constant across research fields! See this March 3 paper in Science:

I Still Don’t Get Foom: http://www.overcomingbias.com/2014/07/30855.html

All of which makes it look like I’m the one with the problem; everyone else gets it. Even so, I’m gonna try to explain my problem again, in the hope that someone can explain where I’m going wrong. Here goes.

“Intelligence” just means an ability to do mental/calculation tasks, averaged over many tasks. I’ve always found it plausible that machines will continue to do more kinds of mental tasks better, and eventually be better at pretty much all of them. But what I’ve found it hard to accept is a “local explosion.” This is where a single machine, built by a single project using only a tiny fraction of world resources, goes in a short time (e.g., weeks) from being so weak that it is usually beat by a single human with the usual tools, to so powerful that it easily takes over the entire world. Yes, smarter machines may greatly increase overall economic growth rates, and yes such growth may be uneven. But this degree of unevenness seems implausibly extreme. Let me explain.

If we count by economic value, humans now do most of the mental tasks worth doing. Evolution has given us a brain chock-full of useful well-honed modules. And the fact that most mental tasks require the use of many modules is enough to explain why some of us are smarter than others. (There’d be a common “g” factor in task performance even with independent module variation.) Our modules aren’t that different from those of other primates, but because ours are different enough to allow lots of cultural transmission of innovation, we’ve out-competed other primates handily.

We’ve had computers for over seventy years, and have slowly build up libraries of software modules for them. Like brains, computers do mental tasks by combining modules. An important mental task is software innovation: improving these modules, adding new ones, and finding new ways to combine them. Ideas for new modules are sometimes inspired by the modules we see in our brains. When an innovation team finds an improvement, they usually sell access to it, which gives them resources for new projects, and lets others take advantage of their innovation.

...

In Bostrom’s graph above the line for an initially small project and system has a much higher slope, which means that it becomes in a short time vastly better at software innovation. Better than the entire rest of the world put together. And my key question is: how could it plausibly do that? Since the rest of the world is already trying the best it can to usefully innovate, and to abstract to promote such innovation, what exactly gives one small project such a huge advantage to let it innovate so much faster?

...

In fact, most software innovation seems to be driven by hardware advances, instead of innovator creativity. Apparently, good ideas are available but must usually wait until hardware is cheap enough to support them.

Yes, sometimes architectural choices have wider impacts. But I was an artificial intelligence researcher for nine years, ending twenty years ago, and I never saw an architecture choice make a huge difference, relative to other reasonable architecture choices. For most big systems, overall architecture matters a lot less than getting lots of detail right. Researchers have long wandered the space of architectures, mostly rediscovering variations on what others found before.

Some hope that a small project could be much better at innovation because it specializes in that topic, and much better understands new theoretical insights into the basic nature of innovation or intelligence. But I don’t think those are actually topics where one can usefully specialize much, or where we’ll find much useful new theory. To be much better at learning, the project would instead have to be much better at hundreds of specific kinds of learning. Which is very hard to do in a small project.

What does Bostrom say? Alas, not much. He distinguishes several advantages of digital over human minds, but all software shares those advantages. Bostrom also distinguishes five paths: better software, brain emulation (i.e., ems), biological enhancement of humans, brain-computer interfaces, and better human organizations. He doesn’t think interfaces would work, and sees organizations and better biology as only playing supporting roles.

...

Similarly, while you might imagine someday standing in awe in front of a super intelligence that embodies all the power of a new age, superintelligence just isn’t the sort of thing that one project could invent. As “intelligence” is just the name we give to being better at many mental tasks by using many good mental modules, there’s no one place to improve it. So I can’t see a plausible way one project could increase its intelligence vastly faster than could the rest of the world.

Takeoff speeds: https://sideways-view.com/2018/02/24/takeoff-speeds/

Futurists have argued for years about whether the development of AGI will look more like a breakthrough within a small group (“fast takeoff”), or a continuous acceleration distributed across the broader economy or a large firm (“slow takeoff”).

I currently think a slow takeoff is significantly more likely. This post explains some of my reasoning and why I think it matters. Mostly the post lists arguments I often hear for a fast takeoff and explains why I don’t find them compelling.

(Note: this is not a post about whether an intelligence explosion will occur. That seems very likely to me. Quantitatively I expect it to go along these lines. So e.g. while I disagree with many of the claims and assumptions in Intelligence Explosion Microeconomics, I don’t disagree with the central thesis or with most of the arguments.)

april 2018 by nhaliday

Fisica ingenua o Fisica di senso comune

november 2017 by nhaliday

naive physics

cited by Lucio Russo

you can plug pdf into google translate here: https://translate.google.com/?tr=f&hl=en

nibble
pdf
presentation
physics
mechanics
psychology
cog-psych
heuristic
science
west-hunter
scitariat
lens
the-classics
slides
foreign-lang
mediterranean
multi
intuition
feynman
giants
neurons
biases
error
gotchas
instinct
init
teaching
tutoring
cited by Lucio Russo

you can plug pdf into google translate here: https://translate.google.com/?tr=f&hl=en

november 2017 by nhaliday

The Function of Reason | Edge.org

august 2017 by nhaliday

https://www.edge.org/conversation/hugo_mercier-the-argumentative-theory

How Social Is Reason?: http://www.overcomingbias.com/2017/08/how-social-is-reason.html

https://gnxp.nofe.me/2017/07/02/open-thread-732017/

Reading The Enigma of Reason. Pretty good so far. Not incredibly surprising to me so far. To be clear, their argument is somewhat orthogonal to the whole ‘rationality’ debate you may be familiar with from Daniel Kahneman and Amos Tversky’s work (e.g., see Heuristics and Biases).

One of the major problems in analysis is that rationality, reflection and ratiocination, are slow and error prone. To get a sense of that, just read ancient Greek science. Eratosthenes may have calculated to within 1% of the true circumference of the world, but Aristotle’s speculations on the nature of reproduction were rather off.

You may be as clever as Eratosthenes, but most people are not. But you probably accept that the world is round and 24,901 miles around. If you are not American you probably are vague on miles anyway. But you know what the social consensus is, and you accept it because it seems reasonable.

One of the points in cultural evolution work is that a lot of the time rather than relying on your own intuition and or reason, it is far more effective and cognitively cheaper to follow social norms of your ingroup. I only bring this up because unfortunately many pathologies of our political and intellectual world today are not really pathologies. That is, they’re not bugs, but features.

https://gnxp.nofe.me/2017/07/23/open-thread-07232017/

Finished The Enigma of Reason. The basic thesis that reasoning is a way to convince people after you’ve already come to a conclusion, that is, rationalization, was already one I shared. That makes sense since one of the coauthors, Dan Sperber, has been influential in the “naturalistic” school of anthropology. If you’ve read books like In Gods We Trust The Enigma of Reason goes fast. But it is important to note that the cognitive anthropology perspective is useful in things besides religion. I’m thinking in particular of politics.

https://gnxp.nofe.me/2017/07/30/the-delusion-of-reasons-empire/

My point here is that many of our beliefs are arrived at in an intuitive manner, and we find reasons to justify those beliefs. One of the core insights you’ll get from The Enigma of Reason is that rationalization isn’t that big of a misfire or abuse of our capacities. It’s probably just a natural outcome for what and how we use reason in our natural ecology.

Mercier and Sperber contrast their “interactionist” model of what reason is for with an “intellectualist: model. The intellecutalist model is rather straightforward. It is one where individual reasoning capacities exist so that one may make correct inferences about the world around us, often using methods that mimic those in abstract elucidated systems such as formal logic or Bayesian reasoning. When reasoning doesn’t work right, it’s because people aren’t using it for it’s right reasons. It can be entirely solitary because the tools don’t rely on social input or opinion.

The interactionist model holds that reasoning exists because it is a method of persuasion within social contexts. It is important here to note that the authors do not believe that reasoning is simply a tool for winning debates. That is, increasing your status in a social game. Rather, their overall thesis seems to be in alignment with the idea that cognition of reasoning properly understood is a social process. In this vein they offer evidence of how juries may be superior to judges, and the general examples you find in the “wisdom of the crowds” literature. Overall the authors make a strong case for the importance of diversity of good-faith viewpoints, because they believe that the truth on the whole tends to win out in dialogic formats (that is, if there is a truth; they are rather unclear and muddy about normative disagreements and how those can be resolved).

The major issues tend to crop up when reasoning is used outside of its proper context. One of the literature examples, which you are surely familiar with, in The Enigma of Reason is a psychological experiment where there are two conditions, and the researchers vary the conditions and note wide differences in behavior. In particular, the experiment where psychologists put subjects into a room where someone out of view is screaming for help. When they are alone, they quite often go to see what is wrong immediately. In contrast, when there is a confederate of the psychologists in the room who ignores the screaming, people also tend to ignore the screaming.

The researchers know the cause of the change in behavior. It’s the introduction of the confederate and that person’s behavior. But the subjects when interviewed give a wide range of plausible and possible answers. In other words, they are rationalizing their behavior when called to justify it in some way. This is entirely unexpected, we all know that people are very good at coming up with answers to explain their behavior (often in the best light possible). But that doesn’t mean they truly understanding their internal reasons, which seem to be more about intuition.

But much of The Enigma of Reason also recounts how bad people are at coming up with coherent and well thought out rationalizations. That is, their “reasons” tend to be ad hoc and weak. We’re not very good at formal logic or even simple syllogistic reasoning. The explanation for this seems to be two-fold.

...

At this point we need to address the elephant in the room: some humans seem extremely good at reasoning in a classical sense. I’m talking about individuals such as Blaise Pascal, Carl Friedrich Gauss, and John von Neumann. Early on in The Enigma of Reason the authors point out the power of reason by alluding to Eratosthenes’s calculation of the circumference of the earth, which was only off by one percent. Myself, I would have mentioned Archimedes, who I suspect was a genius on the same level as the ones mentioned above.

Mercier and Sperber state near the end of the book that math in particular is special and a powerful way to reason. We all know this. In math the axioms are clear, and agreed upon. And one can inspect the chain of propositions in a very transparent manner. Mathematics has guard-rails for any human who attempts to engage in reasoning. By reducing the ability of humans to enter into unforced errors math is the ideal avenue for solitary individual reasoning. But it is exceptional.

Second, though it is not discussed in The Enigma of Reason there does seem to be variation in general and domain specific intelligence within the human population. People who flourish in mathematics usually have high general intelligences, but they also often exhibit a tendency to be able to engage in high levels of visual-spatial conceptualization.

One the whole the more intelligent you are the better you are able to reason. But that does not mean that those with high intelligence are immune from the traps of motivated reasoning or faulty logic. Mercier and Sperber give many examples. There are two. Linus Pauling was indisputably brilliant, but by the end of his life he was consistently pushing Vitamin C quackery (in part through a very selective interpretation of the scientific literature).* They also point out that much of Isaac Newton’s prodigious intellectual output turns out to have been focused on alchemy and esoteric exegesis which is totally impenetrable. Newton undoubtedly had a first class mind, but if the domain it was applied to was garbage, then the output was also garbage.

...

Overall, the take-homes are:

Reasoning exists to persuade in a group context through dialogue, not individual ratiocination.

Reasoning can give rise to storytelling when prompted, even if the reasons have no relationship to the underlying causality.

Motivated reasoning emerges because we are not skeptical of the reasons we proffer, but highly skeptical of reasons which refute our own.

The “wisdom of the crowds” is not just a curious phenomenon, but one of the primary reasons that humans have become more socially complex and our brains have larger.

Ultimately, if you want to argue someone out of their beliefs…well, good luck with that. But you should read The Enigma of Reason to understand the best strategies (many of them are common sense, and I’ve come to them independently simply through 15 years of having to engage with people of diverse viewpoints).

* R. A. Fisher, who was one of the pioneers of both evolutionary genetics and statistics, famously did not believe there was a connection between smoking and cancer. He himself smoked a pipe regularly.

** From what we know about Blaise Pascal and Isaac Newton, their personalities were such that they’d probably be killed or expelled from a hunter-gatherer band.

books
summary
psychology
social-psych
cog-psych
anthropology
rationality
biases
epistemic
thinking
neurons
realness
truth
info-dynamics
language
speaking
persuasion
dark-arts
impro
roots
ideas
speculation
hypocrisy
intelligence
eden
philosophy
multi
review
critique
ratty
hanson
org:edge
video
interview
communication
insight
impetus
hidden-motives
X-not-about-Y
signaling
🤖
metameta
metabuch
dennett
meta:rhetoric
gnxp
scitariat
open-things
giants
fisher
old-anglo
history
iron-age
mediterranean
the-classics
reason
religion
theos
noble-lie
intuition
instinct
farmers-and-foragers
egalitarianism-hierarchy
early-modern
britain
europe
gallic
hari-seldon
theory-of-mind
parallax
darwinian
evolution
telos-atelos
intricacy
evopsych
chart
traces
How Social Is Reason?: http://www.overcomingbias.com/2017/08/how-social-is-reason.html

https://gnxp.nofe.me/2017/07/02/open-thread-732017/

Reading The Enigma of Reason. Pretty good so far. Not incredibly surprising to me so far. To be clear, their argument is somewhat orthogonal to the whole ‘rationality’ debate you may be familiar with from Daniel Kahneman and Amos Tversky’s work (e.g., see Heuristics and Biases).

One of the major problems in analysis is that rationality, reflection and ratiocination, are slow and error prone. To get a sense of that, just read ancient Greek science. Eratosthenes may have calculated to within 1% of the true circumference of the world, but Aristotle’s speculations on the nature of reproduction were rather off.

You may be as clever as Eratosthenes, but most people are not. But you probably accept that the world is round and 24,901 miles around. If you are not American you probably are vague on miles anyway. But you know what the social consensus is, and you accept it because it seems reasonable.

One of the points in cultural evolution work is that a lot of the time rather than relying on your own intuition and or reason, it is far more effective and cognitively cheaper to follow social norms of your ingroup. I only bring this up because unfortunately many pathologies of our political and intellectual world today are not really pathologies. That is, they’re not bugs, but features.

https://gnxp.nofe.me/2017/07/23/open-thread-07232017/

Finished The Enigma of Reason. The basic thesis that reasoning is a way to convince people after you’ve already come to a conclusion, that is, rationalization, was already one I shared. That makes sense since one of the coauthors, Dan Sperber, has been influential in the “naturalistic” school of anthropology. If you’ve read books like In Gods We Trust The Enigma of Reason goes fast. But it is important to note that the cognitive anthropology perspective is useful in things besides religion. I’m thinking in particular of politics.

https://gnxp.nofe.me/2017/07/30/the-delusion-of-reasons-empire/

My point here is that many of our beliefs are arrived at in an intuitive manner, and we find reasons to justify those beliefs. One of the core insights you’ll get from The Enigma of Reason is that rationalization isn’t that big of a misfire or abuse of our capacities. It’s probably just a natural outcome for what and how we use reason in our natural ecology.

Mercier and Sperber contrast their “interactionist” model of what reason is for with an “intellectualist: model. The intellecutalist model is rather straightforward. It is one where individual reasoning capacities exist so that one may make correct inferences about the world around us, often using methods that mimic those in abstract elucidated systems such as formal logic or Bayesian reasoning. When reasoning doesn’t work right, it’s because people aren’t using it for it’s right reasons. It can be entirely solitary because the tools don’t rely on social input or opinion.

The interactionist model holds that reasoning exists because it is a method of persuasion within social contexts. It is important here to note that the authors do not believe that reasoning is simply a tool for winning debates. That is, increasing your status in a social game. Rather, their overall thesis seems to be in alignment with the idea that cognition of reasoning properly understood is a social process. In this vein they offer evidence of how juries may be superior to judges, and the general examples you find in the “wisdom of the crowds” literature. Overall the authors make a strong case for the importance of diversity of good-faith viewpoints, because they believe that the truth on the whole tends to win out in dialogic formats (that is, if there is a truth; they are rather unclear and muddy about normative disagreements and how those can be resolved).

The major issues tend to crop up when reasoning is used outside of its proper context. One of the literature examples, which you are surely familiar with, in The Enigma of Reason is a psychological experiment where there are two conditions, and the researchers vary the conditions and note wide differences in behavior. In particular, the experiment where psychologists put subjects into a room where someone out of view is screaming for help. When they are alone, they quite often go to see what is wrong immediately. In contrast, when there is a confederate of the psychologists in the room who ignores the screaming, people also tend to ignore the screaming.

The researchers know the cause of the change in behavior. It’s the introduction of the confederate and that person’s behavior. But the subjects when interviewed give a wide range of plausible and possible answers. In other words, they are rationalizing their behavior when called to justify it in some way. This is entirely unexpected, we all know that people are very good at coming up with answers to explain their behavior (often in the best light possible). But that doesn’t mean they truly understanding their internal reasons, which seem to be more about intuition.

But much of The Enigma of Reason also recounts how bad people are at coming up with coherent and well thought out rationalizations. That is, their “reasons” tend to be ad hoc and weak. We’re not very good at formal logic or even simple syllogistic reasoning. The explanation for this seems to be two-fold.

...

At this point we need to address the elephant in the room: some humans seem extremely good at reasoning in a classical sense. I’m talking about individuals such as Blaise Pascal, Carl Friedrich Gauss, and John von Neumann. Early on in The Enigma of Reason the authors point out the power of reason by alluding to Eratosthenes’s calculation of the circumference of the earth, which was only off by one percent. Myself, I would have mentioned Archimedes, who I suspect was a genius on the same level as the ones mentioned above.

Mercier and Sperber state near the end of the book that math in particular is special and a powerful way to reason. We all know this. In math the axioms are clear, and agreed upon. And one can inspect the chain of propositions in a very transparent manner. Mathematics has guard-rails for any human who attempts to engage in reasoning. By reducing the ability of humans to enter into unforced errors math is the ideal avenue for solitary individual reasoning. But it is exceptional.

Second, though it is not discussed in The Enigma of Reason there does seem to be variation in general and domain specific intelligence within the human population. People who flourish in mathematics usually have high general intelligences, but they also often exhibit a tendency to be able to engage in high levels of visual-spatial conceptualization.

One the whole the more intelligent you are the better you are able to reason. But that does not mean that those with high intelligence are immune from the traps of motivated reasoning or faulty logic. Mercier and Sperber give many examples. There are two. Linus Pauling was indisputably brilliant, but by the end of his life he was consistently pushing Vitamin C quackery (in part through a very selective interpretation of the scientific literature).* They also point out that much of Isaac Newton’s prodigious intellectual output turns out to have been focused on alchemy and esoteric exegesis which is totally impenetrable. Newton undoubtedly had a first class mind, but if the domain it was applied to was garbage, then the output was also garbage.

...

Overall, the take-homes are:

Reasoning exists to persuade in a group context through dialogue, not individual ratiocination.

Reasoning can give rise to storytelling when prompted, even if the reasons have no relationship to the underlying causality.

Motivated reasoning emerges because we are not skeptical of the reasons we proffer, but highly skeptical of reasons which refute our own.

The “wisdom of the crowds” is not just a curious phenomenon, but one of the primary reasons that humans have become more socially complex and our brains have larger.

Ultimately, if you want to argue someone out of their beliefs…well, good luck with that. But you should read The Enigma of Reason to understand the best strategies (many of them are common sense, and I’ve come to them independently simply through 15 years of having to engage with people of diverse viewpoints).

* R. A. Fisher, who was one of the pioneers of both evolutionary genetics and statistics, famously did not believe there was a connection between smoking and cancer. He himself smoked a pipe regularly.

** From what we know about Blaise Pascal and Isaac Newton, their personalities were such that they’d probably be killed or expelled from a hunter-gatherer band.

august 2017 by nhaliday

Geometers, Scribes, and the structure of intelligence | Compass Rose

july 2017 by nhaliday

cf related comments by Roger T. Ames (I highlighted them) on Greeks vs. Chinese, spatiality leading to objectivity, etc.

ratty
ssc
thinking
rationality
neurons
intelligence
iq
psychometrics
psychology
cog-psych
spatial
math
geometry
roots
chart
insight
law
social-norms
contracts
coordination
language
religion
judaism
adversarial
programming
structure
ideas
history
iron-age
mediterranean
the-classics
alien-character
intuition
lens
n-factor
thick-thin
systematic-ad-hoc
analytical-holistic
metameta
metabuch
🤖
rigidity
info-dynamics
flexibility
things
legacy
investing
securities
trivia
wealth
age-generation
s:*
wordlessness
problem-solving
rigor
discovery
🔬
science
revolution
reason
apollonian-dionysian
essence-existence
elegance
july 2017 by nhaliday

Understanding statistics through interactive visualizations

explanation list visualization gotchas paradox stats methodology hypothesis-testing visual-understanding better-explained links regression-to-mean metabuch examples data-science street-fighting intuition ground-up nitty-gritty

march 2017 by nhaliday

explanation list visualization gotchas paradox stats methodology hypothesis-testing visual-understanding better-explained links regression-to-mean metabuch examples data-science street-fighting intuition ground-up nitty-gritty

march 2017 by nhaliday

A map of the Tricki | Tricki

gowers wiki reference math problem-solving proofs structure list top-n synthesis hi-order-bits tricks yoga scholar tricki metabuch 👳 toolkit unit duplication insight intuition meta:math better-explained metameta wisdom skeleton p:whenever s:*** chart knowledge org:mat elegance

february 2017 by nhaliday

gowers wiki reference math problem-solving proofs structure list top-n synthesis hi-order-bits tricks yoga scholar tricki metabuch 👳 toolkit unit duplication insight intuition meta:math better-explained metameta wisdom skeleton p:whenever s:*** chart knowledge org:mat elegance

february 2017 by nhaliday

Do grad school students remember everything they were taught in college all the time? - Quora

q-n-a qra grad-school learning synthesis hi-order-bits neurons physics lens analogy cartoons links 🎓 scholar gowers mathtariat feynman giants quotes games nibble thinking zooming retention meta:research big-picture skeleton s:** p:whenever wire-guided narrative intuition lesswrong commentary ground-up limits examples problem-solving info-dynamics knowledge studying ideas the-trenches chart

february 2017 by nhaliday

q-n-a qra grad-school learning synthesis hi-order-bits neurons physics lens analogy cartoons links 🎓 scholar gowers mathtariat feynman giants quotes games nibble thinking zooming retention meta:research big-picture skeleton s:** p:whenever wire-guided narrative intuition lesswrong commentary ground-up limits examples problem-solving info-dynamics knowledge studying ideas the-trenches chart

february 2017 by nhaliday

Einstein's Most Famous Thought Experiment

february 2017 by nhaliday

When Einstein abandoned an emission theory of light, he had also to abandon the hope that electrodynamics could be made to conform to the principle of relativity by the normal sorts of modifications to electrodynamic theory that occupied the theorists of the second half of the 19th century. Instead Einstein knew he must resort to extraordinary measures. He was willing to seek realization of his goal in a re-examination of our basic notions of space and time. Einstein concluded his report on his youthful thought experiment:

"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."

einstein
giants
physics
history
stories
gedanken
exposition
org:edu
electromag
relativity
nibble
innovation
novelty
the-trenches
synchrony
discovery
🔬
org:junk
science
absolute-relative
visuo
explanation
ground-up
clarity
state
causation
intuition
ideas
mostly-modern
pre-ww2
marginal
grokkability-clarity
"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."

february 2017 by nhaliday

general topology - What should be the intuition when working with compactness? - Mathematics Stack Exchange

january 2017 by nhaliday

http://math.stackexchange.com/questions/485822/why-is-compactness-so-important

The situation with compactness is sort of like the above. It turns out that finiteness, which you think of as one concept (in the same way that you think of "Foo" as one concept above), is really two concepts: discreteness and compactness. You've never seen these concepts separated before, though. When people say that compactness is like finiteness, they mean that compactness captures part of what it means to be finite in the same way that shortness captures part of what it means to be Foo.

--

As many have said, compactness is sort of a topological generalization of finiteness. And this is true in a deep sense, because topology deals with open sets, and this means that we often "care about how something behaves on an open set", and for compact spaces this means that there are only finitely many possible behaviors.

--

Compactness does for continuous functions what finiteness does for functions in general.

If a set A is finite then every function f:A→R has a max and a min, and every function f:A→R^n is bounded. If A is compact, the every continuous function from A to R has a max and a min and every continuous function from A to R^n is bounded.

If A is finite then every sequence of members of A has a subsequence that is eventually constant, and "eventually constant" is the only kind of convergence you can talk about without talking about a topology on the set. If A is compact, then every sequence of members of A has a convergent subsequence.

q-n-a
overflow
math
topology
math.GN
concept
finiteness
atoms
intuition
oly
mathtariat
multi
discrete
gowers
motivation
synthesis
hi-order-bits
soft-question
limits
things
nibble
definition
convergence
abstraction
span-cover
The situation with compactness is sort of like the above. It turns out that finiteness, which you think of as one concept (in the same way that you think of "Foo" as one concept above), is really two concepts: discreteness and compactness. You've never seen these concepts separated before, though. When people say that compactness is like finiteness, they mean that compactness captures part of what it means to be finite in the same way that shortness captures part of what it means to be Foo.

--

As many have said, compactness is sort of a topological generalization of finiteness. And this is true in a deep sense, because topology deals with open sets, and this means that we often "care about how something behaves on an open set", and for compact spaces this means that there are only finitely many possible behaviors.

--

Compactness does for continuous functions what finiteness does for functions in general.

If a set A is finite then every function f:A→R has a max and a min, and every function f:A→R^n is bounded. If A is compact, the every continuous function from A to R has a max and a min and every continuous function from A to R^n is bounded.

If A is finite then every sequence of members of A has a subsequence that is eventually constant, and "eventually constant" is the only kind of convergence you can talk about without talking about a topology on the set. If A is compact, then every sequence of members of A has a convergent subsequence.

january 2017 by nhaliday

pr.probability - "Entropy" proof of Brunn-Minkowski Inequality? - MathOverflow

q-n-a overflow math information-theory wormholes proofs geometry math.MG estimate gowers mathtariat dimensionality limits intuition insight stat-mech concentration-of-measure 👳 cartoons math.FA additive-combo measure entropy-like nibble tensors coarse-fine brunn-minkowski boltzmann high-dimension curvature convexity-curvature

january 2017 by nhaliday

q-n-a overflow math information-theory wormholes proofs geometry math.MG estimate gowers mathtariat dimensionality limits intuition insight stat-mech concentration-of-measure 👳 cartoons math.FA additive-combo measure entropy-like nibble tensors coarse-fine brunn-minkowski boltzmann high-dimension curvature convexity-curvature

january 2017 by nhaliday

teaching - Intuitive explanation for dividing by $n-1$ when calculating standard deviation? - Cross Validated

january 2017 by nhaliday

The standard deviation calculated with a divisor of n-1 is a standard deviation calculated from the sample as an estimate of the standard deviation of the population from which the sample was drawn. Because the observed values fall, on average, closer to the sample mean than to the population mean, the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population. Using n-1 instead of n as the divisor corrects for that by making the result a little bit bigger.

Note that the correction has a larger proportional effect when n is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.

...

A common one is that the definition of variance (of a distribution) is the second moment recentered around a known, definite mean, whereas the estimator uses an estimated mean. This loss of a degree of freedom (given the mean, you can reconstitute the dataset with knowledge of just n−1 of the data values) requires the use of n−1 rather than nn to "adjust" the result.

q-n-a
overflow
stats
acm
intuition
explanation
bias-variance
methodology
moments
nibble
degrees-of-freedom
sampling-bias
generalization
dimensionality
ground-up
intricacy
Note that the correction has a larger proportional effect when n is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.

...

A common one is that the definition of variance (of a distribution) is the second moment recentered around a known, definite mean, whereas the estimator uses an estimated mean. This loss of a degree of freedom (given the mean, you can reconstitute the dataset with knowledge of just n−1 of the data values) requires the use of n−1 rather than nn to "adjust" the result.

january 2017 by nhaliday

linear algebra - What's an intuitive way to think about the determinant? - Mathematics Stack Exchange

january 2017 by nhaliday

goes through the standard volume of parallelepiped/multilinear alternating map formulations

q-n-a
overflow
intuition
math
linear-algebra
ground-up
explanation
characterization
spatial
measure
nibble
identity
january 2017 by nhaliday

Dvoretzky's theorem - Wikipedia

january 2017 by nhaliday

In mathematics, Dvoretzky's theorem is an important structural theorem about normed vector spaces proved by Aryeh Dvoretzky in the early 1960s, answering a question of Alexander Grothendieck. In essence, it says that every sufficiently high-dimensional normed vector space will have low-dimensional subspaces that are approximately Euclidean. Equivalently, every high-dimensional bounded symmetric convex set has low-dimensional sections that are approximately ellipsoids.

http://mathoverflow.net/questions/143527/intuitive-explanation-of-dvoretzkys-theorem

http://mathoverflow.net/questions/46278/unexpected-applications-of-dvoretzkys-theorem

math
math.FA
inner-product
levers
characterization
geometry
math.MG
concentration-of-measure
multi
q-n-a
overflow
intuition
examples
proofs
dimensionality
gowers
mathtariat
tcstariat
quantum
quantum-info
norms
nibble
high-dimension
wiki
reference
curvature
convexity-curvature
tcs
http://mathoverflow.net/questions/143527/intuitive-explanation-of-dvoretzkys-theorem

http://mathoverflow.net/questions/46278/unexpected-applications-of-dvoretzkys-theorem

january 2017 by nhaliday

mg.metric geometry - How to explain the concentration-of-measure phenomenon intuitively? - MathOverflow

q-n-a overflow soft-question math geometry probability intuition tcstariat orourke concentration-of-measure dimensionality tcs math.MG random pigeonhole-markov nibble paradox novelty high-dimension s:** spatial elegance

january 2017 by nhaliday

q-n-a overflow soft-question math geometry probability intuition tcstariat orourke concentration-of-measure dimensionality tcs math.MG random pigeonhole-markov nibble paradox novelty high-dimension s:** spatial elegance

january 2017 by nhaliday

ag.algebraic geometry - Why do combinatorial abstractions of geometric objects behave so well? - MathOverflow

q-n-a overflow math math.CO geometry synthesis intuition soft-question todo regularity math.AG math.RT polynomials positivity monotonicity nibble abstraction signum guessing

january 2017 by nhaliday

q-n-a overflow math math.CO geometry synthesis intuition soft-question todo regularity math.AG math.RT polynomials positivity monotonicity nibble abstraction signum guessing

january 2017 by nhaliday

dimensionality reduction - Relationship between SVD and PCA. How to use SVD to perform PCA? - Cross Validated

q-n-a overflow stats data-science intuition ground-up linear-algebra methodology explanation confusion links faq exploratory large-factor nibble s:null matrix-factorization

january 2017 by nhaliday

q-n-a overflow stats data-science intuition ground-up linear-algebra methodology explanation confusion links faq exploratory large-factor nibble s:null matrix-factorization

january 2017 by nhaliday

reference request - Why are two "random" vectors in $mathbb R^n$ approximately orthogonal for large $n$? - MathOverflow

q-n-a overflow math probability tidbits intuition cartoons math.MG spatial geometry linear-algebra mathtariat dimensionality magnitude concentration-of-measure probabilistic-method random separation inner-product nibble relaxation paradox novelty high-dimension direction guessing

january 2017 by nhaliday

q-n-a overflow math probability tidbits intuition cartoons math.MG spatial geometry linear-algebra mathtariat dimensionality magnitude concentration-of-measure probabilistic-method random separation inner-product nibble relaxation paradox novelty high-dimension direction guessing

january 2017 by nhaliday

fa.functional analysis - Almost orthogonal vectors - MathOverflow

january 2017 by nhaliday

- you can pick exp(Θ(nε^2)) ε-almost orthogonal unit vectors in R^n w/ probabilistic method

- can also use Johnson-Lindenstrauss

q-n-a
overflow
math
tidbits
intuition
geometry
spatial
cartoons
dimensionality
linear-algebra
magnitude
gowers
mathtariat
tcstariat
math.CO
probabilistic-method
embeddings
math.MG
random
separation
inner-product
nibble
relaxation
paradox
novelty
high-dimension
direction
shift
- can also use Johnson-Lindenstrauss

january 2017 by nhaliday

"Surely You're Joking, Mr. Feynman!": Adventures of a Curious Character ... - Richard P. Feynman - Google Books

january 2017 by nhaliday

Actually, there was a certain amount of genuine quality to my guesses. I had a scheme, which I still use today when somebody is explaining something that l’m trying to understand: I keep making up examples. For instance, the mathematicians would come in with a terrific theorem, and they’re all excited. As they’re telling me the conditions of the theorem, I construct something which fits all the conditions. You know, you have a set (one ball)—disjoint (two balls). Then the balls tum colors, grow hairs, or whatever, in my head as they put more conditions on. Finally they state the theorem, which is some dumb thing about the ball which isn’t true for my hairy green ball thing, so I say, “False!"

physics
math
feynman
thinking
empirical
examples
lens
intuition
operational
stories
metabuch
visual-understanding
thurston
hi-order-bits
geometry
topology
cartoons
giants
👳
nibble
the-trenches
metameta
meta:math
s:**
quotes
gbooks
elegance
january 2017 by nhaliday

ca.analysis and odes - What's a nice argument that shows the volume of the unit ball in $mathbb R^n$ approaches 0? - MathOverflow

q-n-a overflow intuition math geometry spatial dimensionality limits tidbits math.MG measure magnitude visual-understanding oly concentration-of-measure pigeonhole-markov nibble fedja coarse-fine novelty high-dimension elegance guessing

january 2017 by nhaliday

q-n-a overflow intuition math geometry spatial dimensionality limits tidbits math.MG measure magnitude visual-understanding oly concentration-of-measure pigeonhole-markov nibble fedja coarse-fine novelty high-dimension elegance guessing

january 2017 by nhaliday

pr.probability - What is convolution intuitively? - MathOverflow

january 2017 by nhaliday

I remember as a graduate student that Ingrid Daubechies frequently referred to convolution by a bump function as "blurring" - its effect on images is similar to what a short-sighted person experiences when taking off his or her glasses (and, indeed, if one works through the geometric optics, convolution is not a bad first approximation for this effect). I found this to be very helpful, not just for understanding convolution per se, but as a lesson that one should try to use physical intuition to model mathematical concepts whenever one can.

More generally, if one thinks of functions as fuzzy versions of points, then convolution is the fuzzy version of addition (or sometimes multiplication, depending on the context). The probabilistic interpretation is one example of this (where the fuzz is a a probability distribution), but one can also have signed, complex-valued, or vector-valued fuzz, of course.

q-n-a
overflow
math
concept
atoms
intuition
motivation
gowers
visual-understanding
aphorism
soft-question
tidbits
👳
mathtariat
cartoons
ground-up
metabuch
analogy
nibble
yoga
neurons
retrofit
optics
concrete
s:*
multiplicative
fourier
More generally, if one thinks of functions as fuzzy versions of points, then convolution is the fuzzy version of addition (or sometimes multiplication, depending on the context). The probabilistic interpretation is one example of this (where the fuzz is a a probability distribution), but one can also have signed, complex-valued, or vector-valued fuzz, of course.

january 2017 by nhaliday

soft question - Thinking and Explaining - MathOverflow

january 2017 by nhaliday

- good question from Bill Thurston

- great answers by Terry Tao, fedja, Minhyong Kim, gowers, etc.

Terry Tao:

- symmetry as blurring/vibrating/wobbling, scale invariance

- anthropomorphization, adversarial perspective for estimates/inequalities/quantifiers, spending/economy

fedja walks through his though-process from another answer

Minhyong Kim: anthropology of mathematical philosophizing

Per Vognsen: normality as isotropy

comment: conjugate subgroup gHg^-1 ~ "H but somewhere else in G"

gowers: hidden things in basic mathematics/arithmetic

comment by Ryan Budney: x sin(x) via x -> (x, sin(x)), (x, y) -> xy

I kinda get what he's talking about but needed to use Mathematica to get the initial visualization down.

To remind myself later:

- xy can be easily visualized by juxtaposing the two parabolae x^2 and -x^2 diagonally

- x sin(x) can be visualized along that surface by moving your finger along the line (x, 0) but adding some oscillations in y direction according to sin(x)

q-n-a
soft-question
big-list
intuition
communication
teaching
math
thinking
writing
thurston
lens
overflow
synthesis
hi-order-bits
👳
insight
meta:math
clarity
nibble
giants
cartoons
gowers
mathtariat
better-explained
stories
the-trenches
problem-solving
homogeneity
symmetry
fedja
examples
philosophy
big-picture
vague
isotropy
reflection
spatial
ground-up
visual-understanding
polynomials
dimensionality
math.GR
worrydream
scholar
🎓
neurons
metabuch
yoga
retrofit
mental-math
metameta
wisdom
wordlessness
oscillation
operational
adversarial
quantifiers-sums
exposition
explanation
tricki
concrete
s:***
manifolds
invariance
dynamical
info-dynamics
cool
direction
elegance
heavyweights
analysis
guessing
grokkability-clarity
technical-writing
- great answers by Terry Tao, fedja, Minhyong Kim, gowers, etc.

Terry Tao:

- symmetry as blurring/vibrating/wobbling, scale invariance

- anthropomorphization, adversarial perspective for estimates/inequalities/quantifiers, spending/economy

fedja walks through his though-process from another answer

Minhyong Kim: anthropology of mathematical philosophizing

Per Vognsen: normality as isotropy

comment: conjugate subgroup gHg^-1 ~ "H but somewhere else in G"

gowers: hidden things in basic mathematics/arithmetic

comment by Ryan Budney: x sin(x) via x -> (x, sin(x)), (x, y) -> xy

I kinda get what he's talking about but needed to use Mathematica to get the initial visualization down.

To remind myself later:

- xy can be easily visualized by juxtaposing the two parabolae x^2 and -x^2 diagonally

- x sin(x) can be visualized along that surface by moving your finger along the line (x, 0) but adding some oscillations in y direction according to sin(x)

january 2017 by nhaliday

Fourier transform for dummies - Mathematics Stack Exchange

q-n-a big-list fourier intuition math visual-understanding motivation overflow soft-question ground-up nibble qra concept init IEEE space sky models occam parsimony stories history iron-age mediterranean the-classics

january 2017 by nhaliday

q-n-a big-list fourier intuition math visual-understanding motivation overflow soft-question ground-up nibble qra concept init IEEE space sky models occam parsimony stories history iron-age mediterranean the-classics

january 2017 by nhaliday

Shtetl-Optimized » Blog Archive » Why I Am Not An Integrated Information Theorist (or, The Unconscious Expander)

january 2017 by nhaliday

In my opinion, how to construct a theory that tells us which physical systems are conscious and which aren’t—giving answers that agree with “common sense” whenever the latter renders a verdict—is one of the deepest, most fascinating problems in all of science. Since I don’t know a standard name for the problem, I hereby call it the Pretty-Hard Problem of Consciousness. Unlike with the Hard Hard Problem, I don’t know of any philosophical reason why the Pretty-Hard Problem should be inherently unsolvable; but on the other hand, humans seem nowhere close to solving it (if we had solved it, then we could reduce the abortion, animal rights, and strong AI debates to “gentlemen, let us calculate!”).

Now, I regard IIT as a serious, honorable attempt to grapple with the Pretty-Hard Problem of Consciousness: something concrete enough to move the discussion forward. But I also regard IIT as a failed attempt on the problem. And I wish people would recognize its failure, learn from it, and move on.

In my view, IIT fails to solve the Pretty-Hard Problem because it unavoidably predicts vast amounts of consciousness in physical systems that no sane person would regard as particularly “conscious” at all: indeed, systems that do nothing but apply a low-density parity-check code, or other simple transformations of their input data. Moreover, IIT predicts not merely that these systems are “slightly” conscious (which would be fine), but that they can be unboundedly more conscious than humans are.

To justify that claim, I first need to define Φ. Strikingly, despite the large literature about Φ, I had a hard time finding a clear mathematical definition of it—one that not only listed formulas but fully defined the structures that the formulas were talking about. Complicating matters further, there are several competing definitions of Φ in the literature, including ΦDM (discrete memoryless), ΦE (empirical), and ΦAR (autoregressive), which apply in different contexts (e.g., some take time evolution into account and others don’t). Nevertheless, I think I can define Φ in a way that will make sense to theoretical computer scientists. And crucially, the broad point I want to make about Φ won’t depend much on the details of its formalization anyway.

We consider a discrete system in a state x=(x1,…,xn)∈Sn, where S is a finite alphabet (the simplest case is S={0,1}). We imagine that the system evolves via an “updating function” f:Sn→Sn. Then the question that interests us is whether the xi‘s can be partitioned into two sets A and B, of roughly comparable size, such that the updates to the variables in A don’t depend very much on the variables in B and vice versa. If such a partition exists, then we say that the computation of f does not involve “global integration of information,” which on Tononi’s theory is a defining aspect of consciousness.

aaronson
tcstariat
philosophy
dennett
interdisciplinary
critique
nibble
org:bleg
within-without
the-self
neuro
psychology
cog-psych
metrics
nitty-gritty
composition-decomposition
complex-systems
cybernetics
bits
information-theory
entropy-like
forms-instances
empirical
walls
arrows
math.DS
structure
causation
quantitative-qualitative
number
extrema
optimization
abstraction
explanation
summary
degrees-of-freedom
whole-partial-many
network-structure
systematic-ad-hoc
tcs
complexity
hardness
no-go
computation
measurement
intricacy
examples
counterexample
coding-theory
linear-algebra
fields
graphs
graph-theory
expanders
math
math.CO
properties
local-global
intuition
error
definition
coupling-cohesion
Now, I regard IIT as a serious, honorable attempt to grapple with the Pretty-Hard Problem of Consciousness: something concrete enough to move the discussion forward. But I also regard IIT as a failed attempt on the problem. And I wish people would recognize its failure, learn from it, and move on.

In my view, IIT fails to solve the Pretty-Hard Problem because it unavoidably predicts vast amounts of consciousness in physical systems that no sane person would regard as particularly “conscious” at all: indeed, systems that do nothing but apply a low-density parity-check code, or other simple transformations of their input data. Moreover, IIT predicts not merely that these systems are “slightly” conscious (which would be fine), but that they can be unboundedly more conscious than humans are.

To justify that claim, I first need to define Φ. Strikingly, despite the large literature about Φ, I had a hard time finding a clear mathematical definition of it—one that not only listed formulas but fully defined the structures that the formulas were talking about. Complicating matters further, there are several competing definitions of Φ in the literature, including ΦDM (discrete memoryless), ΦE (empirical), and ΦAR (autoregressive), which apply in different contexts (e.g., some take time evolution into account and others don’t). Nevertheless, I think I can define Φ in a way that will make sense to theoretical computer scientists. And crucially, the broad point I want to make about Φ won’t depend much on the details of its formalization anyway.

We consider a discrete system in a state x=(x1,…,xn)∈Sn, where S is a finite alphabet (the simplest case is S={0,1}). We imagine that the system evolves via an “updating function” f:Sn→Sn. Then the question that interests us is whether the xi‘s can be partitioned into two sets A and B, of roughly comparable size, such that the updates to the variables in A don’t depend very much on the variables in B and vice versa. If such a partition exists, then we say that the computation of f does not involve “global integration of information,” which on Tononi’s theory is a defining aspect of consciousness.

january 2017 by nhaliday

soft question - Why does Fourier analysis of Boolean functions "work"? - Theoretical Computer Science Stack Exchange

december 2016 by nhaliday

Here is my point of view, which I learned from Guy Kindler, though someone more experienced can probably give a better answer: Consider the linear space of functions f: {0,1}^n -> R and consider a linear operator of the form σ_w (for w in {0,1}^n), that maps a function f(x) as above to the function f(x+w). In many of the questions of TCS, there is an underlying need to analyze the effects that such operators have on certain functions.

Now, the point is that the Fourier basis is the basis that diagonalizes all those operators at the same time, which makes the analysis of those operators much simpler. More generally, the Fourier basis diagonalizes the convolution operator, which also underlies many of those questions. Thus, Fourier analysis is likely to be effective whenever one needs to analyze those operators.

q-n-a
math
tcs
synthesis
boolean-analysis
fourier
👳
tidbits
motivation
intuition
linear-algebra
overflow
hi-order-bits
insight
curiosity
ground-up
arrows
nibble
s:*
elegance
guessing
Now, the point is that the Fourier basis is the basis that diagonalizes all those operators at the same time, which makes the analysis of those operators much simpler. More generally, the Fourier basis diagonalizes the convolution operator, which also underlies many of those questions. Thus, Fourier analysis is likely to be effective whenever one needs to analyze those operators.

december 2016 by nhaliday

gt.geometric topology - Intuitive crutches for higher dimensional thinking - MathOverflow

december 2016 by nhaliday

Terry Tao:

I can't help you much with high-dimensional topology - it's not my field, and I've not picked up the various tricks topologists use to get a grip on the subject - but when dealing with the geometry of high-dimensional (or infinite-dimensional) vector spaces such as R^n, there are plenty of ways to conceptualise these spaces that do not require visualising more than three dimensions directly.

For instance, one can view a high-dimensional vector space as a state space for a system with many degrees of freedom. A megapixel image, for instance, is a point in a million-dimensional vector space; by varying the image, one can explore the space, and various subsets of this space correspond to various classes of images.

One can similarly interpret sound waves, a box of gases, an ecosystem, a voting population, a stream of digital data, trials of random variables, the results of a statistical survey, a probabilistic strategy in a two-player game, and many other concrete objects as states in a high-dimensional vector space, and various basic concepts such as convexity, distance, linearity, change of variables, orthogonality, or inner product can have very natural meanings in some of these models (though not in all).

It can take a bit of both theory and practice to merge one's intuition for these things with one's spatial intuition for vectors and vector spaces, but it can be done eventually (much as after one has enough exposure to measure theory, one can start merging one's intuition regarding cardinality, mass, length, volume, probability, cost, charge, and any number of other "real-life" measures).

For instance, the fact that most of the mass of a unit ball in high dimensions lurks near the boundary of the ball can be interpreted as a manifestation of the law of large numbers, using the interpretation of a high-dimensional vector space as the state space for a large number of trials of a random variable.

More generally, many facts about low-dimensional projections or slices of high-dimensional objects can be viewed from a probabilistic, statistical, or signal processing perspective.

Scott Aaronson:

Here are some of the crutches I've relied on. (Admittedly, my crutches are probably much more useful for theoretical computer science, combinatorics, and probability than they are for geometry, topology, or physics. On a related note, I personally have a much easier time thinking about R^n than about, say, R^4 or R^5!)

1. If you're trying to visualize some 4D phenomenon P, first think of a related 3D phenomenon P', and then imagine yourself as a 2D being who's trying to visualize P'. The advantage is that, unlike with the 4D vs. 3D case, you yourself can easily switch between the 3D and 2D perspectives, and can therefore get a sense of exactly what information is being lost when you drop a dimension. (You could call this the "Flatland trick," after the most famous literary work to rely on it.)

2. As someone else mentioned, discretize! Instead of thinking about R^n, think about the Boolean hypercube {0,1}^n, which is finite and usually easier to get intuition about. (When working on problems, I often find myself drawing {0,1}^4 on a sheet of paper by drawing two copies of {0,1}^3 and then connecting the corresponding vertices.)

3. Instead of thinking about a subset S⊆R^n, think about its characteristic function f:R^n→{0,1}. I don't know why that trivial perspective switch makes such a big difference, but it does ... maybe because it shifts your attention to the process of computing f, and makes you forget about the hopeless task of visualizing S!

4. One of the central facts about R^n is that, while it has "room" for only n orthogonal vectors, it has room for exp(n) almost-orthogonal vectors. Internalize that one fact, and so many other properties of R^n (for example, that the n-sphere resembles a "ball with spikes sticking out," as someone mentioned before) will suddenly seem non-mysterious. In turn, one way to internalize the fact that R^n has so many almost-orthogonal vectors is to internalize Shannon's theorem that there exist good error-correcting codes.

5. To get a feel for some high-dimensional object, ask questions about the behavior of a process that takes place on that object. For example: if I drop a ball here, which local minimum will it settle into? How long does this random walk on {0,1}^n take to mix?

Gil Kalai:

This is a slightly different point, but Vitali Milman, who works in high-dimensional convexity, likes to draw high-dimensional convex bodies in a non-convex way. This is to convey the point that if you take the convex hull of a few points on the unit sphere of R^n, then for large n very little of the measure of the convex body is anywhere near the corners, so in a certain sense the body is a bit like a small sphere with long thin "spikes".

q-n-a
intuition
math
visual-understanding
list
discussion
thurston
tidbits
aaronson
tcs
geometry
problem-solving
yoga
👳
big-list
metabuch
tcstariat
gowers
mathtariat
acm
overflow
soft-question
levers
dimensionality
hi-order-bits
insight
synthesis
thinking
models
cartoons
coding-theory
information-theory
probability
concentration-of-measure
magnitude
linear-algebra
boolean-analysis
analogy
arrows
lifts-projections
measure
markov
sampling
shannon
conceptual-vocab
nibble
degrees-of-freedom
worrydream
neurons
retrofit
oscillation
paradox
novelty
tricki
concrete
high-dimension
s:***
manifolds
direction
curvature
convexity-curvature
elegance
guessing
I can't help you much with high-dimensional topology - it's not my field, and I've not picked up the various tricks topologists use to get a grip on the subject - but when dealing with the geometry of high-dimensional (or infinite-dimensional) vector spaces such as R^n, there are plenty of ways to conceptualise these spaces that do not require visualising more than three dimensions directly.

For instance, one can view a high-dimensional vector space as a state space for a system with many degrees of freedom. A megapixel image, for instance, is a point in a million-dimensional vector space; by varying the image, one can explore the space, and various subsets of this space correspond to various classes of images.

One can similarly interpret sound waves, a box of gases, an ecosystem, a voting population, a stream of digital data, trials of random variables, the results of a statistical survey, a probabilistic strategy in a two-player game, and many other concrete objects as states in a high-dimensional vector space, and various basic concepts such as convexity, distance, linearity, change of variables, orthogonality, or inner product can have very natural meanings in some of these models (though not in all).

It can take a bit of both theory and practice to merge one's intuition for these things with one's spatial intuition for vectors and vector spaces, but it can be done eventually (much as after one has enough exposure to measure theory, one can start merging one's intuition regarding cardinality, mass, length, volume, probability, cost, charge, and any number of other "real-life" measures).

For instance, the fact that most of the mass of a unit ball in high dimensions lurks near the boundary of the ball can be interpreted as a manifestation of the law of large numbers, using the interpretation of a high-dimensional vector space as the state space for a large number of trials of a random variable.

More generally, many facts about low-dimensional projections or slices of high-dimensional objects can be viewed from a probabilistic, statistical, or signal processing perspective.

Scott Aaronson:

Here are some of the crutches I've relied on. (Admittedly, my crutches are probably much more useful for theoretical computer science, combinatorics, and probability than they are for geometry, topology, or physics. On a related note, I personally have a much easier time thinking about R^n than about, say, R^4 or R^5!)

1. If you're trying to visualize some 4D phenomenon P, first think of a related 3D phenomenon P', and then imagine yourself as a 2D being who's trying to visualize P'. The advantage is that, unlike with the 4D vs. 3D case, you yourself can easily switch between the 3D and 2D perspectives, and can therefore get a sense of exactly what information is being lost when you drop a dimension. (You could call this the "Flatland trick," after the most famous literary work to rely on it.)

2. As someone else mentioned, discretize! Instead of thinking about R^n, think about the Boolean hypercube {0,1}^n, which is finite and usually easier to get intuition about. (When working on problems, I often find myself drawing {0,1}^4 on a sheet of paper by drawing two copies of {0,1}^3 and then connecting the corresponding vertices.)

3. Instead of thinking about a subset S⊆R^n, think about its characteristic function f:R^n→{0,1}. I don't know why that trivial perspective switch makes such a big difference, but it does ... maybe because it shifts your attention to the process of computing f, and makes you forget about the hopeless task of visualizing S!

4. One of the central facts about R^n is that, while it has "room" for only n orthogonal vectors, it has room for exp(n) almost-orthogonal vectors. Internalize that one fact, and so many other properties of R^n (for example, that the n-sphere resembles a "ball with spikes sticking out," as someone mentioned before) will suddenly seem non-mysterious. In turn, one way to internalize the fact that R^n has so many almost-orthogonal vectors is to internalize Shannon's theorem that there exist good error-correcting codes.

5. To get a feel for some high-dimensional object, ask questions about the behavior of a process that takes place on that object. For example: if I drop a ball here, which local minimum will it settle into? How long does this random walk on {0,1}^n take to mix?

Gil Kalai:

This is a slightly different point, but Vitali Milman, who works in high-dimensional convexity, likes to draw high-dimensional convex bodies in a non-convex way. This is to convey the point that if you take the convex hull of a few points on the unit sphere of R^n, then for large n very little of the measure of the convex body is anywhere near the corners, so in a certain sense the body is a bit like a small sphere with long thin "spikes".

december 2016 by nhaliday

The probabilistic heuristic justification of the ABC conjecture | What's new

open-problems gowers thinking tricks intuition probability math tidbits yoga mathtariat models heuristic math.NT cartoons nibble org:bleg borel-cantelli big-surf additive multiplicative questions guessing

october 2016 by nhaliday

open-problems gowers thinking tricks intuition probability math tidbits yoga mathtariat models heuristic math.NT cartoons nibble org:bleg borel-cantelli big-surf additive multiplicative questions guessing

october 2016 by nhaliday

nt.number theory - When has the Borel-Cantelli heuristic been wrong? - MathOverflow

intuition counterexample list q-n-a tidbits math thinking tricks synthesis yoga probability models big-list heuristic overflow levers math.NT rigor cartoons nibble borel-cantelli guessing truth error

october 2016 by nhaliday

intuition counterexample list q-n-a tidbits math thinking tricks synthesis yoga probability models big-list heuristic overflow levers math.NT rigor cartoons nibble borel-cantelli guessing truth error

october 2016 by nhaliday

Overcoming Bias : Two Kinds Of Status

september 2016 by nhaliday

prestige and dominance

More here. I was skeptical at first, but now am convinced: humans see two kinds of status, and approve of prestige-status much more than domination-status. I’ll have much more to say about this in the coming days, but it is far from clear to me that prestige-status is as much better than domination-status as people seem to think. Efforts to achieve prestige-status also have serious negative side-effects.

Two Ways to the Top: Evidence That Dominance and Prestige Are Distinct Yet Viable Avenues to Social Rank and Influence: https://henrich.fas.harvard.edu/files/henrich/files/cheng_et_al_2013.pdf

Dominance (the use of force and intimidation to induce fear) and Prestige (the sharing of expertise or know-how to gain respect)

...

According to the model, Dominance initially arose in evolutionary history as a result of agonistic contests for material resources and mates that were common among nonhuman species, but continues to exist in contemporary human societies, largely in the form of psychological intimidation, coercion, and wielded control over costs and benefits (e.g., access to resources, mates, and well-being). In both humans and nonhumans, Dominance hierarchies are thought to emerge to help maintain patterns of submission directed from subordinates to Dominants, thereby minimizing agonistic battles and incurred costs.

In contrast, Prestige is likely unique to humans, because it is thought to have emerged from selection pressures to preferentially attend to and acquire cultural knowledge from highly skilled or successful others, a capacity considered to be less developed in other animals (Boyd & Richerson, 1985; Laland & Galef, 2009). In this view, social learning (i.e., copying others) evolved in humans as a low-cost fitness-maximizing, information-gathering mechanism (Boyd & Richerson, 1985). Once it became adaptive to copy skilled others, a preference for social models with better than average information would have emerged. This would promote competition for access to the highest quality models, and deference toward these models in exchange for copying and learning opportunities. Consequently, selection likely favored Prestige differentiation, with individuals possessing high-quality information or skills elevated to the top of the hierarchy. Meanwhile, other individuals may reach the highest ranks of their group’s hierarchy by wielding threat of force, regardless of the quality of their knowledge or skills. Thus, Dominance and Prestige can be thought of as coexisting avenues to attaining rank and influence within social groups, despite being underpinned by distinct motivations and behavioral patterns, and resulting in distinct patterns of imitation and deference from subordinates.

Importantly, both Dominance and Prestige are best conceptualized as cognitive and behavioral strategies (i.e., suites of subjective feelings, cognitions, motivations, and behavioral patterns that together produce certain outcomes) deployed in certain situations, and can be used (with more or less success) by any individual within a group. They are not types of individuals, or even, necessarily, traits within individuals. Instead, we assume that all situated dyadic relationships contain differential degrees of both Dominance and Prestige, such that each person is simultaneously Dominant and Prestigious to some extent, to some other individual. Thus, it is possible that a high degree of Dominance and a high degree of Prestige may be found within the same individual, and may depend on who is doing the judging. For example, by controlling students’ access to rewards and punishments, school teachers may exert Dominance in their relationships with some students, but simultaneously enjoy Prestige with others, if they are respected and deferred to for their competence and wisdom. Indeed, previous studies have shown that based on both self- and peer ratings, Dominance and Prestige are largely independent (mean r = -.03; Cheng et al., 2010).

Status Hypocrisy: https://www.overcomingbias.com/2017/01/status-hypocrisy.html

Today we tend to say that our leaders have prestige, while their leaders have dominance. That is, their leaders hold power via personal connections and the threat and practice of violence, bribes, sex, gossip, and conformity pressures. Our leaders, instead, mainly just have whatever abilities follow from our deepest respect and admiration regarding their wisdom and efforts on serious topics that matter for us all. Their leaders more seek power, while ours more have leadership thrust upon them. Because of this us/them split, we tend to try to use persuasion on us, but force on them, when seeking to to change behaviors.

...

Clearly, while there is some fact of the matter about how much a person gains their status via licit or illicit means, there is also a lot of impression management going on. We like to give others the impression that we personally mainly want prestige in ourselves and our associates, and that we only grant others status via the prestige they have earned. But let me suggest that, compared to this ideal, we actually want more dominance in ourselves and our associates than we like to admit, and we submit more often to dominance.

Cads, Dads, Doms: https://www.overcomingbias.com/2010/07/cads-dads-doms.html

"The proper dichotomy is not “virile vs. wimpy” as has been supposed, but “exciting vs. drab,” with the former having the two distinct sub-groups “macho man vs. pretty boy.” Another way to see that this is the right dichotomy is to look around the world: wherever girls really dig macho men, they also dig the peacocky musician type too, finding safe guys a bit boring. And conversely, where devoted dads do the best, it’s more difficult for macho men or in-town-for-a-day rockstars to make out like bandits. …

Whatever it is about high-pathogen-load areas that selects for greater polygynous behavior … will result in an increase in both gorilla-like and peacock-like males, since they’re two viable ways to pursue a polygynous mating strategy."

This fits with there being two kinds of status: dominance and prestige. Macho men, such as CEOs and athletes, have dominance, while musicians and artists have prestige. But women seek both short and long term mates. Since both kinds of status suggest good genes, both attract women seeking short term mates. This happens more when women are younger and richer, and when there is more disease. Foragers pretend they don’t respect dominance as much as they do, so prestigious men get more overt attention, while dominant men get more covert attention.

Women seeking long term mates also consider a man’s ability to supply resources, and may settle for poorer genes to get more resources. Dominant men tend to have more resources than prestigious men, so such men are more likely to fill both roles, being long term mates for some women and short term mates for others. Men who can offer only prestige must accept worse long term mates, while men who can offer only resources must accept few short term mates. Those low in prestige, resources, or dominance must accept no mates. A man who had prestige, dominance, and resources would get the best short and long term mates – what men are these?

Stories are biased toward dramatic events, and so are biased toward events with risky men; it is harder to tell a good story about the attraction of a resource-rich man. So stories naturally encourage short term mating. Shouldn’t this make long-term mates wary of strong mate attraction to dramatic stories?

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518319076

Woman want three things: someone to fight for them (the Warrior), someone to provide for them (the Tycoon) and someone to excite their emotions or entertain them (the Wizard).

In this context,

Dom=Warrior

Dad= Tycoon

Cad= Wizard

To repeat:

Dom (Cocky)+ Dad (Generous) + Cad (Exciting/Funny) = Laid

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518318987

There is an old distinction between "proximate" and "ultimate" causes. Evolution is an ultimate cause, physiology (and psychology, here) is a proximate cause. The flower bends to follow the sun because it gathers more light that way, but the immediate mechanism of the bending involves hormones called auxins. I see a lot of speculation about, say, sexual cognitive dimorphism whose ultimate cause is evolutionary, but not so much speculation about the proximate cause - the "how" of the difference, rather than the "why". And here I think a visit to an older mode of explanation like Marsden's - one which is psychological rather than genetic - can sensitize us to the fact that the proximate causes of a behavioral tendency need not be a straightforward matter of being hardwired differently.

This leads to my second point, which is just that we should remember that human beings actually possess consciousness. This means not only that the proximate cause of a behavior may deeply involve subjectivity, self-awareness, and an existential situation. It also means that all of these propositions about what people do are susceptible to change once they have been spelled out and become part of the culture. It is rather like the stock market: once everyone knows (or believes) something, then that information provides no advantage, creating an incentive for novelty.

Finally, the consequences of new beliefs about the how and the why of human nature and human behavior. Right or wrong, theories already begin to have consequences once they are taken up and incorporated into subjectivity. We really need a new Foucault to take on this topic.

The Economics of Social Status: http://www.meltingasphalt.com/the-economics-of-social-status/

Prestige vs. dominance. Joseph Henrich (of WEIRD fame) distinguishes two types of status. Prestige is the kind of status we get from being an impressive human specimen (think Meryl Streep), and it's governed by our 'approach' instincts. Dominance, on the other hand, is … [more]

things
status
hanson
thinking
comparison
len:short
anthropology
farmers-and-foragers
phalanges
ratty
duty
power
humility
hypocrisy
hari-seldon
multi
sex
gender
signaling
🐝
tradeoffs
evopsych
insight
models
sexuality
gender-diff
chart
postrat
yvain
ssc
simler
critique
essay
debate
paying-rent
gedanken
empirical
operational
vague
info-dynamics
len:long
community
henrich
long-short-run
rhetoric
contrarianism
coordination
social-structure
hidden-motives
politics
2016-election
rationality
links
study
summary
list
hive-mind
speculation
coalitions
values
🤖
metabuch
envy
universalism-particularism
egalitarianism-hierarchy
s-factor
unintended-consequences
tribalism
group-selection
justice
inequality
competition
cultural-dynamics
peace-violence
ranking
machiavelli
authoritarianism
strategy
tactics
organizing
leadership
management
n-factor
duplication
thiel
volo-avolo
todo
technocracy
rent-seeking
incentives
econotariat
marginal-rev
civilization
rot
gibbon
More here. I was skeptical at first, but now am convinced: humans see two kinds of status, and approve of prestige-status much more than domination-status. I’ll have much more to say about this in the coming days, but it is far from clear to me that prestige-status is as much better than domination-status as people seem to think. Efforts to achieve prestige-status also have serious negative side-effects.

Two Ways to the Top: Evidence That Dominance and Prestige Are Distinct Yet Viable Avenues to Social Rank and Influence: https://henrich.fas.harvard.edu/files/henrich/files/cheng_et_al_2013.pdf

Dominance (the use of force and intimidation to induce fear) and Prestige (the sharing of expertise or know-how to gain respect)

...

According to the model, Dominance initially arose in evolutionary history as a result of agonistic contests for material resources and mates that were common among nonhuman species, but continues to exist in contemporary human societies, largely in the form of psychological intimidation, coercion, and wielded control over costs and benefits (e.g., access to resources, mates, and well-being). In both humans and nonhumans, Dominance hierarchies are thought to emerge to help maintain patterns of submission directed from subordinates to Dominants, thereby minimizing agonistic battles and incurred costs.

In contrast, Prestige is likely unique to humans, because it is thought to have emerged from selection pressures to preferentially attend to and acquire cultural knowledge from highly skilled or successful others, a capacity considered to be less developed in other animals (Boyd & Richerson, 1985; Laland & Galef, 2009). In this view, social learning (i.e., copying others) evolved in humans as a low-cost fitness-maximizing, information-gathering mechanism (Boyd & Richerson, 1985). Once it became adaptive to copy skilled others, a preference for social models with better than average information would have emerged. This would promote competition for access to the highest quality models, and deference toward these models in exchange for copying and learning opportunities. Consequently, selection likely favored Prestige differentiation, with individuals possessing high-quality information or skills elevated to the top of the hierarchy. Meanwhile, other individuals may reach the highest ranks of their group’s hierarchy by wielding threat of force, regardless of the quality of their knowledge or skills. Thus, Dominance and Prestige can be thought of as coexisting avenues to attaining rank and influence within social groups, despite being underpinned by distinct motivations and behavioral patterns, and resulting in distinct patterns of imitation and deference from subordinates.

Importantly, both Dominance and Prestige are best conceptualized as cognitive and behavioral strategies (i.e., suites of subjective feelings, cognitions, motivations, and behavioral patterns that together produce certain outcomes) deployed in certain situations, and can be used (with more or less success) by any individual within a group. They are not types of individuals, or even, necessarily, traits within individuals. Instead, we assume that all situated dyadic relationships contain differential degrees of both Dominance and Prestige, such that each person is simultaneously Dominant and Prestigious to some extent, to some other individual. Thus, it is possible that a high degree of Dominance and a high degree of Prestige may be found within the same individual, and may depend on who is doing the judging. For example, by controlling students’ access to rewards and punishments, school teachers may exert Dominance in their relationships with some students, but simultaneously enjoy Prestige with others, if they are respected and deferred to for their competence and wisdom. Indeed, previous studies have shown that based on both self- and peer ratings, Dominance and Prestige are largely independent (mean r = -.03; Cheng et al., 2010).

Status Hypocrisy: https://www.overcomingbias.com/2017/01/status-hypocrisy.html

Today we tend to say that our leaders have prestige, while their leaders have dominance. That is, their leaders hold power via personal connections and the threat and practice of violence, bribes, sex, gossip, and conformity pressures. Our leaders, instead, mainly just have whatever abilities follow from our deepest respect and admiration regarding their wisdom and efforts on serious topics that matter for us all. Their leaders more seek power, while ours more have leadership thrust upon them. Because of this us/them split, we tend to try to use persuasion on us, but force on them, when seeking to to change behaviors.

...

Clearly, while there is some fact of the matter about how much a person gains their status via licit or illicit means, there is also a lot of impression management going on. We like to give others the impression that we personally mainly want prestige in ourselves and our associates, and that we only grant others status via the prestige they have earned. But let me suggest that, compared to this ideal, we actually want more dominance in ourselves and our associates than we like to admit, and we submit more often to dominance.

Cads, Dads, Doms: https://www.overcomingbias.com/2010/07/cads-dads-doms.html

"The proper dichotomy is not “virile vs. wimpy” as has been supposed, but “exciting vs. drab,” with the former having the two distinct sub-groups “macho man vs. pretty boy.” Another way to see that this is the right dichotomy is to look around the world: wherever girls really dig macho men, they also dig the peacocky musician type too, finding safe guys a bit boring. And conversely, where devoted dads do the best, it’s more difficult for macho men or in-town-for-a-day rockstars to make out like bandits. …

Whatever it is about high-pathogen-load areas that selects for greater polygynous behavior … will result in an increase in both gorilla-like and peacock-like males, since they’re two viable ways to pursue a polygynous mating strategy."

This fits with there being two kinds of status: dominance and prestige. Macho men, such as CEOs and athletes, have dominance, while musicians and artists have prestige. But women seek both short and long term mates. Since both kinds of status suggest good genes, both attract women seeking short term mates. This happens more when women are younger and richer, and when there is more disease. Foragers pretend they don’t respect dominance as much as they do, so prestigious men get more overt attention, while dominant men get more covert attention.

Women seeking long term mates also consider a man’s ability to supply resources, and may settle for poorer genes to get more resources. Dominant men tend to have more resources than prestigious men, so such men are more likely to fill both roles, being long term mates for some women and short term mates for others. Men who can offer only prestige must accept worse long term mates, while men who can offer only resources must accept few short term mates. Those low in prestige, resources, or dominance must accept no mates. A man who had prestige, dominance, and resources would get the best short and long term mates – what men are these?

Stories are biased toward dramatic events, and so are biased toward events with risky men; it is harder to tell a good story about the attraction of a resource-rich man. So stories naturally encourage short term mating. Shouldn’t this make long-term mates wary of strong mate attraction to dramatic stories?

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518319076

Woman want three things: someone to fight for them (the Warrior), someone to provide for them (the Tycoon) and someone to excite their emotions or entertain them (the Wizard).

In this context,

Dom=Warrior

Dad= Tycoon

Cad= Wizard

To repeat:

Dom (Cocky)+ Dad (Generous) + Cad (Exciting/Funny) = Laid

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518318987

There is an old distinction between "proximate" and "ultimate" causes. Evolution is an ultimate cause, physiology (and psychology, here) is a proximate cause. The flower bends to follow the sun because it gathers more light that way, but the immediate mechanism of the bending involves hormones called auxins. I see a lot of speculation about, say, sexual cognitive dimorphism whose ultimate cause is evolutionary, but not so much speculation about the proximate cause - the "how" of the difference, rather than the "why". And here I think a visit to an older mode of explanation like Marsden's - one which is psychological rather than genetic - can sensitize us to the fact that the proximate causes of a behavioral tendency need not be a straightforward matter of being hardwired differently.

This leads to my second point, which is just that we should remember that human beings actually possess consciousness. This means not only that the proximate cause of a behavior may deeply involve subjectivity, self-awareness, and an existential situation. It also means that all of these propositions about what people do are susceptible to change once they have been spelled out and become part of the culture. It is rather like the stock market: once everyone knows (or believes) something, then that information provides no advantage, creating an incentive for novelty.

Finally, the consequences of new beliefs about the how and the why of human nature and human behavior. Right or wrong, theories already begin to have consequences once they are taken up and incorporated into subjectivity. We really need a new Foucault to take on this topic.

The Economics of Social Status: http://www.meltingasphalt.com/the-economics-of-social-status/

Prestige vs. dominance. Joseph Henrich (of WEIRD fame) distinguishes two types of status. Prestige is the kind of status we get from being an impressive human specimen (think Meryl Streep), and it's governed by our 'approach' instincts. Dominance, on the other hand, is … [more]

september 2016 by nhaliday

Information Processing: Visualization of geometric intuitions underlying linear algebra (video)

video algebra intuition thinking lectures motivation visualization thurston insight hsu visual-understanding linear-algebra scitariat nibble worrydream better-explained concrete elegance

september 2016 by nhaliday

video algebra intuition thinking lectures motivation visualization thurston insight hsu visual-understanding linear-algebra scitariat nibble worrydream better-explained concrete elegance

september 2016 by nhaliday

machine learning - Euclidean distance is usually not good for sparse data? - Cross Validated

machine-learning acm intuition synthesis thinking q-n-a sparsity overflow soft-question dimensionality curiosity separation concentration-of-measure norms nibble novelty high-dimension direction metric-space yoga measure inner-product best-practices

september 2016 by nhaliday

machine-learning acm intuition synthesis thinking q-n-a sparsity overflow soft-question dimensionality curiosity separation concentration-of-measure norms nibble novelty high-dimension direction metric-space yoga measure inner-product best-practices

september 2016 by nhaliday

machine learning - Why is Euclidean distance not a good metric in high dimensions? - Cross Validated

thinking machine-learning math acm synthesis intuition q-n-a overflow soft-question dimensionality hi-order-bits curiosity cartoons concentration-of-measure norms nibble novelty high-dimension direction metric-space yoga measure best-practices

september 2016 by nhaliday

thinking machine-learning math acm synthesis intuition q-n-a overflow soft-question dimensionality hi-order-bits curiosity cartoons concentration-of-measure norms nibble novelty high-dimension direction metric-space yoga measure best-practices

september 2016 by nhaliday

Alon Amit's answer to In an online lecture, a professor mentioned that Einstein could draw or imagine a 4-dimensional figure. How can one possibly do that? - Quora

september 2016 by nhaliday

like the idea of treating color as 4th dimension

thurston
visualization
math
problem-solving
worrydream
tidbits
insight
intuition
q-n-a
visual-understanding
qra
oly
nibble
dimensionality
retrofit
concrete
elegance
september 2016 by nhaliday

On proof and progress in mathematics

pdf thurston math writing thinking synthesis papers essay unit nibble intuition worrydream communication proofs the-trenches reflection geometry meta:math better-explained stories virtu 🎓 scholar metameta wisdom narrative p:whenever inference cs programming rigor formal-methods meta:research info-dynamics elegance technical-writing heavyweights guessing trust

august 2016 by nhaliday

pdf thurston math writing thinking synthesis papers essay unit nibble intuition worrydream communication proofs the-trenches reflection geometry meta:math better-explained stories virtu 🎓 scholar metameta wisdom narrative p:whenever inference cs programming rigor formal-methods meta:research info-dynamics elegance technical-writing heavyweights guessing trust

august 2016 by nhaliday

Copy this bookmark: