Shtetl-Optimized » Blog Archive » What does the NSA think of academic cryptographers? Recently-declassified document provides clues

tcstariat aaronson commentary links quotes lens academia interdisciplinary government intel white-paper crypto rigorous-crypto complexity tcs conference impetus telos-atelos motivation impact proof-systems expert-experience gedanken reduction

november 2019 by nhaliday

tcstariat aaronson commentary links quotes lens academia interdisciplinary government intel white-paper crypto rigorous-crypto complexity tcs conference impetus telos-atelos motivation impact proof-systems expert-experience gedanken reduction

november 2019 by nhaliday

Parallel Computing: Theory and Practice

september 2019 by nhaliday

by Umut Acar who also co-authored a different book on parallel algorithms w/ Guy Blelloch from a more high-level and functional perspective

unit
books
cmu
cs
programming
tcs
algorithms
concurrency
c(pp)
divide-and-conquer
libraries
complexity
time-complexity
data-structures
orders
graphs
graph-theory
trees
models
functional
metal-to-virtual
systems
september 2019 by nhaliday

big list - Are there proofs that you feel you did not "understand" for a long time? - MathOverflow

nibble q-n-a overflow soft-question big-list math proofs expert-experience heavyweights gowers mathtariat reflection learning intricacy grokkability intuition algebra math.GR motivation math.GN topology synthesis math.CT computation tcs logic iteration-recursion math.CA extrema smoothness span-cover grokkability-clarity

august 2019 by nhaliday

nibble q-n-a overflow soft-question big-list math proofs expert-experience heavyweights gowers mathtariat reflection learning intricacy grokkability intuition algebra math.GR motivation math.GN topology synthesis math.CT computation tcs logic iteration-recursion math.CA extrema smoothness span-cover grokkability-clarity

august 2019 by nhaliday

The Existential Risk of Math Errors - Gwern.net

july 2019 by nhaliday

How big is this upper bound? Mathematicians have often made errors in proofs. But it’s rarer for ideas to be accepted for a long time and then rejected. But we can divide errors into 2 basic cases corresponding to type I and type II errors:

1. Mistakes where the theorem is still true, but the proof was incorrect (type I)

2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?

“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”

- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":

https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs

https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs

]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:

https://mathoverflow.net/questions/11517/computer-algebra-errors

I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2

They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:

Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:

https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/

https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods

Update: measured effort

In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/

You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]

ratty
gwern
analysis
essay
realness
truth
correctness
reason
philosophy
math
proofs
formal-methods
cs
programming
engineering
worse-is-better/the-right-thing
intuition
giants
old-anglo
error
street-fighting
heuristic
zooming
risk
threat-modeling
software
lens
logic
inference
physics
differential
geometry
estimate
distribution
robust
speculation
nonlinearity
cost-benefit
convexity-curvature
measure
scale
trivia
cocktail
history
early-modern
europe
math.CA
rigor
news
org:mag
org:sci
miri-cfar
pdf
thesis
comparison
examples
org:junk
q-n-a
stackex
pragmatic
tradeoffs
cracker-prog
techtariat
invariance
DSL
chart
ecosystem
grokkability
heavyweights
CAS
static-dynamic
lower-bounds
complexity
tcs
open-problems
big-surf
ideas
certificates-recognition
proof-systems
PCP
mediterranean
SDP
meta:prediction
epistemic
questions
guessing
distributed
overflow
nibble
soft-question
track-record
big-list
hmm
frontier
state-of-art
move-fast-(and-break-things)
grokkability-clarity
technical-writing
trust
1. Mistakes where the theorem is still true, but the proof was incorrect (type I)

2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?

“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”

- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":

https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs

https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs

]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:

https://mathoverflow.net/questions/11517/computer-algebra-errors

I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2

They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:

Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:

https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/

https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods

Update: measured effort

In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/

You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]

july 2019 by nhaliday

data structures - Why are Red-Black trees so popular? - Computer Science Stack Exchange

june 2019 by nhaliday

- AVL trees have smaller average depth than red-black trees, and thus searching for a value in AVL tree is consistently faster.

- Red-black trees make less structural changes to balance themselves than AVL trees, which could make them potentially faster for insert/delete. I'm saying potentially, because this would depend on the cost of the structural change to the tree, as this will depend a lot on the runtime and implemntation (might also be completely different in a functional language when the tree is immutable?)

There are many benchmarks online that compare AVL and Red-black trees, but what struck me is that my professor basically said, that usually you'd do one of two things:

- Either you don't really care that much about performance, in which case the 10-20% difference of AVL vs Red-black in most cases won't matter at all.

- Or you really care about performance, in which you case you'd ditch both AVL and Red-black trees, and go with B-trees, which can be tweaked to work much better (or (a,b)-trees, I'm gonna put all of those in one basket.)

--

> For some kinds of binary search trees, including red-black trees but not AVL trees, the "fixes" to the tree can fairly easily be predicted on the way down and performed during a single top-down pass, making the second pass unnecessary. Such insertion algorithms are typically implemented with a loop rather than recursion, and often run slightly faster in practice than their two-pass counterparts.

So a RedBlack tree insert can be implemented without recursion, on some CPUs recursion is very expensive if you overrun the function call cache (e.g SPARC due to is use of Register window)

--

There are some cases where you can't use B-trees at all.

One prominent case is std::map from C++ STL. The standard requires that insert does not invalidate existing iterators

...

I also believe that "single pass tail recursive" implementation is not the reason for red black tree popularity as a mutable data structure.

First of all, stack depth is irrelevant here, because (given log𝑛 height) you would run out of the main memory before you run out of stack space. Jemalloc is happy with preallocating worst case depth on the stack.

nibble
q-n-a
overflow
cs
algorithms
tcs
data-structures
functional
orders
trees
cost-benefit
tradeoffs
roots
explanans
impetus
performance
applicability-prereqs
programming
pls
c(pp)
ubiquity
- Red-black trees make less structural changes to balance themselves than AVL trees, which could make them potentially faster for insert/delete. I'm saying potentially, because this would depend on the cost of the structural change to the tree, as this will depend a lot on the runtime and implemntation (might also be completely different in a functional language when the tree is immutable?)

There are many benchmarks online that compare AVL and Red-black trees, but what struck me is that my professor basically said, that usually you'd do one of two things:

- Either you don't really care that much about performance, in which case the 10-20% difference of AVL vs Red-black in most cases won't matter at all.

- Or you really care about performance, in which you case you'd ditch both AVL and Red-black trees, and go with B-trees, which can be tweaked to work much better (or (a,b)-trees, I'm gonna put all of those in one basket.)

--

> For some kinds of binary search trees, including red-black trees but not AVL trees, the "fixes" to the tree can fairly easily be predicted on the way down and performed during a single top-down pass, making the second pass unnecessary. Such insertion algorithms are typically implemented with a loop rather than recursion, and often run slightly faster in practice than their two-pass counterparts.

So a RedBlack tree insert can be implemented without recursion, on some CPUs recursion is very expensive if you overrun the function call cache (e.g SPARC due to is use of Register window)

--

There are some cases where you can't use B-trees at all.

One prominent case is std::map from C++ STL. The standard requires that insert does not invalidate existing iterators

...

I also believe that "single pass tail recursive" implementation is not the reason for red black tree popularity as a mutable data structure.

First of all, stack depth is irrelevant here, because (given log𝑛 height) you would run out of the main memory before you run out of stack space. Jemalloc is happy with preallocating worst case depth on the stack.

june 2019 by nhaliday

Bareiss algorithm - Wikipedia

june 2019 by nhaliday

During the execution of Bareiss algorithm, every integer that is computed is the determinant of a submatrix of the input matrix. This allows, using the Hadamard inequality, to bound the size of these integers. Otherwise, the Bareiss algorithm may be viewed as a variant of Gaussian elimination and needs roughly the same number of arithmetic operations.

nibble
ground-up
cs
tcs
algorithms
complexity
linear-algebra
numerics
sci-comp
fields
calculation
nitty-gritty
june 2019 by nhaliday

What's the expected level of paper for top conferences in Computer Science - Academia Stack Exchange

june 2019 by nhaliday

Top. The top level.

My experience on program committees for STOC, FOCS, ITCS, SODA, SOCG, etc., is that there are FAR more submissions of publishable quality than can be accepted into the conference. By "publishable quality" I mean a well-written presentation of a novel, interesting, and non-trivial result within the scope of the conference.

...

There are several questions that come up over and over in the FOCS/STOC review cycle:

- How surprising / novel / elegant / interesting is the result?

- How surprising / novel / elegant / interesting / general are the techniques?

- How technically difficult is the result? Ironically, FOCS and STOC committees have a reputation for ignoring the distinction between trivial (easy to derive from scratch) and nondeterministically trivial (easy to understand after the fact).

- What is the expected impact of this result? Is this paper going to change the way people do theoretical computer science over the next five years?

- Is the result of general interest to the theoretical computer science community? Or is it only of interest to a narrow subcommunity? In particular, if the topic is outside the STOC/FOCS mainstream—say, for example, computational topology—does the paper do a good job of explaining and motivating the results to a typical STOC/FOCS audience?

nibble
q-n-a
overflow
academia
tcs
cs
meta:research
publishing
scholar
lens
properties
cost-benefit
analysis
impetus
increase-decrease
soft-question
motivation
proofs
search
complexity
analogy
problem-solving
elegance
synthesis
hi-order-bits
novelty
discovery
My experience on program committees for STOC, FOCS, ITCS, SODA, SOCG, etc., is that there are FAR more submissions of publishable quality than can be accepted into the conference. By "publishable quality" I mean a well-written presentation of a novel, interesting, and non-trivial result within the scope of the conference.

...

There are several questions that come up over and over in the FOCS/STOC review cycle:

- How surprising / novel / elegant / interesting is the result?

- How surprising / novel / elegant / interesting / general are the techniques?

- How technically difficult is the result? Ironically, FOCS and STOC committees have a reputation for ignoring the distinction between trivial (easy to derive from scratch) and nondeterministically trivial (easy to understand after the fact).

- What is the expected impact of this result? Is this paper going to change the way people do theoretical computer science over the next five years?

- Is the result of general interest to the theoretical computer science community? Or is it only of interest to a narrow subcommunity? In particular, if the topic is outside the STOC/FOCS mainstream—say, for example, computational topology—does the paper do a good job of explaining and motivating the results to a typical STOC/FOCS audience?

june 2019 by nhaliday

Analysis of Current and Future Computer Science Needs via Advertised Faculty Searches for 2019 - CRN

june 2019 by nhaliday

Differences are also seen when analyzing results based on the type of institution. Positions related to Security have the highest percentages for all but top-100 institutions. The area of Artificial Intelligence/Data Mining/Machine Learning is of most interest for top-100 PhD institutions. Roughly 35% of positions for PhD institutions are in data-oriented areas. The results show a strong interest in data-oriented areas by public PhD and private PhD, MS, and BS institutions while public MS and BS institutions are most interested in Security.

org:edu
data
analysis
visualization
trends
recruiting
jobs
career
planning
academia
higher-ed
cs
tcs
machine-learning
systems
pro-rata
measure
long-term
🎓
uncertainty
progression
grad-school
phd
distribution
ranking
top-n
security
status
s-factor
comparison
homo-hetero
correlation
org:ngo
white-paper
cost-benefit
june 2019 by nhaliday

algorithm - Skip List vs. Binary Search Tree - Stack Overflow

may 2019 by nhaliday

Skip lists are more amenable to concurrent access/modification. Herb Sutter wrote an article about data structure in concurrent environments. It has more indepth information.

The most frequently used implementation of a binary search tree is a red-black tree. The concurrent problems come in when the tree is modified it often needs to rebalance. The rebalance operation can affect large portions of the tree, which would require a mutex lock on many of the tree nodes. Inserting a node into a skip list is far more localized, only nodes directly linked to the affected node need to be locked.

q-n-a
stackex
nibble
programming
tcs
data-structures
performance
concurrency
comparison
cost-benefit
applicability-prereqs
random
trees
tradeoffs
The most frequently used implementation of a binary search tree is a red-black tree. The concurrent problems come in when the tree is modified it often needs to rebalance. The rebalance operation can affect large portions of the tree, which would require a mutex lock on many of the tree nodes. Inserting a node into a skip list is far more localized, only nodes directly linked to the affected node need to be locked.

may 2019 by nhaliday

Computer Scientists Close In on Unique Games Conjecture Proof | Quanta Magazine

news org:mag org:sci popsci tcs cs computation UGC complexity approximation lower-bounds hardness nibble org:inst rand-approx proofs big-surf announcement SDP optimization open-problems questions research

june 2018 by nhaliday

news org:mag org:sci popsci tcs cs computation UGC complexity approximation lower-bounds hardness nibble org:inst rand-approx proofs big-surf announcement SDP optimization open-problems questions research

june 2018 by nhaliday

Complexity no Bar to AI - Gwern.net

april 2018 by nhaliday

Critics of AI risk suggest diminishing returns to computing (formalized asymptotically) means AI will be weak; this argument relies on a large number of questionable premises and ignoring additional resources, constant factors, and nonlinear returns to small intelligence advantages, and is highly unlikely. (computer science, transhumanism, AI, R)

created: 1 June 2014; modified: 01 Feb 2018; status: finished; confidence: likely; importance: 10

ratty
gwern
analysis
faq
ai
risk
speedometer
intelligence
futurism
cs
computation
complexity
tcs
linear-algebra
nonlinearity
convexity-curvature
average-case
adversarial
article
time-complexity
singularity
iteration-recursion
magnitude
multiplicative
lower-bounds
no-go
performance
hardware
humanity
psychology
cog-psych
psychometrics
iq
distribution
moments
complement-substitute
hanson
ems
enhancement
parable
detail-architecture
universalism-particularism
neuro
ai-control
environment
climate-change
threat-modeling
security
theory-practice
hacker
academia
realness
crypto
rigorous-crypto
usa
government
created: 1 June 2014; modified: 01 Feb 2018; status: finished; confidence: likely; importance: 10

april 2018 by nhaliday

If Quantum Computers are not Possible Why are Classical Computers Possible? | Combinatorics and more

november 2017 by nhaliday

As most of my readers know, I regard quantum computing as unrealistic. You can read more about it in my Notices AMS paper and its extended version (see also this post) and in the discussion of Puzzle 4 from my recent puzzles paper (see also this post). The amazing progress and huge investment in quantum computing (that I presented and update routinely in this post) will put my analysis to test in the next few years.

tcstariat
mathtariat
org:bleg
nibble
tcs
cs
computation
quantum
volo-avolo
no-go
contrarianism
frontier
links
quantum-info
analogy
comparison
synthesis
hi-order-bits
speedometer
questions
signal-noise
november 2017 by nhaliday

Rank aggregation basics: Local Kemeny optimisation | David R. MacIver

september 2017 by nhaliday

This turns our problem from a global search to a local one: Basically we can start from any point in the search space and search locally by swapping adjacent pairs until we hit a minimum. This turns out to be quite easy to do. _We basically run insertion sort_: At step n we have the first n items in a locally Kemeny optimal order. Swap the n+1th item backwards until the majority think its predecessor is < it. This ensures all adjacent pairs are in the majority order, so swapping them would result in a greater than or equal K. This is of course an O(n^2) algorithm. In fact, the problem of merely finding a locally Kemeny optimal solution can be done in O(n log(n)) (for much the same reason as you can sort better than insertion sort). You just take the directed graph of majority votes and find a Hamiltonian Path. The nice thing about the above version of the algorithm is that it gives you a lot of control over where you start your search.

techtariat
liner-notes
papers
tcs
algorithms
machine-learning
acm
optimization
approximation
local-global
orders
graphs
graph-theory
explanation
iteration-recursion
time-complexity
nibble
september 2017 by nhaliday

Algorithms for Finding Nearest Neighbors (and Relatives) - Piotr Indyk

september 2017 by nhaliday

kd-tree mostly

pdf
nibble
slides
talks
mit
tcs
algorithms
data-structures
geometry
spatial
metric-space
mihai
yoga
rand-approx
approximation
high-dimension
dimensionality
hashing
embeddings
trees
contiguity-proximity
september 2017 by nhaliday

Correlated Equilibria in Game Theory | Azimuth

july 2017 by nhaliday

Given this, it’s not surprising that Nash equilibria can be hard to find. Last September a paper came out making this precise, in a strong way:

• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… in general. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in Quanta gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:

baez
org:bleg
nibble
mathtariat
commentary
summary
news
org:mag
org:sci
popsci
equilibrium
GT-101
game-theory
acm
conceptual-vocab
concept
definition
thinking
signaling
coordination
tcs
complexity
communication-complexity
lower-bounds
no-go
liner-notes
big-surf
papers
research
algorithmic-econ
volo-avolo
• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… in general. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in Quanta gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:

july 2017 by nhaliday

Talks

may 2017 by nhaliday

Quantum Supremacy: Office of Science and Technology Policy QIS Forum, Eisenhower Executive Office Building, White House Complex, Washington DC, October 18, 2016. Another version at UTCS Faculty Lunch, October 26, 2016. Another version at UT Austin Physics Colloquium, Austin, TX, November 9, 2016.

Complexity-Theoretic Foundations of Quantum Supremacy Experiments: Quantum Algorithms Workshop, Aspen Center for Physics, Aspen, CO, March 25, 2016

When Exactly Do Quantum Computers Provide A Speedup?: Yale Quantum Institute Seminar, Yale University, New Haven, CT, October 10, 2014. Another version at UT Austin Physics Colloquium, Austin, TX, November 19, 2014; Applied and Interdisciplinary Mathematics Seminar, Northeastern University, Boston, MA, November 25, 2014; Hebrew University Physics Colloquium, Jerusalem, Israel, January 5, 2015; Computer Science Colloquium, Technion, Haifa, Israel, January 8, 2015; Stanford University Physics Colloquium, January 27, 2015

tcstariat
aaronson
tcs
complexity
quantum
quantum-info
talks
list
slides
accretion
algorithms
applications
physics
nibble
frontier
computation
volo-avolo
speedometer
questions
Complexity-Theoretic Foundations of Quantum Supremacy Experiments: Quantum Algorithms Workshop, Aspen Center for Physics, Aspen, CO, March 25, 2016

When Exactly Do Quantum Computers Provide A Speedup?: Yale Quantum Institute Seminar, Yale University, New Haven, CT, October 10, 2014. Another version at UT Austin Physics Colloquium, Austin, TX, November 19, 2014; Applied and Interdisciplinary Mathematics Seminar, Northeastern University, Boston, MA, November 25, 2014; Hebrew University Physics Colloquium, Jerusalem, Israel, January 5, 2015; Computer Science Colloquium, Technion, Haifa, Israel, January 8, 2015; Stanford University Physics Colloquium, January 27, 2015

may 2017 by nhaliday

Discrepancy algorithm inspired by gradient descent and multiplicative weights; after Levy, Ramadas and Rothvoss | I’m a bandit

acmtariat sebastien-bubeck org:bleg nibble exposition liner-notes algorithms math.CO online-learning tcs research combo-optimization gradient-descent linear-algebra amortization-potential

april 2017 by nhaliday

acmtariat sebastien-bubeck org:bleg nibble exposition liner-notes algorithms math.CO online-learning tcs research combo-optimization gradient-descent linear-algebra amortization-potential

april 2017 by nhaliday

Sets with Small Intersection | Academically Interesting

march 2017 by nhaliday

- nice application of LLL lemma

- reference for the "not hard to show" claim: https://homes.cs.washington.edu/~anuprao/pubs/CSE599sExtremal/lecture3.pdf

I think there are some typos, eg, inequality in last line should be reversed (we already have an upper bound on sum d(x), want a lower bound)

cf also Exercise 2.14 and Section 13.6 in Jukna's Extremal Combinatorics, and the Erdös–Ko–Rado Theorem in Alon-Spencer

more:

https://arxiv.org/abs/1404.4622

https://mathoverflow.net/questions/64596/minimal-intersecting-subsets

https://mathoverflow.net/questions/175969/an-upper-bound-on-families-of-subsets-with-a-small-pairwise-intersection

https://mathoverflow.net/questions/21245/pairwise-intersecting-sets-of-fixed-size

ratty
clever-rats
acmtariat
org:bleg
nibble
math
math.CO
tidbits
probability
probabilistic-method
intersection
rigidity
magnitude
combo-optimization
intersection-connectedness
multi
pdf
proofs
extrema
estimate
bare-hands
q-n-a
overflow
preprint
papers
pseudorandomness
tcs
complexity
- reference for the "not hard to show" claim: https://homes.cs.washington.edu/~anuprao/pubs/CSE599sExtremal/lecture3.pdf

I think there are some typos, eg, inequality in last line should be reversed (we already have an upper bound on sum d(x), want a lower bound)

cf also Exercise 2.14 and Section 13.6 in Jukna's Extremal Combinatorics, and the Erdös–Ko–Rado Theorem in Alon-Spencer

more:

https://arxiv.org/abs/1404.4622

https://mathoverflow.net/questions/64596/minimal-intersecting-subsets

https://mathoverflow.net/questions/175969/an-upper-bound-on-families-of-subsets-with-a-small-pairwise-intersection

https://mathoverflow.net/questions/21245/pairwise-intersecting-sets-of-fixed-size

march 2017 by nhaliday

Peter Norvig, the meaning of polynomials, debugging as psychotherapy | Quomodocumque

march 2017 by nhaliday

He briefly showed a demo where, given values of a polynomial, a machine can put together a few lines of code that successfully computes the polynomial. But the code looks weird to a human eye. To compute some quadratic, it nests for-loops and adds things up in a funny way that ends up giving the right output. So has it really ”learned” the polynomial? I think in computer science, you typically feel you’ve learned a function if you can accurately predict its value on a given input. For an algebraist like me, a function determines but isn’t determined by the values it takes; to me, there’s something about that quadratic polynomial the machine has failed to grasp. I don’t think there’s a right or wrong answer here, just a cultural difference to be aware of. Relevant: Norvig’s description of “the two cultures” at the end of this long post on natural language processing (which is interesting all the way through!)

mathtariat
org:bleg
nibble
tech
ai
talks
summary
philosophy
lens
comparison
math
cs
tcs
polynomials
nlp
debugging
psychology
cog-psych
complex-systems
deep-learning
analogy
legibility
interpretability
composition-decomposition
coupling-cohesion
apollonian-dionysian
heavyweights
march 2017 by nhaliday

Path integral formulation - Wikipedia

february 2017 by nhaliday

Scott Aaronson compared this w/ Schrodinger perspective in time complexity during that presentation on quantum supremacy

https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-845-quantum-complexity-theory-fall-2010/lecture-notes/MIT6_845F10_lec04.pdf

physics
quantum
models
integral
feynman
mechanics
wiki
reference
nibble
multi
aaronson
tcstariat
time-complexity
simulation
quantum-info
ideas
tcs
volo-avolo
speedometer
questions
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-845-quantum-complexity-theory-fall-2010/lecture-notes/MIT6_845F10_lec04.pdf

february 2017 by nhaliday

big list - Overarching reasons why problems are in P or BPP - Theoretical Computer Science Stack Exchange

q-n-a overflow nibble tcs complexity algorithms linear-algebra polynomials markov monte-carlo DP math.CO greedy math.NT synthesis list big-list hi-order-bits big-picture aaronson tcstariat graphs graph-theory proofs structure tricki yoga mathtariat time-complexity top-n metabuch metameta skeleton s:*** chart knowledge curvature convexity-curvature

february 2017 by nhaliday

q-n-a overflow nibble tcs complexity algorithms linear-algebra polynomials markov monte-carlo DP math.CO greedy math.NT synthesis list big-list hi-order-bits big-picture aaronson tcstariat graphs graph-theory proofs structure tricki yoga mathtariat time-complexity top-n metabuch metameta skeleton s:*** chart knowledge curvature convexity-curvature

february 2017 by nhaliday

Equivalence between counting and sampling

february 2017 by nhaliday

also: every counting problem either has FPTRAS or no approx. w/i polynomial factor

pdf
exposition
lecture-notes
berkeley
nibble
tcs
counting
sampling
characterization
complexity
approximation
rand-approx
proofs
february 2017 by nhaliday

References for TCS proof techniques - Theoretical Computer Science Stack Exchange

q-n-a overflow tcs complexity proofs structure meta:math problem-solving yoga 👳 synthesis encyclopedic list big-list nibble tricki p:someday knowledge princeton sanjeev-arora lecture-notes elegance advanced

february 2017 by nhaliday

q-n-a overflow tcs complexity proofs structure meta:math problem-solving yoga 👳 synthesis encyclopedic list big-list nibble tricki p:someday knowledge princeton sanjeev-arora lecture-notes elegance advanced

february 2017 by nhaliday

reference request - What are the recent TCS books whose drafts are available online? - Theoretical Computer Science Stack Exchange

q-n-a overflow books draft list big-list tcs links accretion 👳 encyclopedic unit nibble confluence p:someday quixotic advanced proof-systems average-case complexity iidness pseudorandomness rand-approx markov mixing sublinear crypto rigorous-crypto circuits properties checking

february 2017 by nhaliday

q-n-a overflow books draft list big-list tcs links accretion 👳 encyclopedic unit nibble confluence p:someday quixotic advanced proof-systems average-case complexity iidness pseudorandomness rand-approx markov mixing sublinear crypto rigorous-crypto circuits properties checking

february 2017 by nhaliday

cc.complexity theory - The complexity of sampling (approximately) the Fourier transform of a Boolean function - Theoretical Computer Science Stack Exchange

q-n-a overflow tcs complexity quantum quantum-info sampling fourier boolean-analysis research aaronson tcstariat mathtariat hierarchy rand-complexity open-problems counting nibble questions

february 2017 by nhaliday

q-n-a overflow tcs complexity quantum quantum-info sampling fourier boolean-analysis research aaronson tcstariat mathtariat hierarchy rand-complexity open-problems counting nibble questions

february 2017 by nhaliday

cc.complexity theory - Evidence that PPAD is hard? - Theoretical Computer Science Stack Exchange

q-n-a overflow tcs complexity algorithmic-econ game-theory motivation open-problems lower-bounds research relativization crypto rigorous-crypto nibble big-surf hardness questions equilibrium evidence

february 2017 by nhaliday

q-n-a overflow tcs complexity algorithmic-econ game-theory motivation open-problems lower-bounds research relativization crypto rigorous-crypto nibble big-surf hardness questions equilibrium evidence

february 2017 by nhaliday

Copy this bookmark: