recentpopularlog in

nhaliday : tricks   44

"Performance Matters" by Emery Berger - YouTube
Stabilizer is a tool that enables statistically sound performance evaluation, making it possible to understand the impact of optimizations and conclude things like the fact that the -O2 and -O3 optimization levels are indistinguishable from noise (sadly true).

Since compiler optimizations have run out of steam, we need better profiling support, especially for modern concurrent, multi-threaded applications. Coz is a new "causal profiler" that lets programmers optimize for throughput or latency, and which pinpoints and accurately predicts the impact of optimizations.

- randomize extraneous factors like code layout and stack size to avoid spurious speedups
- simulate speedup of component of concurrent system (to assess effect of optimization before attempting) by slowing down the complement (all but that component)
- latency vs. throughput, Little's law
video  presentation  programming  engineering  nitty-gritty  performance  devtools  compilers  latency-throughput  concurrency  legacy  causation  wire-guided  let-me-see  manifolds  pro-rata  tricks  endogenous-exogenous  control  random  signal-noise  comparison  marginal  llvm  systems  hashing  computer-memory  build-packaging  composition-decomposition  coupling-cohesion  local-global  dbs  direct-indirect  symmetry  research  models  metal-to-virtual  linux  measurement  simulation  magnitude  realness  hypothesis-testing  techtariat 
october 2019 by nhaliday
How can lazy importing be implemented in Python? - Quora
The Mercurial revision control system has the most solid lazy import implementation I know of. Note well that it's licensed under the GPL, so you can't simply use that code in a project of your own.
- Bryan O'Sullivan
q-n-a  qra  programming  python  howto  examples  performance  tricks  time  latency-throughput  yak-shaving  expert-experience  hg  build-packaging  oss  property-rights  intellectual-property 
august 2019 by nhaliday
How to make a fast command line tool in Python
An overview of why Python programs tend to be slow to start running, and some techniques Bazaar uses to start quickly, such as lazy imports.
techtariat  presentation  howto  objektbuch  tutorial  python  programming  performance  tricks  time  latency-throughput  yak-shaving  build-packaging 
august 2019 by nhaliday
unix - How can I profile C++ code running on Linux? - Stack Overflow
If your goal is to use a profiler, use one of the suggested ones.

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

Just halt it several times, and each time look at the call stack. If there is some code that is wasting some percentage of the time, 20% or 50% or whatever, that is the probability that you will catch it in the act on each sample. So that is roughly the percentage of samples on which you will see it. There is no educated guesswork required. If you do have a guess as to what the problem is, this will prove or disprove it.

You may have multiple performance problems of different sizes. If you clean out any one of them, the remaining ones will take a larger percentage, and be easier to spot, on subsequent passes. This magnification effect, when compounded over multiple problems, can lead to truly massive speedup factors.

Caveat: Programmers tend to be skeptical of this technique unless they've used it themselves. They will say that profilers give you this information, but that is only true if they sample the entire call stack, and then let you examine a random set of samples. (The summaries are where the insight is lost.) Call graphs don't give you the same information, because they don't summarize at the instruction level, and
they give confusing summaries in the presence of recursion.
They will also say it only works on toy programs, when actually it works on any program, and it seems to work better on bigger programs, because they tend to have more problems to find. They will say it sometimes finds things that aren't problems, but that is only true if you see something once. If you see a problem on more than one sample, it is real.

gprof, Valgrind and gperftools - an evaluation of some tools for application level CPU profiling on Linux:
gprof is the dinosaur among the evaluated profilers - its roots go back into the 1980’s. It seems it was widely used and a good solution during the past decades. But its limited support for multi-threaded applications, the inability to profile shared libraries and the need for recompilation with compatible compilers and special flags that produce a considerable runtime overhead, make it unsuitable for using it in today’s real-world projects.

Valgrind delivers the most accurate results and is well suited for multi-threaded applications. It’s very easy to use and there is KCachegrind for visualization/analysis of the profiling data, but the slow execution of the application under test disqualifies it for larger, longer running applications.

The gperftools CPU profiler has a very little runtime overhead, provides some nice features like selectively profiling certain areas of interest and has no problem with multi-threaded applications. KCachegrind can be used to analyze the profiling data. Like all sampling based profilers, it suffers statistical inaccuracy and therefore the results are not as accurate as with Valgrind, but practically that’s usually not a big problem (you can always increase the sampling frequency if you need more accurate results). I’m using this profiler on a large code-base and from my personal experience I can definitely recommend using it.
q-n-a  stackex  programming  engineering  performance  devtools  tools  advice  checklists  hacker  nitty-gritty  tricks  lol  multi  unix  linux  techtariat  analysis  comparison  recommendations  software  measurement  oly-programming  concurrency  debugging  metabuch 
may 2019 by nhaliday
Surveil things, not people – The sideways view
Technology may reach a point where free use of one person’s share of humanity’s resources is enough to easily destroy the world. I think society needs to make significant changes to cope with that scenario.

Mass surveillance is a natural response, and sometimes people think of it as the only response. I find mass surveillance pretty unappealing, but I think we can capture almost all of the value by surveilling things rather than surveilling people. This approach avoids some of the worst problems of mass surveillance; while it still has unattractive features it’s my favorite option so far.


The idea
We’ll choose a set of artifacts to surveil and restrict. I’ll call these heavy technology and everything else light technology. Our goal is to restrict as few things as possible, but we want to make sure that someone can’t cause unacceptable destruction with only light technology. By default something is light technology if it can be easily acquired by an individual or small group in 2017, and heavy technology otherwise (though we may need to make some exceptions, e.g. certain biological materials or equipment).

Heavy technology is subject to two rules:

1. You can’t use heavy technology in a way that is unacceptably destructive.
2. You can’t use heavy technology to undermine the machinery that enforces these two rules.

To enforce these rules, all heavy technology is under surveillance, and is situated such that it cannot be unilaterally used by any individual or small group. That is, individuals can own heavy technology, but they cannot have unmonitored physical access to that technology.


This proposal does give states a de facto monopoly on heavy technology, and would eventually make armed resistance totally impossible. But it’s already the case that states have a massive advantage in armed conflict, and it seems almost inevitable that progress in AI will make this advantage larger (and enable states to do much more with it). Realistically I’m not convinced this proposal makes things much worse than the default.

This proposal definitely expands regulators’ nominal authority and seems prone to abuses. But amongst candidates for handling a future with cheap and destructive dual-use technology, I feel this is the best of many bad options with respect to the potential for abuse.
ratty  acmtariat  clever-rats  risk  existence  futurism  technology  policy  alt-inst  proposal  government  intel  authoritarianism  orwellian  tricks  leviathan  security  civilization  ai  ai-control  arms  defense  cybernetics  institutions  law  unintended-consequences  civil-liberty  volo-avolo  power  constraint-satisfaction  alignment 
april 2018 by nhaliday
Indiana Jones, Economist?! - Marginal REVOLUTION
In a stunningly original paper Gojko Barjamovic, Thomas Chaney, Kerem A. Coşar, and Ali Hortaçsu use the gravity model of trade to infer the location of lost cities from Bronze age Assyria! The simplest gravity model makes predictions about trade flows based on the sizes of cities and the distances between them. More complicated models add costs based on geographic barriers. The authors have data from ancient texts on trade flows between all the cities, they know the locations of some of the cities, and they know the geography of the region. Using this data they can invert the gravity model and, triangulating from the known cities, find the lost cities that would best “fit” the model. In other words, by assuming the model is true the authors can predict where the lost cities should be located. To test the idea the authors pretend that some known cities are lost and amazingly the model is able to accurately rediscover those cities.
econotariat  marginal-rev  commentary  study  summary  economics  broad-econ  cliometrics  interdisciplinary  letters  history  antiquity  MENA  urban  geography  models  prediction  archaeology  trade  trivia  cocktail  links  cool  tricks  urban-rural  inference  traces 
november 2017 by nhaliday
Let George Do It | West Hunter
I was thinking about how people would have adapted to local differences in essential micronutrients, stuff like iodine, selenium, manganese, molybdenum, zinc, etc. Australia, for example,  hasn’t had much geological activity in ages and generally has mineral-poor soils. At first I thought that Aboriginals, who have lived in such places for a long time,  might have developed better transporters, etc – ways of eking out scarce trace elements.

Maybe they have, but on second thought, they may not have needed to.  Sure, the Aboriginals were exposed to these conditions for tens of thousands of years, but not nearly as long as kangaroos and wombats have been.  If those animals had effective ways of accumulating the necessary micronutrients,  hunter-gatherers could have solved their problems by consuming local fauna. Let George do it, and then eat George.

The real problems should occur in people who rely heavily on plant foods (European farmers) and in their livestock, which are generally not adapted to the mineral-poor environments. If I’m right, even in areas where sheep without selenium supplements get white muscle disease (nutritional muscular dystrophy), indigenous wildlife should not.
west-hunter  scitariat  discussion  ideas  speculation  sapiens  pop-diff  embodied  metabolic  nutrition  diet  food  farmers-and-foragers  agriculture  nature  tricks  direct-indirect 
august 2017 by nhaliday
Book review: "Working Effectively with Legacy Code" by Michael C. Feathers - Eli Bendersky's website
The basic premise of the book is simple, and can be summarized as follows:

To improve some piece of code, we must be able to refactor it.
To be able to refactor code, we must have tests that prove our refactoring didn't break anything.
To have reasonable tests, the code has to be testable; that is, it should be in a form amenable to test harnessing. This most often means breaking implicit dependencies.
... and the author spends about 400 pages on how to achieve that. This book is dense, and it took me a long time to plow through it. I started reading linerarly, but very soon discovered this approach doesn't work. So I began hopping forward and backward between the main text and the "dependency-breaking techniques" chapter which holds isolated recipes for dealing with specific kinds of dependencies. There's quite a bit of repetition in the book, which makes it even more tedious to read.

The techniques described by the author are as terrible as the code they're up against. Horrible abuses of the preprocessor in C/C++, abuses of inheritance in C++ and Java, and so on. Particularly the latter is quite sobering. If you love OOP beware - this book may leave you disenchanted, if not full of hate.

To reiterate the conclusion I already presented earlier - get this book if you have to work with old balls of mud; it will be effort well spent. Otherwise, if you're working on one of those new-age continuously integrated codebases with a 2/1 test to code ratio, feel free to skip it.
techtariat  books  review  summary  critique  engineering  programming  intricacy  code-dive  best-practices  checklists  checking  working-stiff  retrofit  oop  code-organizing  legacy  correctness  coupling-cohesion  composition-decomposition  tricks  metabuch  nitty-gritty  move-fast-(and-break-things)  methodology  project-management 
july 2017 by nhaliday
Surnames: a New Source for the History of Social Mobility
This paper explains how surname distributions can be used as a way to
measure rates of social mobility in contemporary and historical societies.
This allows for estimates of social mobility rates for any population for which the distribution of surnames overall is known as well as the distribution of surnames among some elite or underclass. Such information exists, for example, for England back to 1300, and for Sweden back to 1700. However surname distributions reveal a different, more fundamental type of mobility than that conventionally estimated. Thus surname estimates also allow for measuring a different aspect of social mobility, but the aspect that matters for mobility of social groups, and for families in the long run.

Immobile Australia: Surnames Show Strong Status Persistence, 1870–2017:

The Big Sort: Selective Migration and the Decline of Northern England, 1800-2017:
The north of England in recent years has been poorer, less healthy, less educated and slower growing than the south. Using two sources - surnames that had a different regional distribution in England in the 1840s, and a detailed genealogy of 78,000 people in England giving birth and death locations - we show that the decline of the north is mainly explained by selective outmigration of the educated and talented.

Genetic Consequences of Social Stratification in Great Britain:
pdf  study  spearhead  gregory-clark  economics  cliometrics  status  class  mobility  language  methodology  metrics  natural-experiment  🎩  tricks  history  early-modern  britain  china  asia  path-dependence  europe  nordic  pro-rata  higher-ed  elite  success  society  legacy  stylized-facts  age-generation  broad-econ  s-factor  measurement  within-group  pop-structure  flux-stasis  microfoundations  multi  shift  mostly-modern  migration  biodet  endo-exo  behavioral-gen  regression-to-mean  human-capital  education  oxbridge  endogenous-exogenous  ideas  bio  preprint  genetics  genomics  GWAS  labor  anglo  egalitarianism-hierarchy  welfare-state  sociology  org:ngo  white-paper 
march 2017 by nhaliday
Placebo interventions for all clinical conditions. - PubMed - NCBI
We did not find that placebo interventions have important clinical effects in general. However, in certain settings placebo interventions can influence patient-reported outcomes, especially pain and nausea, though it is difficult to distinguish patient-reported effects of placebo from biased reporting. The effect on pain varied, even among trials with low risk of bias, from negligible to clinically important. Variations in the effect of placebo were partly explained by variations in how trials were conducted and how patients were informed.

How much of the placebo 'effect' is really statistical regression?:
Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression. First, whereas older clinical trials susceptible to regression resulted in a marked improvement in placebo-treated patients, in a modern series of clinical trials whose design tended to protect against regression, we found no significant improvement (median change 0.3 per cent, p greater than 0.05) in placebo-treated patients.

Placebo effects are weak: regression to the mean is the main reason ineffective treatments appear to work:

A radical new hypothesis in medicine: give patients drugs they know don’t work:
People on no treatment got about 30 percent better. And people who were given an open-label placebo got 60 percent improvement in the adequate relief of their irritable bowel syndrome.

Surgery Is One Hell Of A Placebo:
study  psychology  social-psych  medicine  meta:medicine  contrarianism  evidence-based  embodied-cognition  intervention  illusion  realness  meta-analysis  multi  science  stats  replication  gelman  regularizer  thinking  regression-to-mean  methodology  insight  hmm  news  org:data  org:lite  interview  tricks  drugs  cost-benefit  health  ability-competence  chart 
march 2017 by nhaliday
Doomsday rule - Wikipedia, the free encyclopedia
It takes advantage of each year having a certain day of the week, called the doomsday, upon which certain easy-to-remember dates fall; for example, 4/4, 6/6, 8/8, 10/10, 12/12, and the last day of February all occur on the same day of the week in any year. Applying the Doomsday algorithm involves three steps:
1. Determination of the anchor day for the century.
2. Calculation of the doomsday for the year from the anchor day.
3. Selection of the closest date out of those that always fall on the doomsday, e.g., 4/4 and 6/6, and count of the number of days (modulo 7) between that date and the date in question to arrive at the day of the week.

This technique applies to both the Gregorian calendar A.D. and the Julian calendar, although their doomsdays are usually different days of the week.

Easter date:
*When is Easter? (Short answer)*
Easter Sunday is the first Sunday after the first full moon on or after the vernal equinox.

*When is Easter? (Long answer)*
The calculation of Easter is complicated because it is linked to (an inaccurate version of) the Hebrew calendar.


It was therefore decided to make Easter Sunday the first Sunday after the first full moon after vernal equinox. Or more precisely: Easter Sunday is the first Sunday after the “official” full moon on or after the “official” vernal equinox.

The official vernal equinox is always 21 March.

The official full moon may differ from the real full moon by one or two days.


The full moon that precedes Easter is called the Paschal full moon. Two concepts play an important role when calculating the Paschal full moon: The Golden Number and the Epact. They are described in the following sections.


*What is the Golden Number?*
Each year is associated with a Golden Number.

Considering that the relationship between the moon’s phases and the days of the year repeats itself every 19 years (as described in the section about astronomy), it is natural to associate a number between 1 and 19 with each year. This number is the so-called Golden Number. It is calculated thus:

GoldenNumber=(year mod 19) + 1

However, 19 tropical years is 234.997 synodic months, which is very close to an integer. So every 19 years the phases of the moon fall on the same dates (if it were not for the skewness introduced by leap years). 19 years is called a Metonic cycle (after Meton, an astronomer from Athens in the 5th century BC).

So, to summarise: There are three important numbers to note:

A tropical year is 365.24219 days.
A synodic month is 29.53059 days.
19 tropical years is close to an integral number of synodic months.]

In years which have the same Golden Number, the new moon will fall on (approximately) the same date. The Golden Number is sufficient to calculate the Paschal full moon in the Julian calendar.


Under the Gregorian calendar, things became much more complicated. One of the changes made in the Gregorian calendar reform was a modification of the way Easter was calculated. There were two reasons for this. First, the 19 year cycle of the phases of moon (the Metonic cycle) was known not to be perfect. Secondly, the Metonic cycle fitted the Gregorian calendar year worse than it fitted the Julian calendar year.

It was therefore decided to base Easter calculations on the so-called Epact.

*What is the Epact?*
Each year is associated with an Epact.

The Epact is a measure of the age of the moon (i.e. the number of days that have passed since an “official” new moon) on a particular date.


In the Julian calendar, the Epact is the age of the moon on 22 March.

In the Gregorian calendar, the Epact is the age of the moon at the start of the year.

The Epact is linked to the Golden Number in the following manner:

Under the Julian calendar, 19 years were assumed to be exactly an integral number of synodic months, and the following relationship exists between the Golden Number and the Epact:

Epact=(11 × (GoldenNumber – 1)) mod 30


In the Gregorian calendar reform, some modifications were made to the simple relationship between the Golden Number and the Epact.

In the Gregorian calendar the Epact should be calculated thus: [long algorithm]


Suppose you know the Easter date of the current year, can you easily find the Easter date in the next year? No, but you can make a qualified guess.

If Easter Sunday in the current year falls on day X and the next year is not a leap year, Easter Sunday of next year will fall on one of the following days: X–15, X–8, X+13 (rare), or X+20.


If you combine this knowledge with the fact that Easter Sunday never falls before 22 March and never falls after 25 April, you can narrow the possibilities down to two or three dates.
tricks  street-fighting  concept  wiki  reference  cheatsheet  trivia  nitty-gritty  objektbuch  time  calculation  mental-math  multi  religion  christianity  events  howto  cycles 
august 2016 by nhaliday
Answer to What is it like to understand advanced mathematics? - Quora
thinking like a mathematician

some of the points:
- small # of tricks (echoes Rota)
- web of concepts and modularization (zooming out) allow quick reasoning
- comfort w/ ambiguity and lack of understanding, study high-dimensional objects via projections
- above is essential for research (and often what distinguishes research mathematicians from people who were good at math, or majored in math)
math  reflection  thinking  intuition  expert  synthesis  wormholes  insight  q-n-a  🎓  metabuch  tricks  scholar  problem-solving  aphorism  instinct  heuristic  lens  qra  soft-question  curiosity  meta:math  ground-up  cartoons  analytical-holistic  lifts-projections  hi-order-bits  scholar-pack  nibble  the-trenches  innovation  novelty  zooming  tricki  virtu  humility  metameta  wisdom  abstraction  skeleton  s:***  knowledge  expert-experience  elegance  judgement  advanced  heavyweights  guessing 
may 2016 by nhaliday

Copy this bookmark:

to read