recentpopularlog in

nhaliday : project   122

« earlier  
Zettlr | "Wtf is a Zettelkasten?"
The Zettelkasten Manifesto
In case you're still wondering what a Zettelkasten is and you need a little bit more incentives to get started, please have a look at a video we've made earlier this week, where we outline why the notion of a Zettelkasten has become so intrinsically linked to the name of Niklas Luhmann, why we think that this is bad and how we think we should think of Zettelkästen:
techtariat  org:com  project  software  tools  exocortex  notetaking  workflow  thinking  dbs  structure  network-structure  critique  graphs  stay-organized  germanic  metabuch 
11 weeks ago by nhaliday
Ask HN: Getting into NLP in 2018? | Hacker News
syllogism (spaCy author):
I think it's probably a bad strategy to try to be the "NLP guy" to potential employers. You'd do much better off being a software engineer on a project with people with ML or NLP expertise.

NLP projects fail a lot. If you line up a job as a company's first NLP person, you'll probably be setting yourself up for failure. You'll get handed an idea that can't work, you won't know enough about how to push back to change it into something that might, etc. After the project fails, you might get a chance to fail at a second one, but maybe not a third. This isn't a great way to move into any new field.

I think a cunning plan would be to angle to be the person who "productionises" models.
...
.--
...

Basically, don't just work on having more powerful solutions. Make sure you've tried hard to have easier problems as well --- that part tends to be higher leverage.

https://news.ycombinator.com/item?id=14008752
https://news.ycombinator.com/item?id=12916498
https://algorithmia.com/blog/introduction-natural-language-processing-nlp
hn  q-n-a  discussion  tech  programming  machine-learning  nlp  strategy  career  planning  human-capital  init  advice  books  recommendations  course  unit  links  automation  project  examples  applications  multi  mooc  lectures  video  data-science  org:com  roadmap  summary  error  applicability-prereqs  ends-means  telos-atelos  cost-benefit 
november 2019 by nhaliday
The Open Steno Project | Hacker News
https://web.archive.org/web/20170315133208/http://www.danieljosephpetersen.com/posts/programming-and-stenography.html
I think at the end of the day, the Plover guys are trying to solve the wrong problem. Stenography is a dying field. I don’t wish anyone to lose their livelihood, but realistically speaking, the job should not exist once text to speech technology advances far enough. I’m not claiming that the field will be replaced by it, but I also don’t love the idea of people having to learn such an inane and archaic system.
hn  commentary  keyboard  speed  efficiency  writing  language  maker  homepage  project  multi  techtariat  cost-benefit  critique  expert-experience  programming  backup  contrarianism 
november 2019 by nhaliday
Ask HN: Favorite note-taking software? | Hacker News
Ask HN: What is your ideal note-taking software and/or hardware?: https://news.ycombinator.com/item?id=13221158

my wishlist as of 2019:
- web + desktop macOS + mobile iOS (at least viewing on the last but ideally also editing)
- sync across all those
- open-source data format that's easy to manipulate for scripting purposes
- flexible organization: mostly tree hierarchical (subsuming linear/unorganized) but with the option for directed (acyclic) graph (possibly a second layer of structure/linking)
- can store plain text, LaTeX, diagrams, sketches, and raster/vector images (video prob not necessary except as links to elsewhere)
- full-text search
- somehow digest/import data from Pinboard, Workflowy, Papers 3/Bookends, Skim, and iBooks/e-readers (esp. Kobo), ideally absorbing most of their functionality
- so, eg, track notes/annotations side-by-side w/ original PDF/DjVu/ePub documents (to replace Papers3/Bookends/Skim), and maybe web pages too (to replace Pinboard)
- OCR of handwritten notes (how to handle equations/diagrams?)
- various forms of NLP analysis of everything (topic models, clustering, etc)
- maybe version control (less important than export)

candidates?:
- Evernote prob ruled out do to heavy use of proprietary data formats (unless I can find some way to export with tolerably clean output)
- Workflowy/Dynalist are good but only cover a subset of functionality I want
- org-mode doesn't interact w/ mobile well (and I haven't evaluated it in detail otherwise)
- TiddlyWiki/Zim are in the running, but not sure about mobile
- idk about vimwiki but I'm not that wedded to vim and it seems less widely used than org-mode/TiddlyWiki/Zim so prob pass on that
- Quiver/Joplin/Inkdrop look similar and cover a lot of bases, TODO: evaluate more
- Trilium looks especially promising, tho read-only mobile and for macOS desktop look at this: https://github.com/zadam/trilium/issues/511
- RocketBook is interesting scanning/OCR solution but prob not sufficient due to proprietary data format
- TODO: many more candidates, eg, TreeSheets, Gingko, OneNote (macOS?...), Notion (proprietary data format...), Zotero, Nodebook (https://nodebook.io/landing), Polar (https://getpolarized.io), Roam (looks very promising)

Ask HN: What do you use for you personal note taking activity?: https://news.ycombinator.com/item?id=15736102

Ask HN: What are your note-taking techniques?: https://news.ycombinator.com/item?id=9976751

Ask HN: How do you take notes (useful note-taking strategies)?: https://news.ycombinator.com/item?id=13064215

Ask HN: How to get better at taking notes?: https://news.ycombinator.com/item?id=21419478

Ask HN: How do you keep your notes organized?: https://news.ycombinator.com/item?id=21810400

Ask HN: How did you build up your personal knowledge base?: https://news.ycombinator.com/item?id=21332957
nice comment from math guy on structure and difference between math and CS: https://news.ycombinator.com/item?id=21338628
useful comment collating related discussions: https://news.ycombinator.com/item?id=21333383
highlights:
Designing a Personal Knowledge base: https://news.ycombinator.com/item?id=8270759
Ask HN: How to organize personal knowledge?: https://news.ycombinator.com/item?id=17892731
Do you use a personal 'knowledge base'?: https://news.ycombinator.com/item?id=21108527
Ask HN: How do you share/organize knowledge at work and life?: https://news.ycombinator.com/item?id=21310030
Managing my personal knowledge base: https://news.ycombinator.com/item?id=22000791
The sad state of personal data and infrastructure: https://beepb00p.xyz/sad-infra.html
Building personal search infrastructure for your knowledge and code: https://beepb00p.xyz/pkm-search.html

How to annotate literally everything: https://beepb00p.xyz/annotating.html
Ask HN: How do you organize document digests / personal knowledge?: https://news.ycombinator.com/item?id=21642289
Ask HN: Good solution for storing notes/excerpts from books?: https://news.ycombinator.com/item?id=21920143
Ask HN: What's your cross-platform pdf / ePub reading workflow?: https://news.ycombinator.com/item?id=22170395
some related stuff in the reddit links at the bottom of this pin

https://beepb00p.xyz/grasp.html
How to capture information from your browser and stay sane

Ask HN: Best solutions for keeping a personal log?: https://news.ycombinator.com/item?id=21906650

other stuff:
plain text: https://news.ycombinator.com/item?id=21685660

https://www.getdnote.com/blog/how-i-built-personal-knowledge-base-for-myself/
Tiago Forte: https://www.buildingasecondbrain.com

hn search: https://hn.algolia.com/?query=notetaking&type=story

Slant comparison commentary: https://news.ycombinator.com/item?id=7011281

good comparison of options here in comments here (and Trilium itself looks good): https://news.ycombinator.com/item?id=18840990

https://en.wikipedia.org/wiki/Comparison_of_note-taking_software

stuff from Andy Matuschak and Michael Nielsen on general note-taking:
https://twitter.com/andy_matuschak/status/1202663202997170176
https://archive.is/1i9ep
Software interfaces undervalue peripheral vision! (a thread)
https://twitter.com/andy_matuschak/status/1199378287555829760
https://archive.is/J06UB
This morning I implemented PageRank to sort backlinks in my prototype note system. Mixed results!
https://twitter.com/andy_matuschak/status/1211487900505792512
https://archive.is/BOiCG
https://archive.is/4zB37
One way to dream up post-book media to make reading more effective and meaningful is to systematize "expert" practices (e.g. How to Read a Book), so more people can do them, more reliably and more cheaply. But… the most erudite people I know don't actually do those things!

the memex essay and comments from various people including Andy on it: https://pinboard.in/u:nhaliday/b:1cddf69c0b31

some more stuff specific to Roam below, and cf "Why books don't work": https://pinboard.in/u:nhaliday/b:b4d4461f6378

wikis:
https://www.slant.co/versus/5116/8768/~tiddlywiki_vs_zim
https://www.wikimatrix.org/compare/tiddlywiki+zim
http://tiddlymap.org/
https://www.zim-wiki.org/manual/Plugins/BackLinks_Pane.html
https://zim-wiki.org/manual/Plugins/Link_Map.html

apps:
Roam: https://news.ycombinator.com/item?id=21440289
https://www.reddit.com/r/RoamResearch/
https://twitter.com/hashtag/roamcult
https://twitter.com/search?q=RoamResearch%20fortelabs
https://twitter.com/search?q=from%3AQiaochuYuan%20RoamResearch&src=typd
https://twitter.com/vgr/status/1199391391803043840
https://archive.is/TJPQN
https://archive.is/CrNwZ
https://www.nateliason.com/blog/roam
https://twitter.com/andy_matuschak/status/1190102757430063106
https://archive.is/To30Q
https://archive.is/UrI1x
https://archive.is/Ww22V
Knowledge systems which display contextual backlinks to a node open up an interesting new behavior. You can bootstrap a new node extensionally (rather than intensionally) by simply linking to it from many other nodes—even before it has any content.
https://twitter.com/michael_nielsen/status/1220197017340612608
Curious: what are the most striking public @RoamResearch pages that you know? I'd like to see examples of people using it for interesting purposes, or in interesting ways.
https://acesounderglass.com/2019/10/24/epistemic-spot-check-the-fate-of-rome-round-2/
https://twitter.com/andy_matuschak/status/1206011493495513089
https://archive.is/xvaMh
If I weren't doing my own research on questions in knowledge systems (which necessitates tinkering with my own), and if I weren't allergic to doing serious work in webapps, I'd likely use Roam instead!
https://talk.dynalist.io/t/roam-research-new-web-based-outliner-that-supports-transclusion-wiki-features-thoughts/5911/16
http://forum.eastgate.com/t/roam-research-interesting-approach-to-note-taking/2713/10
interesting app: http://www.eastgate.com/Tinderbox/
https://www.theatlantic.com/notes/2016/09/labor-day-software-update-tinderbox-scrivener/498443/

intriguing but probably not appropriate for my needs: https://www.sophya.ai/

Inkdrop: https://news.ycombinator.com/item?id=20103589

Joplin: https://news.ycombinator.com/item?id=15815040
https://news.ycombinator.com/item?id=21555238

MindForgr: https://news.ycombinator.com/item?id=22088175
one comment links to this, mostly on Notion: https://tkainrad.dev/posts/managing-my-personal-knowledge-base/

https://wreeto.com/

Leo Editor (combines tree outlining w/ literate programming/scripting, I think?): https://news.ycombinator.com/item?id=17769892

Frame: https://news.ycombinator.com/item?id=18760079

https://www.reddit.com/r/TheMotte/comments/cb18sy/anyone_use_a_personal_wiki_software_to_catalog/
https://archive.is/xViTY
Notion: https://news.ycombinator.com/item?id=18904648
https://coda.io/welcome
https://news.ycombinator.com/item?id=15543181

accounting: https://news.ycombinator.com/item?id=19833881
Coda mentioned

https://www.reddit.com/r/slatestarcodex/comments/ap437v/modified_cornell_method_the_optimal_notetaking/
https://archive.is/e9oHu
https://www.reddit.com/r/slatestarcodex/comments/bt8a1r/im_about_to_start_a_one_month_journaling_test/
https://www.reddit.com/r/slatestarcodex/comments/9cot3m/question_how_do_you_guys_learn_things/
https://archive.is/HUH8V
https://www.reddit.com/r/slatestarcodex/comments/d7bvcp/how_to_read_a_book_for_understanding/
https://archive.is/VL2mi

Anki:
https://www.reddit.com/r/Anki/comments/as8i4t/use_anki_for_technical_books/
https://www.freecodecamp.org/news/how-anki-saved-my-engineering-career-293a90f70a73/
https://www.reddit.com/r/slatestarcodex/comments/ch24q9/anki_is_it_inferior_to_the_3x5_index_card_an/
https://archive.is/OaGc5
maybe not the best source for a review/advice

interesting comment(s) about tree outliners and spreadsheets: https://news.ycombinator.com/item?id=21170434
https://lightsheets.app/

tablet:
https://www.inkandswitch.com/muse-studio-for-ideas.html
https://www.inkandswitch.com/capstone-manuscript.html
https://news.ycombinator.com/item?id=20255457
hn  discussion  recommendations  software  tools  desktop  app  notetaking  exocortex  wkfly  wiki  productivity  multi  comparison  crosstab  properties  applicability-prereqs  nlp  info-foraging  chart  webapp  reference  q-n-a  retention  workflow  reddit  social  ratty  ssc  learning  studying  commentary  structure  thinking  network-structure  things  collaboration  ocr  trees  graphs  LaTeX  search  todo  project  money-for-time  synchrony  pinboard  state  duplication  worrydream  simplification-normalization  links  minimalism  design  neurons  ai-control  openai  miri-cfar  parsimony  intricacy  meta:reading  examples  prepping  new-religion  deep-materialism  techtariat  review  critique  mobile  integration-extension  interface-compatibility  api  twitter  backup  vgr  postrat  personal-finance  pragmatic  stay-organized  project-management  news  org:mag  epistemic  steel-man  explore-exploit  correlation  cost-benefit  convexity-curvature  michael-nielsen  hci  ux  oly  skunkworks  europe  germanic 
october 2019 by nhaliday
Is there a common method for detecting the convergence of the Gibbs sampler and the expectation-maximization algorithm? - Quora
In practice and theory it is much easier to diagnose convergence in EM (vanilla or variational) than in any MCMC algorithm (including Gibbs sampling).

https://www.quora.com/How-can-you-determine-if-your-Gibbs-sampler-has-converged
There is a special case when you can actually obtain the stationary distribution, and be sure that you did! If your markov chain consists of a discrete state space, then take the first time that a state repeats in your chain: if you randomly sample an element between the repeating states (but only including one of the endpoints) you will have a sample from your true distribution.

One can achieve this 'exact MCMC sampling' more generally by using the coupling from the past algorithm (Coupling from the past).

Otherwise, there is no rigorous statistical test for convergence. It may be possible to obtain a theoretical bound for the convergence rates: but these are quite difficult to obtain, and quite often too large to be of practical use. For example, even for the simple case of using the Metropolis algorithm for sampling from a two-dimensional uniform distribution, the best convergence rate upper bound achieved, by Persi Diaconis, was something with an astronomical constant factor like 10^300.

In fact, it is fair to say that for most high dimensional problems, we have really no idea whether Gibbs sampling ever comes close to converging, but the best we can do is use some simple diagnostics to detect the most obvious failures.
nibble  q-n-a  qra  acm  stats  probability  limits  convergence  distribution  sampling  markov  monte-carlo  ML-MAP-E  checking  equilibrium  stylized-facts  gelman  levers  mixing  empirical  plots  manifolds  multi  fixed-point  iteration-recursion  heuristic  expert-experience  theory-practice  project 
october 2019 by nhaliday
Python Tutor - Visualize Python, Java, C, C++, JavaScript, TypeScript, and Ruby code execution
C++ support but not STL

Ten years and nearly ten million users: my experience being a solo maintainer of open-source software in academia: http://www.pgbovine.net/python-tutor-ten-years.htm
I HYPERFOCUS ON ONE SINGLE USE CASE
I (MOSTLY*) DON'T LISTEN TO USER REQUESTS
I (MOSTLY*) REFUSE TO EVEN TALK TO USERS
I DON'T DO ANY MARKETING OR COMMUNITY OUTREACH
I KEEP EVERYTHING STATELESS
I DON'T WORRY ABOUT PERFORMANCE OR RELIABILITY
I USE SUPER OLD AND STABLE TECHNOLOGIES
I DON'T MAKE IT EASY FOR OTHERS TO USE MY CODE
FINALLY, I DON'T LET OTHER PEOPLE CONTRIBUTE CODE
UNINSPIRATIONAL PARTING THOUGHTS
APPENDIX: ON OPEN-SOURCE SOFTWARE MAINTENANCE
tools  devtools  worrydream  ux  hci  research  project  homepage  python  programming  c(pp)  javascript  jvm  visualization  software  internet  web  debugging  techtariat  state  form-design  multi  reflection  oss  shipping  community  collaboration  marketing  ubiquity  robust  worse-is-better/the-right-thing  links  performance  engineering  summary  list  top-n  pragmatic  cynicism-idealism 
september 2019 by nhaliday
A Formal Verification of Rust's Binary Search Implementation
Part of the reason for this is that it’s quite complicated to apply mathematical tools to something unmathematical like a functionally unpure language (which, unfortunately, most programs tend to be written in). In mathematics, you don’t expect a variable to suddenly change its value, and it only gets more complicated when you have pointers to those dang things:

“Dealing with aliasing is one of the key challenges for the verification of imperative programs. For instance, aliases make it difficult to determine which abstractions are potentially affected by a heap update and to determine which locks need to be acquired to avoid data races.” 1

While there are whole logics focused on trying to tackle these problems, a master’s thesis wouldn’t be nearly enough time to model a formal Rust semantics on top of these, so I opted for a more straightforward solution: Simply make Rust a purely functional language!

Electrolysis: Simple Verification of Rust Programs via Functional Purification
If you know a bit about Rust, you may have noticed something about that quote in the previous section: There actually are no data races in (safe) Rust, precisely because there is no mutable aliasing. Either all references to some datum are immutable, or there is a single mutable reference. This means that mutability in Rust is much more localized than in most other imperative languages, and that it is sound to replace a destructive update like

p.x += 1
with a functional one – we know there’s no one else around observing p:

let p = Point { x = p.x + 1, ..p };
techtariat  plt  programming  formal-methods  rust  arrows  reduction  divide-and-conquer  correctness  project  state  functional  concurrency  direct-indirect  pls  examples  simplification-normalization  compilers 
august 2019 by nhaliday
Why is Google Translate so bad for Latin? A longish answer. : latin
hmm:
> All it does its correlate sequences of up to five consecutive words in texts that have been manually translated into two or more languages.
That sort of system ought to be perfect for a dead language, though. Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.

We're not exactly inundated with brand new Latin to translate.
--
> Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.
What makes you think that the Google folks haven't done so and used that to create the language models they use?
> That sort of system ought to be perfect for a dead language, though.
Perhaps. But it will be bad at translating novel English sentences to Latin.
foreign-lang  reddit  social  discussion  language  the-classics  literature  dataset  measurement  roots  traces  syntax  anglo  nlp  stackex  links  q-n-a  linguistics  lexical  deep-learning  sequential  hmm  project  arrows  generalization  state-of-art  apollonian-dionysian  machine-learning  google 
june 2019 by nhaliday
Burrito: Rethinking the Electronic Lab Notebook
Seems very well-suited for ML experiments (if you can get it to work), also the nilfs aspect is cool and basically implements exactly one of the my project ideas (mini-VCS for competitive programming). Unfortunately gnarly installation instructions specify running it on Linux VM: https://github.com/pgbovine/burrito/blob/master/INSTALL. Linux is hard requirement due to nilfs.
techtariat  project  tools  devtools  linux  programming  yak-shaving  integration-extension  nitty-gritty  workflow  exocortex  scholar  software  python  app  desktop  notetaking  state  machine-learning  data-science  nibble  sci-comp  oly  vcs  multi  repo  paste  homepage  research 
may 2019 by nhaliday
Dimensions - Geert Hofstede
http://geerthofstede.com/culture-geert-hofstede-gert-jan-hofstede/6d-model-of-national-culture/

https://www.reddit.com/r/europe/comments/4g88kt/eu28_countries_ranked_by_hofstedes_cultural/
https://archive.is/rXnII

https://hbdchick.wordpress.com/2013/09/07/national-individualism-collectivism-scores/

Individualism and Collectivism in Israeli Society: Comparing Religious and Secular High-School Students: https://sci-hub.tw/https://link.springer.com/article/10.1023/A:1016945121604
A common collective basis of mutual value consensus was found in the two groups; however, as predicted, there were differences between secular and religious students on the three kinds of items, since the religious scored higher than the secular students on items emphasizing collectivist orientation. The differences, however, do not fit the common theoretical framework of collectivism-individualism, but rather tend to reflect the distinction between in-group and universal collectivism.

Individualism and Collectivism in Two Conflicted Societies: Comparing Israeli-Jewish and Palestinian-Arab High School Students: https://sci-hub.tw/http://journals.sagepub.com/doi/10.1177/0044118X01033001001
Both groups were found to be more collectivistic than individualistic oriented. However, as predicted, the Palestinians scored higher than the Israeli students on items emphasizing in-group collectivist orientation (my nationality, my country, etc.). The differences between the two groups tended to reflect some subdistinctions such as different elements of individualism and collectivism. Moreover, they reflected the historical context and contemporary influences, such as the stage where each society is at in the nation-making process.

Religion as culture: religious individualism and collectivism among american catholics, jews, and protestants.: https://www.ncbi.nlm.nih.gov/pubmed/17576356
We propose the theory that religious cultures vary in individualistic and collectivistic aspects of religiousness and spirituality. Study 1 showed that religion for Jews is about community and biological descent but about personal beliefs for Protestants. Intrinsic and extrinsic religiosity were intercorrelated and endorsed differently by Jews, Catholics, and Protestants in a pattern that supports the theory that intrinsic religiosity relates to personal religion, whereas extrinsic religiosity stresses community and ritual (Studies 2 and 3). Important life experiences were likely to be social for Jews but focused on God for Protestants, with Catholics in between (Study 4). We conclude with three perspectives in understanding the complex relationships between religion and culture.

Inglehart–Welzel cultural map of the world: https://en.wikipedia.org/wiki/Inglehart%E2%80%93Welzel_cultural_map_of_the_world
Live cultural map over time 1981 to 2015: https://www.youtube.com/watch?v=ABWYOcru7js

https://en.wikipedia.org/wiki/Post-materialism

https://ourworldindata.org/materialism-and-post-materialism
By Income of the Country

Most of the low post-materialism, high income countries are East Asian :(. Some decent options: Norway, Netherlands, Iceland (surprising!). Other Euro countries fall into that category but interest me less for other reasons.

https://graphpaperdiaries.com/2016/06/10/materialism-and-post-materialism/

Postmaterialism and the Economic Condition: https://www.jstor.org/stable/2111573
prof  psychology  social-psych  values  culture  cultural-dynamics  anthropology  individualism-collectivism  expression-survival  long-short-run  time-preference  uncertainty  outcome-risk  gender  egalitarianism-hierarchy  things  phalanges  group-level  world  tools  comparison  data  database  n-factor  occident  social-norms  project  microfoundations  multi  maps  visualization  org:junk  psych-architecture  personality  hari-seldon  discipline  self-control  geography  shift  developing-world  europe  the-great-west-whale  anglosphere  optimate  china  asia  japan  sinosphere  orient  MENA  reddit  social  discussion  backup  EU  inequality  envy  britain  anglo  nordic  ranking  top-n  list  eastern-europe  germanic  gallic  mediterranean  cog-psych  sociology  guilt-shame  duty  tribalism  us-them  cooperate-defect  competition  gender-diff  metrics  politics  wiki  concept  society  civilization  infographic  ideology  systematic-ad-hoc  let-me-see  general-survey  chart  video  history  metabuch  dynamic  trends  plots  time-series  reference  water  mea 
june 2017 by nhaliday
Reuters Institute Digital News Report 2017
Section 3.2, p. 39 has polarization data
A new way to chart ideological leanings in news media: https://www.axios.com/a-new-way-to-chart-ideological-leanings-in-news-media-2475716743.html
(using Twitter follows)
Exploring the Ideological Nature of Journalists’ Social Networks on Twitter and Associations with News Story Content: https://drive.google.com/file/d/0B8CcT_0LwJ8QVnJMR1QzcGNuTkk/view
Visualizing Political Polarization on Twitter: http://www.theoutgroup.org/
Dear Mainstream Media: Why so liberal?: https://www.washingtonpost.com/blogs/erik-wemple/wp/2017/01/27/dear-mainstream-media-why-so-liberal/
Political Leanings of US Journalists vs. the Public in 2002

Topline Results: 2017 Texas Media & Society Survey: https://moody.utexas.edu/sites/default/files/TMASS_2017Topline_final.pdf
https://twitter.com/gelliottmorris/status/915295562123108352
https://archive.is/sE5cg
Some interesting results from a poll about media & polarization that I presented today for @AStraussInst <THREAD>
pdf  news  org:lite  media  database  data  analysis  politics  polarization  poll  values  time-use  world  usa  europe  EU  britain  internet  tv  social  white-paper  org:ngo  org:edu  ideology  multi  visualization  spatial  exploratory  polisci  wonkish  network-structure  twitter  techtariat  ssc  neocons  info-dynamics  project  org:junk  journos-pundits  info-foraging  track-record  objektbuch  chart  commentary  backup  org:rec  distribution  biases  comparison  within-without  input-output  supply-demand 
june 2017 by nhaliday
Distribution of Word Lengths in Various Languages - Ravi Parikh's Website
Note that this visualization isn't normalized based on usage. For example the English word 'the' is used frequently, while the word 'lugubrious' is rarely used; however both words count the same in computing the histogram and average word lengths. A great idea for a follow-up would be to use language corpuses instead of word lists in order to build these histograms.
techtariat  data  visualization  project  anglo  language  foreign-lang  distribution  expectancy  measure  lexical 
june 2017 by nhaliday
Comprehensive Military Power: World’s Top 10 Militaries of 2015 - The Unz Review
gnon  military  defense  scale  top-n  list  ranking  usa  china  asia  analysis  data  sinosphere  critique  russia  capital  magnitude  street-fighting  individualism-collectivism  europe  germanic  world  developing-world  latin-america  MENA  india  war  meta:war  history  mostly-modern  world-war  prediction  trends  realpolitik  strategy  thucydides  great-powers  multi  news  org:mag  org:biz  org:foreign  current-events  the-bones  org:rec  org:data  org:popup  skunkworks  database  dataset  power  energy-resources  heavy-industry  economics  growth-econ  foreign-policy  geopolitics  maps  project  expansionism  the-world-is-just-atoms  civilization  let-me-see  wiki  reference  metrics  urban  population  japan  britain  gallic  allodium  definite-planning  kumbaya-kult  peace-violence  urban-rural  wealth  wealth-of-nations  econ-metrics  dynamic  infographic 
june 2017 by nhaliday
Lost and Found | West Hunter
I get the distinct impression that someone (probably someone other than Varro) came up with an approximation of germ theory 1500 years before Girolamo Fracastoro. But his work was lost.

Everybody knows, or should know, that the vast majority of Classical literature has not been preserved. Those lost works contained facts and ideas that might have value today – certainly there are topics that we understand much better because of insights from Classical literature. For example, Reich and Patterson find that some of the Indian castes have existed for something like three thousand years: this is easier to believe when you consider that Megasthenes wrote about the caste system as early as 300 BC.

We don’t put much effort into recovering lost Classical literature. But there are ways in which we could push harder – by increased funding for work on the Herculaneum scrolls, or the Oxyrhynchus papyri collection, for example. Some old-fashioned motivated archaeology might get lucky and find another set of Amarna cuneiform letters, or a new Antikythera mechanism.

https://westhunt.wordpress.com/2012/03/06/spontaneous-generation/
Here we have yet another case in which a discovery was possible for a long time before it was actually accepted. Aristotle is the villain here: he clearly endorses spontaneous generation of many plants and animals. On the other hand, I don’t remember him saying that people should accept all of his conclusions uncritically and without further experimentation for the next couple of thousand years, which is what happened. So maybe we’re all guilty.

...

Part of the funny here (not even counting practical experience) is that almost every educated man over these two millennia had read, and indeed studied deeply, a work with a fairly clear statement of the actual fly->egg->maggot->fly process. As I as I can tell, only one person (Redi) seems to have picked up on this.

“But the more Achilles gazed, the greater rose his desire for vengeance, and his eyes flashed terribly, like coals beneath his lids, as he lifted the god’s marvellous gifts and exulted. When he had looked his fill on their splendour, he spoke to Thetis winged words; ‘Mother, the god grants me a gift fit for the immortals, such as no mortal smith could fashion. Now I shall arm myself for war. Yet I fear lest flies infest the wounds the bronze blades made, and maggots breed in the corpse of brave Patroclus, and now his life is fled, rot the flesh, and disfigure all his body.’ ”

You’d think a blind man would have noticed this.

Anyhow, the lesson is clear. Low hanging fruit can persist for a long time if the conventional wisdom is wrong – and sometimes it is.

http://www.bede.org.uk/literature.htm

Transmission of the Greek Classics: https://en.wikipedia.org/wiki/Transmission_of_the_Greek_Classics
https://www.quora.com/How-much-writing-from-ancient-Greece-is-preserved-Is-it-a-finite-amount-that-someone-could-potentially-read

By way of comparison, the complete Loeb Classical Library (which includes all the important classical texts) has 337 volumes for Ancient Greek --- and those aren't 100,000 word-long door-stoppers.
https://www.loebclassics.com/
$65/year for individuals (I wonder if public libraries have subscriptions?)

http://www.roger-pearse.com/weblog/2009/10/26/reference-for-the-claim-that-only-1-of-ancient-literature-survives/
http://www.patheos.com/blogs/geneveith/2015/01/finding-the-lost-texts-of-classical-antiquity/
http://www.historyofinformation.com/narrative/loss-of-information.php
http://www.bede.org.uk/literature.htm

https://twitter.com/futurepundit/status/927344648154112000
https://archive.is/w86uL
1/ Thinking about what Steven Greenblatt described in The Swerve as a mass extinction of ancient books (we have little of what they wrote)
2/ If I could go back in time to, say, 100 AD or 200 AD I would go with simple tech for making books last for a thousand years. Possible?

https://www.gnxp.com/WordPress/2018/01/28/the-rapid-fading-of-information/
I’ve put a lot of content out there over the years. Probably on the order of 5 million words across my blogs. Some publications here and there. Lots of tweets. But very little of it will persist into future generations. Digital is evanescent.

But so is paper. I believe that even good hardcover books probably won’t last more than a few hundred years.

Perhaps we should go back to some form of cuneiform? Stone and metal will last thousands of years.

How long does a paperback book last?: https://www.quora.com/How-long-does-a-paperback-book-last

A 500 years vault for books?: https://worldbuilding.stackexchange.com/questions/137583/a-500-years-vault-for-books
There are about four solutions that have actually worked in history

1. The desert method
2. Give them to an institution which will preserve them
3. The opposite of secrecy: duplicate them extensively

4. Transcribe them to durable materials

It is hard to keep books for a really long time because paper, parchment and papyrus are easily destroyed. However books have been produced on much more durable materials. Nowadays a holographic copy can be laser etched into stainless steel. In Sumer, 5300 years ago they pressed them into clay tablets. If the document was important, they fired the clay; otherwise they just let it dry. The fired versions are close to indestructible.
west-hunter  scitariat  discussion  ideas  speculation  history  iron-age  mediterranean  the-classics  innovation  low-hanging  spreading  disease  parasites-microbiome  🔬  archaeology  discovery  epidemiology  canon  multi  literature  fiction  agriculture  india  asia  pop-structure  social-structure  ethnography  the-trenches  nihil  flux-stasis  science  medieval  europe  the-great-west-whale  letters  info-dynamics  being-right  scale  wiki  reference  trivia  cocktail  curiosity  enlightenment-renaissance-restoration-reformation  article  q-n-a  qra  data  database  project  toys  religion  christianity  civilization  twitter  social  gedanken  gnon  backup  time  volo-avolo  brands  money  gnxp  store  stackex  traces  sequential  knowledge  pro-rata  human-capital  age-generation 
may 2017 by nhaliday
:feed v1 - /fora/posts/~2017.4.12..21.14.00..fe17~
The goal of this demo was to show that building a Twitter replacement actually isn't that hard at all; and it can be done almost entirely on the frontend. As shown, you don't even have to use React/Redux. But that's probably the way to go if you want to build the real thing.
techtariat  urbit  software  decentralized  twitter  social  internet  web  programming  tutorial  project  gnon 
april 2017 by nhaliday
Dgsh – Directed graph shell | Hacker News
I've worked with and looked at a lot of data processing helpers. Tools, that try to help you build data pipelines, for the sake of performance, reproducibility or simply code uniformity.
What I found so far: Most tools, that invent a new language or try to cram complex processes into lesser suited syntactical environments are not loved too much.

...

I'll give dgsh a try. The tool reuse approach and the UNIX spirit seems nice. But my initial impression of the "C code metrics" example from the site is mixed: It reminds me of awk, about which one of the authors said, that it's a beautiful language, but if your programs getting longer than hundred lines, you might want to switch to something else.

Two libraries which have a great grip at the plumbing aspect of data processing systems are airflow and luigi. They are python libraries and with it you have a concise syntax and basically all python libraries plus non-python tools with a command line interface at you fingertips.

I am curious, what kind of process orchestration tools people use and can recommend?

--

Exactly our experience too, from complex machine learning workflows in various aspects of drug discovery.
We basically did not really find any of the popular DSL-based bioinformatics pipeline tools (snakemake, bpipe etc) to fit the bill. Nextflow came close, but in fact allows quite some custom code too.

What worked for us was to use Spotify's Luigi, which is a python library rather than DSL.

The only thing was that we had to develop a flow-based inspired API on top of Luigi's more functional programming based one, in order to make defining dependencies fluent and easy enough to specify for our complex workflows.

Our flow-based inspired Luigi API (SciLuigi) for complex workflows, is available at:

https://github.com/pharmbio/sciluigi

--

We have measured many of the examples against the use of temporary files and the web report one against (single-threaded) implementations in Perl and Java. In almost all cases dgsh takes less wall clock time, but often consumes more CPU resources.
commentary  project  programming  terminal  worrydream  pls  plt  unix  hn  graphs  tools  devtools  let-me-see  composition-decomposition  yak-shaving  workflow  exocortex  hmm  cool  software  desktop  sci-comp  stock-flow  performance  comparison  links  libraries  python 
january 2017 by nhaliday
Cryptpad: Zero Knowledge, Collaborative Real Time Editing | Hacker News
comments have interesting discussion of use of "zero-knowledge" in practice
commentary  hn  project  software  tools  crypto  privacy  hmm  engineering 
september 2016 by nhaliday
« earlier      
per page:    204080120160

Copy this bookmark:





to read