recentpopularlog in

newtonapple : library   302

« earlier  
linkedin/luminol: Anomaly Detection and Correlation library
Luminol is a light weight python library for time series data analysis. The two major functionalities it supports are anomaly detection and correlation. It can be used to investigate possible causes of anomaly. You collect time series data and Luminol can:

Given a time series, detect if the data contains any anomaly and gives you back a time window where the anomaly happened in, a time stamp where the anomaly reaches its severity, and a score indicating how severe is the anomaly compare to ...
anomaly  anomaly-detection  linkedin  python  timeseries  library 
4 weeks ago by newtonapple
Falcon - A modern high-performance web server for Ruby, supporting HTTP/2 and HTTPS out of the box.
Falcon is a multi-process, multi-fiber rack-compatible HTTP server built on top of async, async-io, async-container and async-http. Each request is run within a lightweight fiber and can block on up-stream requests without stalling the entire server process. Supports HTTP/1 and HTTP/2 natively.
ruby  http  http2  http-server  server  gem  github  library 
october 2018 by newtonapple
Awesome asynchronous I/O for Ruby.
Several years ago, I was hosting websites on a server in my garage. Back then, my ADSL modem was very basic, and I wanted to have a DNS server which would resolve to an internal IP address when the domain itself resolved to my public IP. Thus was born RubyDNS. This project was originally built on top of EventMachine, but a lack of support for IPv6 at the time and other problems, meant that I started looking for other options. Around that time Celluloid was picking up steam. I had not encountered actors before and I wanted to learn more about it. So, I reimplemented RubyDNS on top of Celluloid and this eventually became the first stable release.

Moving forward, I refactored the internals of RubyDNS into Celluloid::DNS. This rewrite helped solidify the design of RubyDNS and to a certain extent it works. However, unfixed bugs and design problems in Celluloid meant that RubyDNS 2.0 was delayed by almost 2 years. I wasn't happy releasing it with known bugs and problems. After sitting on the problem for a while, and thinking about possible solutions, I decided to build a small event reactor using nio4r and timers, the core parts of Celluloid::IO which made it work so well. The result is this project.

In addition, there is a similarly designed C++ library of the same name. These two libraries share similar design principles, but are different in some areas due to the underlying semantic differences of the languages.
ruby  async  async-io  io  gem  github  library  c++ 
october 2018 by newtonapple
dgryski/go-simstore: simhash storage and searching
go-simstore: store and search through simhashes

This package is an implementation of section 3 of "Detecting Near-Duplicates for Web Crawling" by Manku, Jain, and Sarma,

simhash is a simple simhashing library.
simstore is the storage and searching logic
simd is a small daemon that wraps simstore and exposes a http /search endpoint
This code is licensed under the MIT license
simhash  golang  github  code  library 
june 2018 by newtonapple
rs/xid: xid is a globally unique id generator thought for the web
Package xid is a globally unique id generator library, ready to be used safely directly in your server code.

Xid is using Mongo Object ID algorithm to generate globally unique ids with a different serialization (base64) to make it shorter when transported as a string:

4-byte value representing the seconds since the Unix epoch,
3-byte machine identifier,
2-byte process id, and
3-byte counter, starting with a random value.
The binary representation of the id is compatible with Mongo 12 bytes Object IDs. The string representation is using base32 hex (w/o padding) for better space efficiency when stored in that form (20 bytes). The hex variant of base32 is used to retain the sortable property of the id.

Xid doesn't use base64 because case sensitivity and the 2 non alphanum chars may be an issue when transported as a string between various systems. Base36 wasn't retained either because 1/ it's not standard 2/ the resulting size is not predictable (not bit aligned) and 3/ it would not remain sortable. To validate a base32 xid, expect a 20 chars long, all lowercase sequence of a to v letters and 0 to 9 numbers ([0-9a-v]{20}).

UUIDs are 16 bytes (128 bits) and 36 chars as string representation. Twitter Snowflake ids are 8 bytes (64 bits) but require machine/data-center configuration and/or central generator servers. xid stands in between with 12 bytes (96 bits) and a more compact URL-safe string representation (20 chars). No configuration or central generator server is required so it can be used directly in server's code.
uuid  golang  generator  guid  github  library  code 
march 2018 by newtonapple
Java Library that implements and integrates concepts from TCP congestion control to auto-detect concurrency limits to achieve optimal throughput with optimal latency.

When thinking of service availability operators traditionally think in terms of RPS (requests per second). Stress tests are normally performed to determine the RPS at which point the service tips over. RPS limits are then set somewhere below this tipping point (say 75% of this value) and enforced via a token bucket. However, in large distributed systems that auto-scale this value quickly goes out of date and the service falls over anyway and becomes non-responsive to a point where it is unable to gracefully shed load. Instead of thinking in terms of RPS, we should be thinking in terms of concurrent request where we apply queuing theory to determine the number of concurrent requests a service can handle before a queue starts to build up, latencies increase and the service eventually exhausts a hard limit such as CPU, memory, disk or network. This relationship is covered very nicely with Little's Law where Limit = Average RPS * Average Latency.

Concurrency limits are very easy to enforce but difficult to determine as they would require operators to fully understand the hardware services run on and coordinate how they scale. Instead we'd prefer to measure or estimate the concurrency limits at each point in the network. As systems scale and hit limits each node will adjust and enforce its local view of the limit. To estimate the limit we borrow from common TCP congestion control algorithms by equating a system's concurrency limit to a TCP congestion window.

Before applying the algorithm we need to set some ground rules.

We accept that every system has an inherent concurrency limit that is determined by a hard resources, such as number of CPU cores.
We accept that this limit can change as a system auto-scales.
For large systems it's impossible to know all the hard resource limits so we'd rather measure and estimate that limit.
We use minimum latency measurements to determine when queuing happens.
We use timeouts and rejected requests to aggressively back off.
A quick note on using minimum latency. Not all requests are the same and can have varying latency distributions but do tend to converge on an average. But we're not trying to measure average latency. We're trying to detect queuing using the minimum observed latency. That is, regardless of how long it takes to process a request, if there is queuing even the fastest requests to process will sit in the queue and the overall observed latency in a sampling window, especially the minimum observed latency, will increase. When the queue is small the limit can be increased. When the queue grows the limit is decreased.
concurrency  java  library  libraries  TCP 
march 2018 by newtonapple
zalando/skipper: An HTTP router and reverse proxy for service composition
Skipper is an HTTP router and reverse proxy for service composition. It's designed to handle >100k HTTP route definitions with detailed lookup conditions, and flexible augmentation of the request flow with filters. It can be used out of the box or extended with custom lookup, filter logic and configuration sources.
golang  clustering  proxy  router  http  reverse-proxy  github  library 
march 2018 by newtonapple
Libdill is a C library that makes writing structured concurrent programs easy.
concurrency  c  library  multithreading 
december 2017 by newtonapple
seaborn: statistical data visualization
Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.
statistics  visualization  programming  library  pydata  python 
october 2017 by newtonapple
logv/sybil: sybil - a fast and simple NoSQL OLAP engine
Sybil is an append only analytics datastore with no up front table schema requirements; just log JSON records to a table and run queries. Written in Go, sybil is designed for fast full table scans of multi-dimensional data on a single machine.

if sybil by itself is uninteresting (who wants to run command line queries, anyways?), sybil is a supported backend for snorkel
database  golang  github  library  olap  data  datastore 
october 2017 by newtonapple
Gonum is a set of packages designed to make writing numeric and scientific algorithms productive, performant, and scalable.

Gonum contains libraries for matrices and linear algebra; statistics, probability distributions, and sampling; tools for function differentiation, integration, and optimization; network creation and analysis; and more.

We encourage you to get started with Go and Gonum if:

- You are tired of sluggish performance, and fighting C and vectorization.
You are struggling with managing programs as they grow larger.
- You struggle to re-use – even the code you tried to make reusable.
- You would like easy access to parallel computing.
- You want code to be fully transparent, and want the ability to read the source code you use.
- You’d like a compiler to catch mistakes early, but hate fighting linker and unintelligible compile errors.
golang  numeric  library  math  numeric-computing  linear-algebra  machinelearning 
october 2017 by newtonapple
minio/highwayhash: Native Go implementation of HighwayHash -
HighwayHash is a pseudo-random-function (PRF) developed by Jyrki Alakuijala, Bill Cox and Jan Wassenberg (Google research). HighwayHash takes a 256 bit key and computes 64, 128 or 256 bit hash values of given messages.

It can be used to prevent hash-flooding attacks or authenticate short-lived messages. It can also be used as a fingerprint function. This repository provides a native Go and optimized assembly implementations for the AMD64 and ARM64 platforms.

HighwayHash is not a general purpose cryptographic hash function (such as BLAKE2b, SHA-3 or SHA-2) and cannot be used if strong collision resistance is required.
hash  hashing  algorithms  golang  github  library  hash-function 
august 2017 by newtonapple
demerphq/BeagleHash: Fast 64 bit evolved hash.

This is BeagleHash and friends- a family of hash functions developed
using genetic-algorithm techniques. Each hash function has a different
set of trade offs, primarily relating to the size of the seed, the
size the run time state of the hash, the method of seeding the hashes
state, the block size for reads and the size of the final hash.

Hash WordBits SeedBits StateBits HashBits
BeagleHash 64 Variable 128 64
Zaphod64 64 191 192 64
StadtX 64 128 256 64
Zaphod32 32 95 96 32
Phat4 32 96 128 32
SBOX 64 128 524480 64

BeagleHash is named in honor of the the H.M.S. Beagle, the ship
which carried Charles Darwin to the Galapogos Islands. Zaphod is named
after the character from the Hitchhikers Guide to the Galaxy, and is
also a play on the fact that Microsoft has a hash called "Marvin".
StadtX is a hash function inspired by the metrohash family of hashes,
and was named accordingly, "stadt" being the German word for city.
Phat4 needs a new name. SBOX is named after how it works, which is
substitution box hashing. SBOX is a toy, designed to demonstrate how
"perfect unbreakable hashing" might behave. It hashes by looking up
a random 64 bit value for each input byte. (And then switches to Zaphod64
like hashing after 32 bytes.) The purpose is purely for testing. I do
not recommend you use it.


The primary intention of the hash functions contained in this
repository is for use in scripting languages such as Perl and other
contexts where there is a single seed used to hash many keys,
including ones from untrusted sources, and where there may be
"leakage" of details about how the hash behaves. In particular one
of the assumptions of the hash functions contained here is that
they will be *rarely* seeded, but that we will hash many times
with the same seed.
hash  hashing  hash-function  github  library 
march 2017 by newtonapple
http4s is a minimal, idiomatic Scala interface for HTTP. http4s is Scala's answer to Ruby's Rack, Python's WSGI, Haskell's WAI, and Java's Servlets.
scala  http  framework  library  api 
march 2016 by newtonapple
Two.js is a two-dimensional drawing api geared towards modern web browsers. It is renderer agnostic enabling the same api to draw in multiple contexts: svg, canvas, and webgl.
javascript  graphics  library  2d  canvas  svg  webgl 
july 2015 by newtonapple
H2O is a very fast HTTP server written in C. It can also be used as a library.
networking  http  http2  c  library  server 
may 2015 by newtonapple
Ramjet makes it looks as though one DOM element is capable of transforming into another, no matter where the two elements sit in the DOM tree.

It does so by making copies of the two elements (and all their children), setting a fixed position on each, then using CSS transforms to morph the two elements in sync
animation  css  dom  javascript  library  transform  css-transform 
april 2015 by newtonapple
Go implementation of SipHash-2-4, a fast short-input PRF created by Jean-Philippe Aumasson and Daniel J. Bernstein (


$ go get
Supported Go 1.1 and later.


import ""
There are two ways to use this package. The slower one is to use the standard hash.Hash64 interface:

h := siphash.New(key)
sum := h.Sum(nil) // returns 8-byte []byte

sum64 := h.Sum64() // returns uint64
The faster one is to use Hash() function, which takes two uint64 parts of 16-byte key and a byte slice, and returns uint64 hash:

sum64 := siphash.Hash(key0, key1, []byte("Hello"))
The keys and output are little-endian.
golang  hash  hashing  siphash  algorithm  github  code  library  asm  performance  security 
march 2015 by newtonapple
context - GoDoc
Package context defines the Context type, which carries deadlines, cancelation signals, and other request-scoped values across API boundaries and between processes.

Incoming requests to a server should create a Context, and outgoing calls to servers should accept a Context. The chain of function calls between must propagate the Context, optionally replacing it with a modified copy created using WithDeadline, WithTimeout, WithCancel, or WithValue.

Programs that use Contexts should follow these rules to keep interfaces consistent across packages and enable static analysis tools to check context propagation:

Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx:

func DoSomething(ctx context.Context, arg Arg) error {
// ... use ctx ...
Do not pass a nil Context, even if a function permits it. Pass context.TODO if you are unsure about which Context to use.

Use context Values only for request-scoped data that transits processes and APIs, not for passing optional parameters to functions.

The same Context may be passed to functions running in different goroutines; Contexts are safe for simultaneous use by multiple goroutines.

See for example code for a server that uses Contexts.
golang  google  context  http  server  library 
december 2014 by newtonapple
sqlx is a library which provides a set of extensions on go's standard database/sql library. The sqlx versions of sql.DB, sql.TX, sql.Stmt, et al. all leave the underlying interfaces untouched, so that their interfaces are a superset on the standard ones. This makes it relatively painless to integrate existing codebases using database/sql with sqlx.

Major additional concepts are:

Marshal rows into structs (with embedded struct support), maps, and slices
Named parameter support including prepared statements
Get and Select to go quickly from query to struct/slice
LoadFile for executing statements from a file
golang  database  sql  library  github 
december 2014 by newtonapple
Velocity is a jQuery plugin that re-implements $.animate() to produce significantly greater performance (making Velocity also faster than CSS animations) while including several new features to improve animation workflow.
jquery  animation  javascript  library  css 
april 2014 by newtonapple
MeTA is a modern C++ data sciences toolkit featuring

text tokenization, including deep semantic features like parse trees
inverted and forward indexes with compression and various caching strategies
various ranking functions for searching the indexes
topic modeling algorithms
classification algorithms
wrappers for liblinear and libsvm
UTF8 support for analysis on various languages
multithreaded algorithms
c++  machinelearning  github  code  programming  library  text  tokenizer  nlp 
april 2014 by newtonapple
nanomsg is a socket library that provides several common communication patterns. It aims to make the networking layer fast, scalable, and easy to use. Implemented in C, it works on a wide range of operating systems with no further dependencies.

The communication patterns, also called "scalability protocols", are basic blocks for building distributed systems. By combining them you can create a vast array of distributed applications. The following scalability protocols are currently available:

PAIR - simple one-to-one communication
BUS - simple many-to-many communication
REQREP - allows to build clusters of stateless services to process user requests
PUBSUB - distributes messages to large sets of interested subscribers
PIPELINE - aggregates messages from multiple sources and load balances them among many destinations
SURVEY - allows to query state of multiple applications in a single go
Scalability protocols are layered on top of the transport layer in the network stack. At the moment, the nanomsg library supports the following transports mechanisms:

INPROC - transport within a process (between threads, modules etc.)
IPC - transport between processes on a single machine
TCP - network transport via TCP
The library exposes a BSD-socket-like C API to the applications.

It is licensed under MIT/X11 license.
messaging  zeromq  c  library  protocol  networking  socket 
april 2014 by newtonapple
jOOQ: Get Back in Control of Your SQL
jOOQ generates Java code from your database and lets you build typesafe SQL queries through its fluent API.
orm  java  database  sql  library 
february 2014 by newtonapple
Simple Dynamic Strings

SDS is a string library for C designed to augment the limited libc string handling functionalities by adding heap allocated strings that are:

- Simpler to use.
- Binary safe.
- Computationally more efficient.
- But yet... Compatible with normal C string functions.
redis  c  programming  library  string 
february 2014 by newtonapple
Now is a time toolkit for golang
github  programming  golang  library  time 
december 2013 by newtonapple
« earlier      
per page:    204080120160

Copy this bookmark:

to read