Useful Unix commands for data science
Imagine you have a 4.2GB CSV file. It has over 12 million records and 50 columns. All you need from this file is the sum of all values in one particular column.

OK, but I'd mention the useless use of 'cat' to anyone learning from this guide. Alternatives:
<code class="language-bash">
<data.csv awk -F "|" '{ sum += $4 } END { printf "%.2f\n", sum }'
awk -F "|" '{ sum += $4 } END { printf "%.2f\n", sum }' data.csv
7 weeks ago
Building a data science portfolio: Machine learning project
Another reason why you should wrap your READMEs and code at <80 columns.
july 2016

