[GLLUG] Digests of CSV files
james.dutton at gmail.com
Fri Jan 5 19:52:04 UTC 2018
I would probably use Apache Pig Latin for that. About 3 lines of pig code.
1) load ...
2) group ...
3) count ...
Another option could be Jupyter notebook.
On 5 Jan 2018 18:13, "John Levin via GLLUG" <gllug at mailman.lug.org.uk>
> Dear list,
> I'm having a bad google day, and am not sure what terms to search on, so I
> hope the list will point me in the right direction.
> I have a number of csv files (of c18th imprisoned debtors). There are
> three important columns: gender, prison, trade. What I want is a program or
> script that will simply digest each column and relate them to each other,
> producing something along the lines of:
> There are 200 weavers.
> There are 190 male weavers.
> There are 20 weavers in Norwich Castle.
> There are 18 male weavers in Norwich Castle.
> This strikes me as a very obvious need, but aside from fantastically
> complex apps like SPSS (which in any case, doesn't seem to have a simple
> way of doing this) I have not found anything that satisfies it.
> Very happy to try writing some bash script to do this, but am not sure
> where to start. CSVkit or suchlike?
> Thanks in advance,
> John Levin
> GLLUG mailing list
> GLLUG at mailman.lug.org.uk
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the GLLUG