q - Text as Data
q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files).
q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs, etc. It supports automatic column name and type detection, and q provides full support for multiple character encodings.
q's web site is http://harelba.github.io/q/. It contains everything you need to download and use q immediately.
Installation.
Extremely simple.
Instructions for all OSs are here.
Examples
q "SELECT COUNT(*) FROM ./clicks_file.csv WHERE c3 > 32.3"
ps -ef | q -H "SELECT UID, COUNT(*) cnt FROM - GROUP BY UID ORDER BY cnt DESC LIMIT 3"
Go here for more examples.
Python API
A development branch for exposing q's capabilities as a Python module can be viewed here, along with examples of the alpha version of the API.
Existing functionality as a command-line tool will not be affected by this. Your input will be most appreciated.
Change log
Click here to see the change log.
Contact
Any feedback/suggestions/complaints regarding this tool would be much appreciated. Contributions are most welcome as well, of course.
Harel Ben-Attia, harelba@gmail.com, @harelba on Twitter
q on twitter: #qtextasdata

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
