Have you ever wanted to extract only a certain set of lines from a file? Maybe you wanted to get everything from line 400 onwards, or just lines 25 to 50? Well I did. I call the end result ‘splitter’.
Splitter is a program designed to be used on the command line and it has been written entirely in Haskell. I have uploaded Splitter so that it is available on Hackage. You can find the source code for splitter on BitBucket, along with the source code for ‘range’ the library that I wrote in order to make the splitter program easier to deal with. The repositories are here:
- Splitter on BitBucket: https://bitbucket.org/robertmassaioli/splitter
- range on BitBucket: https://bitbucket.org/robertmassaioli/range
But words are just words and I really need to show you some examples.
Show me an Example!
For this demo lets make a file that has twenty lines in it and, on every line, are the numbers one to twenty, like this one: Twenty Numbers.
If you were to get that file (calling it ‘twenty.txt’) then the following commands would have the following results. You could get single lines from files:
$ cat twenty.txt | splitter 3 three $
You could get an entire range of lines from a file:
$ cat twenty.txt | splitter 5-9 five six seven eight nine $
You could get multiple ranges from the file:
$ cat twenty.txt | splitter 10-14,2-4 two three four ten eleven twelve thirteen fourteen $
You can get ranges that are only bounded on one side:
$ cat twenty.txt | splitter -5,15- one two three four five fifteen sixteen seventeen eighteen nineteen twenty $
You can invert the selection if you chose to:
$ cat twenty.txt | splitter -i -5,15- six seven eight nine ten eleven twelve thirteen fourteen $
And you can specify an infinite range if you really want to (even though it would be the same as ‘cat’):
$ cat twenty.txt | splitter * one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty $
And the are a few more options that you can choose from that you can see by running ‘splitter –help’. I would recommend that you have a play around with it yourself. It will be possible to install it on any platform that has a cabal-install installed. Which will be part of the Haskell Platform.
The bottom line is that splitter makes it really easy to only extract certain lines from your files. It also has the following features so that you can:
- Select any range that you like; whether infinite or fixed.
- Select infinite ranges.
- Invert your selection so that you get all of the lines that you did NOT specify.
- You can get the line numbers printed out with the lines in the file.
- Lines are printed out when they are ready. Meaning that you can use splitter on a logfile in the same way that you can use ‘tail -f’.
I have tried to make it a highly useful and focussed tool to get certain lines from files using an easy to understand format to specify which lines that you want. For more detailed information you should check out the README file on BitBucket. It is perhaps the most comprehensive and up to date resource on the way to use the splitter tool.
Extra: Range code huh? That sounds useful.
While I was writing this I did indeed look around for Range libraries that would meet my criteria. I discovered the following:
A nice looking package that has been marked as Obsolete by the Author. I did not want to have to be stuck on an obsolete version of code that would not be updated. Also, this library cannot handle infinite ranges.
This is a nice library and it makes good use of Haskell classes but it does not support infinite ranges either and thus was not suitable for this project
So, getting excited and wanting to start from scratch, I wrote my own library called: range. That I have now placed on Hackage. Please feel free to use it for your own purposes and I will happily accept pull requests on that work.