Text File Manipulation in Windows using GNU Utilities – head, tail, split

Text File Manipulation in Windows using GNU Utilities – head, tail, split

//

Oftentimes I find myself needing to work with large text files, and opening up huge files just chokes out even the best text editors. Most of the time, I only need a sampling of the file to get a picture of what is happening, and these are my go-to utilities.

I’m a big fan of Linux and GNU Utilities, but living practically, I use Windows as my primary workstation. One of my first installs on a new build is the GnuWin32 package available at http://gnuwin32.sourceforge.net/ The library is fairly small & lean, and doesn’t require complex environments.

Three useful utilities in the coreutils package – head, tail, and split.

Simply for head and tail:

head -n 50 input.txt

Optionally send the output to a new file:

head -n 50 input.txt > 50lines.txt

tail works the same way, and n specifies to grab the first or last n lines. The default value for n is 10.

split is a little more complicated, here’s the usage:

Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
PREFIX is 'x'.  With no INPUT, or when INPUT is -, read standard input.

  -b, --bytes=SIZE        put SIZE bytes per output file
  -C, --line-bytes=SIZE   put at most SIZE bytes of lines per output file
  -l, --lines=NUMBER      put NUMBER lines per output file
  -NUMBER                 same as -l NUMBER
      --verbose           print a diagnostic to standard error just
                            before each output file is opened
      --help              display this help and exit
      --version           output version information and exit

SIZE may have a multiplier suffix: b for 512, k for 1K, m for 1 Meg.

Report bugs to .

My task for split was to take a large file, and break it up into files no larger than 8MB each. The command used:

split -C8m input.txt split

That results in files no larger than 8MB, prefixed with split, and the best part is it keeps lines together.

Related

Need Expert Help?

See Our Full Menu of Data Services

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!