Search

split: To split a file based on line or byte size

The command "split" can be used to split a file into parts or pieces of various sizes based on byte size or number of lines.
Let us say we have a file, temp, with the following contents

temp:



We can use the command split to create new files with parts of the above file in it, by default split breaks a file into different files in the size of 1000 bytes, i.e. one new file for every 1000 bytes of data in the original file.

The new files created by split are named by default with a prefix "x" followed by the alphabets in the format xaa,xab etc.

example:



We can see that along with temp, we have new file xaa. As the file temp does not have more than 1000 bytes of data it was not broken up into multiple files.

We can reduce the size of number of bytes per file by using the option -b.

Example:



We can see that there are 9 files created, each having 10 bytes of data from the original file temp.



By using byte size as the limiting factor we can not be sure as to how much of each line gets split. As we can see in the above example,file xaa has first line and one character of the next line.

To make sure that the splitting happens at exact line boundaries we can use the option -l along with the number of lines per file.

example:



Each file created will have exactly one line of the file "temp".



If we don't like using alphabets to be used with "x" for the names of the files, we can use numbers by using the options -d



The prefix "x" can also be changed to whatever prefix we need by passing the prefix after the file name.

example :



We can see that by passing the string "file" after the filename "temp" we could create the new files with prefix as file.

1 comment:

  1. Really useful link. Exactly what I was looking for. Moreover, lots of options have been provided by the author. Thank you. :)

    ReplyDelete