Flat File Splitter

At work the other day, I needed to write a program for splitting a text file (.txt) to smaller pieces. The file that we received from our client was a flat file of size of about 2.5 GB. I needed to import the contents of the file to our SQL Server database.  Since the file was in pipe delimited format with 33 columns of data elements, simply using Import feature of SQL would have solved the case for me. But the import turned out unsuccessful. There were some faulty rows that didn’t match up with the required format of 33 columns. So, I needed to open the file and check where the issue was, however because of the huge size of the file, text editor (notepad) couldn’t open the file. Thus, the following code was developed.
using System;
using System.Collections.Generic;
using System.Configuration;
using System.IO;
using System.Linq;
using System.Text;
namespace FileSplitter
{   
    class Program
    {
        static void Main(string[] args)
        {
            FileSplitter();
            // Suspend the screen.
            Console.WriteLine(“Program completed”);
            Console.ReadLine();
        }       
        /// 
        /// Reads the specified input file one line at a time and creates output file(s)         based upon the required number of lines
        /// to break the file with
        /// 
        public static void FileSplitter()
        {
            int counter = 0;
            int fileLengthLimit = int.Parse( ConfigurationManager.AppSettings[“lineLimit”]);
            int fileCursor = 0;
           
            string filePath = ConfigurationManager.AppSettings[“input”];
            string outputFile = ConfigurationManager.AppSettings[“outputDir”];
            //StringWriter allows to create/write new text files
            var output = new System.IO.StreamWriter(System.IO.Path.Combine(outputFile, string.Format(“1-{0}”, fileLengthLimit)));//output file
            //StringReader allows to read from text files
            var file = new StreamReader(filePath);//read file   
            string line=string.Empty;
            while ((line = file.ReadLine()) != null)
            {
                output.WriteLine(line);
                counter++;
                if (counter == fileLengthLimit)
                {
                    
                    output.Close();
                    fileCursor += counter;
                    Console.WriteLine(“finished processing line: “ + fileCursor);
                    output = new System.IO.StreamWriter(System.IO.Path.Combine(outputFile, string.Format(“{0}-{1}”, fileCursor, fileCursor + fileLengthLimit)));
                    counter = 0;
                }               
            }
            //          
            file.Close();
            output.Close();
        }       
    }   
}
App.Config
<?xml version=1.0?>
<configuration>
  <appSettings>
    <add key=input value=“”/>
    <add key=outputDir value=“”/>
    <add key=lineLimit value=“”/>
  </appSettings>
</configuration>
The code is pretty straight forward. The static method FileSplitter() opens the specified text file and reads one line of text at a time from the file. Then it writes output files with desired number of lines (file length limit) per file.  The input file path, output filepath and line limit needs to be provided from the config file.
I think I used 10000 lines per text file and got about 1200 output files. I forgot the exact numbers. Thankfully, the error with the file was in the last record i.e. in the last line of the file. It had only six elements instead of 33. Since, I looked at the last file output first, so I didn’t need to dig into more than one file.
Tags: