CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 10 of 10
  1. #1
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    [RESOLVED] System.outofmemoryexception, how to handle

    Hello,

    Below is code that compares 122 text files (there are two copies of each file, making 61 old copies and 61 new copies). These are data files from a database and can be extremely large. I am having a problem when I hit my largest file (471,483 KB and it grows every day depending on what is added to it). In my last test run the two files were 471,483 KB and 485,359 KB). I compare these files to only extract the new data to a new text file that i write in another direcotry. Is there a way i can handle/free up memory to handle this exception? Thanks in advance.
    Code:
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
    
                Console.WriteLine("Comparing text files to find new data...");
    
                //these are the folders that have the data
                // i perform two loops to go through both of them
                string folderUsing;
    
                try
                {
                    //to iterate over both folders
                    for (int i = 1; i <= 2; i++)
                    {
                        if (i == 1)
                        {
                            folderUsing = "dataset1";
                        }
                        else
                        {
                            folderUsing = "dataset2";
                        }
    
                        Console.WriteLine();
                        Console.WriteLine("Using data from {0}", folderUsing);
    
                        //New appended data location, this folder will always be there
                        string folderSending = "C:\\Reporting\\" + folderUsing + "\\Uploads\\";
    
                        //New Data location, this folder will always be there
                        string folderNewData = "C:\\Reporting\\" + folderUsing + "\\NewData";
    
                        //Old files get moved here, need to create this folder if not one already
                        //this acts as my backup
                        string folderArchive = "C:\\Reporting\\" + folderUsing + "\\" + DateTime.Now.ToString("yyyyMMdd") + "\\";
    
                        //Prior files location, this folder will always be there
                        string folderPrior = "C:\\Reporting\\" + folderUsing + "\\";
    
                        //Create the new data folder if it does not exist
                        if (!Directory.Exists(folderArchive))
                        {
                            Directory.CreateDirectory(folderArchive);
                        }
    
                        //Prelim test, make sure all files in new dir are in old dir
                        //if there is a new file move it into the new directory immediately
                        string[] fileEntriesNew = Directory.GetFiles(folderNewData);
                        foreach (string fileName in fileEntriesNew)
                        {
    
                            string newFileName = Path.GetFileName(fileName);
                            FileInfo newFileInfo = new FileInfo(fileName);
                            FileInfo oldFileInfo = new FileInfo(folderPrior + newFileName);
    
                            //Check old dir name for similar file, if it does not exist, copy entire file over
                            if (!File.Exists(folderPrior + newFileName))
                            {
                                Console.WriteLine("{0} does not exist.", folderPrior + newFileName);
                                Console.WriteLine("{0} will be copied over.", fileName);
                                Console.WriteLine("Press enter to continue");
                                Console.ReadLine();
                                File.Move(fileName, folderSending);
                            }
                            else
                            {
                                Console.WriteLine();
                                Console.WriteLine("Comparing files {0}, {1} KB", newFileName, newFileInfo.Length / 1024);
    
                                //the new file should never be less than the old file size
                                if (newFileInfo.Length < oldFileInfo.Length)
                                {
                                    Console.WriteLine("Possible error: The new file is smaller than the old file");
                                    Console.WriteLine("Press enter to continue");
                                    Console.ReadLine();
                                }
    
                                //if there are two similar files open them both
                                if (File.Exists(folderPrior + newFileName))
                                {
    
                                    Console.WriteLine("Reading files");
                                    //IEnumerable data sources (arrays), this will get all lines in newText not in oldText
                                    string[] newText = File.ReadAllLines(fileName);
                                    string[] oldText = File.ReadAllLines(folderPrior + newFileName);
    
                                    //The query based on the data sources
                                    IEnumerable<string> differenceQuery = newText.Except(oldText);
    
                                    //Any will get any differences and I do not want the first line to be blank
                                    if (differenceQuery.Any() && differenceQuery.First() != "")
                                    {
    
                                        Console.WriteLine("Outputting new lines");
    
                                        using (StreamWriter fsSending = new StreamWriter(folderSending + newFileName, true))
                                        {
    
                                            foreach (string newLine in differenceQuery)
                                            {
                                                fsSending.WriteLine(newLine);
                                            }
    
                                        }
                                    }
    
                                }
                            }
    
                        }
    
                        //this will move new files in to the old dir and old files into the archive dir
                        Console.WriteLine();
                        Console.WriteLine("Moving files to appropriate destination");
    
                        string[] fileEntriesOld = Directory.GetFiles(folderNewData);
                        foreach (string fileName in fileEntriesOld)
                        {
                            string fileMove = Path.GetFileName(fileName);
                            File.Move(folderPrior + fileMove, folderArchive + fileMove);
                            File.Move(fileName, folderPrior + fileMove);
                        }
    
                    }
    
                }
    
                catch (Exception e)
                {
                    Console.WriteLine("The process failed: {0}", e.ToString());
                    throw;
                }
                finally
                {
                    Console.WriteLine();
                    Console.WriteLine("Program finished");
                }
    
            }
        }
    }

  2. #2
    Join Date
    Dec 2011
    Posts
    61

    Re: System.outofmemoryexception, how to handle

    These 2 statements will eat up your memory if your files are very big:

    string[] newText = File.ReadAllLines(fileName);
    string[] oldText = File.ReadAllLines(folderPrior + newFileName);

    You can use ReadLine method of StreamReader to compare the text line by line.

  3. #3
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    Re: System.outofmemoryexception, how to handle

    Would I be able to still use the IEnumerable? Initially I did use your suggested method (i would open the old file read a line, open the new file and go through the entire file to find a similar line, if it was not there i would use streamwriter to write it to a new file, if a match was found in the old file I would go to the next line in the new file, close the old file and continue the process). However this process was taking way too long as you can imagine.

  4. #4
    Join Date
    Dec 2011
    Posts
    61

    Re: System.outofmemoryexception, how to handle

    then you need to figure out what is the maximum file size your system will accept and use a byte[] buffer to store partial string you get from your text file, process it, then clear it and read in other part of the file.

  5. #5
    Join Date
    May 2007
    Posts
    1,546

    Re: System.outofmemoryexception, how to handle

    Or use:

    Code:
    foreach (var line in File.ReadLines (path_to_file))
        Process (line);
    or

    Code:
    using (var stream = new StreamReader (path_to_file)) {
        string line = null;
        while ((line = stream.ReadLine ()) != null)
            Process (line);
    }
    www.monotorrent.com For all your .NET bittorrent needs

    NOTE: My code snippets are just snippets. They demonstrate an idea which can be adapted by you to solve your problem. They are not 100% complete and fully functional solutions equipped with error handling.

  6. #6
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    Re: System.outofmemoryexception, how to handle

    I will revisit my code and let you know what I come up with. Thanks for the help!

  7. #7
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    Re: [RESOLVED] System.outofmemoryexception, how to handle

    Will the code below find the differences in both text files regardless of where the lines are? (for example is this code just comparing line 1 and line 1, line 2 and line 2 etc.)

    Code:
    var differenceQuery = File.ReadLines(fileName).Except(File.ReadLines(folderPrior + newFileName));
    
                                    Console.WriteLine("Outputting new lines");
    
                                    using (StreamWriter fsSending = new StreamWriter(folderSending + newFileName, true))
                                    {
    
                                        foreach (string newLine in differenceQuery)
                                        {
                                            fsSending.WriteLine(newLine);
                                        }
    
                                    }

  8. #8
    Join Date
    May 2007
    Posts
    1,546

    Re: [RESOLVED] System.outofmemoryexception, how to handle

    This query will work, but it will hit similar memory issues as your original approach. You will require *all* of the first file to be in memory in order to execute the query. If your files are massive (which they are), you are quite likely to require custom logic to do the checking and comparing. Is every line likely to be unique in your text file or does it contain about 100 unique lines which are just repeated a lot?
    www.monotorrent.com For all your .NET bittorrent needs

    NOTE: My code snippets are just snippets. They demonstrate an idea which can be adapted by you to solve your problem. They are not 100% complete and fully functional solutions equipped with error handling.

  9. #9
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    Re: [RESOLVED] System.outofmemoryexception, how to handle

    there are a lot of repeating lines yes. The new data that is added to the second file depends on how much the user inputs to the end of the file or changes to lines within the file. You are right that last bit of code eats my memory, but for some reason it did not give me the exception, just froze my computer. For example i had an old file of 64,141kb and a new file of 66,885kb and the new text file i created was only 2,974kb..will hashing the files first help?
    Last edited by dssrun; January 10th, 2012 at 09:54 AM.

  10. #10
    Join Date
    Jul 2011
    Location
    .Net V4.0
    Posts
    7

    Re: [RESOLVED] System.outofmemoryexception, how to handle

    does it help that these two files are really data sets (comma separated files stored in different formats). Would creating two data sets and merging them to find the differences work? If so, how would i go about this?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured