CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    large stings memory footprint

    I have an application which reads a 70 Mb text file (compressed into a 5 Mb file).

    The problem is, that it takes initially 256 Mb to read the file (with this code):
    Code:
        Public Function CompressedFileToStringArray(ByVal FilePath As String) As String()
            Dim Answer() As String
            Try
                Dim fs As FileStream = File.OpenRead(FilePath)
                Dim GZfs As New GZipStream(fs, CompressionMode.Decompress, False)
                Dim sr As New StreamReader(GZfs)
                Answer = sr.ReadToEnd.Split(vbCrLf.ToCharArray, StringSplitOptions.RemoveEmptyEntries)
                fs.Close()
    
            Catch
                Throw New Exception _
                      ("Error en Function L*neasDeArchivo" & vbCrLf & _
                       "no se pudo abrir:" & vbCrLf & _
                       FilePath)
            End Try
    
            Return Answer
        End Function
    Later, it grows up to 720 Mb!

    ¿Some idea on how to reduce the memory footprint?
    [Vb.NET 2008 (ex Express)]

  2. #2
    Join Date
    Mar 2002
    Location
    St. Petersburg, Florida, USA
    Posts
    12,125

    Re: large stings memory footprint

    This is a perfect example of the reasons to PRE-allocate memory...

    Most of the collections/arrays use a "doubling" approach to memory allocation, as this has proven to be most useful and efficient in the general case.

    This means that if you have a collection allocated for 50 items, and add a 51st, the allocated size becomes 100. When you add a 101st it becormes 200.

    This is for each individual collection.

    -----

    Now if you are talki8ng about the aggregate memory footprint you need to be very careful......

    The framework allocates memory from the OS as required. Provided it is getting memory without issues, it will continue (within certain bounds) to keep growing.

    This is NOT a problem. Generally speaking the most efficient use of resources is when the resources are being used. The is no reason to keep "Free" memory.

    As memory pressure increases, the GC will begin to run, and make memory available WITHIN the process for future allocations. This does NOT mean that the memory will be returned fto the operating system.

    ------

    Bottom line. You need to look at HOW the memory is being used, and you need to look at the entire system environment.
    TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
    2008, 2009,2010
    In theory, there is no difference between theory and practice; in practice there is.

    * Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions
    * How NOT to post a question here
    * Of course you read this carefully before you posted
    * Need homework help? Read this first

  3. #3
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Quote Originally Posted by TheCPUWizard View Post
    This is a perfect example of the reasons to PRE-allocate memory...

    Most of the collections/arrays use a "doubling" approach to memory allocation, as this has proven to be most useful and efficient in the general case.
    probably that is the reasen because Notepat reserves 150 mb to read the 70 mb file.
    Quote Originally Posted by TheCPUWizard View Post
    This means that if you have a collection allocated for 50 items, and add a 51st, the allocated size becomes 100. When you add a 101st it becormes 200.

    This is for each individual collection.
    The problem is, that, I tried to copy each string (one by one) into another array of strings. Then I make The OriginalString = Nothing, and even call GC.Collect, but it does nothing. the memory remains the same.

    Probably each string allocates much more memory than needed.
    [Vb.NET 2008 (ex Express)]

  4. #4
    Join Date
    Mar 2002
    Location
    St. Petersburg, Florida, USA
    Posts
    12,125

    Re: large stings memory footprint

    Quote Originally Posted by Marraco View Post
    probably that is the reasen because Notepat reserves 150 mb to read the 70 mb file.
    The problem is, that, I tried to copy each string (one by one) into another array of strings. Then I make The OriginalString = Nothing, and even call GC.Collect, but it does nothing. the memory remains the same.

    Probably each string allocates much more memory than needed.
    Go back and REREAD my post. Go read the DOCUMENTATION.

    Just becuase you are no longer referencing an object (setting something to nothing if it is going out of scope is BAD bractice) does NOT mean that the memory will be returned to the operating system, nor will you see a reduction in memory "usage".

    This expectation is one of the clearest indicators that a person does not understand the very fundamentals of using .NET (this is not VB.NET specific).

    Consider: (pseudo code)
    Code:
    for num = 1 to 1000000
        item = new Item()
        item.DoSomething();
    next num
    Clearly there will only be one "Item" in use at any time ( a total of one million will be created).

    IF the computer system has sufficient memory to allocate all one million (ie GC does not run for the duration of the loop), there is NO problem.

    IF GC (explicitly or implicitly) runs after the loop, the process will have already allocated significant memory from the OS. But there is no prima facia reason for the proces to return the memory to the OS.

    Therefore a perfectly valid memory profile would be a rapid increase during the running of the loop, and the memory footprint of the process NEVER going down.
    TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
    2008, 2009,2010
    In theory, there is no difference between theory and practice; in practice there is.

    * Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions
    * How NOT to post a question here
    * Of course you read this carefully before you posted
    * Need homework help? Read this first

  5. #5
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Quote Originally Posted by TheCPUWizard View Post
    This is a perfect example of the reasons to PRE-allocate memory...
    As I understand it, you suggest to declare an array of fixed length (of fixed length strings). But I don't know beforehand the length of the strings (instead of Dim Answer() As String)
    otherwise, I don't understand how I can pre allocate memory. There is not malloc() on VB.NET (Or I maybe are wrong?).
    (also, I suspect than declaring Answer other way, would block use of this line:
    Code:
    Answer = sr.ReadToEnd.Split(vbCrLf.ToCharArray, StringSplitOptions.RemoveEmptyEntries)
    )
    Quote Originally Posted by TheCPUWizard View Post
    Most of the collections/arrays use a "doubling" approach to memory allocation, as this has proven to be most useful and efficient in the general case.

    This means that if you have a collection allocated for 50 items, and add a 51st, the allocated size becomes 100. When you add a 101st it becomes 200.

    This is for each individual collection.
    I think that I understand it.

    I think you mean that sr.ReadToEnd.Split reserves probably a power of 2 size of memory (or something like that), and than an array is internally a collection, or maybe you assume that I copy the array elements into some collection.

    Quote Originally Posted by TheCPUWizard View Post
    -----

    Now if you are talki8ng about the aggregate memory footprint you need to be very careful......
    My English is not good. I don't know if you mean some specific technical term with the word "footprint". I mean the total size the task manager declares the executable uses.
    I mean that:
    Before calling those lines:
    Code:
    Dim fs As FileStream = File.OpenRead(FilePath)
    Dim GZfs As New GZipStream(fs, CompressionMode.Decompress, False)
    Dim sr As New StreamReader(GZfs)
    Answer = sr.ReadToEnd.Split(vbCrLf.ToCharArray, StringSplitOptions.RemoveEmptyEntries)
    fs.Close()
    My application have less than 30 Mb memory.
    After calling that routine, it jumps to 250 Mb!

    So, I tried cloning each string, one by one, and tried to free the original array memory with
    Code:
    myArray = nothing
    GC.Collect
    but it makes no difference
    Quote Originally Posted by TheCPUWizard View Post
    The framework allocates memory from the OS as required. Provided it is getting memory without issues, it will continue (within certain bounds) to keep growing.

    This is NOT a problem. Generally speaking the most efficient use of resources is when the resources are being used. The is no reason to keep "Free" memory.
    My app reach later 900 Mb. Then, the free memory is 4 /5 Mb, and start heavily using the swap file.

    Quote Originally Posted by TheCPUWizard View Post
    As memory pressure increases, the GC will begin to run, and make memory available WITHIN the process for future allocations. This does NOT mean that the memory will be returned fto the operating system.
    As I understand, you mean the memory is free to be allocated again by my own program, but not for the OS.
    Quote Originally Posted by TheCPUWizard View Post
    Bottom line. You need to look at HOW the memory is being used, and you need to look at the entire system environment.
    I don't get it.

    Quote Originally Posted by TheCPUWizard View Post
    Go back and REREAD my post. Go read the DOCUMENTATION.
    MS documentation is the worst I ever had seen. I never find any answer in the MSDN. I get a lot of unrelated topics on java, or anything (even when specifically restrict the answers to VB). Sometimes I can't even find ONE of the words I had searched on the MSDN search results.

    Can you be more specific? Do you suggest search about strings or garbage collector documentation?
    Quote Originally Posted by TheCPUWizard View Post
    Just because you are no longer referencing an object (setting something to nothing if it is going out of scope is BAD practice) does NOT mean that the memory will be returned to the operating system, nor will you see a reduction in memory "usage".
    I had done my search before posting here (obviously wrong). The array does not have a .Dispose sub, and I does not have found any way on how to free the array memory.
    Quote Originally Posted by TheCPUWizard View Post
    This expectation is one of the clearest indicators that a person does not understand the very fundamentals of using .NET (this is not VB.NET specific).

    Consider: (pseudo code)
    Code:
    for num = 1 to 1000000
        item = new Item()
        item.DoSomething();
    next num
    Clearly there will only be one "Item" in use at any time ( a total of one million will be created).

    IF the computer system has sufficient memory to allocate all one million (ie GC does not run for the duration of the loop), there is NO problem.

    IF GC (explicitly or implicitly) runs after the loop, the process will have already allocated significant memory from the OS. But there is no prima facia reason for the proces to return the memory to the OS.

    Therefore a perfectly valid memory profile would be a rapid increase during the running of the loop, and the memory footprint of the process NEVER going down.
    My best interpretation is that i need to free each string one by one?

    ...and how to do it? I have fond only a .Finalize method, but it is not accessible.
    Last edited by Marraco; November 14th, 2008 at 11:13 AM. Reason: Added the malloc line
    [Vb.NET 2008 (ex Express)]

  6. #6
    Join Date
    Mar 2002
    Location
    St. Petersburg, Florida, USA
    Posts
    12,125

    Re: large stings memory footprint

    A couple of points:

    1) "Total FootPrint" is what is being shown by TaskMgr (and some of the PerfMon counters.

    2) When using MSDN, it is a good idea to utilize the filters feature. Many people find google, with "msdn.microsoft.com" as part of the query a better way to search msdn than msdn.microsoft.com itself.

    3) EXPLICIT calls to GC.Collect are a BAD idea. They can actually cause memory requirements to INCREASE.

    4) Dispose is only applicable if the managed code [VB.NET] is using a resource which must be explicitly returned or which has limited availability [WIN32 API objects such as Pens, and DB objects such as connection, etc]

    5) The Finalizer will only be invoked IF you have a BUG in your program (you failed to call Dispose and/or Dispose failed to suppress the finalizer.

    6) Consider the following:

    a) I am at work and need to write something down, so I get a pen from the supply cabinet [new]
    b) I write down what I need [method call]
    c) I put the pen in my pocket [no more usage]
    d) I go home and empty my pockets [no more reference - I can not longer "Reach" the pen

    Tomorrow I repeat this process..And again the next day.

    Eventually I have many pens at home. This is not a problem provided:

    a) I have a place to store all of them
    b) The supply cabinet does not run out of pens for myself or others.

    It is only when one of the above occurs, that I must [because I am honest and decent] return the pens to the supply cabinet.

    The same is true for memory utilization. There is nothing "wrong" with a program using ALL of the available memory.... until the OS requests that the process return some.
    TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
    2008, 2009,2010
    In theory, there is no difference between theory and practice; in practice there is.

    * Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions
    * How NOT to post a question here
    * Of course you read this carefully before you posted
    * Need homework help? Read this first

  7. #7
    Join Date
    Jul 2000
    Location
    Milano, Italy
    Posts
    7,726

    Re: large stings memory footprint

    I do believe question was:
    "I am running out of memory. Is there any way to have back a bit before the usual?"
    and I am afraid answer is : "no".
    ...at present time, using mainly Net 4.0, Vs 2010



    Special thanks to Lothar "the Great" Haensler, Chris Eastwood , dr_Michael, ClearCode, Iouri and
    all the other wonderful people who made and make Codeguru a great place.
    Come back soon, you Gurus.

  8. #8
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Thanks for your help.
    Quote Originally Posted by TheCPUWizard View Post
    ...
    2) When using MSDN, it is a good idea to utilize the filters feature. Many people find google, with "msdn.microsoft.com" as part of the query a better way to search msdn than msdn.microsoft.com itself.
    The filters don't work. (and exactly NOW they are not available, so, I cannot provide an easy to get example)

    I totally agree on Google, although Microsoft disables much of the Google links to MSDN. Frequently, direct google links to MSDN not work, but once google tells what to looking for, you can research it on MSDN.
    Quote Originally Posted by TheCPUWizard View Post
    3) EXPLICIT calls to GC.Collect are a BAD idea. They can actually cause memory requirements to INCREASE.
    That is a good piece of advice. It tells me that I are even more lost than I though
    Quote Originally Posted by TheCPUWizard View Post
    ...
    It is only when one of the above occurs, that I must [because I am honest and decent] return the pens to the supply cabinet.
    .... hhhmm I' not decent, but at least honest
    Is there a way to return the Strings pens? (or the entire array?)
    Quote Originally Posted by TheCPUWizard View Post

    The same is true for memory utilization. There is nothing "wrong" with a program using ALL of the available memory.... until the OS requests that the process return some.
    (I are kicked from the building now, so, I gonna get back Monday)
    [Vb.NET 2008 (ex Express)]

  9. #9
    Join Date
    Apr 2008
    Posts
    82

    Re: large stings memory footprint

    if you are having memory problems then don't use readtoend.

  10. #10
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Ok. I give up.

    Maybe if I make the reading of data in an independent dll, then I can call the dll to do the reading, and send me the data in an memory efficient structure. But that will only work if I can unload the dll from RAM after using it.

    ¿Does it make sense, or it does not worth the work?

    (My application needs to stay in memory all day, so I cannot mess with the available RAM.)
    [Vb.NET 2008 (ex Express)]

  11. #11
    Join Date
    Mar 2002
    Location
    St. Petersburg, Florida, USA
    Posts
    12,125

    Re: large stings memory footprint

    Quote Originally Posted by Marraco View Post
    ¿Does it make sense, or it does not worth the work?

    (My application needs to stay in memory all day, so I cannot mess with the available RAM.)
    1) No it does not make sense...but thinkg that is does is a very common mistake (even among professional developers)

    2) If your program is running, but not accessing specific pages of memory, they will be swapped out to disk, and have NO impact on the running state of your machine.
    TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
    2008, 2009,2010
    In theory, there is no difference between theory and practice; in practice there is.

    * Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions
    * How NOT to post a question here
    * Of course you read this carefully before you posted
    * Need homework help? Read this first

  12. #12
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Quote Originally Posted by TheCPUWizard View Post
    1) No it does not make sense...but thinkg that is does is a very common mistake (even among professional developers)

    2) If your program is running, but not accessing specific pages of memory, they will be swapped out to disk, and have NO impact on the running state of your machine.
    The problem is, that since the memory taken increases later, it reach easily 1 Gb, and Windows start swapping the memory on hard disk. That makes the computer unusable.
    Worst, the swapping causes the code to run for hours, instead of minutes.

    It would be solved I were possible to free the unused strings memory.
    [Vb.NET 2008 (ex Express)]

  13. #13
    Join Date
    Jul 2000
    Location
    Milano, Italy
    Posts
    7,726

    Re: large stings memory footprint

    if that is really an issue, then you should think to rewrite code to read and populate array: do a more coded job, and you will be able to preallocate the exact amount of bytes you need. Matter, however is: why it keep on consuming ram? You sure you need all those new instances? Could it be done with a single instance stuff (see about "shared" keyword)?
    ...at present time, using mainly Net 4.0, Vs 2010



    Special thanks to Lothar "the Great" Haensler, Chris Eastwood , dr_Michael, ClearCode, Iouri and
    all the other wonderful people who made and make Codeguru a great place.
    Come back soon, you Gurus.

  14. #14
    Join Date
    Mar 2002
    Location
    St. Petersburg, Florida, USA
    Posts
    12,125

    Re: large stings memory footprint

    Quote Originally Posted by Cimperiali View Post
    if that is really an issue, then you should think to rewrite code to read and populate array: do a more coded job, and you will be able to preallocate the exact amount of bytes you need. Matter, however is: why it keep on consuming ram? You sure you need all those new instances? Could it be done with a single instance stuff (see about "shared" keyword)?
    Just remember that:

    1) EVERY modification to a string creates a new string ALWAYS.
    2) If you are creating object create that 84999 bytes [42499 (minus overhead) characters] they are going onto the LOH and fragmentation will cause memory growth.

    Ideally, a long running program should NEVER have a string that exceeds about 40K in length. Not one, not for an instant.
    TheCPUWizard is a registered trademark, all rights reserved. (If this post was helpful, please RATE it!)
    2008, 2009,2010
    In theory, there is no difference between theory and practice; in practice there is.

    * Join the fight, refuse to respond to posts that contain code outside of [code] ... [/code] tags. See here for instructions
    * How NOT to post a question here
    * Of course you read this carefully before you posted
    * Need homework help? Read this first

  15. #15
    Join Date
    Mar 2007
    Location
    Argentina
    Posts
    579

    Re: large stings memory footprint

    Quote Originally Posted by Cimperiali View Post
    if ...(see about "shared" keyword)?
    Quote Originally Posted by TheCPUWizard View Post
    Just remember that...Not one, not for an instant.
    It looks like I have a lot of unexpected work...
    [Vb.NET 2008 (ex Express)]

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured