Loading file into memory is taking up TOO much memory!
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 12 of 12

Thread: Loading file into memory is taking up TOO much memory!

Hybrid View

  1. #1
    Join Date
    Mar 2013
    Posts
    7

    Loading file into memory is taking up TOO much memory!

    Hi, I am attempting to load a small file (Excel format for now) into memory. The excel file is roughly 80 KB in size and only contains about 20 or 30 columns and 80 rows. I've set my JVM parameters to 2 GB. Considering its only an 80 KB file I wouldn't except it to take up more than a MB of memory but for some reason it seems to expand my JVM by 40 MB once loaded!

    Upon startup of my program 40 MB is used according to the task manager, which I think is fine. But for it to increase form 40 to 80 MB on an 80 KB file is too much because I plan on loading much larger files and if memory requirements grow linear then its simply not going to be possible.

    Some classes
    Code:
    /*
     TableTabInfo Class
     *
     * This Class is a container for all the information that needs to be
     * segregated between multiple loaded tables
     */
    package dataStorageClasses;
    
    import java.util.ArrayList;
    import javax.swing.JTable;
    
    public class TableTabInfo {
    
        public JTable jDataTable;
        public ArrayList<String> headers;
        public RowData[] rData;
    
        public TableTabInfo() {
            this.headers = new ArrayList<String>();
        }
    
        public void setSizeOfRowData(int size) {
            this.rData = new RowData[size];
    
            for (int i = 0; i < size; i++) {
                this.rData[i] = new RowData();
            }
    
        }
    
        public void addCell(Object cell, int rowNumber) {
            this.rData[rowNumber].addToMyList(cell);
        }
    
        public Object getCell(int row, int col) {
            return this.rData[row].myList.get(col);
        }
    
        public void addToHeaderList(String header) {
            this.headers.add(header);
        }
    
        public void emptyHeaderList() {
            this.headers.clear();
        }
    
        public int sizeOfRowData() {
            return this.rData.length;
        }
    }
    
    package dataStorageClasses;
    import java.util.ArrayList;
    
    public class RowData<T> {
    
        public ArrayList<T> myList = new ArrayList<T>();
    
        public RowData()
        {
        }
    
        public T getFromMyList(int index) {
            return this.myList.get(index);
        }
    
        public void addToMyList(T e) {
            this.myList.add(e);
        }
    }
    Loading the file
    Code:
    if (getExtension.contains("xls") || getExtension.contains("lsx")) {
                            try {
                                File inputFile = new File(chooser.getSelectedFile().getAbsolutePath());
                                FileInputStream inputStream = new FileInputStream(inputFile);
    
                                Workbook wb = null;
                                try { 
                                    wb = WorkbookFactory.create(inputStream);
                                } catch (InvalidFormatException ex) {
                                    System.out.println(ex.toString());
                                }
    
                                Sheet sheet = wb.getSheetAt(0);
                               
                                //Add a new tab info object into our vector
                                TableTabInfo newTab = new TableTabInfo();
                                newTab.setSizeOfRowData(sheet.getLastRowNum());
    
                                vTabInfo.add(newTab);
                                
                                Boolean boolIterOnce = false; 
                                int maxSize = 0; // the variable that is used as the pseudo max
    
                                // Iterate over each row in the sheet
                                Iterator<Row> rowsIterator = sheet.rowIterator();
                                while (rowsIterator.hasNext()) {
                                    Row row = rowsIterator.next();
    
                                    if (!boolIterOnce) //only 1 time for headers
                                    {
                                        Iterator<Cell> cellsIterator = row.cellIterator();
                                        ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addToHeaderList("Row");
    
                                        while (cellsIterator.hasNext()) {
                                            Cell cell = cellsIterator.next();
    
                                            switch (cell.getCellType()) {
    
                                                case Cell.CELL_TYPE_STRING:
                                                    if (row.getRowNum() == 0) { //In the file there is no row number column so I am just inserting one manually
                                                        ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addToHeaderList(cell.getStringCellValue());
                                                    }
                                                    break;
    
                                                default:
                                                    break;
                                            }
    
                                        } 
                                    }
                                    else if (boolIterOnce) // if not traversing the headers
                                    {                      
                                        for (int i = 0; i < maxSize; i++) {
                                            if (i == 0) {
                                                //First column is used for row numbers, convert to string so they are left aligned by default
                                                ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(Integer.toString(row.getRowNum() - 1), row.getRowNum() - 1);
                                            } else {
                                                Cell cell = row.getCell(i - 1, Row.CREATE_NULL_AS_BLANK);
                                                switch (cell.getCellType()) {
                                                    case Cell.CELL_TYPE_NUMERIC: // if the cell is a number
                                                        if (row.getRowNum() == 0) {
                                                        } else {
                                                            if (DateUtil.isCellDateFormatted(cell)) {//speical case when its a date
                                                                ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getDateCellValue(), row.getRowNum() - 1);
                                                            } else {
                                                                ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getNumericCellValue(), row.getRowNum() - 1);
                                                            }
                                                        }
                                                        break;
    
                                                    case Cell.CELL_TYPE_STRING:
                                                        if (row.getRowNum() == 0) {
                                                        } else {
                                                            ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getStringCellValue(), row.getRowNum() - 1);
                                                        }
                                                        break;
                                                    case Cell.CELL_TYPE_BLANK:
                                                        ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell("NO DATA", row.getRowNum() - 1);
                                                        break;
                                                    case Cell.CELL_TYPE_BOOLEAN: // if the cell contains a boolean value                                       
                                                        ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getBooleanCellValue(), row.getRowNum() - 1);
                                                        break;
                                                    case Cell.CELL_TYPE_FORMULA: // if the cell contaions a formula (Shouldn't be called)
                                                        ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getCellFormula(), row.getRowNum() - 1);
                                                        break;
                                                    default: 
                                                        System.out.println("unsuported Cell type Row: " + row.getRowNum() + " Column: " + cell.getColumnIndex());
                                                        break;
                                                }
                                            }
                                        }
                                    }
                                    maxSize = ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).headers.size();
                                    boolIterOnce = true;
                                } 
                            } catch (IOException ex) {
                                ex.printStackTrace();
                                outputTextArea.setText("ERROR: FILE COULD NOT BE OPENED!");
                            }
                        }
    I am using vectors to store each new instance of a TableTabInfo class, which shouldn't be an issue because only a few exist at any given time. I thought it was the data inside that was taking up a lot of space (rData variable perhaps as thats where the raw data is actually stored (in addition to them being loaded into a JTable also in that class. But I set rData to null after loading the file and java.exe remained at 80 MB usage and I understand that setting the object to null essentially deletes it and frees the memory but that did not seem to be the case.

    Any suggestions would be great.

  2. #2
    Join Date
    May 2006
    Location
    UK
    Posts
    4,474

    Re: Loading file into memory is taking up TOO much memory!

    But I set rData to null after loading the file and java.exe remained at 80 MB usage and I understand that setting the object to null essentially deletes it and frees the memory but that did not seem to be the case.
    No, removing all references to an object allows it to be garbage collected but there is no guarantee it will be. Whether or not the garbage collector runs and if it does whether or not it collects a particular object depends on many factors such as the particular garbage collector that is running, the current level of free memory, the time in GC cycles the object has been alive etc etc.

    BTW You need to make sure you are closing all streams you open.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  3. #3
    Join Date
    Mar 2013
    Posts
    5

    Re: Loading file into memory is taking up TOO much memory!

    You should try using Java in concert with MongoDB. MongoDB is a hot emerging database technology and it'd be useful to match the power of these two. I put a list of tutorials together about this, take a look if you're interested. http://www.verious.com/board/AKumar/...eries-in-java/

  4. #4
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    closing the filestream didn't seem to have any affect.

    i'll look into the mongo db but I still think there must be something wrong with the code above for it to be consuming that much memory.

  5. #5
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    I do want it to run on other Operating systems like OSX though which is why I'm using java. I guess Mongo DB would not be possible then.

  6. #6
    Join Date
    May 2006
    Location
    UK
    Posts
    4,474

    Re: Loading file into memory is taking up TOO much memory!

    Have you tried loading several different size spreadsheets to see how the memory usage changes. You will possibly find a chunk of that 40Mb is a fixed overhead for using the library and so a doubling of the workbook size won't double the memory usage.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  7. #7
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    You were right about the different sizes. I loaded a second file as a new tab and it only grew a few MB, not 40 MB.

    I tried a 100 MB text file with 65000 lines and 35 columns and it took up about 1.5 GB total. When loaded into memory does it take up more space as an object than when its on disk?
    Last edited by mapleleafs91; March 12th, 2013 at 11:22 PM.

  8. #8
    Join Date
    May 2006
    Location
    UK
    Posts
    4,474

    Re: Loading file into memory is taking up TOO much memory!

    When loaded into memory does it take up more space as an object than when its on disk?
    That depends on how it is stored on disk and how the library is designed. But in all probability yes it will take up far more space because spreadsheets are complex things. For example each cell holds far more information than just the actual data such as data type, formatting, borders, formula etc etc. Now one would assume that cells with the same format/border etc share the same formatter/border object but each cell still needs to hold a reference to the formatter/border object etc and each variable uses some memory.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  9. #9
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    The way I'm loading it into data is not storing that information. Only the actual text in the cell.

    Getting the data
    ((TableTabInfo) vTabInfo.get(jTabs.getTabCount())).addCell(cell.getStringCellValue(), row.getRowNum() - 1);

    My addCell function is also above.

  10. #10
    Join Date
    May 2006
    Location
    UK
    Posts
    4,474

    Re: Loading file into memory is taking up TOO much memory!

    That information has nothing to do with what you are doing with the data, it's in the excel spreadsheet and will get loaded by the excel library. Is it POI you are using?

    Try loading the excel sheet without creating any TableTabInfo class objects and see what memory it is using.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  11. #11
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    Well, I think its the TableTabInfo class because I also tried loading a plain text file with the data. 100 MB in size and took up 1.6 GB in memory

  12. #12
    Join Date
    Mar 2013
    Posts
    7

    Re: Loading file into memory is taking up TOO much memory!

    Quote Originally Posted by copeg View Post
    Memory management in java isn't as simple as watching memory usage in the task manager. Memory is managed by a garbage collector - its pretty smart but is not constantly monitoring object references and immediately acting accordingly. Thus setting an object to null (and presuming no more references exist to the object) does not mean it is immediately removed from memory. If you truly wish to inspect memory usage, use a java profiler (for instance VisualVM) to inspect the runtime of your program - from there you can try to nail down what is consuming the most memory, whether you need it, and/or whether you are holding onto references to object you no longer need.
    Thanks, I tried Visual VM and here are the results from the sampler.

    http://i.imgur.com/QhaJSmO.png

    It looks like its showing basic objects - which is hard to tell where exactly they come from.

    These are results from a 27 MB text file. Takes up 650 MB at the start, then drops for some reason.

    See here http://i.imgur.com/cJw2xyU.png

    I really have no clue whats going on, looks like high usage and then a manageable amount. Then it climbs back up.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center