CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 9 of 9

Thread: Need opinions on best practices passing datatables

  1. #1
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Need opinions on best practices passing datatables

    My task is simple. I am taking two .csv files, importing them to datatables, comparing each datatable to the other, placing non-matches into new datatables, exporting to a workbook, and notifying email lists of success or failure. I haven't been in C# for a while and am getting back up to speed.

    My question deals with the comparing of the datatables. I am building a class to do the compares and figure on passing the import datatables through the constructor. I am struggling with how to handle the export datatables. As I see it, I have three main options and none are elegant:

    • Create the datatable externally and pass as a constructor parameter
    • Create and populate internally and retrieve using a "getter"
    • Create and populate internally and build and populate the workbook from within the class (don't like this one as I will be instantiating the class for each compare)


    In addition to being rusty, this is my first full application using C#, so I probably have overlooked other viable options. Please let me know if I did.

  2. #2
    Arjay's Avatar
    Arjay is offline Moderator / EX MS MVP Power Poster
    Join Date
    Aug 2004
    Posts
    13,274

    Re: Need opinions on best practices passing datatables

    How many lines are the csv files?

    Are you comparing the files line by line (i.e. line 1 of file 1 compared with line 1 of file 2, etc.)?

  3. #3
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Re: Need opinions on best practices passing datatables

    Quote Originally Posted by Arjay View Post
    How many lines are the csv files?

    Are you comparing the files line by line (i.e. line 1 of file 1 compared with line 1 of file 2, etc.)?
    There are a variable amount of lines in each .csv. To explain a bit more thoroughly, one file is a hospital file with denied payments being sent to the vendor for resolution. The other file is from the vendor showing what they have active in their system. I am comparing the two twice, once to see if everything the hospital sent was entered into the vendors system, and once to see if the vendor's acknowledged inventory shows as still open in the hospital's database.

    The comparison will be each line in one file compared to every line in the other and a list of those not found being exported to a new datatable in preparation of generating an Excel worksheet with the exceptions. The workbook will have a sheet for each compare showing the exceptions that need to be manually resolved by hospital/vendor.

    The report will be sent whether populated or not, so the only issue is a best practices of where to create the datatables receiving the accounts not found. Another class creates the datatables from the .csv files and I can either pass as constructor parameters or use properties to "get" them. The datatables for the exceptions are where I am finding a mental block on "best practices".

    The class handling the compares was built to handle a single compare and to be instantiated twice, differing in either parameter position (currently), or manipulating "setters" from the creating class.

  4. #4
    Arjay's Avatar
    Arjay is offline Moderator / EX MS MVP Power Poster
    Join Date
    Aug 2004
    Posts
    13,274

    Re: Need opinions on best practices passing datatables

    Do the columns in the csv files remain constant (i.e. same number and data type for each column) between the files being compared?

    I ask because if they do, you can represent each line of the csv as an instance of a class (say, 'Record')
    Next write an IEqualityComparer<> for the Record.
    Then you read each file into a generic list - List<Record>. The TextFieldParser class is helpful.
    Finally use the Linq Except method to get the differences between the two lists

    See https://msdn.microsoft.com/en-us/lib...v=vs.110).aspx for the Except comparison
    Search TextFieldParser in C# in the internet for related examples.

    Then you can use Linq on the two lists

  5. #5
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Re: Need opinions on best practices passing datatables

    Quote Originally Posted by Arjay View Post
    Do the columns in the csv files remain constant (i.e. same number and data type for each column) between the files being compared?

    I ask because if they do, you can represent each line of the csv as an instance of a class (say, 'Record')
    Next write an IEqualityComparer<> for the Record.
    Then you read each file into a generic list - List<Record>. The TextFieldParser class is helpful.
    Finally use the Linq Except method to get the differences between the two lists

    See https://msdn.microsoft.com/en-us/lib...v=vs.110).aspx for the Except comparison
    Search TextFieldParser in C# in the internet for related examples.

    Then you can use Linq on the two lists
    Both of those are extra overhead for what is expected to be from 0- a dozen daily records. The columns in the .csv files are constant, and only the first column has a key to compare. The rest are visual aids (total of 7 columns). My intent was to load them into a datatable using a StreamReader, compare the datatables, generate exceptions datatables, push the data into an .xlsx file and use sFTP to send. Even at the beginning of this process, I am looking at ~200+ records. The comparisons have been done manually for a while and no corrections have been implemented allowing a backlog of out of sync accounts.

    An incomplete example from the web for this process is pasted below:
    Code:
    public static DataTable ConvertCSVtoDataTable(string strFilePath)
     {
        StreamReader sr = new StreamReader(strFilePath);
        string[] headers = sr.ReadLine().Split(','); 
        DataTable dt = new DataTable();
        foreach (string header in headers)
        {
            dt.Columns.Add(header);
        }
        while (!sr.EndOfStream)
        {
            string[] rows = sr.ReadLine().Split(',');
            DataRow dr = dt.NewRow();
            for (int i = 0; i < headers.Length; i++)
            {
                dr[i] = rows[i];
            }
            dt.Rows.Add(dr);
        }
        return dt;
    }
    .NET Framework Version 4.5.50938; Visual Studio Professional 2013 Version 12.0.21005.1 REL
    "Having power is nice, having a lot of power is very nice, having too much power is just about right!"

  6. #6
    Arjay's Avatar
    Arjay is offline Moderator / EX MS MVP Power Poster
    Join Date
    Aug 2004
    Posts
    13,274

    Re: Need opinions on best practices passing datatables

    Quote Originally Posted by Jim Snyder View Post
    Both of those are extra overhead for what is expected to be from 0- a dozen daily records.
    Personally, in terms of separation of concerns, error handling and code maintainability, I find that is helpful to separate the processing of the files (and getting them into a form that is useful in code) from the logic of the processing.

    If you get them into a list of Record objects, then you can use the Linq Extract extension to do the compare between the two lists in one line of code.

    Of course, if you are only doing one operation, you may not want to take the 'overhead'. On the other hand, you might decide to code with code reuse in mind and build up a library that wraps the functionality that I'm talking about.

    Something like
    Code:
    // Load the records from the csv (TextFileParserHelper is a class that wraps the
    // TextFileParser specifying csv format and a column mapping for the Record class)
    var records1 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath1);
    var records2 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath2);
    
    // Retrieve the different records using Linq
    var differences = records1.Except(records2);
    This type of overhead may not be desirable for a one off job, but it becomes useful if you do this sort of thing more than once.

  7. #7
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Re: Need opinions on best practices passing datatables

    Much appreciated. Readability and maintainability are high on my list. At 61, the thing I battle most is memory. Clean code and easily understood construction not only helps me, but also the coworkers whose styles ranges from one giant class doing everything to C like code.
    .NET Framework Version 4.5.50938; Visual Studio Professional 2013 Version 12.0.21005.1 REL
    "Having power is nice, having a lot of power is very nice, having too much power is just about right!"

  8. #8
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Re: Need opinions on best practices passing datatables

    Quote Originally Posted by Arjay View Post
    Personally, in terms of separation of concerns, error handling and code maintainability, I find that is helpful to separate the processing of the files (and getting them into a form that is useful in code) from the logic of the processing.

    If you get them into a list of Record objects, then you can use the Linq Extract extension to do the compare between the two lists in one line of code.

    Of course, if you are only doing one operation, you may not want to take the 'overhead'. On the other hand, you might decide to code with code reuse in mind and build up a library that wraps the functionality that I'm talking about.

    Something like
    Code:
    // Load the records from the csv (TextFileParserHelper is a class that wraps the
    // TextFileParser specifying csv format and a column mapping for the Record class)
    var records1 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath1);
    var records2 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath2);
    
    // Retrieve the different records using Linq
    var differences = records1.Except(records2);
    This type of overhead may not be desirable for a one off job, but it becomes useful if you do this sort of thing more than once.
    It is always good for a developer to have a library of tools to tap into. Gotta love code reuse!
    .NET Framework Version 4.5.50938; Visual Studio Professional 2013 Version 12.0.21005.1 REL
    "Having power is nice, having a lot of power is very nice, having too much power is just about right!"

  9. #9
    Join Date
    Jun 2016
    Location
    Columbus, Ohio, USA
    Posts
    12

    Re: Need opinions on best practices passing datatables

    A follow-on question:
    Assuming I end up with a generic list as the result of the LINQ .Except, is it possible to dump this directly to a querytable to get it into a reportable spreadsheet?

    I notice that 90+% of the "export generic list to Excel" solutions do a cell by painful cell conversions somewhere, and I am looking for the 'magic bullet 'to do set to set. If you haven't heard of this, neither have I, but I am looking for efficient reusable ways all the time.
    .NET Framework Version 4.5.50938; Visual Studio Professional 2013 Version 12.0.21005.1 REL
    "Having power is nice, having a lot of power is very nice, having too much power is just about right!"

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width




On-Demand Webinars (sponsored)