Need opinions on best practices passing datatables
My task is simple. I am taking two .csv files, importing them to datatables, comparing each datatable to the other, placing non-matches into new datatables, exporting to a workbook, and notifying email lists of success or failure. I haven't been in C# for a while and am getting back up to speed.
My question deals with the comparing of the datatables. I am building a class to do the compares and figure on passing the import datatables through the constructor. I am struggling with how to handle the export datatables. As I see it, I have three main options and none are elegant:
- Create the datatable externally and pass as a constructor parameter
- Create and populate internally and retrieve using a "getter"
- Create and populate internally and build and populate the workbook from within the class (don't like this one as I will be instantiating the class for each compare)
In addition to being rusty, this is my first full application using C#, so I probably have overlooked other viable options. Please let me know if I did.
Re: Need opinions on best practices passing datatables
How many lines are the csv files?
Are you comparing the files line by line (i.e. line 1 of file 1 compared with line 1 of file 2, etc.)?
Re: Need opinions on best practices passing datatables
Quote:
Originally Posted by
Arjay
How many lines are the csv files?
Are you comparing the files line by line (i.e. line 1 of file 1 compared with line 1 of file 2, etc.)?
There are a variable amount of lines in each .csv. To explain a bit more thoroughly, one file is a hospital file with denied payments being sent to the vendor for resolution. The other file is from the vendor showing what they have active in their system. I am comparing the two twice, once to see if everything the hospital sent was entered into the vendors system, and once to see if the vendor's acknowledged inventory shows as still open in the hospital's database.
The comparison will be each line in one file compared to every line in the other and a list of those not found being exported to a new datatable in preparation of generating an Excel worksheet with the exceptions. The workbook will have a sheet for each compare showing the exceptions that need to be manually resolved by hospital/vendor.
The report will be sent whether populated or not, so the only issue is a best practices of where to create the datatables receiving the accounts not found. Another class creates the datatables from the .csv files and I can either pass as constructor parameters or use properties to "get" them. The datatables for the exceptions are where I am finding a mental block on "best practices".
The class handling the compares was built to handle a single compare and to be instantiated twice, differing in either parameter position (currently), or manipulating "setters" from the creating class.
Re: Need opinions on best practices passing datatables
Do the columns in the csv files remain constant (i.e. same number and data type for each column) between the files being compared?
I ask because if they do, you can represent each line of the csv as an instance of a class (say, 'Record')
Next write an IEqualityComparer<> for the Record.
Then you read each file into a generic list - List<Record>. The TextFieldParser class is helpful.
Finally use the Linq Except method to get the differences between the two lists
See https://msdn.microsoft.com/en-us/lib...v=vs.110).aspx for the Except comparison
Search TextFieldParser in C# in the internet for related examples.
Then you can use Linq on the two lists
Re: Need opinions on best practices passing datatables
Quote:
Originally Posted by
Arjay
Do the columns in the csv files remain constant (i.e. same number and data type for each column) between the files being compared?
I ask because if they do, you can represent each line of the csv as an instance of a class (say, 'Record')
Next write an IEqualityComparer<> for the Record.
Then you read each file into a generic list - List<Record>. The TextFieldParser class is helpful.
Finally use the Linq Except method to get the differences between the two lists
See
https://msdn.microsoft.com/en-us/lib...v=vs.110).aspx for the Except comparison
Search TextFieldParser in C# in the internet for related examples.
Then you can use Linq on the two lists
Both of those are extra overhead for what is expected to be from 0- a dozen daily records. The columns in the .csv files are constant, and only the first column has a key to compare. The rest are visual aids (total of 7 columns). My intent was to load them into a datatable using a StreamReader, compare the datatables, generate exceptions datatables, push the data into an .xlsx file and use sFTP to send. Even at the beginning of this process, I am looking at ~200+ records. The comparisons have been done manually for a while and no corrections have been implemented allowing a backlog of out of sync accounts.
An incomplete example from the web for this process is pasted below:
Code:
public static DataTable ConvertCSVtoDataTable(string strFilePath)
{
StreamReader sr = new StreamReader(strFilePath);
string[] headers = sr.ReadLine().Split(',');
DataTable dt = new DataTable();
foreach (string header in headers)
{
dt.Columns.Add(header);
}
while (!sr.EndOfStream)
{
string[] rows = sr.ReadLine().Split(',');
DataRow dr = dt.NewRow();
for (int i = 0; i < headers.Length; i++)
{
dr[i] = rows[i];
}
dt.Rows.Add(dr);
}
return dt;
}
Re: Need opinions on best practices passing datatables
Quote:
Originally Posted by
Jim Snyder
Both of those are extra overhead for what is expected to be from 0- a dozen daily records.
Personally, in terms of separation of concerns, error handling and code maintainability, I find that is helpful to separate the processing of the files (and getting them into a form that is useful in code) from the logic of the processing.
If you get them into a list of Record objects, then you can use the Linq Extract extension to do the compare between the two lists in one line of code.
Of course, if you are only doing one operation, you may not want to take the 'overhead'. On the other hand, you might decide to code with code reuse in mind and build up a library that wraps the functionality that I'm talking about.
Something like
Code:
// Load the records from the csv (TextFileParserHelper is a class that wraps the
// TextFileParser specifying csv format and a column mapping for the Record class)
var records1 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath1);
var records2 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath2);
// Retrieve the different records using Linq
var differences = records1.Except(records2);
This type of overhead may not be desirable for a one off job, but it becomes useful if you do this sort of thing more than once.
Re: Need opinions on best practices passing datatables
Much appreciated. Readability and maintainability are high on my list. At 61, the thing I battle most is memory. Clean code and easily understood construction not only helps me, but also the coworkers whose styles ranges from one giant class doing everything to C like code.
Re: Need opinions on best practices passing datatables
Quote:
Originally Posted by
Arjay
Personally, in terms of separation of concerns, error handling and code maintainability, I find that is helpful to separate the processing of the files (and getting them into a form that is useful in code) from the logic of the processing.
If you get them into a list of Record objects, then you can use the Linq Extract extension to do the compare between the two lists in one line of code.
Of course, if you are only doing one operation, you may not want to take the 'overhead'. On the other hand, you might decide to code with code reuse in mind and build up a library that wraps the functionality that I'm talking about.
Something like
Code:
// Load the records from the csv (TextFileParserHelper is a class that wraps the
// TextFileParser specifying csv format and a column mapping for the Record class)
var records1 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath1);
var records2 = TextFileParserHelper.Load<Record, ColumnDefinitionCsv, RecordColumnMap>(filePath2);
// Retrieve the different records using Linq
var differences = records1.Except(records2);
This type of overhead may not be desirable for a one off job, but it becomes useful if you do this sort of thing more than once.
It is always good for a developer to have a library of tools to tap into. Gotta love code reuse!
Re: Need opinions on best practices passing datatables
A follow-on question:
Assuming I end up with a generic list as the result of the LINQ .Except, is it possible to dump this directly to a querytable to get it into a reportable spreadsheet?
I notice that 90+% of the "export generic list to Excel" solutions do a cell by painful cell conversions somewhere, and I am looking for the 'magic bullet 'to do set to set. If you haven't heard of this, neither have I, but I am looking for efficient reusable ways all the time.