Click to See Complete Forum and Search --> : Read performance


beck
October 24th, 2008, 06:34 AM
I am using the following code to test the performance of reading of data (1000 records)

using (IDataReader reader = database.ExecuteReader(command))
{
while (reader.Read())
{
LogManager.LogInformation(DateTime.Now.Millisecond.ToString("000"));
}
}

When I inspect the logfile that is written to, I see that every second record read takes 15ms-16ms of execution time.
...
24.10.2008 13:08:36 : #1033 [unknown] 515
24.10.2008 13:08:36 : #1034 [unknown] 515
24.10.2008 13:08:36 : #1035 [unknown] 530
24.10.2008 13:08:36 : #1036 [unknown] 530
24.10.2008 13:08:36 : #1037 [unknown] 546
24.10.2008 13:08:36 : #1038 [unknown] 546
24.10.2008 13:08:36 : #1039 [unknown] 562
24.10.2008 13:08:36 : #1040 [unknown] 562
...

I thought looping through the datareader should be very fast. Are there something I could do to speed things up?

mmetzger
October 24th, 2008, 07:40 AM
We'd have to see what your reader is doing overall - meaning what query are you running?, what database are you using?, what client are you using?, what are your connection string options?, etc...

Usually when I've had a slow data reader either my query is too complex or I'm asking for too much data.

Also, please make sure to use CODE tags - it makes everything more readable.

beck
October 24th, 2008, 08:42 AM
Here is the select that is executed:

SELECT *
FROM (SELECT utctime, source, message, severity, hidden, description, condition, subcondition, user1, user2, user3, user4, user5, ROWNUM rnum
FROM (SELECT TO_TIMESTAMP((TO_CHAR(utctime,'DD.MM.YYYY HH:MI:SS.') || TO_CHAR(MSEC,'000') || TO_CHAR(utctime,' AM')), 'DD.MM.YYYY HH:MI:SS.FF3 AM') AS utctime,LOCALTIME,
SYSTEM,source,message,severity,Hidden,description,condition,subcondition,user1,user2,user3,user4,user5
FROM EVLOG_ARCHIVE
WHERE category IN(1,2,3,4,5) AND Hidden='False'
AND utctime > TO_DATE('2007-05-21 08:00:00','yyyy-mm-dd hh24:mi:ss') AND utctime < TO_DATE('2007-05-21 09:00:00','yyyy-mm-dd hh24:mi:ss') AND severity > 0 AND severity <= 1000
ORDER BY UTCTime)
WHERE ROWNUM <= 1000)
WHERE rnum > 0

and the code:

using (IDbConnection connection = database.CreateConnection())
{
connection.Open();
DbCommand command = database.GetSqlStringCommand(query);

using (IDataReader reader = database.ExecuteReader(command))
{
while (reader.Read())
{
LogManager.LogInformation(DateTime.Now.Millisecond.ToString("000"));
}
}
}

BTW The database used is Oracle

TheCPUWizard
October 24th, 2008, 08:49 AM
You are skewing the information by the very fact you are attempting to log inside the loop.

Do a simple test:

for (int i=0; i<100;++i)
LogManager.LogInformation(DateTime.Now.Millisecond.ToString("000"));


Also ".Now" is SIGNIFICANTLY (a few hundred times) SLOWER than .UtcNow as every invocation must query the locale to get the timezone, and perform all of the necessary calculations.

Change to:

List<DataTime> stamps = new List<DateTime>();
while (reader.Read())
{
stamps.Add(DateTime.UtcNow);
}
// Print out the list here...


This will be much closer to "reality".

beck
October 24th, 2008, 09:04 AM
You are skewing the information by the very fact you are attempting to log inside the loop.

Do a simple test:

for (int i=0; i<100;++i)
LogManager.LogInformation(DateTime.Now.Millisecond.ToString("000"));


Also ".Now" is SIGNIFICANTLY (a few hundred times) SLOWER than .UtcNow as every invocation must query the locale to get the timezone, and perform all of the necessary calculations.

Change to:

List<DataTime> stamps = new List<DateTime>();
while (reader.Read())
{
stamps.Add(DateTime.UtcNow);
}
// Print out the list here...


This will be much closer to "reality".

That might be, but I am also timing the whole execution and it does not change significantly if I comment out the line of code that logs info inside the loop. My question is regarding performance of the reading of data.

mmetzger
October 24th, 2008, 09:20 AM
Not certain it will make a difference, but the query is pulling 14 columns of data - if you perform a simpler query does it improve?

Also you mentioned using Oracle - which client library are you using to access the data - Oracle's, the .NET version, OleDB, ODBC, etc?

beck
October 24th, 2008, 09:23 AM
What I have seen is that some of the columns that are included in the select-statement can hold a quite large textstring (Oracle type VARCHAR2(4000)). If I try the same query as above, but only get one coulmn, looping through the datareader is much faster. Typically I can see in the logfile that 30 lines are logged at the same time before between each 15ms-16ms delay. This is not a solution to the problem of course, but gives information about how the datareader works.

TheCPUWizard
October 24th, 2008, 09:50 AM
<snip> Typically I can see in the logfile that 30 lines are logged at the same time before between each 15ms-16ms delay. <snip>.
Unless you are multi-threading two things can NEVER happen "at the same time". Therefore your approach is flawed (you would have seen this if you had run the experiments I recommended previously).

It appears that you are getting the SAME sample multiple times followed by the next quantized sample..... (typically 16.666mS)