|
-
November 7th, 2008, 04:23 AM
#11
Re: Consistent performance issues at high bandwidths, UDP.
I went to the site yesterday and discovered the problem. Before going on site I did a few more experiments with the theory that it was a driver problem on the MacBooks we were using, as I never saw the issue on the PC development machine. There was only a 10/100 switch available to test with at the time. Testing on that, with a 100mbit connection connected to one of the hardware devices, there were no issues at all. We dropped the connection speed down to 10mbits and still, no issues. There was no opportunity to test with a gigabit connection there, so I went to the site with the idea that it was a strange bug with the MacBook drivers in gigabit mode only.
Well, after getting there I immediately tested on the MacBook in gigabit mode and the problems happened. Skipping 100mbits, I tested at 10mbits and the delays disappeared and performance was significantly better. Ok, that supports the broken driver theory. So I tested on a PC running at 1gbit and... it didn't work. Same delays. Confused, I tested on my development machine at 1gbit and... same problems. There goes the MacBook theory.
Well, I was not responsible for the network set up at the site and had assumed it was sane. As it turns out, the 59 hardware devices can only run at 10mbit speeds (they are not configurable). However, the network consisted of 5 Netgear unmanaged switches, daisy-chained with 1gbit connections with each one breaking out into 11-13 of the 10mbit devices, and the machine connected at 1gbit to the end of the of the chain (the person on-site who had verified that all devices were connected at gigabit speeds had misinterpreted the lights on the front of the switches, which actually showed 10mbits to each device).
All of the dropped packets I was seeing were caused entirely by connection speed mismatches and the inability of the dumb switches to convert. The inconsistent frame rates with delays at regular intervals that I was seeing were entirely caused by ethernet flow control. I suspect that Windows somehow handles flow control logic per-socket, which may explain the difference in delays with different socket configurations, and also may explain the shorter delays and better performance with the one-socket-per-destination configuration. All of these problems were made worse by per-switch inconsistencies caused by the network layout.
This also explains why the development machine did not show any of the issues, it was never connected to a full set of 59 devices, and when it was connected to test hardware, it was connected through switches that could handle the data rate conversion properly. It also explains why we didn't see the problem in the tests earlier that day with the 10/100 switch, which could handle rate changes correctly. Had we had a smart gigabit switch available at that time, it would have also worked just fine.
Verifying all of that, disabling flow control eliminated the delays, dropping the computer down to a 10mbit connection eliminated the dropped packets, reducing the number of devices on the network also eliminated the delays.
The solution is to fix the network layout. We ordered a new set of nicer, smarter switches and reorganized the layout to something more sane. One switch will serve as the top level switch with a gigabit connection to the machine, and will connect to each of the 5 other switches which will be connected to the devices -- the new switches can handle the data rate conversions, and the new layout also equalizes the bandwidth through all of the switches. We'll have to experiment with flow control settings.
Unfortunately I had to return home before the new devices arrived, hopefully the new setup will resolve all of the issues.
Thanks for all the advice here, thought I'd post back with the (likely) solution, since it was completely different that all of the other theories.
J
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|