Accidental repost, please read the second post.
Accidental repost, please read the second post.
Is there a recommended I/O strategy for streaming server programming for Windows sockets?
All the articles I've read so far say that overlapped I/O with completion ports offers the best performance and scalability for socket server programming. But I've also noticed that these articles' main topic is web server programming. They highlight the importance of serivicing thousands of connection requests in a matter of seconds. But what about a smaller number of connections, each requiring constant feed for a longer period of time, or maybe continously? Maybe UDP would be better suited for such a scenario, but for the sake of this post let's stick with TCP.
Are there any articles citing differences between web servers and streaming servers? Or, are there any differences at all anyway? In my understanding of a streaming server, the producer is the creator of data, for example a microphone connected to the PC or a network sniffer (they fill/create buffer(s) to be sent to the clients), and the consumer is the server (it empties/posts the buffers to the network stack). This sounds different than a web server, where there's no immediate need to free the buffers so that the producer wouldn't block because a circular buffer is full or no memory is available. Okay, this is what I think. But is my thinking right?
One thing I can think against IOCP is the overhead of allocating/freeing buffers to be used in WSASend. If I ever program such a server with IOCP, I think I'll have to use a memory pool. I know this is outside the limits of network programming forum, but I've also read an article or two about that, and the reactions seem to be mixed. Some people say that malloc/HeapAlloc is just another memory pool routine anyway, so there's no way of beating it in its own game. Some other people say that a memory pool tailored to the needs of a particular program will beat general purpose allocation routines any day. Is there someone in here to share their real world experiences in a couple of words?
Thanks for reading.
Please do not start more than one thread for the same question
Sorry about that, that was not intentional. When I first started the thread I got the message 'you are not logged in', and not the expected mesage 'your thread will be reviewed by an admin'. And since there was no notification in my inbox that someone was reviewing my post, I thought I made something wrong and reposted it.
The simplest approach from a code pov is starting a thread for each socket, and use blocking read/writes.
For small number of connections this will work.
For dozens/hundreds/thousands of connections this won't. If you would try it that way... the biggest problems will be:
- a lot of time will be spent by the OS just servicing/managing the threads with little actual work being done.
- you'll be wasting memory because of all the tread stacks and local variables. This can become bad enough that you spend a large part of the time just paging in and out.
Thanks for the reply. There are lots of example codes out there for simple select servers, multi-threaded servers, Windows SDK example for an IOCP server, AsyncSelect and EventSelect servers. And I think I can find ways to test them, up to a point. But my tests would always be limited in scope and complexity. And it would be hard to write code for all the possible cases. For example I asked in the original post if a memory pool implementation would fare better for an IOCP server than calling directly HeapAlloc/malloc. That's just one case, and I honestly don't want to implement/test all the scenarios. That's why I'd appreciate real life experiences.
Now you're confusing two totally unrelated items.
memory management issues.
Depending on the problem, regular new/delete's might be enough.
if you have (many) threads doing lots of memory (de)allocation, then it may not be and depending on how local the problem is could be solved by separating thread local memory management with c++ allocators.
if the problem is severe, then you may even need have to have custom allocators for everything
In extremes, even the overhead of the C++ lib and the OS heap allocator may be too much and you may need to write a custom memory management.
memorypooling is sort of an in-between of the above that may or may not be appropriate for your problem. It is not a catch-all, and it's not needed in all situations either.
TLDR: IOCP and memory management are unrelated, you pick your IO method depending on IO needs, and you pick your memory management system depending on (de)allocation needs.
No need for a tl;dr. I shouldn't be posting here if I didn't wat to read the replies ;)
The memory issue was just an example of many things I could face. I was looking for articles which have something to say about pros and cons of various I/O strategies. The ones I've found are always in favor of IOCP, and the only negative thing they say about it is the difficulty of implementation. I've also seen a comparison of how many connections per second different server models accepted. But that was an echo server. Then I asked myself questions. Is it the same thing for a streaming server (I'm not repeating my understanding of a streaming server here, it's in the first post). Or, would memory allocation/deallocation create problems? Is it possible to have a case in which another I/O model could beat an IOCP server?
IOCP is complex to implement, but it does provide the only means to handle hundreds/thousands of simultaneous connections in an efficient way. That's also why any "good" implementation will favour this.
A separate thread and blocking reads is the simplest, but it's disadvantage is that it doesn't scale well to larger amounts of connections. If you know up front that scaling is never going to be an issue, then it's a good method to get the programming done faster.
EVen if scaling might be an issue, there's still the issue of "do we go IOCP right away even though we don't need it, or can we aford a simple implementation now and a possible entire rewrite to IOCP somewhere in the future).
Scaling doesn't stop with network IO alone, as your needs for higher throughput and/or more simultaneous connections go up, you can compensate that in part with better hardware, but EVERY aspect of your design can end up being the bottleneck. Improving hardware is often the "cheap" way out of bottlenecks since software development (and especially high end optimisations) takes time and is quickly more expensive.
If your application does a lot of database access, then your bottleneck might be there so you'll need to find a better database system, possibly a database cluster.
If you do disk io, then disk might be a bottleneck, maybe you need SSD's, and/or raid striping, or means to reduce/improve disk i/o.
you could have a problem of excessive memory use (install more memory, or find ways to conserve memory),
excessive memory allocation could be a bottleneck (multiple heaps, memory pooling, or one of many other ways to deal with that, or... simply don't do as many allocations and find ways to group them into a single (de)allocation, maybe changing to a better type of container for the data needs).
Your problem could be calculations (need more CPU cores, or more optimized calculation routines)
There's no point in trying to take "the best" approach for each part of your design, you identify the parts that are the bottleneck and improve there.
If your entire IOCP based app only does a handfull of (de)allocations per second, then the best (and possibly most complex) memory management method will not help you any one bit.
The thing is... If you have a problem that's "big enough" to need IOCP, then you'll likely have several other key parts of your software that will need attention to keep up with the network IO. memory management COULD be one of them, it could not be, it depends what you're doing with the network data.
But there are cases where the difference is near zero or certainly not big enough to warrant the extra complexity/programming time.
As for streaming... It all depends on the speed/throughput/data size and expectations towards guaranteed delivery, and server CPU/memory/disk usage.
The closer expectations come to the physical limits of the hardware, the better your programming will need to be.