Array subset
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 15 of 15

Thread: Array subset

Hybrid View

  1. #1
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Array subset

    Hi,
    I need to create an array subset. ArraySegment is not really what I look for, since I would like to access the subset as you access any other array. The most important thing is that the subset must not be a copy of the array.

    Code:
    		byte[] buffer = new byte[]{ 1, 2, 3, 4, 5 };
    
    		int startindex = 1, count = 2;
    		byte[] subarray = Something(buffer, startindex, count);
    
    		subarray[0] = 8; // Access buffer's position [1] with [0] index on subarray
    		subarray[1] = 9; // Access buffer's position [2] with [1] index on subarray
    
    		// buffer must now contain { 1, 8, 9, 4, 5 };

    Can anyone help me implementing the Something function, please?
    Mr. Burns

  2. #2
    Join Date
    Jul 2012
    Posts
    90

    Re: Array subset

    I think this may suit your needs...

    Code:
        public class SubArray
        {
            private byte[] m_baseArray;
            private byte[] m_subArray;
            private Int32 m_startIndex = 0;
    
            public SubArray(byte[] buffer, Int32 startIndex, Int32 count)
            {
                if (buffer.Length - startIndex - count < 0) { throw new ArgumentOutOfRangeException(); }
                m_subArray = new byte[count];
                m_baseArray = buffer;
                Array.ConstrainedCopy(buffer, startIndex, m_subArray, 0, count);
                m_startIndex = startIndex;
            }
    
            public byte this[Int32 index]
            {
                get { return m_subArray[index]; }
                set
                {
                    m_subArray[index] = value;
                    Array.ConstrainedCopy(m_subArray, 0, m_baseArray, m_startIndex, m_subArray.Length);
                }
            }
    
            public byte[] baseArray
            {
                get { return m_baseArray; }
                set { }
            }
        }
    The baseArray property is readonly (to allow access to the base array after the class is instanciated). Pass the base array, start index, and count in the constructor. Internally a new array is created and exposed via the indexer property. The set of the indexer, copies any updates made to the new array back to the base array.

    Usage:

    Code:
        byte[] buffer = new byte[]{ 1, 2, 3, 4, 5 };
    
        int startindex = 1, count = 2;
        SubArray subarray = new SubArray(buffer, startindex, count);
    
        subarray[0] = 8;
        subarray[1] = 9;
        buffer = subarray.baseArray;
    
        //  buffer now contains { 1, 8, 9, 4, 5 };

    I "cheated" a little in that internally to the class the sub array is actually a copy of a segment of the original array, but any updates to it are synced back to the original array, which can then be accessed at any time via the baseArray property.
    Last edited by CGKevin; October 12th, 2012 at 09:47 AM.

  3. #3
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    Thank you for your reply.

    As you can see in my code the function I need returns a byte[], which is what I need, because I must also pass the sub buffer to methods that accept byte[], and there is where I must not have a copy of the buffer (if you think, for example, about creating a cast to byte[] method into your class, this would solve it, but a copy would be needed)
    Mr. Burns

  4. #4
    Join Date
    Jan 2010
    Posts
    1,099

    Re: Array subset

    Quote Originally Posted by MontgomeryBurns View Post
    As you can see in my code the function I need returns a byte[], which is what I need
    Well, sorry, but that is never gonna work that way in C#, because byte is a value type, which means it's always passed around by value (a copy is made). What you can do is wrap each byte into a reference type (a class you'd make specifically for this purpose), and then instead have an array of wrapper objects, so you can share them and use them to manipulate the bytes they contain, and then provide a method that will turn the whole thing into a byte[] when required. Or you could alternatively modify your code so that it always uses the original byte array.

    You can also try using List<byte>, which provides methods for replacing/inserting elements or subsections with one or more bytes, so you can simulate the behavior that way. It also provides a ToArray() method which returns a byte[].

    Can you describe what you're trying to do in more detail?

  5. #5
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    Quote Originally Posted by TheGreatCthulhu View Post
    Can you describe what you're trying to do in more detail?
    Hi!
    one of my classes that takes care of socket communication between devices returns an array (byte []) containing a whole message. The message consists of some initial bytes and then a data part. A need to operate to the data part without loss of performances due to Copy stuff (and actually I don't need a copy for any other reason). I represent the message here as follows:

    [P] [P] [P] [P] [P] [D] [D] [D] ... [D] [D] [D] [CHK]

    [P] bytes are the bytes needed by protocol communication
    [D] bytes are the "sub buffer" of data bytes
    [CHK] final checksum

    The thing is that for every message that arrives I now create a copy of the data bytes to create an array that contains only the [D] bytes.
    I need to use the sub buffer for every futher management of the data, after the message arrives. I do not need the [P] bytes anymore, after the message is decoded.

    On a Desktop PC machine everything works fine, but when the application is run on a PDA, all of those useless copies consume a lot in matter of RAM and CPU-time, sometimes the PDA goes also into OutOfMemoryException. Consider that my message is usually composed by a few [P] bytes and many [D] bytes. For example a message might be 5000 bytes and only 10 of them are [P] bytes.
    That is why I would like to operate onto the original array and not onto a copy of it. I understand now that this may not by possible in C#. Everything would have been much easier in C++ when you could simply use a pointer to a byte in the middle as an array of bytes and pass it as an argument of array of byte.
    Last edited by MontgomeryBurns; October 15th, 2012 at 04:33 AM.
    Mr. Burns

  6. #6
    Join Date
    Jul 2012
    Posts
    90

    Re: Array subset

    Just a thought...

    Is there any possibility of modifying the socket communication class to return separate arrays? A [P] array, a [D] array and the [CHK] separately. Then if, for some reason, you need the entire thing as a single array, you can combine the 2 arrays and [CHK] into a single array.

  7. #7
    Join Date
    Jan 2010
    Posts
    1,099

    Re: Array subset

    You could do something like this.
    While byte is a value type, array types aren't - they are reference types, which really means they will behave similar to C++ pointers, except you don't have to worry about deleting then, since C# is a managed language.
    So, create a class to represent a subarray, pass the original array to a constructor, or a init method, along with the bounds of the subarray, and then implement an indexer on that class, so that it translates the subarray indices into the original indices, and updates the original array, a reference to which is maintained internally. This will probably allow you to keep most of your original code, as far as app logic is concerned, but some changes will be necessary.

    I currently don't have time to write a short example, but I'm sure some of the other members will be able to expand on the idea, if required.

  8. #8
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    Quote Originally Posted by TheGreatCthulhu View Post
    I currently don't have time to write a short example, but I'm sure some of the other members will be able to expand on the idea, if required.
    Like CGKevin's class in first thread's reply? I already got exactly what you say, with an indexer too. As I explained, all of my problems are in passing that kind of object to methods that have byte[] as input argument. The only way to do it is to implement a cast function into that SubArray class, but this would end into implementing an array Copy of the data part.
    Mr. Burns

  9. #9
    Join Date
    Jan 2010
    Posts
    1,099

    Re: Array subset

    Quote Originally Posted by MontgomeryBurns View Post
    Like CGKevin's class in first thread's reply?
    No. What CGKevin tried to do was to meet your requirement for the method to return a byte[], however, as I already told you, in
    C# you can't both meet the requirement to return a byte[], and the requirement for it to bee a shallow copy of the original array, at least not without doing some kind of boxing on the byte type - since byte is a value type. So, you need to redesign your app, but, fortunately, only slightly.

    Take a look at this peace of code - it's just a quick example to demonstrate the idea:
    Code:
    class SubArray
        {
            private byte[] _srcArray = null;   // this is a reference to the *original* array
            private int _start = 0;
            private int _count = 0;
    
            public SubArray(byte[] array, int start, int count)
            {
                _srcArray = array;
                _start = start;
                _count = count;
            }
    
            public byte this[int index]
            {
                get
                {
                    if (!CheckBounds(index))
                        throw new IndexOutOfRangeException();
    
                    int i = index + _start;
                    return _srcArray[i];
                }
                set
                {
                    if (!CheckBounds(index))
                        throw new IndexOutOfRangeException();
    
                    int i = index + _start;
                    _srcArray[i] = value;
                }
            }
    
            public int Count
            {
                get { return _count; }
            }
    
            private bool CheckBounds(int index)
            {
                if (index < 0 || index >= _count)
                    return false;
    
                return true;
            }
        }
    Basically, the _srcArray variable is not a copy - it references (points to) the original array. The indexer enables you to work with a SubArray object pretty much the same way you'd work with an ordinary array. This code is just bare bones; so you'll want to add guard conditions and such, to make sure all the invariants are maintained.

    You can then use the class like this:
    Code:
            static void Main(string[] args)
            {
                byte[] testArray = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7 };
    
                SubArray sub = new SubArray(testArray, 2, 3);
                sub[0] = 255;
                sub[1] = 255;
                sub[2] = 255;
    
                foreach (byte b in testArray)
                    Console.WriteLine(b);
            }
    This prints out:
    0
    1
    255
    255
    255
    5
    6
    7


    No copy is made, and the original array is updated.

    You can go even further than that, and make the class implement the IList<byte> interface. This interface is implemented by the byte[] as well. Although it provides generic methods like Add(), Remove(), Insert(), and such, the array class simply throws an InvalidOperationException for these. It also implements them explicitly, so that they normally don't appear in IntelliSence, unless the interface itself is used to access the members. The only thing that would require a bit of work is the implementation of the GetEnumerator() methods, where you would need to provide a custom enumerator to iterate only through the desired segment. This method is used by the foreach statement, so it must be implemented in order for foreach to work. But the for loop will work normally with or without it. Once you've implemented the IList<byte> interface, you can replace all your byte[] method parameters with IList<byte>, and it will work the same as before (you won't have to change the most of the code, and you'll still be able to pass byte[] variables).
    The up side is, if you later on decide to use some other list-like structure instead of a byte[] (like List<byte>), you can do that with minimum (or almost no) additional work.

  10. #10
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    OK, what I was trying to ask is, how can I pass such a SubArray class to a function like:
    Code:
    		public void Update(byte[] buffer)
    		{
    			...
    		}
    Mr. Burns

  11. #11
    Join Date
    Jan 2010
    Posts
    1,099

    Re: Array subset

    And what I'm trying to tell you is that you can't do it that way, you have to change the signatures of such methods in one of two ways:
    1. Change the signature of the methods that are supposed to operate on the sub-array so that they accept a SubArray instance, and adapt your code to that change. Depending on how your app is structured, you either already have candidate methods for this refactoring, or you can create some.

      So, it would be:
      public void Update(SubArray buffer) { /* ... */ }

      Only minor changes in the code that actually does the work on buffer will be required.

    2. Make SubArray implement IList<byte>, and then simply replace byte[] with IList<byte> wherever you see it (except of course where you must pass the modified original array as a byte[] to some library function, or something along those lines). After that, probably the only change you'll have to make is to replace occurrences of xyz.Length with xyz.Count. Once that is done, your existing code should work just the same with byte[], SubArray, List<byte> or whatever other structure you might use, as long as it too implements IList<byte> - it wouldn't matter.

      To sum up:
      public void Update(IList<byte> buffer) { /* ... */ }

      However, if the sole purpose of those methods is to manipulate the sub-array, then, even if you implement IList<byte>, you can still use the declaration from (1) for convenience.
      public void Update(SubArray buffer) { /* ... */ }


    It is important for you to understand the reason why this can't be done the way you originally wanted to do it. In order to have a method return a sub-array as a byte[] array, the method has to make a copy of each byte. What you needed is for the two arrays to refer to the bytes from the same set, to put it that way. Even in C++, you would have to have an array of pointers to bytes in order to do this. But, those pointers would require more memory than the bytes themselves, in order to store memory addresses, so nothing would be gained. C# equivalent of this would be to wrap each byte into a wrapper class (since in C# classes behave a lot like pointers), but again, this would do more harm than good.
    So, what I did there with the SubArray class is: I took a reference to the original array, kept it hidden from the outside world, and then basically pretended that only a subset of elements is available, by allowing access only to those elements, and passing the modifications back to the original array via the reference. (Don't be confused by the term "reference". The only real reason it's not called "pointer" that it's garbage collected, and that, unlike with C++ pointers, you can't directly manipulate the address it stores.)

    So, there's really no way to avoid redesign of your code - but again, it's nothing drastic.
    If you choose to implement the IList<byte> interface, the only part that might appear tricky to you will be implementing the two GetEnumerator() methods - but that is not that hard at all.
    These methods need to return an enumerator object, the primary purpose of which is to enable your SubArray class to work with the foreach loop.
    One method returns an IEnumerator<byte>, while the other returns a more abstract and non generic IEnumerator, but since IEnumerator<byte> inherits IEnumerator, you can return the same object for both.
    All you need to do is implement a small SubArrayEnumerator class, and here you can see how to do it (scroll down to the Remarks section, followed by an example). It's just a few extra lines of code.

    Again, for the other IList<byte> methods (Add, Insert, Remove, RemoveAt, Clear) which are not supposed to be supported by the SubArray class, simply do what byte[] array itself does: use explicit interface implementation, and just throw a NotSupportedException (I originally said it's InvalidOperationException, sry, my bad).

  12. #12
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    Quote Originally Posted by TheGreatCthulhu View Post
    public void Update(SubArray buffer) { /* ... */ }
    public void Update(IList<byte> buffer) { /* ... */ }
    These are some library functions. I cannot imagine to rewrite anything that accepts byte[], even .Net Framework functions like FileStream.Write. It is not applicable.

    Quote Originally Posted by TheGreatCthulhu View Post
    It is important for you to understand the reason why this can't be done the way you originally wanted to do it.
    Even in C++, you would have to have an array of pointers to bytes in order to do this. But, those pointers would require more memory than the bytes themselves, in order to store memory addresses, so nothing would be gained.
    In C++ I would simply do it like the following, without any array copy to access the subbuffer nor array of pointers:
    Code:
    FILE * pFile;
    
    unsigned char * Something(unsigned char * arr)
    {
    	return arr +2;
    }
    
    void Update(unsigned char * buffer)
    {
    	// ...
    	// treat buffer as any other array:
    	fwrite (buffer , 1 , sizeof(buffer) , pFile );
    
    	return;
    }
    
    
    int _tmain(int argc, _TCHAR* argv[])
    {
    	unsigned char buffer[] = { 1, 2, 3, 4, 5 };;
    
    	unsigned char * subarray = Something(buffer);
    
    	subarray[0] = 8;
    	subarray[1] = 9;
    
    	Update(subarray);
    
    	return 0;
    }
    Anyway I now understand that what I need is something that C# can't give. I see it as a limit of this programming language.
    Mr. Burns

  13. #13
    Join Date
    Jan 2010
    Posts
    1,099

    Re: Array subset

    Quote Originally Posted by MontgomeryBurns View Post
    In C++ I would simply do it like the following, without any array copy to access the subbuffer nor array of pointers:
    Code:
    FILE * pFile;
    
    unsigned char * Something(unsigned char * arr)
    {
    	return arr +2;
    }
    
    void Update(unsigned char * buffer)
    {
    	// ...
    	// treat buffer as any other array:
    	fwrite (buffer , 1 , sizeof(buffer) , pFile );
    
    	return;
    }
    
    
    int _tmain(int argc, _TCHAR* argv[])
    {
    	unsigned char buffer[] = { 1, 2, 3, 4, 5 };;
    
    	unsigned char * subarray = Something(buffer);
    
    	subarray[0] = 8;
    	subarray[1] = 9;
    
    	Update(subarray);
    
    	return 0;
    }
    Oh, really? Well, sorry to disappoint you, but you just broke that program, and introduced a nasty bug. It's because sizeof(buffer) doesn't return what you think it does. In this case, it's going to return the "char-relative" size of the unsigned char *, i.e. how many bytes the pointer takes to store an address. It's not going to return the size of your sub-array. The bottom line is, if a library function such as Update() is not specifically designed to handle this kind of pointer arithmetic gymnastics, then you'll be doing something dangerous.

    Quote Originally Posted by MontgomeryBurns View Post
    These are some library functions. I cannot imagine to rewrite anything that accepts byte[], even .Net Framework functions like FileStream.Write. It is not applicable.
    Aah, I see what's your scenario now. I was under the impression that you want to create a sub-array, pass it to some auxiliary methods of your own that would modify the segment, and then, once all that was done, pass the whole, original array to some library functions. Of course I don't expect you to rewrite a bunch of libraries - I thought we were talking about some app-specific methods, still work-in-progress. Nevertheless, if those library methods aren't designed to handle such a scenario, than, even if you could pull it of, there's no guarantee that the application will work correctly.

    In C#, you can work with pointers, however, under certain restrictions (related to the fact that C# is a managed language), in a so-called unsafe context - but I don't think that's going to help in this case. Are those libraries .NET libraries, or are they wrappers around native code? Maybe you can pull something off in C++/CLI ("managed c++")?

    In any case, this could indicate that you could maybe redesign your app, if possible, to avoid the need for this. Also, are you sure that copying is really the primary bottleneck here? It could be other things - are you using some tools to analyze performance?

    Quote Originally Posted by MontgomeryBurns View Post
    Anyway I now understand that what I need is something that C# can't give. I see it as a limit of this programming language.
    Well, maybe. I does, at first glance, seem a little stupid that there isn't a way to create an array that would internally maintain a pointer to some location in an existing array, as well as a length variable, taking advantage of the fact that arrays are full-blown objects in C#... No reason why it couldn't still be managed by the garbage collector. It would certainly be better than messing with pointers in C++. But, maybe it's a conscious design choice by the C# team, maybe it somehow introduces problems or complexity to the garbage collection algorithm...
    Last edited by TheGreatCthulhu; October 18th, 2012 at 05:32 PM.

  14. #14
    Join Date
    Jul 2003
    Location
    Springfield
    Posts
    188

    Re: Array subset

    Quote Originally Posted by TheGreatCthulhu View Post
    Oh, really? Well, sorry to disappoint you, but you just broke that program, and introduced a nasty bug.
    sorry for this, I copy-pasted an example code just to show how the Update might be. But the nitty-gritty is that I wanted to show that the Update function will write bytes starting from the position passed to Update. If you correct the bug changing sizeof(buffer) into 3, in my example, the subarray that is passed to Update will be written. Therefore 8, 9, 5 will be written by fwrite (buffer[2], buffer [3], buffer[4]).
    Quote Originally Posted by TheGreatCthulhu View Post
    Aah, I see what's your scenario now. I was under the impression that you want to create a sub-array, pass it to some auxiliary methods of your own that would modify the segment, and then, once all that was done, pass the whole, original array to some library functions. Of course I don't expect you to rewrite a bunch of libraries - I thought we were talking about some app-specific methods, still work-in-progress. Nevertheless, if those library methods aren't designed to handle such a scenario, than, even if you could pull it of, there's no guarantee that the application will work correctly.
    No, I don't want to pass the whole original array to Library functions or .Net functions, I want to pass the subarray. I want to pass the subarray to some library functions such as Update or FileStream.Write and others.
    Quote Originally Posted by TheGreatCthulhu View Post
    In any case, this could indicate that you could maybe redesign your app, if possible, to avoid the need for this. Also, are you sure that copying is really the primary bottleneck here? It could be other things - are you using some tools to analyze performance?
    I've got a profiler for the application when it is run on a PC, but the problem is when I run it under Windows Mobile on a PDA, resources are limited. I have got in mind a solution which is to change the communication protocol in order to reduce the amount of bytes exchanged per each message. That might solve the OutOfMemoryException on PDA, but would be a bigger work effort and would reduce speed performances (which must be taken care of).
    Mr. Burns

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center