End of File Character?
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 6 of 6

Thread: End of File Character?

Hybrid View

  1. #1
    Join Date
    Feb 2002
    Location
    New Delhi
    Posts
    25

    End of File Character?

    End of file character to my knowledge is ^Z.

    Is there a way to delete from any given file?

    if it is ^Z or let's say any other character - Can placing the same character any where in between would depict EOF there itself? - I guess no. Cause Binary files have huge probablity of having such a character.

    Is there a tool to edit EOF?? How is EOF internally detected by the calls such as fopen , open or any other file manuplation calls?

    Spread Love and Knowledge.
    Spread Love And Knowledge

  2. #2
    Join Date
    Sep 2001
    Location
    San Diego
    Posts
    2,147

    Re: End of File Character?


    EOF, or ^Z, as you point out is simply a marker used when dealing with text only files. Applications using text files see the EOF marker, and decide that's the end of the file. It's just for convenience. Obviously there may be EOF characters within binary files, and when opening files in binary mode, they are simply ignored, in exactly the same way as line feeds, carriage returns and tabs are ignored in binary files.

    There is no difference at all between a binary file and a text file. It is whoever opens them that determines the file mode (text/binary).

    EOF (or ^Z) is equivalent to character 26 - to ignore or change them, open the file in binary mode, then look for character 26 or these other text-mode characters:

    ^Z EOF 26
    \n NewLine 10
    \r CarriageReturn 13
    \t Tab 9



    Hope this helps,

    - Nigel



  3. #3
    Join Date
    Feb 2002
    Location
    Somewhere in the United States.
    Posts
    58

    Re: End of File Character?

    This is an interesting thing to note. I had no idea that the end of file was actually a character, I was under the impression that it was more or less an abstraction handed out by the filesystem - that there was no end of file character just end of file position.

    I have spoken with a great many people on this subject and they all seem to either not know what the character (or value in general) is or believe that it's something dictated and specific to the filesystem on which you're running - are you sure that you can be so specific as to say that end of file is ^Z on all filesystems at all times?

    You mentioned something about files storing text data, is this only with them? I'm not sure if I completely understand/agree with the fact that end of file is ^Z, I think I would like to see some proof if it can be provided.


  4. #4
    Join Date
    Sep 2001
    Location
    San Diego
    Posts
    2,147

    Re: End of File Character?


    I think the general concept of the end-of-file marker has changed over the years. Some file mechanisms interpret the end of file one way, some another.

    Originally, all text files were terminated with an optional ^Z character, which followed the stdin end of file stream marker (from the days before DOS). Anyone reading a file stream would see this marker and know it was the end of the file, and stop reading it. If the marker was not there, the reading mechanism reported the fact that there was no more data to get by returning the ^Z character back to the requesting routine (thereby simulating the EOF marker).

    This is still apparent by looking at the documentation for stdio.h file routines such as '_read' - indeed the details regarding Ctrl-Z are not very clear when describing the _read routine. It states that a Ctrl-Z will terminate the reading in text mode, but that you can bypass it using the lseek command to jump beyond it - hardly the point of using text mode any more...

    Stranger still is that even though the likes of fgetc routines and such that retreive single characters from a file stream represent EOF as -1 (or 0xFFFFFFFF), their wide character counterparts represent the end of file as WEOF (or 0xFFFF).

    This becomes more confusing when you look at modern implementations that do not process the Ctrl-Z in the same way as some of the older routines. In fact most new routines just read and ignore the ^Z character, electing to just read and return what they find.

    So, back to the original question, as to whether it's possible to open a file and change the Ctrl-Z characters to something else - sure, but it's a lot less painful if you open the file in binary mode, which has none of the above 'features'.

    Hope this helps,

    - Nigel



  5. #5
    Join Date
    Feb 2002
    Location
    Somewhere in the United States.
    Posts
    58

    Re: End of File Character?

    So, what I can gather from what you've written is that there really are a lot of ways to look at it and there are a lot of implementations out there that may or may not have been accepted at one time or another and may or may not be in use still.....very nice......


  6. #6
    Join Date
    Sep 2001
    Location
    San Diego
    Posts
    2,147

    Re: End of File Character?


    Absolutely - it's things like this that keep you on your toes :-)

    - Nigel



Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center