opening files were name contains '
I have a cpp app that reads in a number of files and writes revised output. The app doesn't seem to be able to open a file with a ' in the file name, such as,
N,N'-dimethylethylenediamine.mol
I have been using this app for a while and I think I would have noticed this, but I haven't. Is there some reason why there would be a problem with the file name?
This is the function that opens the file,
Code:
// opens mol file, reads in rows to string vector and returns vector
vector<string> get_mol_file(string& filePath) {
vector<string> mol_file;
string new_mol_line;
// create an input stream and open the mol file
ifstream read_mol_input;
read_mol_input.open( filePath.c_str() );
// check if the file was opened
if(!read_mol_input.is_open()) {
cout << "mol file " << filePath << " could not be opened" << endl;
exit(-3);
}
// add each keep line to mol_file
while(getline(read_mol_input, new_mol_line)) {
// remove windows '\r' character if present
new_mol_line.erase (remove(new_mol_line.begin(),new_mol_line.end(),'\r') , new_mol_line.end());
// store keep values in vector
mol_file.push_back(new_mol_line);
}
return mol_file;
}
The path to the file is passed as a cpp string and the c version is used to open the file. Do I need to handle this as a special case? It is possible that there could be " as well, parenthesis, etc.
Should I expect there to be a problem with this as it is written, or is something else going on? I can post the code and test files if that would help.
LMHmedchem
Re: opening files were name contains '
Does filePath contain the correct file name?
Re: opening files were name contains '
Are you creating your filePath string with escape sequences for directories?
eg. "c:\\C++\\test.txt"
Re: opening files were name contains '
Quote:
Originally Posted by
LMHmedchem
I have a cpp app that reads in a number of files and writes revised output. The app doesn't seem to be able to open a file with a ' in the file name, such as,
N,N'-dimethylethylenediamine.mol
Well instead of all that code, where we don't know what's behind all of those variables, why not post a program that proves that you have an issue?
Code:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream ifs("N,N'-dimethylethylenediamine.mol");
if ( !ifs )
std::cout << "File could not be opened";
else
std::cout << "File opened successfully";
}
So what gets printed when you run this program? If the file cannot be opened, what if you specified a full path name?
Regards,
Paul McKenzie
1 Attachment(s)
Re: opening files were name contains '
Quote:
Originally Posted by
laserlight
Does filePath contain the correct file name?
This is an interesting question. There may be a problem in that the list of file names comes out of excel and is then processed by a bash script. It looks like there may be more than one version of the single quote in excel. I can do a search in excel for a file name, and it says it's not there, but I can clearly see the file name. It looks like the single quote in excel is a bit different, like a close quote (slightly curved). If I change to the single quote from the keyboard, I can find the file. The file names were copied from a website, so I don't know if there is a different font there, etc. Are there more than one kind of single quote in ASCII? Single quotes are a pain in that all of the different tools I am using seem to handle then differently.
Presuming that the file path is correct and matches an actual file, is there any reason why I should have to handle the single quote differently than other characters. I wouldn't expect it to need to be escaped, like in a bash script or something like that.
Quote:
Originally Posted by
Paul McKenzie
Well instead of all that code, where we don't know what's behind all of those variables, why not post a program that proves that you have an issue?
I certainly can post the entire program, but it's a few hundred lines and I didn't know if anyone wanted to bother to sift through all of that. I have used this app for quite a while and it has worked fine. Suddenly today, it said it couldn't open files that were actually there. All of the problem file names contain ', so I started tracking down if there was an issue with that, such as if cpp looked at that as a regular expression.
The output is a revised format of the input information. I have attached the src and test files.
This was built with,
g++ -O2 -o sdf_build.exe sdf_build_j.cpp; strip sdf_build.exe
and run as,
./sdf_build.exe -d test_molfiles/ -k _keep_build.txt -i makesdf_test_input.txt -o test_output.sdf
The file test_output.sdf is the correct output. With the files in the .zip, this runs correctly, which isn't much help. There must be something about the single quote from my excel files that is the problem, since I am able to get it to work with these files.
Tomorrow I will try again to replicate the list of files that doesn't work from excel. Hopefully someone can spot what the difference is.
Quote:
Originally Posted by
Mavens
Are you creating your filePath string with escape sequences for directories?
eg. "c:\\C++\\test.txt"
File path is just a concatenation of a file name, and the directory that is passed in as in argument. The path is always relative and the directory is always in pwd. This is not a very sophisticated method, but the app is never used in cases where these conditions are not true, so I didn't include a way to use a full path. There is code that adds ./ to the dir_name/ that is passed in.
LMHmedchem
Re: opening files were name contains '
The quote in N,N'-dimethylethylenediamine.mol needs to be set as escape sequence as N,N\'-dimethylethylenediamine.mol
Regards,
Thomas
Re: opening files were name contains '
Quote:
Originally Posted by LMHmedchem
Presuming that the file path is correct and matches an actual file, is there any reason why I should have to handle the single quote differently than other characters.
If it is really just a single quote, no.
Quote:
Originally Posted by greve
The quote in N,N'-dimethylethylenediamine.mol needs to be set as escape sequence as N,N\'-dimethylethylenediamine.mol
That is not true. Even if this were embedded in a string literal, escaping the single quote would be optional.
Re: opening files were name contains '
Quote:
Originally Posted by
LMHmedchem
I certainly can post the entire program, but it's a few hundred lines and I didn't know if anyone wanted to bother to sift through all of that.
I did not ask you to post your program. I'm asking you to run the program I posted. If it works, OK, if it doesn't then the file doesn't exist or truly can't be opened.
Quote:
so I started tracking down if there was an issue with that, such as if cpp looked at that as a regular expression.
Why would it do that, unless the underlying OS treats the character as something special.
Quote:
There must be something about the single quote from my excel files that is the problem, since I am able to get it to work with these files.
Well, that might be the problem. Is it an ASCII quote, or a special "curly" quote that is not the normal ASCII quote?
I know that copying and pasting single and double quotes from a Word document into a text editor usually does not equal the ASCII single/double quotes, as Word uses the curly version of the quotes.
Do you have a small C++ program that lists the files in a directory? If you do, then run it under the debugger and see what names come back. Inspect carefully the quote character, and see what the ASCII value of that character is. If it's not 0x27 (the ASCII quote), then that's the problem.
Regards,
Paul McKenzie
Re: opening files were name contains '
Quote:
Originally Posted by
Mavens
Are you creating your filePath string with escape sequences for directories?
eg. "c:\\C++\\test.txt"
These days "c:/C++/test.txt" is just as ok
Re: opening files were name contains '
Well it looks like the single quote was the problem. Both in the file names and some in the file listing the file names, there was not consistent usage of the same quote character. The curly single quote was there in many places instead of the ASCII single quote. If I pasted the file names in to crimson editor, it would give a ? instead of the single quote, so it didn't recognize the character.
I could modify the program to treat the file names in the input file as having ASCII single quotes, but I'm not sure there is anything reasonable I can do about the file names. There is another issue in that the double quotes " are really two consecutive single quotes because filenames can't contain a double quote. I've wished for a long time that there were dedicated ASCII delimiter characters so that the characters that are used in common language wouldn't be interpreted as an instruction, but I guess that train left the station long ago.
My app spits out the name when it comes to a file it can't find, so I guess I will just have to fix the file names as I find them. I will just do find and replace in excel and I can probably write a script to change the file names. I will have to be careful when copying a name from a website.
LMHmedchem