I have a very simple capitalization program that helps me auto-capitalize text files. It operates from a capitalization dictionary which is also a text file. The capitalization dictionary consists of nothing more than words that are always capitalized a certain way, separated by line breaks. So, for example, if the capitalization dictionary consists of the following:


"Napoleon Dynamite"
French
iPad
I


and the string to be processed is:

i watched "napoleon dynamite" in french on my ipad

the output after processing will be:

I watched "Napoleon Dynamite" in French on my iPad

However, for some reason I cannot figure out, the program strips out (i.e., deletes) any character that has a diacritic. So if I add to my capitalization dictionary the word:

Napoléon

and the string is:

i watched "napoleon dynamite" in french on my ipad with my friend napoléon

what I end up with is:

I watched "Napoleon Dynamite" in French on my iPad with my friend napolon

Obviously, this is not desired. Can anyone help me figure out what the fix might be, to make sure that letters with diacritics are treated properly rather than being deleted? I think it might have something to do with the ToLower function...

Here is the code of the function:

Code:
        public static string CapitalizeString(List<string> wordList, string str)
        {

            if (str == null || str.Length == 0)
            {
                return "";
            }

            string capitalizedString = str.ToLower();

            capitalizedString = ReplaceCapitalizedWord(wordList, capitalizedString);

            // capitalizes the first letter
            for (int i = 0; i < capitalizedString.Length; i++)
            {
                char ch = capitalizedString[i];
                if (char.IsLetter(ch) || char.IsNumber(ch))
                {
                    if (char.IsUpper(str[i]))
                    {
                        capitalizedString = capitalizedString.Substring(0, i) + capitalizedString.Substring(i, 1).ToUpper() + capitalizedString.Substring(i + 1);
                    }

                    break;
                }
            }

            return capitalizedString;

        }


        private static string ReplaceCapitalizedWord(List<string> capitalsWordList, string stringToCapitalize)
        {
            string lowerCaseString = stringToCapitalize.ToLower();
            string capitalizedString = stringToCapitalize.ToString();

            foreach (string capStr in capitalsWordList)
            {
                string capStrLower = capStr.ToLower();
                int startIndex = 0;
                int foundIndex = -1;
                while (startIndex < capitalizedString.Length && (foundIndex = lowerCaseString.IndexOf(capStrLower, startIndex)) >= 0)
                {
                    bool isSeparatorPrevChar = true;
                    bool isSeparatorNextChar = true;

                    if (foundIndex > 0)
                    {
                        char prevChar = capitalizedString[foundIndex - 1];
                        isSeparatorPrevChar = !char.IsLetterOrDigit(prevChar) && prevChar != '-' && prevChar != '\'';
                    }

                    if (foundIndex + capStr.Length < capitalizedString.Length)
                    {
                        char nextChar = capitalizedString[foundIndex + capStr.Length];
                        isSeparatorNextChar = !char.IsLetterOrDigit(nextChar) && nextChar != '-';
                    }


                    if (isSeparatorPrevChar && isSeparatorNextChar)
                    {
                        capitalizedString = capitalizedString.Substring(0, foundIndex) + capStr + capitalizedString.Substring(foundIndex + capStr.Length);
                    }

                    startIndex = foundIndex + 1;
                }

            }

            return capitalizedString;
        }
Using .NET 2.0 (I think--compiling in Visual C# 2005 Express, anyway).