Click to See Complete Forum and Search --> : character count
anissurendran
May 20th, 2006, 03:53 AM
hi everyone,
I want to find the character count of a document file.I used the code below.
$lines = file('E:/FILELIST.DOC');
$tot=0;
foreach ($lines as $line_num => $line)
{
echo $line_num."=".$line."<br>";
$tot=$tot+strlen($lines[$line_num])-2;
}
echo $tot;
This code worked correctly at first.But now i got the ouput like this.
0=ÐÏࡱá>þÿ 35ÿÿÿÿ Code guru (i have deleted the remaining symbols and characters bcoz it is too long).
The count shown is28648 .actual character count is 8.
why does this happen?is anyother way to find the charater count?
cherish
May 20th, 2006, 08:56 AM
I wonder, in which language is the document you're trying to count? It might be a language issue. If so, you might want to check mb_strlen() (http://us3.php.net/manual/en/function.mb-strlen.php) function of PHP.
anissurendran
May 21st, 2006, 10:23 PM
hi friends,
Thanks cherish.
I think it is not a language problem.Language i used is English.The symbols and characters shown as output includes the properties of the .doc file.
I think the header and footer are also included. At first it worked well and i got the correct count.When i check the same in a .txt file i got the correct count. But i need the count in a .doc file. Is there any way to solve this problem
PeejAvery
May 21st, 2006, 10:33 PM
But i need the count in a .doc file. Is there any way to solve this problem
If the .doc file is done in Microsoft Word or another application that uses rich text base, you will not be able to get the count that simply. I am not even sure if you can get it at all. If it is rich text based then there is a lot of extra coding there.
dave2k
May 22nd, 2006, 03:25 AM
One possible way is do what you did but have an array of accepted characters i.e. 'a','b','c' etc, and only increment the counter when they are shown. the only trouble is they may be present in stuff you dont want. i also found http://www.phpwordlib.motion-bg.com/
e_har
June 2nd, 2006, 01:54 AM
hihi...i hope that everyone here haven't given up hope on finding new ways for this problem...at least i haven't...hehe...
I have come up with the following function...hope that i got it right
<?
function adv_count_words($str) {
$words = 0;
// I've replace all double space characters with a single space character and the script not to count spaces as word.
$str = eregi_replace(" +", " ", $str);
// Break the string into pieces that are separated by spaces and place them into an array
$array = explode(" ", $str);
// For every string in the array, i make one more test to assure it is a word.
for($i=0;$i < count($array);$i++) {
if (eregi("[0-9A-Za-z]", $array[$i]))
$words++;
}
return $words;
}
?>
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.