|
str_word_count
Return information about words used in a string
(PHP 4 >= 4.3.0, PHP 5)
Example 2450. A str_word_count() example<?php The above example will output: Array Related Examples ( Source code ) » str_word_count Examples ( Source code ) » String word count Examples ( Source code ) » String word count and frequency Code Examples / Notes » str_word_countmegat
[Ed: You'd probably want to use regular expressions if this was the case --alindeman @ php.net] Consider what will happen in some of the above suggestions when a person puts more than one space between words. That's why it's not sufficient just to explode the string. webmaster
Trying to make an effiecient word splitter, and "paragraph limiter", eg, limit item text to 100, or 200 words and so-forth. I don't know how well this compares, but it works nicely. function trim_text($string, $word_count=100) { $trimmed = ""; $string = preg_replace("/\040+/"," ", trim($string)); $stringc = explode(" ",$string); echo sizeof($stringc); if($word_count >= sizeof($stringc)) { // nothing to do, our string is smaller than the limit. return $string; } elseif($word_count < sizeof($stringc)) { // trim the string to the word count for($i=0;$i<$word_count;$i++) { $trimmed .= $stringc[$i]." "; } if(substr($trimmed, strlen(trim($trimmed))-1, 1) == '.') return trim($trimmed).'..'; else return trim($trimmed).'...'; } } $text = "some test text goes in here, I'm not sure, but ok."; echo trim_text($text,5); geertdd
This is an update to my previously posted word_limiter() function. The regex is even more optimized now. Just replace the preg_match line. Change to: <?php preg_match('/^\s*(?:\S+\s*){1,'. (int) $limit .'}/', $str, $matches); aidan
This functionality is now implemented in the PEAR package PHP_Compat. More information about using this function without upgrading your version of PHP can be found on the below link: http://pear.php.net/package/PHP_Compat 16-jan-2005 02:38
This function seems to view numbers as whitespace. I.e. a word consisting of numbers only won't be counted.
muz1
This function is awesome however I needed to display the first 100 words of a string. I am submitting this as a possible solution but also to get feedback as to whether it is the most efficient way of doing it. <? $currString = explode(" ", $string); for ($wordCounter=0; $wordCounter<100; $wordCounter++) { echo $currString[$wordCounter]." "; } ?> brettnospam
This example may not be pretty, but It proves accurate: <?php //count words $words_to_count = strip_tags($body); $pattern = "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/"; $words_to_count = preg_replace ($pattern, " ", $words_to_count); $words_to_count = trim($words_to_count); $total_words = count(explode(" ",$words_to_count)); ?> Hope I didn't miss any punctuation. ;-) rabin
There is a small bug in the "trim_text" function by "webmaster at joshstmarie dot com" below. If the string's word count is lesser than or equal to $truncation, that function will cut off the last word in the string. [EDITOR'S NOTE: above referenced note has been removed] This fixes the problem: <?php function trim_text_fixed($string, $truncation = 250) { $matches = preg_split("/\s+/", $string, $truncation + 1); $sz = count($matches); if ( $sz > $truncation ) { unset($matches[$sz-1]); return implode(' ',$matches); } return $string; } ?> philip
Some ask not just split on ' ', well, it's because simply exploding on a ' ' isn't fully accurate. Words can be separated by tabs, newlines, double spaces, etc. This is why people tend to seperate on all whitespace with regular expressions.
aix
One function. <?php if (!function_exists('word_count')) { function word_count($str,$n = "0"){ $m=strlen($str)/2; $a=1; while ($a<$m) { $str=str_replace(" "," ",$str); $a++; } $b = explode(" ", $str); $i = 0; foreach ($b as $v) { $i++; } if ($n==1) return $b; else return $i; } } $str="Tere Tartu linn"; $c = word_count($str,1); // it return an array $d = word_count($str); // it return int - how many words was in text print_r($c); echo $d; ?> kirils solovjovs
Nothing of this worked for me. I think countwords() is very encoding dependent. This is the code for win1257. For other layots you just need to redefine the ranges of letters... <?php function countwords($text){ $ls=0;//was it a whitespace? $cc33=0;//counter for($i=0;$i<strlen($text);$i++){ $spstat=false; //is it a number or a letter? $ot=ord($text[$i]); if( (($ot>=48) && ($ot<=57)) || (($ot>=97) && ($ot<=122)) || (($ot>=65) && ($ot<=90)) || ($ot==170) || (($ot>=192) && ($ot<=214)) || (($ot>=216) && ($ot<=246)) || (($ot>=248) && ($ot<=254)) )$spstat=true; if(($ls==0)&&($spstat)){ $ls=1; $cc33++; } if(!$spstat)$ls=0; } return $cc33; } ?> artimis
Never use this function to count/separate alphanumeric words, it will just split them up words to words, numbers to numbers. You could refer to another function "preg_split" when splitting alphanumeric words. It works with Chinese characters as well.
jtey
In the previous note, the example will only extract from the string, words separated by exactly one space. To properly extract words from all strings, use regular expressions. Example (extracting the first 4 words): <?php $string = "One two three four five six"; echo implode(" ", array_slice(preg_split("/\s+/", $string), 0, 4)); ?> The above $string would not have otherwise worked when using the explode() method below. lwright
If you are looking to count the frequency of words, try: <?php $wordfrequency = array_count_values( str_word_count( $string, 1) ); ?> andrea
if string doesn't contain the space " ", the explode method doesn't do anything, so i've wrote this and it seems works better ... i don't know about time and resource <?php function str_incounter($match,$string) { $count_match = 0; for($i=0;$i<strlen($string);$i++) { if(strtolower(substr($string,$i,strlen($match)))==strtolower($match)) { $count_match++; } } return $count_match; } ?> example <?php $string = "something:something!!something"; $count_some = str_incounter("something",$string); // will return 3 ?> olivier
I will not discuss the accuracy of this function but one of the source codes above does this. <?php function wrdcnt($haystack) { $cnt = explode(" ", $haystack); return count($cnt) - 1; } ?> That could be replace by <?php function wrdcnt($haystack) { return substr_count($haystack,' ') + 1; } ?> I doubt this does need to be a function :) josh
I was interested in a function which returned the first few words out of a larger string. In reality, I wanted a preview of the first hundred words of a blog entry which was well over that. I found all of the other functions which explode and implode strings to arrays lost key markups such as line breaks etc. So, this is what I came up with: function WordTruncate($input, $numWords) { if(str_word_count($input,0)>$numWords) { $WordKey = str_word_count($input,1); $WordIndex = array_flip(str_word_count($input,2)); return substr($input,0,$WordIndex[$WordKey[$numWords]]); } else {return $input;} } While I haven't counted per se, it's accurate enough for my needs. It will also return the entire string if it's less than the specified number of words. The idea behind it? Use str_word_count to identify the nth word, then use str_word_count to identify the position of that word within the string, then use substr to extract up to that position. Josh. gorgonzola
i tried to write a wordcounter and ended up with this: <?php //strip html-codes or entities $text = strip_tags(strtr($text, array_flip(get_html_translation_table(HTML_ENTITIES)))); //count the words $wordcount = preg_match_all("#(\w+)#", $text, $match_dummy ); ?> joshua dot blake
I needed a function which would extract the first hundred words out of a given input while retaining all markup such as line breaks, double spaces and the like. Most of the regexp based functions posted above were accurate in that they counted out a hundred words, but recombined the paragraph by imploding an array down to a string. This did away with any such hopes of line breaks, and thus I devised a crude but very accurate function which does all that I ask it to: function Truncate($input, $numWords) { if(str_word_count($input,0)>$numWords) { $WordKey = str_word_count($input,1); $PosKey = str_word_count($input,2); reset($PosKey); foreach($WordKey as $key => &$value) { $value=key($PosKey); next($PosKey); } return substr($input,0,$WordKey[$numWords]); } else {return $input;} } The idea behind it? Go through the keys of the arrays returned by str_word_count and associate the number of each word with its character position in the phrase. Then use substr to return everything up until the nth character. I have tested this function on rather large entries and it seems to be efficient enough that it does not bog down at all. Cheers! Josh aurelien marchand
I found a more reliable way to print, say the first 100 words and then print elipses. My code goes this way; $threshold_length = 80; // 80 words max $phrase = "...."; // populate this with the text you want to display $abody = str_word_count($phrase,2); if(count($abody) >= $threshold_length){ // gotta cut $tbody = array_keys($abody); echo " " . substr($phrase,0,$tbody[$threshold_length]) . "... <span class=\"more\"><a href=\"?\">read more</a></span> \n"; } else { // put the whole thing echo " " . $phrase . "\n"; } For any questions, com.iname@artaxerxes2 geertdd
Here's a very fast word limiter function that preserves the original whitespace. <?php function word_limiter($str, $limit = 100, $end_char = '…') { if (trim($str) == '') return $str; preg_match('/\s*(?:\S*\s*){'. (int) $limit .'}/', $str, $matches); if (strlen($matches[0]) == strlen($str)) $end_char = ''; return rtrim($matches[0]) . $end_char; } ?> For the thought process behind this function, please read: http://codeigniter.com/forums/viewthread/51788/ Geert De Deckere madcoder
Here's a function that will trim a $string down to a certian number of words, and add a... on the end of it. (explansion of muz1's 1st 100 words code) ---------------------------------------------- function trim_text($text, $count){ $text = str_replace(" ", " ", $text); $string = explode(" ", $text); for ( $wordCounter = 0; $wordCounter <= $count;wordCounter++ ){ $trimed .= $string[$wordCounter]; if ( $wordCounter < $count ){ $trimed .= " "; } else { $trimed .= "..."; } } $trimed = trim($trimed); return $trimed; } Usage ------------------------------------------------ $string = "one two three four"; echo trim_text($string, 3); returns: one two three... rcatinterfacesdotfr
Here is another way to count words : $word_count = count(preg_split('/\W+/', $text, -1, PREG_SPLIT_NO_EMPTY)); 30-jan-2007 04:15
Here is a php work counting function together with a javascript version which will print the same result. <?php //Php word counting function function word_count($theString) { $char_count = strlen($theString); $fullStr = $theString." "; $initial_whitespace_rExp = "^[[:alnum:]]$"; $left_trimmedStr = ereg_replace($initial_whitespace_rExp,"",$fullStr); $non_alphanumerics_rExp = "^[[:alnum:]]$"; $cleanedStr = ereg_replace($non_alphanumerics_rExp," ",$left_trimmedStr); $splitString = explode(" ",$cleanedStr); $word_count = count($splitString)-1; if(strlen($fullStr)<2) { $word_count=0; } return $word_count; } ?> <?php //Function to count words in a phrase function wordCount(theString) { var char_count = theString.length; var fullStr = theString + " "; var initial_whitespace_rExp = /^[^A-Za-z0-9]+/gi; var left_trimmedStr = fullStr.replace(initial_whitespace_rExp, ""); var non_alphanumerics_rExp = rExp = /[^A-Za-z0-9]+/gi; var cleanedStr = left_trimmedStr.replace(non_alphanumerics_rExp, " "); var splitString = cleanedStr.split(" "); var word_count = splitString.length -1; if (fullStr.length <2) { word_count = 0; } return word_count; } ?> tim
As used above: "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/"; using this pattern for counting words, does anyone else have a problem with someone puts quotes anywhere in the body? for me, it cuts off the rest of the data in the field, and just puts the pre-quote info into the db. cathy
A cute little function for truncating text to a given word limit: <?php function limit_text($text, $limit) { if (strlen($text) > $limit) { $words = str_word_count($text, 2); $pos = array_keys($words); $text = substr($text, 0, $pos[$limit]) . '...'; } return $text; } ?> |
Change Languageaddcslashes addslashes bin2hex chop chr chunk_split convert_cyr_string convert_uudecode convert_uuencode count_chars crc32 crypt echo explode fprintf get_html_translation_table hebrev hebrevc html_entity_decode htmlentities htmlspecialchars_decode htmlspecialchars implode join levenshtein localeconv ltrim md5_file md5 metaphone money_format nl_langinfo nl2br number_format ord parse_str printf quoted_printable_decode quotemeta rtrim setlocale sha1_file sha1 similar_text soundex sprintf sscanf str_getcsv str_ireplace str_pad str_repeat str_replace str_rot13 str_shuffle str_split str_word_count strcasecmp strchr strcmp strcoll strcspn strip_tags stripcslashes stripos stripslashes stristr strlen strnatcasecmp strnatcmp strncasecmp strncmp strpbrk strpos strrchr strrev strripos strrpos strspn strstr strtok strtolower strtoupper strtr substr_compare substr_count substr_replace substr trim ucfirst ucwords vfprintf vprintf vsprintf wordwrap |