|
strcmp
Binary safe string comparison
(PHP 4, PHP 5)
Related Examples ( Source code ) » strcmp Examples ( Source code ) » Read in and perform operations on a file line by line Examples ( Source code ) » String compare with if statement Examples ( Source code ) » Using foreach to Iterate through an Array Examples ( Source code ) » E-mail address validation class Code Examples / Notes » strcmptimpie2000
[Ed Note: See strcasecmp(). --irc-html@php.net] You can make this function case insentive by using strcmp(strtolower($a) , strtolower ($b)); element @ no spam dot net
When using strcmp to compare results received from a form, keep in mind that the way you decide to encapsulate the value of the form will have an effect on your strcmp() results. Example: <input type="post" name="user[0]" value="abc"> <input type="post" name="user[1]" value='abc'> strcmp() will not return the values sent from this form as "0". However, by using single-quotes or double-quotes to encapsulate BOTH values, strcmp() will return a "0" result. jesse
Well, I am using PHP 4.0 and both strcmp and strcasecmp appear to be giving me very arbitrary and incomprehensible results. When I input strings, it appears that "equal" strings return "1", as well as some unequal strings, and that if the first argument is "smaller" then I *tend* to get negative numbers, but sometimes I get 1, and if larger I *tend* to get numbers larger than 1.. both strcmp and strcasecmp are thus rendered totally unuseable for me in order to either alphebetize or compare strings for inequality. nobody@nogroup
To Monsieur Egmont and 'php or die', use setlocale and strcoll to compare strings in languages such as German, Swedish that use accented characters, and strings of non-latin script.
jeff
The definition of return values of this function is listed correctly on this page, however, there is a common misconception in the notes posted here previously from users. A previous poster said: If $str1 == $str2 strcmp return 0. If $str1Â > $str2 strcmp return 1. If $str1Â < $str2 strcmp return -1. That is incorrect, please look at the definition of the function at the top of this page. It returns less than 0 if str1 is less than str2. Note the phrase "less than", it does not return just -1, but any negative value. The same happens when str1 is greater than str2, but it returns a positive, non-zero value. It returns a positive value that can be 1, or any number thereafter. strcmp() returns a number that is the difference between the two strings starting with the last character that was found to be similar. Here is an example: $output = strcmp("red", "blue"); The variable $output with contain a value of 16 jcanals
Some notes about the spanish locale. I've read some notes that says "CH", "RR" or "LL" must be considered as a single letter in Spanish. That's not really tru. "CH", "RR" and "LL" where considered a single letter in the past (lot of years ago), for that you must use the "Tradictional Sort". Nowadays, the Academy uses the Modern Sort and recomends not to consider anymore "CH", "RR" and "LL" as a single letter. They must be considered two separated letters and sort and compare on that way. Ju just have to take a look to the Offial Spanish Language Dictionary and you can see there that from many years ago there is not the separated section for "CH", "LL" or "RR" ... i.e. words starting with CH must be after the ones starting by CG, and before the ones starting by CI. robertb
S dot Radovanovic suggested: "So my conclusion is that when comparing string, you'd better not make use of == (use strmp or === instead). For integer comparisons the == equation can be usefull, since our values will always be casted to an integer (1 == "1" returns true)." When I tried this, I ran into a problem with parameters passed in the URL. I forgot that a parameter that is not supplied wouldn't be a string, so this code -- which I expected to default to the current year -- didn't work: <?php // Did we get a parameter? if ($year === '') { // Default to current year $temp = getdate(); $year = $temp['year']; } $fname = "magic{$year}.txt"; if (file_exists($fname) == FALSE) { echo "<h2>Data file doesn't exist</h2>"; exit; } ?> Since $year was uninitialized, it didn't exactly equal '', and the file wasn't found. I went back to == in this case. However, I did heed your advice when looking for a particular value ($debug === 'Y'), instead of looking for the lack of a value. Thanks! admin nospam
Reguarding the above note on language specific string comparisons, LL and RR are also single letters in the Spanish language.
owen
Regarding bizarre return values from str*cmp(), I was having similar troubles until I realized that I was attempting to compare a string with HTML formatting with its plain-text equivilant. The formatted string was an <OPTION> value, so the HTML was rendered without the <b> and <i> formatting I was using. Consequently the formatted and unformatted strings were rendered identically in the browser. D'oh!
x123
Please tell what you mean by "alphabetically" and when you say "because of the way it works, it is not very useful..." or "I get strange results, 1 for strings that are equal, etc." : give an example where it does not work and/or show where your script works better! (To test some PHP code, it is sufficient to have a file of 3 lines: <form action='' method=POST><input type=submit value=Evaluate> <textarea id=c><?=$c=stripslashes($_POST['c'])?></textarea> Result of the above:<pre><?=eval($c)?></pre></form>) Imagine how much time is lost for visitors trying to figure out what your program does differently from strcmp ? Imagine how much resources are wasted if PHP users are made to think that PHP functions don't work well and use the EXTREMELY inefficient routines proposed below (e.g. a string comparision routine that uses substr(...,$i,1) to access individual caracters...!) If the documentation says 'binary safe' this should mean that strings are compared byte by byte (according to internal format) - if strcmp() does instead use locale collation tables, this should be clearly mentioned in the documentation. m. egmond
php dot or dot die at phpuser dot net wrote that he had an unexpected difference in comparing between case sensitive and insensitive. They key there is that the case insensitive comparison converts both strings to lowercase before comparing. Since the underscore character is in a different place when compared to an uppercase or lowercase string, the result is different. There is no 'clear' order of punctuation and other characters in or around the alphabet. Most code assumes ASCII order in which case there are several characters before both upper- and lowercase, a few inbetween, and some after both upper- and lowercase. Note also many other/older implementations of sorting sort accented character wrong since they appear after all other alphabetical characters in most charactersets. There is probably a function in PHP to take this into account though. Therefore I would not recommend to make a detailed assumption on how punctuation and other characters sort in relation to alphabetical characters. If sorting these characters at a specific place and in a specific order is important for you, you should probably write a custom string comparison function that does it the way you want. Usually it's sufficient to have a consistent sorting order though, which is what you get by using either strcmp, or strcasecmp consistently. s dot radovanovic
One thing to note in comparison with == When we make a comparison with == php automaticly converts strings to integers when either side of the comparison is an integer, f.e.: <? $value = 0; if($value == "submit") { echo "Let's submit"; } ?> Above would be succesful, since "submit" is converted to an integer (eq 0) and the equation is would return true; (that's why (1 == "1submit") would also return true) That's why we should use strcmp or === (checks type also), for string comparisons. So my conclusion is that when comparing string, you'd better not make use of == (use strmp or === instead). For integer comparisons the == equation can be usefull, since our values will always be casted to an integer (1 == "1" returns true). pabloatnkstudiosdotnet
Just note that the documentation about the function returns is a little confused. So... If $str1 == $str2 strcmp return 0. If $str1 > $str2 strcmp return 1. If $str1 < $str2 strcmp return -1. Pablo Rosciani http://pablo.rosciani.com.ar madsen
It's definitely worth noting that the return-values of strcmp() when used for i.e. password-checking is the oposite of that of the ==-operator. I.e.: $pw1 = "yeah"; $pw2 = "yeah"; if (strcmp($pw1, $pw2)) { // This returns false. // $pw1 and $pw2 are NOT the same. } else { // $pw1 and $pw2 are the same. } Where the use of the == operator would give us.: if ($pw1==$pw2) { // This returns true. // $pw1 and $pw2 are the same. } else { // $pw1 and $pw2 are NOT the same. } Additionally, to check if $pw1 and $pw2 are of the same type you can use the === operator. 27-aug-2002 02:45
In summary, strcmp() does not necessarily use the ASCII code order of each character like in the 'C' locale, but instead parse each string to match language-specific character entities (such as 'ch' in Spanish, or 'dz' in Czech), whose collation order is then compared. When both character entities have the same collation order (such as 'ss' and 'ß' in German), they are compared relative to their code by strcmp(), or considered equal by strcasecmp(). The LC_COLLATE locale setting is then considered: only if LC_COLLATE=C or LC_ALL=C does strcmp() compare strings by character code. Generally, most locales define the following order: control, space, punctuation and underscore, digit, alpha (lower then upper with Latin scripts; or final, middle, then isolated, initial with Arabic script), symbols, others... With strcasecmp(), the alpha subclass is ignored and consider all forms of letters as equal. Note also that some locales behave differently with accented characters: some consider they are the same letter as the unaccented letter (with a minor collation order, e.g. French, Italian, Spanish), some consider they are distinct letters with an independant collation order (e.g. in the C locale, or in Nordic languages). Finally, the collation string is not considering individual characters but instead groups of characters that form a single letter: - for example "ch" or "CH" in Spanish which is always after all other strings beginning with 'c' or 'C', including "cz", but before 'd' or 'D'; - 'ss' and 'ß' in German; - 'dz', 'DZ' and 'Dz' in some Central European languages written with the Latin script... - UTF-8, UTF-16 (Unicode), S-JIS, Big5, ISO2022 character encoding of a locale (the suffix in the locale name) first decode the characters into the UCS4/ISO10646 code position before applying the rules of the language indicated by the main locale... So be extremely careful to what you consider a "character", as it may just mean a encoding byte with no significance in the string collation algorithm: the first character of the string "cholera" in Spanish is "ch", not "c" ! gregd
In cases when you need to compare a line from a just parsed file stream to match a user-defined "nametag" (useful for parsing ini and configuration files), keep in mind the 'end of line' tags as well: // nametag to look for in a file (notice a required "\r\n" at the end) $nametag = "[System]\r\n"; // ...assuming the file has been aready opened for reading and the stream is bound to $filehandle... parse the file until an EOF or $nametag encountered. while (!feof ($handle)) { $buffer = fgets($filehandle); if (strcmp($nametag, $buffer) == 0) { // at this point "[System]" is found, do additional parsings... break; } } francis
If you want to strings according to locale, use strcoll instead.
php dot or dot die
I've a strange problem. I tried to compare string in the same order oracle does. The strange thing is: strcmp('ACQUE_SOTTERRANEE','ACQUE_SOTT_ORAC') //result is -1 strcasecmp('ACQUE_SOTTERRANEE','ACQUE_SOTT_ORAC') //result is 6 I'm surprised for the opposite result! The correct order (oracle ascending order) is: 1- ACQUE_SOTTERRANEE 2- ACQUE_SOTT_ORAC izhan dot khalib
I have tried the strcmp function. Pls be very carefull. The string comparison must exactly "equal". So many people had confused. I.e My program read a string from test.txt file to get the "[company name]" string. // get contents of a file into a string $filename = "test.txt"; $fd = fopen ($filename, "rb"); $contents = fread ($fd, filesize ($filename)); for($i=0;$i<strpos($contents, "]")+1;$i++) { //print $contents[$i]; //$a=trim($contents[$i]); $a=$contents[$i]; echo $a; //echo $i; } $str2="[companyname]"; // this comparison will resulted on greater (1), $result=1 //please remember $tempvariable[2] != $tempvariable (is not equal) $result = strcmp(strtolower($a),strtolower($str2)); //this comparison working properly, $result=0 //$result = strcmp(strtolower($a),strtolower($str2[12])); echo $result; if ($a==$str2[12]) //double check the equality of string { echo "read the NextLine"; } else { echo "not equal"; } //end fclose ($fd); ?> I hope the above example will help you. frewuill
Hey be sure the string you are comparing has not special characters like '\n' or something like that.
phpnotes
Here is a quick note to explain what is meant by > and <. I wrote a script to compare results. If a string is closer to 'A' in the alphabet, it is < the other string. Here is the output hope this save saves people lots of time: acc = acc | ac < acc accc > acc | acd > acc acb < acc | acb < accc acb < accd | bcc > acc bcc > acc | bcca > acc bcc > acca | 1bcc < acca bcc < bcc1 | bcc1 > bcc 1bcc < bcc | 1bcc < 1bcd 1bcd > 1bcc | _bcc < bcc bcc > _bcc | ;bcc < bcc bcc > ;bcc | bcc < bcc; bcc; = bcc; ahmed
For some reason the strcmp fails for this function if the $item_to_compare->ID = "AB123" or some similar string. This happened even though $ID = "AB123": <pre> function item_exists($ID) { // returns 0 for error global $item_list; // this is an array of class objects if (is_array($item_list)) { foreach ($item_list as $key => $item_to_compare) { if (!strcmp($item_to_compare->ID, $ID)) { unset ($item_list); return $key; } } } // Else fail unset ($item_list); return 0; } </pre> So I was forced to do this: <pre> function item_exists($ID) { // returns 0 for error global $item_list; if (is_array($item_list)) { for($i = 0; $i < count($item_list); $i++) { if (!strcmp($item_list[$i]->ID, $ID)) { unset ($item_list); return $i; } } } // Else fail unset ($item_list); return 0; } </pre> mnunemacher
Because of the way this function works, it's not very useful for ordering strings alphabetically. If you're trying to alphabetize a list of strings (like a dictionary does), you may want to use the following functions instead: ----------------- // Returns 1 if $str1 comes before $str2 alphabetically // Returns -1 if $str1 comes after $str2 alphabetically // Returns 0 if $str1 and $str2 are the same function orderAlpha ( $str1, $str2 ) { $limit = null; if ( strlen( $str1 ) > strlen( $str2 ) ) { $limit = strlen( $str2 ); } else { $limit = strlen( $str1 ); } for ( $i = 0;$i < $limit;$i++ ) { if ( substr( $str1, $i, 1 ) > substr( $str2, $i, 1 ) ) { return 1; } else if ( substr( $str1, $i, 1 ) < substr( $str2, $i, 1 ) ) { return -1; } } if ( strlen( $str1 ) > strlen( $str2 ) ) { return 1; } else if ( strlen( $str1 ) < strlen( $str2 ) ) { return -1; } return 0; } // Case insensitive version of orderAlpha function orderiAlpha ( $str1, $str2 ) { return orderAlpha( strtolower( $str1 ), strtolower( $str2 ) ); } ----------------- anonymous
As mentioned above...be careful of trailing whitespace when making string comparisons...to be sure that you are rid of it all, use the trim() function.
|
Change Languageaddcslashes addslashes bin2hex chop chr chunk_split convert_cyr_string convert_uudecode convert_uuencode count_chars crc32 crypt echo explode fprintf get_html_translation_table hebrev hebrevc html_entity_decode htmlentities htmlspecialchars_decode htmlspecialchars implode join levenshtein localeconv ltrim md5_file md5 metaphone money_format nl_langinfo nl2br number_format ord parse_str printf quoted_printable_decode quotemeta rtrim setlocale sha1_file sha1 similar_text soundex sprintf sscanf str_getcsv str_ireplace str_pad str_repeat str_replace str_rot13 str_shuffle str_split str_word_count strcasecmp strchr strcmp strcoll strcspn strip_tags stripcslashes stripos stripslashes stristr strlen strnatcasecmp strnatcmp strncasecmp strncmp strpbrk strpos strrchr strrev strripos strrpos strspn strstr strtok strtolower strtoupper strtr substr_compare substr_count substr_replace substr trim ucfirst ucwords vfprintf vprintf vsprintf wordwrap |