PHPhas two operators and six functions for comparing strings to eachother.
You can compare two strings forequality with the ==
and ===
operators.These operators differ in how they deal with non-string operands. The==
operatorcasts non-string operands tostrings, so it reports that 3
and"3"
are equal. The ===
operatordoes not cast, and returns false
if the types ofthe arguments differ.
$o1 = 3;$o2 = "3";if ($o1 == $o2) { echo("== returns true<br>");}if ($o1 === $o2) { echo("=== returns true<br>");}== returns true
The comparison operators (<
, <=
, >
,>=
) also work on strings:
$him = "Fred";$her = "Wilma";if ($him < $her) { print "$him comes before $her in the alphabet.\n";}Fred comes before Wilma in the alphabet
However, the comparison operators give unexpected results whencomparing strings and numbers:
$string = "PHP Rocks";$number = 5;if ($string < $number) { echo("$string < $number");}PHP Rocks < 5
When one argument toa comparison operator is a number, the other argument is cast to anumber. This means that "PHP Rocks"
is cast to a number, giving 0
(since the string does not start with a number). Because 0 is lessthan 5, PHP prints "PHP Rocks < 5"
.
Toexplicitly compare two strings as strings, casting numbers to stringsif necessary, use the strcmp( )
function:
$relationship = strcmp(string_1
,string_2
);
The function returns a number less than 0 ifstring_1
sorts beforestring_2
, greater than 0 ifstring_2
sorts beforestring_1
, or 0 if they are the same:
$n = strcmp("PHP Rocks", 5);echo($n); 1
A variation on strcmp( )
is strcasecmp( )
, which converts strings tolowercase before comparing them. Its arguments and return values arethe same as those for strcmp( )
:
$n = strcasecmp("Fred", "frED"); // $n is 0
Another variation on string comparison is to compare only the firstfew characters of the string. The strncmp( )
and strncasecmp( )
functions take an additional argument, the initial number ofcharacters to use for the comparisons:
$relationship = strncmp(string_1
,string_2
,len
);$relationship = strncasecmp(string_1
,string_2
,len
);
The final variation on these functions isnatural-order comparison withstrnatcmp( )
and strnatcasecmp( )
, which take the same arguments as strcmp( )
and return the same kinds of values. Natural-ordercomparison identifies numeric portions of the strings being comparedand sorts the string parts separately from the numeric parts.
Table 4-5 shows strings in natural order andASCII order.
PHP provides severalfunctions that let you test whether two strings are approximatelyequal: soundex( )
, metaphone( )
,similar_text()
, and levenshtein( )
.
$soundex_code = soundex($string
);$metaphone_code = metaphone($string
);$in_common = similar_text($string_1
,$string_2
[,$percentage
]);$similarity = levenshtein($string_1
,$string_2
);$similarity = levenshtein($string_1
,$string_2 [
,$cost_ins
,$cost_rep
,$cost_del ]
);
The Soundex and Metaphone algorithms each yield a string thatrepresents roughly how a word is pronounced in English. To seewhether two strings are approximately equal with these algorithms,compare their pronunciations. You can compare Soundex values only toSoundex values and Metaphone values only to Metaphone values. TheMetaphone algorithm is generally more accurate, as the followingexample demonstrates:
$known = "Fred";$query = "Phred";if (soundex($known) == soundex($query)) { print "soundex: $known sounds like $query<br>";} else { print "soundex: $known doesn't sound like $query<br>";}if (metaphone($known) == metaphone($query)) { print "metaphone: $known sounds like $query<br>";} else { print "metaphone: $known doesn't sound like $query<br>";}soundex: Fred doesn't sound like Phred metaphone: Fred sounds like Phred
The similar_text( )
function returns the number ofcharacters that its two stringarguments have in common. The third argument, if present, is avariable in which to store the commonality as a percentage:
$string_1 = "Rasmus Lerdorf";$string_2 = "Razmus Lehrdorf";$common = similar_text($string_1, $string_2, $percent);printf("They have %d chars in common (%.2f%%).", $common, $percent);They have 13 chars in common (89.66%).
The Levenshteinalgorithm calculates the similarity oftwo strings based on how many characters you must add, substitute, orremove to make them the same. For instance, "cat"
and "cot"
have aLevenshtein distance of 1, because you need to change only onecharacter (the "a"
to an "o"
) to make them the same:
$similarity = levenshtein("cat", "cot"); // $similarity is 1
This measure of similarity is generally quicker to calculate thanthat used by the similar_text( )
function.Optionally, you can pass three values to the levenshtein( )
function to individually weight insertions, deletions,and replacements—for instance, to compare a word against acontraction.
This example excessively weights insertions when comparing a stringagainst its possible contraction, because contractions should neverinsert characters:
echo levenshtein('would not', 'wouldn\'t', 500, 1, 1);
Get Programming PHP now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.