|
preg_replace
Perform a regular expression search and replace
(PHP 4, PHP 5)
Example 1726. Using backreferences followed by numeric literals<?php The above example will output: April1,2003 Example 1727. Using indexed arrays with preg_replace()<?php The above example will output: The bear black slow jumped over the lazy dog. By ksorting patterns and replacements, we should get what we wanted. <?php The above example will output: The slow black bear jumped over the lazy dog. Example 1728. Replacing several values<?php The above example will output: $startDate = 5/27/1999 Example 1729. Using the 'e' modifier<?php This would capitalize all HTML tags in the input text. Example 1730. Strip whitespaceThis example strips excess whitespace from a string. <?php Example 1731. Using the count parameter<?php The above example will output: xp***to Related Examples ( Source code ) » preg_replace Examples ( Source code ) » preg_replace Examples ( Source code ) » Replacing a Pattern with a Found String Examples ( Source code ) » Replace the tag content without getting rid of any attributes Examples ( Source code ) » Get WhoIs Information for 270 different tld's Examples ( Source code ) » FeedAggregator using HttpRequest Examples ( Source code ) » RFC-compliant email address validator Code Examples / Notes » preg_replaceas
What about strtr: <?php $DNA = "AGTCTGCCCTAG"; echo "$DNA\n"; $DNA = strtr($DNA,"AGCT","TCGA"); echo "$DNA\n"; ?> Andreas alexey lebedev
Wasted several hours because of this: $str='It's a string with HTML entities'; preg_replace('~&#(\d+);~e', 'code2utf($1)', $str); This code must convert numeric html entities to utf8. And it does with a little exception. It treats wrong codes starting with � The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039). And it does matter! PHP treats 039 as octal number. Try print(011); Solution: preg_replace('~�*(\d+);~e', 'code2utf($1)', $str); anonymous
To djurredenboer: It's because PHP stores array elements in order of their creation. Use sort() or ksort() functions to change the order of your arrays! istvan dot csiszar
This is an addition to the previously sent removeEvilTags function. If you don't want to remove the style tag entirely, just certain style attributes within that, then you might find this piece of code useful: <?php function removeEvilStyles($tagSource) { // this will leave everything else, but: $evilStyles = array('font', 'font-family', 'font-face', 'font-size', 'font-size-adjust', 'font-stretch', 'font-variant'); $find = array(); $replace = array(); foreach ($evilStyles as $v) { $find[] = "/$v:.*?;/"; $replace[] = ''; } return preg_replace($find, $replace, $tagSource); } function removeEvilTags($source) { $allowedTags = '<h1><h2><h3><h4><h5><a><img><label>'. ' <span><sup><sub><ul><li><ol>'. '<table><tr><td><th><tbody><div><hr><em><b><i>'; $source = strip_tags(stripslashes($source), $allowedTags); return trim(preg_replace('/<(.*?)>/ie', "'<'.removeEvilStyles('\\1').'>'", $source)); } ?> tim
This function has a little quirk. When you are trying to use backreferences in the pattern, you MUST use \\n, and not $n. $n doesn't work. flar
Special care needs to be taken, since all regular expressions are "greedy" by default. What does it mean? Example: Text: 'the big scary house here and the big weird house over there' Expression '/big(.*)house/' Matches: 'big scary house here and the big weird house' In other words, the expression matches everything between the first occurrence of the word "big" and the last occurrence of the word "house". If you need to preserve the words 'here and the', you need a non-greedy expression. If you need a non-greedy regular expression in PHP, add the character "U" after the final slash. So our updated example will look like this: Expression: '/big(.*)house/U' Match 1: 'big scary house' Match 2: 'big weird house' Note that with the non-greedy expression we now got two matches and the words between the matches are preserved (not replaced). ae
Something innovative for a change ;-) For a news system, I have a special format for links: "Go to the [Blender3D Homepage|http://www.blender3d.org] for more Details" To get this into a link, use: $new = preg_replace('/\[(.*?)\|(.*?)\]/', '<a href="$2" target="_blank">$1</a>', $new); sg_01
Re: wcc at techmonkeys dot org You could put this in 1 replace for faster execution as well: <?php /* * Removes all blank lines from a string. */ function removeEmptyLines($string) { return preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string); } ?> steven -a-t- acko dot net
People using the /e modifier with preg_replace should be aware of the following weird behaviour. It is not a bug per se, but can cause bugs if you don't know it's there. The example in the docs for /e suffers from this mistake in fact. With /e, the replacement string is a PHP expression. So when you use a backreference in the replacement expression, you need to put the backreference inside quotes, or otherwise it would be interpreted as PHP code. Like the example from the manual for preg_replace: preg_replace("/(<\/?)(\w+)([^>]*>)/e", "'\\1'.strtoupper('\\2').'\\3'", $html_body); To make this easier, the data in a backreference with /e is run through addslashes() before being inserted in your replacement expression. So if you have the string He said: "You're here" It would become: He said: \"You\'re here\" ...and be inserted into the expression. However, if you put this inside a set of single quotes, PHP will not strip away all the slashes correctly! Try this: print ' He said: \"You\'re here\" '; Output: He said: \"You're here\" This is because the sequence \" inside single quotes is not recognized as anything special, and it is output literally. Using double-quotes to surround the string/backreference will not help either, because inside double-quotes, the sequence \' is not recognized and also output literally. And in fact, if you have any dollar signs in your data, they would be interpreted as PHP variables. So double-quotes are not an option. The 'solution' is to manually fix it in your expression. It is easiest to use a separate processing function, and do the replacing there (i.e. use "my_processing_function('\\1')" or something similar as replacement expression, and do the fixing in that function). If you surrounded your backreference by single-quotes, the double-quotes are corrupt: $text = str_replace('\"', '"', $text); People using preg_replace with /e should at least be aware of this. I'm not sure how it would be best fixed in preg_replace. Because double-quotes are a really bad idea anyway (due to the variable expansion), I would suggest that preg_replace's auto-escaping is modified to suit the placement of backreferences inside single-quotes (which seemed to be the intention from the start, but was incorrectly applied). santosh patnaik
Once a match is identified, the regular expression engine appears to set aside the matching segment of the target string. A second segment that you expect to match may therefore end up not getting matched: // Expect 'pa pa pa pa' but get 'pa ma pa ma' echo preg_replace('`(^|\s)ma(\s|$)`', '$1pa$2', 'ma ma ma ma'); Here the issue can be solved by using a 'lookahead': // Expect and get 'pa pa pa pa' echo preg_replace('`(^|\s)ma(?=\s|$)`', '$1pa', 'ma ma ma ma'); dani dot church
Note that it is in most cases much more efficient to use preg_replace_callback(), with a named function or an anonymous function created with create_function(), instead of the /e modifier. When preg_replace() is called with the /e modifier, the interpreter must parse the replacement string into PHP code once for every replacement made, while preg_replace_callback() uses a function that only needs to be parsed once.
alexandre
Match and replace for arrays. Useful for parsing entire $_POST <?php function array_preg_match(array $patterns, array $subjects, &$errors = array()) { $errors = array(); foreach ($patterns as $k => $v) preg_match($v, $subjects[$k]) or $errors[$k] = TRUE; return count($errors) == 0 ? TRUE : FALSE; } function array_preg_replace(array $patterns, array $replacements, array $subject) { $r = array(); foreach ($patterns as $k => $v) $r[$k] = preg_replace($v, $replacements[$k], $subject[$k]); return $r+$subject; } ?> jhm
It took me a while to figure this one out, but here is a nice way to use preg_replace to convert a hex encoded string back to clear text <?php $text = "PHP rocks!"; $encoded = preg_replace( "'(.)'e" ,"dechex(ord('\\1'))" ,$text ); print "ENCODED: $encoded\n"; ?> ENCODED: 50485020726f636b7321 <?php print "DECODED: ".preg_replace( "'([\S,\d]{2})'e" ,"chr(hexdec('\\1'))" ,$encoded)."\n"; ?> DECODED: PHP rocks! gabe
It is useful to note that the 'limit' parameter, when used with 'pattern' and 'replace' which are arrays, applies to each individual pattern in the patterns array, and not the entire array. <?php $pattern = array('/one/', '/two/'); $replace = array('uno', 'dos'); $subject = "test one, one two, one two three"; echo preg_replace($pattern, $replace, $subject, 1); ?> If limit were applied to the whole array (which it isn't), it would return: test uno, one two, one two three However, in reality this will actually return: test uno, one dos, one two three robert
If you're wanting to strip a variety of timestamps, try this: <?php $var = "[09:21:32] Testing"; $var = preg_replace('/\W*\d+:\d+(?::\d+)?[a-zA-Z]*[\W\s]*/', '', $var); echo $var; ?> Hope this helps :) pcabc
if (!function_exists('htmlspecialchars_decode')) { function htmlspecialchars_decode($str) { $str = preg_replace('/\\\"/', '"', $str); return strtr($str, array_flip(get_html_translation_table(HTML_SPECIALCHARS))); } } a little correcting, change back the \" string to "... iasmin
I thought that someone could use this hyperlink function. preg_replace is about 6 times faster than ereg_replace. I took the original example from the ereg_replace function page and modified so that it works perfect. I gave a comment of what it matches. One thing is that I added a space at the beginning so that only links that don't have <a href="" around them or anything else touching will be replaced. <i>NOTE! I had to break the long lines otherwise I couldn't have posted this. So take the new line out and it will work</i> <?php function hyperlink(&$text) { // match protocol://address/path/file.extension?some=variable&another=asf% $text = preg_replace("/\s(([a-zA-Z]+:\/\/)([a-z][a-z0-9_\..-]* [a-z]{2,6})([a-zA-Z0-9\/*-?&%]*))\s/i", " <a href=\"$1\">$3</a> ", $text); // match www.something.domain/path/file.extension?some=variable&another=asf% $text = preg_replace("/\s(www\.([a-z][a-z0-9_\..-]* [a-z]{2,6})([a-zA-Z0-9\/*-?&%]*))\s/i", " <a href=\"http://$1\">$2</a> ", $text); return $text; } ?> Play around with it and see how it works. Courtesy of AmazingDiscoveries.org God bless, Iasmin Balaj lehongviet
I got problem echoing text that contains double-quotes into a text field. As it confuses value option. I use this function below to match and replace each pair of them by smart quotes. The last one will be replaced by a hyphen(-). It works for me. function smart_quotes($text) { $pattern = '/"((.)*?)"/i'; $text = preg_replace($pattern,"“\\1”",stripslashes($text)); $text = str_replace("\"","-",$text); $text = addslashes($text); return $text; } djurredenboer
I found something strange <?php $string = 'The quick brown fox jumped over the lazy dog.'; $patterns[0] = '/quick/'; $patterns[1] = '/brown/'; $patterns[2] = '/fox/'; $replacements[2] = 'bear'; $replacements[1] = 'black'; $replacements[0] = 'slow'; echo preg_replace($patterns, $replacements, $string); ?> Output The bear black slow jumped over the lazy dog. But when you swap the replacements like $replacements[0] = 'slow'; $replacements[2] = 'bear'; $replacements[1] = 'black'; you get The slow bear black jumped over the lazy dog. mrozenoer
I could not find a function to unescape javascript unicode escapes anywhere (e.g., "\u003c"=>"<"). <?php function js_uni_decode($s) { return preg_replace('/\\\u([0-9a-f]{4})/ie', "chr(hexdec('\\1'))", $s); } echo js_uni_decode("\u003c"); ?> mike dot hayward
Hi. Not sure if this will be a great help to anyone out there, but thought i'd post just in case. I was having an Issue with a project that relied on $_SERVER['REQUEST_URI']. Obviously this wasn't working on IIS. (i am using mod_rewrite in apache to call up pages from a database and IIS doesn't set REQUEST_URI). So i knocked up this simple little preg_replace to use the query string set by IIS when redirecting to a PHP error page. <? //My little IIS hack :) if(!isset($_SERVER['REQUEST_URI'])){ $_SERVER['REQUEST_URI'] = preg_replace( '/404;([a-zA-Z]+:\/\/)(.*?)\//i', "/" , $_SERVER['QUERY_STRING'] ); } ?> Hope this helps someone else out there trying to do the same thing :) If anyone finds a better way, please let met know, I'm still learning ;) mac
Here's a derivate of Iasmin's function. This one handles URL's followed by a period or comma too and adds a _blank target to the hyperlinks. Additionally it replaces e-mail addresses with a mailto hyperlink. <?php function hyperlink($text) { // match protocol://address/path/file.extension?some=variable&another=asf% $text = preg_replace("/\s([a-zA-Z]+:\/\/[a-z][a-z0-9\_\.\-]* [a-z]{2,6}[a-zA-Z0-9\/\*\-\?\&\%]*)([\s|\.|\,])/i", " <a href=\"$1\" target=\"_blank\">$1</a>$2", $text); // match www.something.domain/path/file.extension?some=variable&another=asf% $text = preg_replace("/\s(www\.[a-z][a-z0-9\_\.\-]* [a-z]{2,6}[a-zA-Z0-9\/\*\-\?\&\%]*)([\s|\.|\,])/i", " <a href=\"http://$1\" target=\"_blank\">$1</a>$2", $text); // match name@address $text = preg_replace("/\s([a-zA-Z][a-zA-Z0-9\_\.\-]*[a-zA-Z]* \@[a-zA-Z][a-zA-Z0-9\_\.\-]*[a-zA-Z]{2,6})([\s|\.|\,])/i", " <a href=\"mailto://$1\">$1</a>$2", $text); return $text; } ?> eric
Here recently I needed a way to replace links (<a href="blah.com/blah.php">Blah</a>) with their anchor text, in this case Blah. It might seem simple enough for some..or most, but at the benefit of helping others: <?php $value = '<a href="http://www.domain.com/123.html">123</a>'; echo preg_replace('/<a href="(.*?)">(.*?)<\\/a>/i', '$2', $value); //Output // 123 ?> kyle
Here is a regular expression to "slashdotify" html links. This has worked well for me, but if anyone spots errors, feel free to make corrections. <?php $url = '<a attr="garbage" href="http://us3.php.net/preg_replace">preg_replace - php.net</a>'; $url = preg_replace( '/<.*href="?(.*:\/\/)?([^ \/]*)([^ >"]*)"?[^>]*>(.*)(<\/a>)/', '<a href="$1$2$3">$4</a> [$2]', $url ); ?> Will output: <a href="http://us3.php.net/preg_replace">preg_replace - php.net</a> [us3.php.net] sternkinder
From what I can see, the problem is, that if you go straight and substitute all 'A's wit 'T's you can't tell for sure which 'T's to substitute with 'A's afterwards. This can be for instance solved by simply replacing all 'A's by another character (for instance '_' or whatever you like), then replacing all 'T's by 'A's, and then replacing all '_'s (or whatever character you chose) by 'A's: $dna = "AGTCTGCCCTAG"; echo str_replace(array("A","G","C","T","_","-"), array("_","-","G","A","T","C"), $dna); //output will be TCAGACGGGATC Although I don't know how transliteration in perl works (though I remember that is kind of similar to the UNIX command "tr") I would suggest following function for "switching" single chars: function switch_chars($subject,$switch_table,$unused_char="_") { foreach ( $switch_table as $_1 => $_2 ) { $subject = str_replace($_1,$unused_char,$subject); $subject = str_replace($_2,$_1,$subject); $subject = str_replace($unused_char,$_2,$subject); } return $subject; } echo switch_chars("AGTCTGCCCTAG", array("A"=>"T","G"=>"C")); //output will be TCAGACGGGATC robvdl
For those of you that have ever had the problem where clients paste text from msword into a CMS, where word has placed all those fancy quotes throughout the text, breaking the XHTML validator... I have created a nice regular expression, that replaces ALL high UTF-8 characters with HTML entities, such as ’. Note that most user examples on php.net I have read, only replace selected characters, such as single and double quotes. This replaces all high characters, including greek characters, arabian characters, smilies, whatever. It took me ages to get it just downto two regular expressions, but it handles all high level characters properly. $text = preg_replace('/([\xc0-\xdf].)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 192) * 64 + (ord(substr('$1', 1, 1)) - 128)) . ';'", $text); $text = preg_replace('/([\xe0-\xef]..)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 224) * 4096 + (ord(substr('$1', 1, 1)) - 128) * 64 + (ord(substr('$1', 2, 1)) - 128)) . ';'", $text); anon
For the benefit of perl coders, $s =~ s/PATTERN/REPLACEMENT/g; becomes: <? $s = preg_replace('/PATTERN/', 'REPLACEMENT', $s); ?> Note that you have to assign the result back to $s. If your preg_replace doesn't seem to be working, you may have merely forgotten to assign the return to $s. flar
For completeness, with regard to my previous message, it should be noted that the following expressions are functionally identical in PHP: '/big(.*)house/U' '/big(.*?)house/' Last note: The parenthesis can be omitted. jon
for a url explode I would suggest parse_url($url). Its far simpler than the list of preg_replaces used.
igasparetto
Displaying results of a search engine: $words=explode(" ", $_POST['query']); foreach($words as $word){ $patterns[]='/'.$word.'/i'; $replaces[]='<span class="textFound">$0</span>'; } // run sql $display_results=""; foreach($res as $row) $display_results .= " " . preg_replace($patterns, $replaces, nl2br(htmlentities( $row['field'] ) ) ) . "\n"; echo $display_results; ismith
Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches. This is due to the PCRE library returning an error code if the string contains bad UTF-8. 131 dot php
Based on previous comment, i suggest ( this function already exist in php 6 ) function unicode_decode($str){ return preg_replace( '#\\\u([0-9a-f]{4})#e', "unicode_value('\\1')", $str); } function unicode_value($code) { $value=hexdec($code); if($value<0x0080) return chr($value); elseif($value<0x0800) return chr((($value&0x07c0)>>6)|0xc0) .chr(($value&0x3f)|0x80); else return chr((($value&0xf000)>>12)|0xe0) .chr((($value&0x0fc0)>>6)|0x80) .chr(($value&0x3f)|0x80); } matt
at below post: <?php $template = "Price: #price#"; $price = '$5'; print "Price: $price\n"; $res = preg_replace("/#price#/", $price, $template); print "From template: -> $res\n"; ?> anzenews
As steven -a-t- acko dot net explained, using /e modifier is tricky if strings have quotes in them. However, the solution might be easier than manual coding of some sort of stripslashes - just use preg_replace_callback instead of preg_replace. It is safer anyway.
rob
Also worth noting is that you can use array_keys()/array_values() with preg_replace like: $subs = array( '/\[b\](.+)\[\/b\]/Ui' => '<strong>$1</strong>', '/_(.+)_/Ui' => '<em>$1</em>' ... ... ); $raw_text = '[b]this is bold[/b] and this is _italic!_'; $bb_text = preg_replace(array_keys($subs), array_values($subs), $raw_text); santosh patnaik
@giel dot berkers Use the 'PCRE_DOTALL' ('s') option so that the '.' covers newline characters: $code = preg_replace('/\/\*.*\*\//ms', '', $code); jefkin
@ Santosh Patnaik The perl regular expression engine will handle this expression better and much faster by using the word boundry escape code \b. Though it may not be obvious except to long time perl geeks such as I :) so: // Expect and get 'pa pa pa pa' echo preg_replace('`\bma\b`', 'pa', 'ma ma ma ma'); Jeff info
@ Mac: Your adjustment only seems to work when the period or comma is followed by a newline. The code breaks when more characters follow. Try the following code instead. It parses the url to a point where all punctuation is followed by either a word character or a slash. <?php $string = preg_replace('#(^|\s)([a-z]+://([^\s\w/]?[\w/])*)#is', '\\1<a href="\\2">\\2</a>', $string); $string = preg_replace('#(^|\s)((www|ftp)\.([^\s\w/]?[\w/])*)#is', '\\1<a href="http://\\2">\\2</a>', $string); $string = preg_replace('#(^|\s)(([a-z0-9._%+-]+)@(([.-]?[a-z0-9])*))#is', '\\1<a href="mailto:\\2">\\2</a>', $string); ?> This seems fairly waterproof but please post your improvements. |