|
split
Split string into array by regular expression
(PHP 4, PHP 5)
Example 1897. split() example
To split off the first four fields from a line from
<?php Example 1898. split() exampleTo parse a date which may be delimited with slashes, dots, or hyphens: <?php Related Examples ( Source code ) » split Examples ( Source code ) » Email validation Demo Examples ( Source code ) » preg_split (/\+{1,}/) Examples ( Source code ) » Splitting a String Based on a Found String Examples ( Source code ) » Use more than one delimiter to split string Examples ( Source code ) » String token for string split Examples ( Source code ) » Create a PDF document containing a pie chart Examples ( Source code ) » Animated Talking Captcha php class Examples ( Source code ) » Feed Validator Examples ( Source code ) » Get WhoIs Information for 270 different tld's Examples ( Source code ) » E-mail address validation class Code Examples / Notes » splitjchart
[Ed. note: Close. The pipe *is* an operator in PHP, but the reason this fails is because it's also an operator in the regex syntax. The distinction here is important since a PHP operator inside a string is just a character.] The reason your code: $line = "12|3|Fred"; list ($msgid, $msgref, $msgtopic)=split('|', $line); didn't work is because the "|" symbol is an operator in PHP. If you want to use the pipe symbol as a delimiter you must excape it with a back slash, "\|". You code should look like this: $line = "12|3|Fred"; list ($msgid, $msgref, $msgtopic)=split('\|', $line); robin
wchris's quotesplit assumes that anything that is quoted must also be a complete delimiter-seperated entry by itself. This version does not. It also uses split's argument order. function quotesplit( $splitter=',', $s ) { //First step is to split it up into the bits that are surrounded by quotes //and the bits that aren't. Adding the delimiter to the ends simplifies //the logic further down $getstrings = explode('"', $splitter.$s.$splitter); //$instring toggles so we know if we are in a quoted string or not $delimlen = strlen($splitter); $instring = 0; while (list($arg, $val) = each($getstrings)) { if ($instring==1) { //Add the whole string, untouched to the previous value in the array $result[count($result)-1] = $result[count($result)-1].$val; $instring = 0; } else { //Break up the string according to the delimiter character //Each string has extraneous delimiters around it (inc the ones we added //above), so they need to be stripped off $temparray = split($splitter, substr($val, $delimlen, strlen($val)-$delimlen-$delimlen+1 ) ); while(list($iarg, $ival) = each($temparray)) { $result[] = trim($ival); } $instring = 1; } } return $result; } fotw
Ups! It seems that neither explode nor split REALY takes a STRING but only a single character as a string for splitting the string. I found this problem in one of my codes when trying to split a string using ";\n" as breaking string. The result, only ";" was thaken... the rest of the string was ignored. Same when I tried to substitute "\n" by any other thing. :( claes
Though this is obvious, the manual is a bit incorrect when claiming that the return will always be 1+number of time the split pattern occures. If the split pattern is the first part of the string, the return will still be 1. E.g. $a = split("zz," "zzxsj.com"); count($a); => 1. The return of this can not in anyway be seperated from the return where the split pattern is not found. kang
This is a good way to display a comma delimited file with two columns. The first column is the URL's description, the second is the actual URL. <ul> <?php $fname="relatedlinks.csv"; $fp=fopen($fname,"r") or die("Error found."); $line = fgets( $fp, 1024 ); while(!feof($fp)) { list($desc,$url,$dummy) = split( ",", $line, 3 ); print "<li>"; print "<a href='$url'>$desc</a>"; print "</li>\n"; $line = fgets( $fp, 1024 ); } fclose($fp); ?> </ul> nate sweet
This function takes a string like this... this=cool that="super cool" thing='ridiculous cool' And gives you an associative array of names and values. function parseNameValues ($text) { $values = array(); if (preg_match_all('/([^=\s]+)=("(?P<value1>[^"]+)"|' . '\'(?P<value2>[^\']+)\'|(?P<value3>.+?)\b)/', $text, $matches, PREG_SET_ORDER)) foreach ($matches as $match) $values[trim($match[1])] = trim(@$match['value1'] . @$match['value2'] . @$match['value3']); return $values; } (regex broken into two strings so it won't be too long for this webpage) justin
The previous solution assumes that a quoted string always starts a new element (true in real CSV files, but not in my application). The following routine does not make any such assumptions. It also deals with pairs of quotes and does not use any regular expressions. <?php function getCSVValues($string, $separator=",") { $elements = explode($separator, $string); for ($i = 0; $i < count($elements); $i++) { $nquotes = substr_count($elements[$i], '"'); if ($nquotes %2 == 1) { for ($j = $i+1; $j < count($elements); $j++) { if (substr_count($elements[$j], '"') > 0) { // Put the quoted string's pieces back together again array_splice($elements, $i, $j-$i+1, implode($separator, array_slice($elements, $i, $j-$i+1))); break; } } } if ($nquotes > 0) { // Remove first and last quotes, then merge pairs of quotes $qstr =& $elements[$i]; $qstr = substr_replace($qstr, '', strpos($qstr, '"'), 1); $qstr = substr_replace($qstr, '', strrpos($qstr, '"'), 1); $qstr = str_replace('""', '"', $qstr); } } return $elements; } ?> franz
The example from ramkumar rajendran did not work. $line = split("/\n", $input_several_lines_long); I do not know why this does not work for me. The following has worked for me to get a maximum of 2 array parts separated by the first new line (independant if saved under UNIX or WINDOWS): $line = preg_split('/[\n\r]+/',$input_several_lines_long,2); Also empty lines are not considered here. mcgarry
split() doesn't like NUL characters within the string, it treats the first one it meets as the end of the string, so if you have data you want to split that can contain a NUL character you'll need to convert it into something else first, eg: $line=str_replace(chr(0),'',$line); destes
Some corrections to robin-at-teddyb's CSV splitting function. Recall that the point of this is to properly implement a split() function that handles data exported to CSV, where data containing commas gets quote-delimited. * Problem 1: As jh-at-junetz pointed out, the +1 in robin's nonquoted splitting command mistakenly adds an extra element to the resulting array. * Problem 2: If consecutive fields are quote-delimited, the remaining "separator" between them only contains one delimiter and no actual fields - so an extra element gets added to the parsed array. * Problem 3: When double-quotes appear in a spreadsheet exported to CSV, they get escaped by doubling them, i.e. a data field reading "this is a test of a "special" case" gets written to CSV as, "this is a test of a ""special"" case". These quotes are also interpreted as top-level delimiters and (mistakenly) add extra array elements to the output. I have hacked a conversion of "" to a single quote ( ' ), but a truly clever preg_split for the top-level splitter (instead of the explode) might preserve the original doubled "s without bugging up the top-level parsing. i.e., a smarter man than I could solve the problem rather than avoiding it by replacing the bad data. (current) Solution: <?php function quotesplit( $splitter=',', $s, $restore_quotes=0 ) { // hack because i'm a bad programmer - replace doubled "s with a ' $s = str_replace('""', "'", $s); //First step is to split it up into the bits that are surrounded by quotes //and the bits that aren't. Adding the delimiter to the ends simplifies //the logic further down $getstrings = explode('"', $splitter.$s.$splitter); //$instring toggles so we know if we are in a quoted string or not $delimlen = strlen($splitter); $instring = 0; while (list($arg, $val) = each($getstrings)) { if ($instring==1) { if( $restore_quotes ) { //Add the whole string, untouched to the previous value in the array $result[count($result)-1] = $result[count($result)-1].'"'.$val.'"'; } else { //Add the whole string, untouched to the array $result[] = $val; } $instring = 0; } else { // check that we have data between multiple $splitter delimiters if ((strlen($val)-$delimlen) >= 1) { //Break up the string according to the delimiter character //Each string has extraneous delimiters around it (inc the ones we added //above), so they need to be stripped off $temparray = split($splitter, substr($val, $delimlen, strlen($val)-$delimlen-$delimlen ) ); while(list($iarg, $ival) = each($temparray)) { $result[] = trim($ival); } } // else, the next element needing parsing is a quoted string and the comma // here is just a single separator and contains no data, so skip it $instring = 1; } } return $result; } ?> jh
robin: Nice function, saved my day. The +1 at the end of split / substr is wrong, though.
dalu
php4.3.0 strange things happen with split this didn't work $vontag $vonmonat were empty strings <?php function ckdate($fromdate="01.01", $todate="31.12") { $nowyear = date("Y"); list ($vontag , $vonmonat) = split ('.' , $fromdate); // << bad $vondatum = "$nowyear-$vonmonat-$vontag"; list ($bistag , $bismonat) = split ('.' , $todate); // << bad $bisdatum = "$nowyear-$bismonat-$bistag"; $von = strtotime($vondatum); $bis = strtotime($bisdatum); $now = time(); if (($now <= $bis) and ($now >= $von)) { return TRUE; } else { return FALSE; } } ?> however this one worked perfectly <?php function ckdate($fromdate="01.01", $todate="31.12") { $nowyear = date("Y"); list ($vontag , $vonmonat) = split ('[.]' , $fromdate); // << good $vondatum = "$nowyear-$vonmonat-$vontag"; list ($bistag , $bismonat) = split ('[.]' , $todate); // << good $bisdatum = "$nowyear-$bismonat-$bistag"; $von = strtotime($vondatum); $bis = strtotime($bisdatum); $now = time(); if (($now <= $bis) and ($now >= $von)) { return TRUE; } else { return FALSE; } } ?> btw this fn checks if $now if between $fromdate and $todate use it if you like re: gcerretini
Original problem: ================= I've try using split function. <?php $ferro="2�12"; $valore=split("[�]",$ferro); echo $ferro." "; echo "p1-".$valore[0]." "; echo "p2-".$valore[1]." "; echo "p3-".$valore[2]." "; $ferro="2d12"; $valore=split("[d]",$ferro); echo $ferro." "; echo "p1-".$valore[0]." "; echo "p2-".$valore[1]." "; echo "p3-".$valore[2]." "; ?> This return: ============ 2�12 p1-2 p2- p3-12 2d12 p1-2 p2-12 p3- I use charset UTF-8. When I use char � the split function ad an empty string between "2" and "12"... Why? Explanation: ============ UTF-8 charset codes some characters (like the "�" character) into two bytes. In fact the regular expresion "[�]" contains 4 bytes (4 non-unicode characters). To demonstrate the real situation I wrote following example: $ferro="2de12"; $valore=split("[de]",$ferro); echo $ferro." "; echo "p1-".$valore[0]." "; echo "p2-".$valore[1]." "; echo "p3-".$valore[2]." "; This returns: ============= 2d12 p1-2 p2- p3-12 moritz
Often you want to split CSV-Like data, so this is the function for this :) It splits data formatted like: 1,2,3 -> [1,2,3] 1 , 3, 4 -> [1,3,4] one; two;three -> ['one','two','three'] "this is a string", "this is a string with , and ;", 'this is a string with quotes like " these', "this is a string with escaped quotes \" and \'.", 3 -> ['this is a string','this is a string with , and ;','this is a string with quotes like " these','this is a string with escaped quotes " and '.',3] function quotesplit($s) { $r = Array(); $p = 0; $l = strlen($s); while ($p < $l) { while (($p < $l) && (strpos(" \r\t\n",$s[$p]) !== false)) $p++; if ($s[$p] == '"') { $p++; $q = $p; while (($p < $l) && ($s[$p] != '"')) { if ($s[$p] == '\\') { $p+=2; continue; } $p++; } $r[] = stripslashes(substr($s, $q, $p-$q)); $p++; while (($p < $l) && (strpos(" \r\t\n",$s[$p]) !== false)) $p++; $p++; } else if ($s[$p] == "'") { $p++; $q = $p; while (($p < $l) && ($s[$p] != "'")) { if ($s[$p] == '\\') { $p+=2; continue; } $p++; } $r[] = stripslashes(substr($s, $q, $p-$q)); $p++; while (($p < $l) && (strpos(" \r\t\n",$s[$p]) !== false)) $p++; $p++; } else { $q = $p; while (($p < $l) && (strpos(",;",$s[$p]) === false)) { $p++; } $r[] = stripslashes(trim(substr($s, $q, $p-$q))); while (($p < $l) && (strpos(" \r\t\n",$s[$p]) !== false)) $p++; $p++; } } return $r; } wchris
moritz's quotesplit didn't work for me. It seemed to split on a comma even though it was between a pair of quotes. However, this did work: function quotesplit($s, $splitter=',') { //First step is to split it up into the bits that are surrounded by quotes and the bits that aren't. Adding the delimiter to the ends simplifies the logic further down $getstrings = split('\"', $splitter.$s.$splitter); //$instring toggles so we know if we are in a quoted string or not $delimlen = strlen($splitter); $instring = 0; while (list($arg, $val) = each($getstrings)) { if ($instring==1) { //Add the whole string, untouched to the result array. $result[] = $val; $instring = 0; } else { //Break up the string according to the delimiter character //Each string has extraneous delimiters around it (inc the ones we added above), so they need to be stripped off $temparray = split($splitter, substr($val, $delimlen, strlen($val)-$delimlen-$delimlen ) ); while(list($iarg, $ival) = each($temparray)) { $result[] = trim($ival); } $instring = 1; } } return $result; } paha
It's evident but not mentioned in the documentation that using asterisks is more restricted than in a normal regular expression. for exaple you cannot say: split(";*",$string); because what if there's no ";" separator?(which is covered by this regular expression) so you have to use at least split(";+",$quotatxt); in this situation. jeffrey
In answer to gwyne at gmx dot net, dec 1, 2002: For split(), when using a backslash as the delimiter, you have to *double escape* the backslash. example: ================================== <pre> <? $line = 'stuff\\\thing\doodad\\'; $linearray = split('\\\\', $line); //<--NOTE USE OF FOUR(4)backslashes print join(":", $linearray); ?> </pre> ================================== output is: <pre> stuff::thing:doodad: </pre> nomail
If you want to use split to check on line feeds (\n), the following won't work: $line = split("\n", $input_several_lines_long); You really have to do this instead, notice the second slash: $line = split("\\n", $input_several_lines_long); Took me a little while to figure out. not
If you need to do a split on a period make sure you escape the period out.. $ext_arr = split("\.","something.jpg"); ... because $ext_arr = split(".","something.jpg"); won't work properly. jort
If you are looking for EITHER open square brackets OR close square brackets, then '[[]]' won't work (reasonably expected), but neither will '[\[\]]', nor with any number of escapes. HOWEVER, if your pattern is '[][]' it will work.
gcerretini
I've try using split function. <?PHP $ferro="2ø12"; $valore=split("[ø]",$ferro); echo $ferro." "; echo "p1-".$valore[0]." "; echo "p2-".$valore[1]." "; echo "p3-".$valore[2]." "; $ferro="2d12"; $valore=split("[d]",$ferro); echo $ferro." "; echo "p1-".$valore[0]." "; echo "p2-".$valore[1]." "; echo "p3-".$valore[2]." "; ?> This return: 2ø12 p1-2 p2- p3-12 2d12 p1-2 p2-12 p3- I use charset UTF-8 When I use char ø the split function ad an empty string between "2" and "12" Why? alphibia
I'd like to correct myself, I found that after testing my last solution it will create 5 lines no matter what... So I added this to make sure that it only displays 5 if there are five newlines. :-) <?php $MaxNewLines = 5; $BRCount = substr_count($Message, '<br />'); if ($BRCount<$MaxNewLines) $MaxNewLines=$BRCount; else if($BRCount == 0) $MaxNewLines=1; $Message = str_replace(chr(13), "<br />", $Message); $MessageArray = split("<br />", $Message, $MaxNewLines); $Message = ""; $u=0; do { $Message.=$MessageArray[$u].'<br />'; $u++; } while($u<($MaxNewLines-1)); $Message.=str_replace("<br />"," ",$MessageArray[$u]); ?> -Tim http://www.alphibia.com dan dot jones
Here's a function to split a string into csv values where they are optionally enclosed by " to allow values with commas in. I think it works. Let me know if I'm wrong. Cheers. Dan function getCSVValues($string) { // split the string at double quotes " $bits = split('"',$string); $elements = array(); for ($i=0;$i<count($bits);$i++) { /* odd numbered elements would have been enclosed by double quotes even numbered elements would not have been */ if (($i%2) == 1) { /* if the element number is odd add the whole string to the output array */ $elements[] = $bits[$i]; } else { /* otherwise split the unquoted stuff at commas and add the elements to the array */ $rest = $bits[$i]; $rest = preg_replace("/^,/","",$rest); $rest = preg_replace("/,$/","",$rest); $elements = array_merge($elements,split(',',$rest)); } } return $elements; } passtschu
divide a string with a template. the "template dividers" are the keys for the output array. <?PHP function string2array ($string, $template){ #search defined dividers preg_match_all ("|%(.+)%|U", $template, $template_matches); #replace dividers with "real dividers" $template = preg_replace ("|%(.+)%|U", "(.+)", $template); #search matches preg_match ("|" . $template . "|", $string, $string_matches); #[template_match] => $string_match foreach ($template_matches[1] as $key => $value){ $output[$value] = $string_matches[($key + 1)]; } return $output; } $string1 = 'www.something.com 66.196.91.121 - - [01/Sep/2005:04:20:39 +0200] "GET /robots.txt HTTP/1.0" 200 49 "-"'; $string2= '%Domain% %IP% - %User% \[%Date%:%Time% %TimeZone%\] "%Method% %Request% %Protocol%" %ServerCode% %Bytes% "%Referer%"'; print_r (string2array ($string1, $string2)); /* Array ( [ServerAddress] => www.something.com [IP] => 66.196.91.121 [User] => - [Date] => 01/Sep/2005 [Time] => 04:20:39 [TimeZone] => +0200 [Method] => GET [Request] => /robots.txt [Protocol] => HTTP/1.0 [ServerCode] => 200 [Bytes] => 49 [Referer] => - ) */ ?> 04-dec-2005 01:57
Be advised $arr = split("x", "x" ); print_r($arr); will output: Array ( [0] => [1] => ) That is it will catch 2 empty strings on each side of the delimiter. robin
Actually, this version is better than the last I submitted. The goal here is to be able to engage in *multiple* delimeter removal passes; for all but the last pass, set the third value to "1", and everything should go well. function quotesplit( $splitter=',', $s, $restore_quotes=0 ) { //First step is to split it up into the bits that are surrounded by quotes //and the bits that aren't. Adding the delimiter to the ends simplifies //the logic further down $getstrings = explode('"', $splitter.$s.$splitter); //$instring toggles so we know if we are in a quoted string or not $delimlen = strlen($splitter); $instring = 0; while (list($arg, $val) = each($getstrings)) { if ($instring==1) { if( $restore_quotes ) { //Add the whole string, untouched to the previous value in the array $result[count($result)-1] = $result[count($result)-1].'"'.$val.'"'; } else { //Add the whole string, untouched to the array $result[] = $val; } $instring = 0; } else { //Break up the string according to the delimiter character //Each string has extraneous delimiters around it (inc the ones we added //above), so they need to be stripped off $temparray = split($splitter, substr($val, $delimlen, strlen($val)-$delimlen-$delimlen+1 ) ); while(list($iarg, $ival) = each($temparray)) { $result[] = trim($ival); } $instring = 1; } } return $result; } theodule
A little modification from dan jones. _New_: I had a parameter to specify separator (default ","). _Fix _: double-quotes who appear in a spreadsheet exported to CSV, they get escaped by doubling them. So I remplace them by ' caracter <?php function getCSVValues($string,$separator=",") { $string = str_replace('""', "'", $string); // split the string at double quotes " $bits = explode('"',$string); $elements = array(); for ( $i=0; $i < count($bits) ; $i++ ) { /* odd numbered elements would have been enclosed by double quotes even numbered elements would not have been */ if (($i%2) == 1) { /* if the element number is odd add the whole string to the output array */ $elements[] = $bits[$i]; } else { /* otherwise split the unquoted stuff at commas and add the elements to the array */ $rest = $bits[$i]; $rest = preg_replace("/^".$separator."/","",$rest); $rest = preg_replace("/".$separator."$/","",$rest); $elements = array_merge($elements,explode($separator,$rest)); } } return $elements; } ?> ramkumar rajendran
A correction to a earlier note If you want to use split to check on line feeds (\n), the following won't work: $line = split("\n", $input_several_lines_long); You really have to do this instead, notice the second slash: $line = split("/\n", $input_several_lines_long); Took me a little while to figure to do mike
// Split a string into words on boundaries of one or more spaces, tabs or new-lines $s = "Please cut \t me \n in pieces"; $words = split("[\n\r\t ]+", $s); print_r($words); // Output: Array ( [0] => Please [1] => cut [2] => me [3] => in [4] => pieces ) shimon
<? // ** // * splitslash() // * // * this function enables to split with an escape char; // * // * @since 25/12/05 21:26:00 // * @author Shimon Doodkin // * // * @param $string // * @param $string // * @return Array() // ** function splitslash($split,$str,$esc='\\\\') { $o=explode($split,$str); $oc=count($o); $a=array(); for($i=0;$i<$oc;$i++) { $o2=explode($esc.$esc,$o[$i]); $o2c=count($o2); if($o2[$o2c-1][strlen($o2[$o2c-1])-1]==$esc) { $o2[$o2c-1]=substr($o2[$o2c-1],0,-1); if($i+1<$oc) { $o[$i+1]=join($esc.$esc,$o2).$split.$o[$i+1]; } else { //echo "error"; $a[]=join($esc,$o2); //do like ok } } else { $a[]=join($esc,$o2); } } return $a; } // example: $r=splitslash("NA","mooNAmooNAma\\\\ma\\NA"); print_r($r); //output: /* Array ( [0] => moo [1] => moo [2] => ma\\maNA ) */ ?> krahn
> strange things happen with split > this didn't work > $vontag $vonmonat were empty strings ... > list ($vontag , $vonmonat) = split ('.' , $fromdate); // << bad Split is acting exactly as it should; it splits on regular expressions. A period is a regular expression pattern for a single character. So, an actual period must be escaped with a backslash: '\.' A period within brackets is not an any-character pattern, because it does not make sense in that context. Beware that regular expressions can be confusing becuase there are a few different varieties of patterns. |