A while ago I have written a short tutorial of how you can write a short PHP function to extract content from specific delimiters. I has come to my attention that many people are looking for a way to replace and even modify content between 2 delimiters. Therefore I have decided to write a script that can help them with this endeavor. Uses regular expressions to find and parse the matched content.
Replace all the content found between the delimiters
1
2
3
| function replace_content_inside_delimiters($start,$end, $new, $source) {return preg_replace('#('.preg_quote($start).')(.*?)('.preg_quote($end).')#si', '$1'.$new.'$3',$source);} |
As you can see the function uses 4 arguments. The first 2 two are the delimiters (the beginning and the ending one), the 3d is the replacement string and the last one is the main source that is parsed.
1
2
3
4
5
6
7
8
9
10
11
12
| $data = '<body><div class="wrap"><div class="inside">Lorem ipsum dolor sit amet, consectetur adipiscing elit.</div></div></body>';$start = '<div class="inside">';$end = '</div>';$replace_with = 'PHP: Hypertext Preprocessor';$str = replace_content_inside_delimiters($start,$end, 'PHP: Hypertext Preprocessor', $data);echo $str;// Result: <body><div class="wrap"><div class="inside">PHP: Hypertext Preprocessor</div></div></body>' |
Filter the content found between the delimiters
In case you need to modify the text between the delimiters here’s how you can do it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| $source = '<div class="subtitle">PHP is a widely-used general-purpose scripting language for web development.</div>';$start = '<div class="subtitle">';$end = '</div>';$data = preg_replace('#('.preg_quote($start).')(.*?)('.preg_quote($end).')#si','$1'.parse_content($new).'$3', $source);function parse_content($content) {$words = array('PHP', 'scripting', 'development');// Let's bold some words!foreach($words as $word) {$content = str_replace($word,'<strong>'.$word.'</strong>', $content);}return $content;} |
How it works?
The regex uses the \s and the \i modifiers. The former (aka: DOTALL) makes dot a special character that matches newlines too. The later matches the characters in insensitive mode.