I'm new to PHP so I don't know whether this is possible.
I need to add brackets to different timestamps so that this:
<span class="time">2:26</span>
<span class="time">2:51</span>
<span class="time">3:37</span>
<span class="time">1:19</span>
becomes this:
<span class="time">(2:26)</span>
<span class="time">(2:51)</span>
<span class="time">(3:37)</span>
<span class="time">(1:19)</span>
EDIT
The above HTML is generated using a simple DOM parser to grab info from a webpage.
If this is part of a larger HTML string or if the syntax might differ, it's a better idea to use a DOM parser.
However, if that isn't the case you can do this:
$string = str_replace('<span class="time">', '<span class="time">(', $string);
$string = str_replace('</span>', ')</span>', $string);
Or you can use regex:
$string = preg_replace('/<span class=\"time\">(\d+\:\d+)<\/span>/', '<span class="time">($1)</span>', $string);
Assuming your timestamp string is in $timestamp variable, you could use concatenation, ie.:
$output = '(' . $timestamp . ')';
Or as you mentioned, using a regular expression to validate the string before addind brackets:
$output = preg_replace("/(\d+):(\d+)/", "($0)", $timestamp);
Related
I have the following regex :
$string = preg_replace("/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i","<a target=\"_blank\" href=\"$1\">$1</A>",$string);
Using it to parse this string : http://www.ttt.com.ar/hello_world
Produces this new string :
<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/hello_world</A>
So far , soo good. What I want to do is to get replacement $1 to be a substring of $1 producing an output like :
<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/...</A>
Pseudocode of what I mean:
$string = preg_replace("/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i","<a target=\"_blank\" href=\"$1\">substring($1,0,24)..</A>",$string);
Is this even possible? Probably Im just doing all wrong :)
Thanks in advance.
Check out preg_replace_callback():
$string = 'http://www.ttt.com.ar/hello_world';
$string = preg_replace_callback(
"/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i",
function($matches) {
$link = $matches[1];
$substring = substr($link, 0, 24) . '..';
return "<a target=\"_blank\" href=\"$link\">$substring</a>";
},
$string
);
var_dump($string);
// <a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/...</a>
Note, you can also use the e modifier in PHP to execute functions in your preg_replace(). This has been deprecated in PHP 5.5.0, in favor of preg_replace_callback().
You can use a capturing group inside of a lookahead like this:
preg_replace(
"/((?=(.{24}))[\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i",
"<a target=\"_blank\" href=\"$1\">$2..</A>",
$string);
This will capture the entire URL in group 1, but it will also capture the first 24 characters of it in group 2.
You are showing bad practice. Regexes should not being used to parse or modify xml content from application's context.
Suggests:
Use a DOM parsing to read and modify the value
use parse_url() to get the protocol + domain name
Example:
$doc = new DOMDocument();
$doc->loadHTML(
'<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/hello_world</A>'#
);
$link = $doc->getElementsByTagName('a')->item(0);
$url = parse_url($link->nodeValue);
$link->nodeValue = $url['scheme'] . '://' . $url['host'] . '/...';
echo $doc->saveHTML();
How I can replace string inside some text to getting this string without this "pattern"?
For example I trying replace %%some text%% to
<span class="spoiler">some text</span>
preg_replace("'%%[\w\s]+%%'siu",'<span class="spoiler">$0</span>',$description);
This will do what you are looking for:
$description = '%%some text%%';
$fixed_description = preg_replace("~%%([\w\s]+?)%%~siu",'<span class="spoiler">$1</span>',$description);
echo $fixed_description;
Output:
<span class="spoiler">some text</span>
i have a string that has markers and I need to replace with text from a database. this text string is stored in a database and the markers are for auto fill with data from a different part of the database.
$text = '<span data-field="la_lname" data-table="user_properties">
{Listing Agent Last Name}
</span>
<br>RE: The new offer<br>Please find attached....'
if i can find the data marker by:
strpos($text, 'la_lname');
can i use that to select everything in and between the <span> and </span> tags..
so the new string looks like:
'Sommers<br>RE: The new offer<br>Please find attached....'
I thought I could explode the string based on the <span> tags but that opens up a lot of problems as I need to keep the text intact and formated as it is. I just want to insert the data and leave everything else untouched.
To get what's between two parts of a string
for example if you have
<span>SomeText</span>
If you want to get SomeText then I suggest using a function that gets whatever is between two parts that you put as parameters
<?php
function getbetween($content,$start,$end) {
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
$text = '<span>SomeText</span>';
$start = '<span>';
$end = '</span>';
$required_text = getbetween($text,$start,$end);
$full_line = $start.$required_text.$end;
$text = str_replace($full_line, 'WHAT TO REPLACE IT WITH HERE',$text);
You could try preg_replace or use a DOM Parser, which is far more useful for navigating HTML-like-structure.
I should add that while regular expressions should work just fine in this example, you may need to do more complex things in the future or traverse more intrincate DOM structures for your replacements, so a DOM Parser is the way to go in this case.
Using PHP Simple HTML DOM Parser
$html = str_get_html('<span data-field="la_lname" data-table="user_properties">{Listing Agent Last Name}</span><br>RE: The new offer<br>Please find attached....');
$html->find('span')->innerText = 'New value of span';
How could I replace some markup in this format:
[a href="/my_page" style="font-size: 13px"]click me[/a]
to
click me
using preg_replace()?
I will need to allow for more attributes as well.
$s = '[a href="/my_page" style="font-size: 13px"]click me[/a]';
$ret = preg_replace('~\[([^\[\]]+)\]([^\[\]]++)\[/([^\[\]]++)\]~', '<\1>\2</\3>', $s);
I have a string that has some hyperlinks inside. I want to match with regex only certain link from all of them. I can't know if the href or the class comes first, it may be vary.
This is for example a sting:
<div class='wp-pagenavi'>
<span class='pages'>Page 1 of 8</span><span class='current'>1</span>
<a href='http://stv.localhost/channel/political/page/2' class='page'>2</a>
»eee<span class='extend'>...</span><a href='http://stv.localhost/channel/political/page/8' class='last'>lastן »</a>
<a class="cccc">xxx</a>
</div>
I want to select from the aboce string only the one that has the class nextpostslink
So, the match in this example should return this -
»eee
This regex is the most close I could get -
/<a\s?(href=)?('|")(.*)('|") class=('|")nextpostslink('|")>.{1,6}<\/a>/
But it is selecting the links from the start of the string.
I think my problem is in the (.*) , but I can't figure out how to change this to select only the needed link.
I would appreciate your help.
It's much better to use a genuine HTML parser for this. Abandon all attempts to use regular expressions on HTML.
Use PHP's DOMDocument instead:
$dom = new DOMDocument;
$dom->loadHTML($yourHTML);
foreach ($dom->getElementsByTagName('a') as $link) {
$classes = explode(' ', $link->getAttribute('class'));
if (in_array('nextpostslink', $classes)) {
// $link has the class "nextpostslink"
}
}
Not sure if that's what you're but anyway: it's a bad idea to parse html with regex. Use a xpath implementation in order to reach the desired elements. The following xpath expression would give you all the 'a' elements with class "nextpostlink" :
//a[contains(#class,"nextpostslink")]
There are loads of xpath info around, since you didn't mention your programming language here goes a quick xpath tutorial using java: http://www.ibm.com/developerworks/library/x-javaxpathapi/index.html
Edit:
php + xpath + html: http://dev.juokaz.com/php/web-scraping-with-php-and-xpath
This would work in php:
/<a[^>]+href=(\"|')([^\"']*)('|\")[^>]+class=(\"|')[^'\"]*nextpostslink[^'\"]*('|\")[^>]*>(.{1,6})<\/a>/m
This is of course assuming that the class attribute always comes after the href attribute.
This is a code snippet:
$html = <<<EOD
<div class='wp-pagenavi'>
<span class='pages'>Page 1 of 8</span><span class='current'>1</span>
<a href='http://stv.localhost/channel/political/page/2' class='page'>2</a>
»eee<span class='extend'>...</span><a href='http://stv.localhost/channel/political/page/8' class='last'>lastן »</a>
<a class="cccc">xxx</a>
</div>
EOD;
$regexp = "/<a[^>]+href=(\"|')([^\"']*)('|\")[^>]+class=(\"|')[^'\"]*nextpostslink[^'\"]*('|\")[^>]*>(.{1,6})<\/a>/m";
$matches = array();
if(preg_match($regexp, $html, $matches)) {
echo "URL: " . $matches[2] . "\n";
echo "Text: " . $matches[6] . "\n";
}
I would however suggest first matching the link and then getting the url so that the order of the attributes doesn't matter:
<?php
$html = <<<EOD
<div class='wp-pagenavi'>
<span class='pages'>Page 1 of 8</span><span class='current'>1</span>
<a href='http://stv.localhost/channel/political/page/2' class='page'>2</a>
»eee<span class='extend'>...</span><a href='http://stv.localhost/channel/political/page/8' class='last'>lastן »</a>
<a class="cccc">xxx</a>
</div>
EOD;
$regexp = "/(<a[^>]+class=(\"|')[^'\"]*nextpostslink[^'\"]*('|\")[^>]*>(.{1,6})<\/a>)/m";
$matches = array();
if(preg_match($regexp, $html, $matches)) {
$link = $matches[0];
$text = $matches[4];
$regexp = "/href=(\"|')([^'\"]*)(\"|')/";
$matches = array();
if(preg_match($regexp, $html, $matches)) {
$url = $matches[2];
echo "URL: $url\n";
echo "Text: $text\n";
}
}
You could of course extend the regexp by matching one of the both variants (class first vs href first) but it would be very long and I don't think it would be a performance increase.
Just as a proof of concept I created a regexp that doesn't care about the order:
/<a[^>]+(href=(\"|')([^\"']*)('|\")[^>]+class=(\"|')[^'\"]*nextpostslink[^'\"]*(\"|')|class=(\"|')[^'\"]*nextpostslink[^'\"]*(\"|')[^>]+href=(\"|')([^\"']*)('|\"))[^>]*>(.{1,6})<\/a>/m
The text will be in group 12 and the URL will be in either group 3 or group 10 depending on the order.
As the question is to get it by regex, here is how <a\s[^>]*class=["|']nextpostslink["|'][^>]*>(.*)<\/a>.
It doesn't matter in which order are the attributs and it also consider simple or double quotes.
Check the regex online: https://regex101.com/r/DX03KD/1/
I replaced the (.*) with [^'"]+ as follows:
<a\s*(href=)?('|")[^'"]+('|") class=('|")nextpostslink('|")>.{1,6}</a>
Note: I tried this with RegEx Buddy so I didnt need to escape the <>'s or /