$cont=htmlspecialchars(file_get_contents("https://myanimelist.net/anime/30276/One_Punch_Man"));
function getBetween($string, $start = "", $end = ""){
if (strpos($string, $start)) { // required if $start not exist in $string
$startCharCount = strpos($string, $start) + strlen($start);
$firstSubStr = substr($string, $startCharCount, strlen($string));
$endCharCount = strpos($firstSubStr, $end);
if ($endCharCount == 0) {
$endCharCount = strlen($firstSubStr);
}
return substr($firstSubStr, 0, $endCharCount);
} else {
return '';
}
}
$name=getBetween($cont,'title',' - MyAnimeList.net');
//$name=preg_replace('/[^a-zA-Z0-9 \p{L}]/m', '', $name);
preg_replace('/(*UTF8)[\>\<]/m', '', $name);
trim($name," ");
//$name=str_replace("gt", "", $name);
echo $name;
i want to find the text between title tags. how to do this?
for example in this page title contains 'One Punch Man - MyAnimeList.net' i want to get that
Just use string replace function:
$string = '<BoomBox>';
$string = str_replace('<', '', $string);
$string = str_replace('>', '', $string);
echo $string; // output: Boombox
http://php.net/manual/en/function.str-replace.php
You edited your answer, and we can now see you are dealing with XML/HTML. It's always better to work with the DOM classes. Never use regex! There is a famous Stack Overflow post explaining why never to parse html with regex. Try this solution instead:
<?php
$dom = new DOMDocument();
$dom->loadHTML('<title>BoomBox</title>');
echo $dom->getElementsByTagName('title')->item(0)->textContent;
http://php.net/manual/en/class.domdocument.php
http://php.net/manual/en/class.domnode.php
See it working here https://3v4l.org/EjPQd
You can use preg_replace();, or strip_tags();.
Example preg_replace();:
$str = '> One Punch Man';
$new = preg_replace('/[^a-zA-Z0-9 \p{L}]/m', '', $str);
echo $new;
Output: One Punch Man
Above example will only allow a-z, A-Z and 0-9. You can expand this.
Example strip_tags();:
$str = '<title> BoomBox </title>';
$another = strip_tags($str);
echo $another;
Output: BoomBox
Documentation:
http://php.net/manual/en/function.preg-replace.php // preg_replace();
http://php.net/manual/en/function.strip-tags.php // strip_tags();
You can also use a single call to str_replace with the ['<','>'] as the search argument:
$string = '<BoomBox>';
echo str_replace(['<', '>'], '', $string) . PHP_EOL;
// => Boombox
Or, you may use a regex with preg_replace (especially, if you plan on adding more restrictions for in-context matching to it):
echo preg_replace('~[<>]~', '', $string);
// => Boombox
See the PHP demo.
Related
I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);
You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>
It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.
I have make a try like this:
$string = "localhost/product/-/123456-Ebook-Guitar";
echo $string = substr($string, 0, strpos(strrev($string), "-/(0-9+)")-13);
and the output work :
localhost/product/-/123456 cause this just for above link with 13 character after /-/123456
How to remove all? i try
$string = "localhost/product/-/123456-Ebook-Guitar";
echo $string = substr($string, 0, strpos(strrev($string), "-/(0-9+)")-(.*));
not work and error sintax.
and i try
$string = "localhost/product/-/123456-Ebook-Guitar";
echo $string = substr($string, 0, strpos(strrev($string), "-/(0-9+)")-999);
the output is empty..
Assume there are no number after localhost/product/-/123456, then I will just trim it with below
$string = "localhost/product/-/123456-Ebook-Guitar";
echo rtrim($string, "a..zA..Z-"); // localhost/product/-/123456
Another non-regex version, but require 5.3.0+
$str = "localhost/product/-/123456-Ebook-Guitar-1-pdf/";
echo dirname($str) . "/" . strstr(basename($str), "-", true); //localhost/product/-/123456
Heres a more flexibility way but involve in regex
$string = "localhost/product/-/123456-Ebook-Guitar";
echo preg_replace("/^([^?]*-\/\d+)([^?]*)/", "$1", $string);
// localhost/product/-/123456
$string = "localhost/product/-/123456-Ebook-Guitar-1-pdf/";
echo preg_replace("/^([^?]*-\/\d+)([^?]*)/", "$1", $string);
// localhost/product/-/123456
This should match capture everything up to the number and remove everything afterward
regex101: localhost/product/-/123456-Ebook-Guitar
regex101: localhost/product/-/123456-Ebook-Guitar-1-pdf/
Not a one-liner, but this will do the trick:
$string = "localhost/product/-/123456-Ebook-Guitar";
// explode by "/"
$array1 = explode('/', $string);
// take the last element
$last = array_pop($array1);
// explode by "-"
$array2 = explode('-', $last);
// and finally, concatenate only what we want
$result = implode('/', $array1) . '/' . $array2[0];
// $result ---> "localhost/product/-/123456"
I have a webpage which includes a hyperlink as follows:
$name = "Hello World";
echo "<a href='page.php?name='". preg_replace(" ", "_", $name) "'> bla bla </a>"
This generates the following link successfully:
...page.php?name=Hello_World
in my page.php I try to reverse the operation:
if($_SERVER[REQUEST_METHOD] == "GET"){
$name = $_GET['name'];
//testing if the problem is with GET
echo $name;
//now here's the problem:
$string = preg_replace("_", " ", $name);
echo $string;
}
the $name echoes correctly but the $string is always null
I've tried all possible combinations like ~~ and // and [_] and \s and using $_GET directly like:
preg_replace("_", " ", $_GET['name']);
none of them worked.
This problem has burned most of my day.
Any help is appreciated.
preg_replace accepts a regular expression as it's first arguments. Neither " " nor "_" are valid regular expressions.
In this case you can use str_replace.
The preg_replace syntax is incorrect as pointed by #Halcyon, the following is correct:
$string = preg_replace('/_/', ' ', $name);
But for such a simple find/replace you can use str_replace instead:
$string = str_replace("_", " ", $name);
echo $string;
you can create a function like sefurl
function makesefurl($url=''){
$s=trim($url);
$tr = array('ş','Ş','ı','I','İ','ğ','Ğ','ü','Ü','ö','Ö','Ç','ç','(',')','/',':',',');
$eng = array('s','s','i','i','i','g','g','u','u','o','o','c','c','','','-','-','');
$s = str_replace($tr,$eng,$s);
$s = preg_replace('/&amp;amp;amp;amp;amp;amp;amp;.+?;/', '', $s);
$s = preg_replace('/\s+/', '-', $s);
$s = preg_replace('|-+|', '-', $s);
$s = preg_replace('/#/', '', $s);
$s = str_replace('.', '.', $s);
$s = trim($s, '-');
$s = htmlspecialchars(strip_tags(urldecode(addslashes(stripslashes(stripslashes(trim(htmlspecialchars_decode($s))))))));
}
echo "<a href='page.php?name='". makesefurl($name) "'> bla bla </a>";
and than you can convert it to before makesefurl function all you need to create another functin like decoder or encoder html command
I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);
You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>
It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.
I am using regular expression for getting multiple patterns from a given string.
Here, I will explain you clearly.
$string = "about us";
$newtag = preg_replace("/ /", "_", $string);
print_r($newtag);
The above is my code.
Here, i am finding the space in a word and replacing the space with the special character what ever i need, right??
Now, I need a regular expression that gives me patterns like
about_us, about-us, aboutus as output if i give about us as input.
Is this possible to do.
Please help me in that.
Thanks in advance!
And finally, my answer is
$string = "contact_us";
$a = array('-','_',' ');
foreach($a as $b){
if(strpos($string,$b)){
$separators = array('-','_','',' ');
$outputs = array();
foreach ($separators as $sep) {
$outputs[] = preg_replace("/".$b."/", $sep, $string);
}
print_r($outputs);
}
}
exit;
You need to do a loop to handle multiple possible outputs :
$separators = array('-','_','');
$string = "about us";
$outputs = array();
foreach ($separators as $sep) {
$outputs[] = preg_replace("/ /", $sep, $string);
}
print_r($outputs);
You can try without regex:
$string = 'about us';
$specialChar = '-'; // or any other
$newtag = implode($specialChar, explode(' ', $string));
If you put special characters into an array:
$specialChars = array('_', '-', '');
$newtags = array();
foreach ($specialChars as $specialChar) {
$newtags[] = implode($specialChar, explode(' ', $string));
}
Also you can use just str_replace()
foreach ($specialChars as $specialChar) {
$newtags[] = str_replace(' ', $specialChar, $string);
}
Not knowing exactly what you want to do I expect that you might want to replace any occurrence of a non-word (1 or more times) with a single dash.
e.g.
preg_replace('/\W+/', '-', $string);
If you just want to replace the space, use \s
<?php
$string = "about us";
$replacewith = "_";
$newtag = preg_replace("/\s/", $replacewith, $string);
print_r($newtag);
?>
I am not sure that regexes are the good tool for that. However you can simply define this kind of function:
function rep($str) {
return array( strtr($str, ' ', '_'),
strtr($str, ' ', '-'),
str_replace(' ', '', $str) );
}
$result = rep('about us');
print_r($result);
Matches any character that is not a word character
$string = "about us";
$newtag = preg_replace("/(\W)/g", "_", $string);
print_r($newtag);
in case its just that... you would get problems if it's a longer string :)