Get substring between two strings PHP - Reading HTML - php

I am having a ton of trouble running through finding a string between two strings.
This is the code i currently have
<?
$html = file_get_contents('mywebsite');
$tags = explode('<',$html);
foreach ($tags as $tag)
{
// skip scripts
if (strpos($tag,'script') !== FALSE) continue;
// get text
$text = strip_tags('<'.$tag);
// only if text present remember
if (trim($text) != '') $texts[] = $text;
//print_r($text);
echo($text);
}
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = $text;
$parsed = get_string_between($fullstring, "tag1", "tag2");
print_r($parsed);
echo ($parsed);
?>
I think the problem happens on this line:
$fullstring = $text;
I am not entirely sure if $text has the stripped down HTML from the above function. When i run this code i get the stripped out webpage like i expect but i got nothing between the tags i am setting.
Does anyone know why this might be happening or what i am missing?

I think its because you are declaring text as a local variable inside for loop. so , after when you are assigning $text to fullstring It's actually null. I don't understand what you are trying to do , but do this and see if it works
$fullstring = ""
foreach ($tags as $tag){
#your code as usual
echo($text);
$fullstring = $fullstring.$text;
}
and delete the $fullstring = $text line.

you can use this:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Reference

Related

PHP How to Replace string with empty space? [duplicate]

This question already has answers here:
PHP: Best way to extract text within parenthesis?
(8 answers)
Closed last year.
Input:
GUJARAT (24)
Expected Output:
24
String Format:
State Name (State Code)
How can I remove State Code with parentheses?
You can also use php explode:
$state = "GUJARAT (24)";
$output = explode( "(", $state );
echo trim( $output[0] ); // GUJARAT
$str = "GUJARAT (24)";
echo '<br />1.';
print_r(sscanf($str, "%s (%d)"));
echo '<br />2.';
print_r(preg_split('/[\(]+/', rtrim($str, ')')));
echo '<br />3.';
echo substr($str, strpos($str, '(')+1, strpos($str, ')')-strpos($str, '(')-1);
echo '<br />4.';
echo strrev(strstr(strrev(strstr($str, ')', true)), '(', true));
echo '<br />5.';
echo preg_replace('/(\w+)\s+\((\d+)\)/', "$2", $str);
If you know the stateCode length, and it is fixed
$state = "GUJARAT (24)";
$state = substr($state, strpos($state, "(") + 1, 2);
// get string upto "(", and remove right white space
echo $state;
You can check more about substr and strpos in php manual.
If stateCode length is not fixed, you can use regular expression:-
$state = 'GUJARAT (124)';
preg_match('/(\d+)/', $state, $res);
$stateCode = $res[0];
var_dump($stateCode);
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
you can use the function preg_match
$state = 'GUJARAT (24)'
$result = [];
preg_match('/[\(][\d]+[\)]/', $state, $result);
$stateCode = $result[0] // (24)
$stateCode = substr($stateCode, 1, strlen($stateCode) - 2); // will output 24

Processing text in PHP finding a matching character

How can I process text with some codes.
So suppose I have text as below
Hello {::first_name::} {::last_name::},
How are you?
Your organisation is {::organisation::}
For any text between {:: and ::} should be evaluated to get its value.
I tried exploding text to array using space as delimiter and then parsing array items to look for "{::" and if found get string between "{::" and "::}" and calling database to get this field value.
So basically these will be db fields.
Below is the code I have tried
$msg = "Hello {::first_name::} {::last_name::},
How are you?
Your organisation is {::organisation::}";
$msg_array = explode(" ", $msg);
foreach ($msg_array as $str) {
if (strpos($str, "{::") !== false) {
$field_str = get_string_between($str, "{::", "::}");
$field_value = $bean->$field_str; //Logic that gets the value of the field
$msgStr .= $field_value . " ";
} else {
$msgStr .= $str . " ";
}
}
function get_string_between($string, $start, $end)
{
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
Your script seems fine. Your script in fiddle
If you are looking for alternative way, you can try using preg_match_all() with str_replace(array, array, source)
<?php
$bean = new stdClass();
$bean->first_name = 'John';
$bean->last_name = 'Doe';
$bean->organisation = 'PHP Company';
$string = "Hello {::first_name::} {::last_name::}, How are you? Your organisation is {::organisation::}";
// find all placeholders
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$placeholders = $matches[0];
//strings inside placeholders
$parts = $matches[1];
// return values from $bean by matching object property with strings inside placeholders
$replacements = array_map(function($value) use ($bean) {
// use trim() to remove unexpected space
return $bean->{trim($value)};
}, $parts);
echo $newstring = str_replace($placeholders, $replacements, $string);
Short format:
$string = "Hello {::first_name::} {::last_name::}, How are you? Your organisation is {::organisation::}";
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$replacements = array_map(function($value) use ($bean) {
return $bean->{trim($value)};
}, $matches[1]);
echo str_replace($matches[0], $replacements, $string);
And if you prefer to use a function:
function holder_replace($string, $source = null) {
if (is_object($source)) {
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$replacements = array_map(function($value) use ($source) {
return (property_exists(trim($value), 'source')) ? $source->{trim($value)} : $value;
}, $matches[1]);
return str_replace($matches[0], $replacements, $string);
}
return $string;
};
echo holder_replace($string, $bean);
OUTPUT:
Hello John Doe, How are you? Your organisation is PHP Company
fiddle
Or you can simply use str_replace function:
$data = "{:: string ::}";
echo str_replace("::}", "",str_replace("{::", "", $data));

php listing specific file extension only

I want to make my code to list specific file extension only. I have list of php files and I use this code to list all of then in page
so what I need here is to list specific extiontion only. php files only
right now my code lists all files in same folder
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
can some one please post an answer with editing this and show me how do I make it to list php files only?
Hope this helps. It will only read files which filenames end with php
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if (substr($myfile, -3) != "php")
continue;
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
glob() is pretty much made to do this exactly, and it avoids regex in a loop, substringing on an arbitrary number of chars, etc.
foreach (glob("*.php") as $script) {
echo "$script size " . filesize($script) . PHP_EOL;
}
Something like this ( untested )
$Dir = new DirectoryIterator($dir);
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
Or even better
$Dir = new FilesystemIterator(__DIR__, FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS | FilesystemIterator::KEY_AS_PATHNAME );
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
This second one lets you skip the dots ('.', '..' etc) and changes all the \ Windows to / Linux style ( mainly for use on windows ), auto-magically.

echo hyperlinks in array

I have an array that looks like this:
[6625] => Trump class="mediatype"> href="/news/picture">Slideshow: [6628] => href="http://www.example.com/news/picture/god=USRTX1N84J">GOP [6630] => nation
I need to be able to pull out anything within href="" of the array and put into a new one.
I have tried:
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
echo getStringInBetween($arr[0], 'href="', '"')
?>
Try this code, adapt it to suit you,
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
foreach ($arr as $val) {
if (strpos($val, 'href') !== false) {
echo getStringInBetween($val, 'href="', '"');
}
}
?>
This example when ran outputted google.com/hello.

php substring occurances between two strings in an html file

So i have an HTML file as source, it contains several instances of the following code:
<span itemprop="name">NAME</span>
where the NAME part always changing to something different.
how can i write a php code that would go through the html code, extract all the names between the "<span itemprop="name">" and "</span>" and put it in an array?
i have tried this code but it doesn't work:
$prev=$html;
for($i=0; $i<10; $i++){
$current = explode('<span itemprop="name">', $prev);
$cur = explode('</span>', $current[1]);
$names[] = $cur[0];
$prev = $current[2];
}
print_r($names);
Probably better way would be using php DOMDocument or simple php dom or any DOM representative than the way you planed.
Here is example of working DOMDocument code:
$doc = new DOMDocument();
$doc->loadHTML('<html><body><span itemprop="name">1</span><span itemprop="name">2</span><span itemprop="name">3</span></body></html>');
$finder = new DomXPath($doc);
$nodes = $finder->query("//*[contains(#itemprop, 'name')]");
foreach($nodes as $node)
{
echo $node->nodeValue . '<br />';
}
Outputs:
1
2
3
I kinda feel bad for saying this... but you could use a regular expression
preg_match_all('/<span itemprop="name">(.*?)<\/span>/i', $matches);
var_dump($matches); // results are stored in the variable $matches;
This function will get us the "NAME"
function getbetween($content,$start,$end) {
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
This function will replace only the first occurence
<?php
function str_replace_once($search, $replace, $subject) {
$firstChar = strpos($subject, $search);
if($firstChar !== false) {
$beforeStr = substr($subject,0,$firstChar);
$afterStr = substr($subject, $firstChar + strlen($search));
return $beforeStr.$replace.$afterStr;
} else {
return $subject;
}
}
?>
now a loop
$start = '<span itemprop="name">';
$end = '</span>';
while(strpos($content, $start)) {
$name = getbetween($content, $start, $end);
$content = str_replace_once($start.$name.$end, '',$content);
echo $name.'<br>';
}
use this function:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Refenter link description here

Categories