using str_replace with file_get_contents

using str_replace with file_get_contents - php

I am using file_get_contents to grab a page, but i need to replace some data in the contents of the page before echoing it.
I have this so far (this script runs on domain2.com)
<?php
$page = file_get_contents('http://domain.com/page.html');
str_replace('href="/','href="http://domain.com','$page');
echo $page;
?>
The problem is, that when the page displays, some links on the domain.com page read:
<a href=/about.html>
Which when i call in my script, are prepending it with the incorrect domain. I tried using str_replace, to look for
href="/
and replace it with
href="http://www.domain.com/
But its not working. Any clues?

fixed it
$pagefixed = str_replace("href=\"/","href=\"http://www.domain.com/","$page");
Thanks all

You'll either need to use a regular expression (preg_replace) or 2 str_replaces since quotes vary.

Related

str_replace infected code with a wildcard

I have a website that has been infected with malware code, heres an example:
<?php if(!isset($GLOBALS["\x61\156\x75\156\x61"])) { $ua=strto...algiyujsz-1; ?>
It's a rather large code, however I'm guessing due to some characters str_replace is not working, how would I go about replacing a string like the above via preg_replace(the ... being a wildcard)? I'm rather bad at regex and can't get it working. Or is there some way to get this working via str_replace so I have a point of reference for furure?
Full code here: http://pastie.org/10084259
Thank you!

Solved my issue with '/<\?php if\(!isset\(\$GL(.+?)z-1; \?>/is' and preg_replace

preg_replace "%2520" with a space

I have a blog page on my website where a user edit's a post by going to a URL like this... http://www.example.com/blog?edit=blog post here. The script used to replace the spaces with %20 like it should but now it is replacing the spaces with %2520 and now the script can't search the database because there is no post called blog20post20here. I was going to go down the path of preg_replace, so I tried this...
preg_replace("/%2520/"," ",$_GET['edit']);
but that didn't seem to work.
I have never used preg_replace() and I just now read up on it in the manual. If someone could either point me down the right path and or show me how to correctly use preg_replace that would be awesome.

Sounds like you're double-escaping somewhere when generating the urls. %25 is the coding for the % character, so it sounds like it's going from %20 to %2520.
As an aside, there's better ways to decode that url (urldecode() for example), so perhaps preg_replace isn't really necessary...
EDIT: oh, and you should just use urlencode to generate the url in the first place.

For %2520
<?php echo urldecode(urldecode($_GET['edit'])); ?>
For %20
<?php echo urldecode($_GET['edit']); ?>

"catching" links in regex using php ignoring inline js

I'm stuck trying to make a regex in PHP that catches the link and its content from a html page (which I have no control over) and replaces it with a link of mine.
i.e.:
<a style="position:absolute;more_styles:more;" href="http://www.google.co.il/" class="something">This is the content</a>
Becomes:
<a style="position:absolute;more_styles:more;" href="my_function('http://www.google.co.il/')" class="something">This is the content</a>
This is the regex that I wrote:
$content = preg_replace('|<a(.*?)href=[\"\'](.*?)[\"\'][^>]*>(.*?)</a>|i','$3',$content);
This works well with all the links except links like:
<a href="http://google.co.il" onclick="if(MSIE_VER()>=4){this.style.behavior='url(#default#homepage)';this.setHomePage('http://www.google.co.il')}" class='brightgrey rightbar' style='font-size:12px'><b>Make me the home page!</b></a>
Obviously, the regexp stops at "MSIE_VER()>" because of the "[^>]*" part and i get the wrong content when I use "$3".
I tried almost every option to make this work but no luck.
Any thoughts?
Thank you all in advance..

First of all your code is trying to do something different that to add my_function - it tries to remove the starting tag and replace it with url only. There are several ways to acheieve your declared goal (i.e. substituing my_function to all hrefs) , the most pragmafic would be:
$content = preg_replace('|href=[\"\'](.*?)[\"\']|i',"href=\"my_function('$1')\"",$content);
if you need more prudent approach than I would use
$content = preg_replace('|(<a.*?)href=[\"\'](.*?)[\"\'](.*?</a>)|i',"$1href=\"my_function('$2')\"$3",$content);
last but not least if you need removing tag rather than what you have written, let me know there is million ways to do it.

By default .* will take evryting it can - eg. it takes onclick argument, because regex is still valid - replace "." with [^\"] - it will tell regexp to take evrything excluding " ( which cannot be in URL )
$content = preg_replace('|<a(.*?)href=[\"\']([^"]*?)[\"\'][^>]*>(.*?)</a>|i','$3',$content);

PHP remove page name Regex - preg_replace

I have this url (several similar ones)..
images/image1/image1.jpg
images/images1/images2/image2.jpg
images/images2/images3/images4/image4.jpg
I have this regex: but I want it to strip away the image name from the string:
<?php $imageurlfolder = $pagename1;
$imageurlfolder = preg_replace('/[A-Za-z0-9]+.asp/', '', $pagename1);?>
the string would look like the url's above images/images2/images3/images4/ but without the image4.jpg
hope you can help
Thanks

For this particular purpose function dirname() would be sufficient:
<?php echo dirname('images/images2/images3/images4/image4.jpg'); ?>
Would return:
images/images2/images3/images4

I think you can use the dirname function
for instance (from that page)
dirname("/etc/passwd")
would print
/etc

A quite straightforward way to do it:
preg_replace("#(?<=/)[^/]+$#","",$your_string);
It will remove everything between the last / and the end of the string.
Edit: as many peopole pointed out, you can also use dirname which might proof faster…

PHP Extract Text from Webpage

Is it possible to do something with PHP where I can set up a connection to a URL like http://en.wikipedia.org/wiki/Wiki and extract any words that contain a prefix like "Exa" and "ins" such that the resulting PHP page will print out all the words that it found. For example with "Exa", the word "Example" would be printed out each time it found an instance of "Example". Same thing for words that start with "ins".

$data = strip_tags(file_get_contents($url));
$matches = array();
preg_match('/\bExa|ins([^\b]+)/', $data, &$matches);
for ($i = 1; $i < count($matches); $i++) {
echo "Match: '".$matches[$i]."'\r\n";
}
Probably something like this, though I'm not so sure about the regex, I haven't tested it yet...
Edit: I changed it, it should work now... (\B => \b and strip_tags to prevent HTML-classes from being matched).

I don't have a full answer with example to give you, but yes, you should be able to read the whole page into a string variable and then do normal string operations on it. It will read in all the HTML, so you will probably need to do a lot of regex to eliminate tags if you don't want them.

Read the page into a string using file_get_contents. Use one of the various string functions to examine the page.

Yes, this possible. A potential approach would be to:
Use something like fopen (if allow_url_fopen is enabled - failing that use CURL) to grab the external web page content.
Remove the (presumably not required) HTML tags via strip_tags.
Use strtok to tokenise and iterate over the remaining content, checking for whatever conditions you require.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

using str_replace with file_get_contents - php

fixed it $pagefixed = str_replace("href=\"/","href=\"http://www.domain.com/","$page"); Thanks all

You'll either need to use a regular expression (preg_replace) or 2 str_replaces since quotes vary.

Related

str_replace infected code with a wildcard

preg_replace "%2520" with a space

"catching" links in regex using php ignoring inline js

PHP remove page name Regex - preg_replace

PHP Extract Text from Webpage

Categories

Resources