I know how to do this in Ruby, but I want to do this in PHP. Grab a page and be able to parse stuff out of it.
Take a look at cURL. Knowing about cURL and how to use it will help in many ways as it's not specific to PHP. If you want something specific however, you can use file_get_contents which is the recommended way in PHP to get the contents of a file into a string.
$file = file_get_contents("http://google.com/");
How to parse it depends on what you are trying to do, but I'd recommend one of the XML libraries for PHP.
You could use fopen in read mode: fopen($url, 'r'); or more simply file_get_contents($url);. You could also use readfile(), but file_get_contents() is potentially more efficient and therefore recommended.
Note: these are dependent on config (see the linked manual page) but will work on most setups.
For parsing, simplexml is enabled by default in PHP.
$xmlObject = simplexml_load_string($string);
// If the string was valid, you now have a fully functional xml object.
echo $xmlObject->username;
Its funny, I had the opposite question when I started rails development
Related
http://simplehtmldom.sourceforge.net/
Looking to parse through an HTML file, make some small changes, and overwrite the current file with the updates. Was wondering if this was possible through simplehtmldom
As of right now, I can access everything, but have no way of displaying the entirely of the HTML With all of the changes included. Is it only possible to grab specific values?
If so, what other methods could I use to accomplish this? I'm afraid the environment I have to work in is extremely limited due to security issues.
The simplehtmldom object can be used as a string, and will just contain the full contents of the modified document.
For example:
echo($html);
file_put_contents($filename, $html);
If you prefer an object-oriented method, you can use save():
$updated = $html->save();
or
$html->save($filename);
I'm working on a website that uses a lot of XML-files as data (150 in total and probably growing). Each page is an XML-file.
What I'm looking for is a way to look for a string through the XML-files. I'm not sure what programming language to use for this XML search engine.
I'm familiar with PHP, JavaScript, JQuery. So I'd prefer using those languages.
Thanks a bunch!
UPDATE: I'm looking for a solution that works quickly.
Ideally, the function returns the tagname that contains the searchstring.
If, for instance, the XML is as follows:
<article-1>This is a great story.</article-1>
If one would search for 'story', it would return 'article-1'.
I'm not quite sure on how to do this with a regular expression.
PHP can do this. Here's an example:
foreach(glob("{foldera/*.xml,folderb/*.xml}",GLOB_BRACE) as $filename) {
$xml = simplexml_load_file($filename);
//use regular expressions to find your string
}
You simply iterate through each file on your server using glob() with a foreach loop.
Sounds like a problem that could be solved with grep and regular expressions. Without knowing what string you're looking for it's not possible to say exactly what you should do, but reading some documentation on grep should get you started down the right path.
I'm currently working a project that has me working with XML a lot. I have to take an XML response and decrypt each text node and then do various tasks with the data. The problem I'm having is taking the response and processing each text node. Originally I was using the XMLToArray library, and that worked fine I would change the XML into an array and then loop through the array and decrypt the values. However some of the XML response I'm dealing with have repeated tags and the XMLToArray library will only return the last values.
Is there a good way that I can take an XML response and process all the text nodes and easily putting the values into an array that has a similar structure to the response?
Thanks in advance.
I would use SimpleXML.
Here's a small example of using it. It loads and parses XML from http://www.w3schools.com/xml/plant_catalog.xml and then outputs values of "COMMON" and "PRICE" tags of each "PLANT" tag.
$xml = simplexml_load_file('http://www.w3schools.com/xml/plant_catalog.xml');
foreach ( $xml->PLANT as $plantNode ) {
echo $plantNode->COMMON, ' - ', $plantNode->PRICE, "\n";
}
If you have any problems with adapting it to your needs, just give an example of your XML so that we can help with it.
All those XML to array libraries are a remain of the times where PHP 4 would force you to write your own XML parser almost from scratch. In recent PHP versions you have a good set of XML libraries that do the hard job. I particularly recommend SimpleXML (for small files) and XMLReader (for large files). If you still find them complicate, you can try phpQuery.
You might want to give SimpleXML a try. Plus it comes by default in php so you dont need to install
Check out SimpleXML, it may offer a bit more for what you are looking for.
i've been having problems trying to write attributes of xml files using php in mac. Now, the weird thing is that it works flawlesly on windows, but when i try to run the script in a mac, for some mysterious reason beyond me, it keeps writing the atributes of the xml file with dashes, this is the actual xml file that the script writes:
<stuff id=\"stuffid\"></stuff>
this is php code, really basic script:
$file = fopen("data.xml","w");
fwrite($file, $xml);
fclose($file);
can anyone lend a hand?, i've been looking for a solution to this all morning, im using mamp by the way
if the XML is coming from an external source as a string, my best guess is that php is mis-configured, in this case it's probably the magic_quotes_gpc setting, which should be set to "Off"
You probably have magic quotes turned on. Try this:
if(get_magic_quotes_gpc())
$xml = stripcslashes($xml);
Learn more at http://php.net/manual/en/security.magicquotes.php
have you considered using the DOMDocument class or SimpleXML class to construct your document?
It'll handle all this stuff for you...
I'm a somewhat experienced PHP scripter, however I just dove into parsing XML and all that good stuff.
I just can't seem to wrap my head around why one would use a separate XML parser instead of just using the explode function, which seems to be just as simple. Here's what I've been doing (assuming there is a valid XML file at the path xml.php):
$contents = file_get_contents("xml.php");
$array1 = explode("<a_tag>", $contents);
$array2 = explode("</a_tag>", $array1[1]);
$data = $array2[0];
So my question is, what is the practical use for an XML parser if you can just separate the values into arrays and extract the data from that point?
Thanks in advance! :)
Excuse me for not going into details but for starters try parsing
$contents = '<a xmlns="urn:something">
<a_tag>
<b>..</b>
<related>
<a_tag>...</a_tag>
</related>
</a_tag>
<foo:a_tag xmlns:foo="urn:something">
<![CDATA[This is another <a_tag> element]]>
</foo:a_tag>
</a>';
with your explode-approach. When you're done we can continue with some trickier things ;-)
In a nutshell, its consistency. Before XML came into wide use there were numerous undocumented formats for keeping information in files. One of the motivators behind XML was to create a well defined, standard document format. With this well defined format in place, a general set of parsing tools could be developed that would work consistently on documents so long as the documents adhered to the aforementioned well defined format.
In some specific cases, your example code will work. However, if the document changes
...
<!-- adding an attribute -->
<a_tag foo="bar">Contents of the Tag</a_tag>
...
...
<!-- adding a comment to the contents -->
<a_tag>Contents <!-- foobar --> of the Tag</a_tag>
...
Your parsing code will probably break. Code written using a correctly defined XML parser will not.
XML parsers:
Handle encoding
May have xpath support
Allow you to easily modify and save the XML; append/remove child nodes, add/remove attributes, etc.
Don't need to load the whole file into memory (except from DOM parsers)
Know about namespaces
...
How would you explode the same file if a_tag had an attribute?
explode("<a_tag>" ... will work differently than explode("<a_tag attr='value'>" ..., after all.
XML Parsers understand the XML specification. Explode can only handle the simplest of cases, and will most likely fail in a lot of instances of that case.
Using a proven XML parsing method will make the code more maintainable and easy to read. It will also make it more easily adaptable should the schema change, and it can make it easier to determine error conditions. XPath and XSLT exist for a reason, they are proven ways to deal with XML data in a sensible, legible manner. I'd suggest you use whichever is applicable in your given situation. Remember, just because you think you're only writing code for one specific purpose, you never know what a piece of well-written code could evolve into.