I'm trying to display the latest additions to this NVD XML file:
http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml
I can get all of them to list using the following code, but I'm only interested in displaying the most recent ten (from 2013 for the time being) and the XML file lists them in chronological order (starting in 2011).
<?php
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
foreach($sxe->entry as $entry)
{
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
?>
Since you cannot manipulate the XML arrays directly, something like this should work for your needs:
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
$all = $sxe->entry;
$length = count($all);
$offset_start = $length - 10;
for($i = 0; $i < $length; $i++)
{
if($i >= $offset_start)
{
$entry = $all[$i];
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
}
Related
Is it possible to convert just a selection of a HTML with multiple tables to JSON ?
I have this Table:
<div class="mon_title">2.11.2015 Montag</div>
<table class="info" >
<tr class="info"><th class="info" align="center" colspan="2">Nachrichten zum Tag</th></tr>
<tr class='info'><td class='info' colspan="2"><b><u></u> </b>
...
</table>
<p>
<table class="mon_list" >
...
</table>
And this PHP code to covert it into JSON:
function save_table_to_json ( $in_file, $out_file ) {
$html = file_get_contents( $in_file );
file_put_contents( $out_file, convert_table_to_json( $html ) );
}
function convert_table_to_json ( $html ) {
$document = new DOMDocument();
$document->loadHTML( $html );
$obj = [];
$jsonObj = [];
$th = $document->getElementsByTagName('th');
$td = $document->getElementsByTagName('td');
$thNum = $th->length;
$arrLength = $td->length;
$rowIx = 0;
for ( $i = 0 ; $i < $arrLength ; $i++){
$head = $th->item( $i%$thNum )->textContent;
$content = $td->item( $i )->textContent;
$obj[ $head ] = $content;
if( ($i+1) % $thNum === 0){
$jsonObj[++$rowIx] = $obj;
$obj = [];
}
}
save_table_to_json( 'heute_S.htm', 'heute_S.json' );
What it does is takes the table class=info and the table class=mon_list and converts it to json.
Is there any way that it can just take the table class=mon_list?
You can use XPath to search for the class, and then create a new DOM document that only contains the results of the XPath query. This is untested, but should get you on the right track.
It's also worth mentioning that you can use foreach to iterate over the node list.
$document = new DOMDocument();
$document->loadHTML( $html );
$xpath = new DomXPath($document);
$tables = $xpath->query("//*[contains(#class, 'mon_list')]");
$tableDom = new DomDocument();
$tableDom->appendChild($tableDom->importNode($tables->item(0), true));
$obj = [];
$jsonObj = [];
$th = $tableDom->getElementsByTagName('th');
$td = $tableDom->getElementsByTagName('td');
$thNum = $th->length;
$arrLength = $td->length;
$rowIx = 0;
for ( $i = 0 ; $i < $arrLength ; $i++){
$head = $th->item( $i%$thNum )->textContent;
$content = $td->item( $i )->textContent;
$obj[ $head ] = $content;
if( ($i+1) % $thNum === 0){
$jsonObj[++$rowIx] = $obj;
$obj = [];
}
}
Another unrelated answer is to use getAttribute() to check the class name. Someone on a different answer has written a function for doing this:
function getElementsByClass(&$parentNode, $tagName, $className) {
$nodes=array();
$childNodeList = $parentNode->getElementsByTagName($tagName);
for ($i = 0; $i < $childNodeList->length; $i++) {
$temp = $childNodeList->item($i);
if (stripos($temp->getAttribute('class'), $className) !== false) {
$nodes[]=$temp;
}
}
return $nodes;
}
So far my script is working fine, basically it gets all htm files, out puts results, however im using DOM to get the HTML title tag from each file, that's where im not get to get it in the random array.. (image basenames and htm basename files are the same (firstresult.htm has picture firstresult.jpg)
I hope the code I provide and answer will be useful
<?php
// loop through the images
$count = 0;
$filenamenoext = array();
foreach (glob("/mydirectory/*.htm") as $filename) {
$filenamenoext[$count] = basename($filename, ".htm");
$count++;
}
for ($i = 0; $i < 10; $i++) {
$random = mt_rand(1, $count - 1);
$cachefile = "$filename";
$contents = file($cachefile);
$string = implode($contents);
$doc = new DOMDocument();
#$doc->loadHTML($string);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
echo '<img class="image" src="'.$filenamenoext[$random].'.jpg" " />"'.$title.'"<BR><BR>';
}
?>
It looks like the $filename variable you use on the line $cachefile = "$filename"; hasn't been set. It's only defined in the foreach loop's scope.
You should change it to
$cachefile = $filenamenoext[$random] . '.htm';
Also, it's a better practice to use array_push() and count() functions, instead of using a counter and manually filling the array. At least the code is better looking and more readable.
<?php
// loop through the images
$count = 0;
$filenamenoext = array();
foreach (glob("/mydirectory/*.htm") as $filename) {
array_push($filenamenoext, basename($filename, ".htm"));
}
for ($i = 0; $i < 10; $i++) {
$random = mt_rand(1, count($filenamenoext) - 1);
$cachefile = $filenamenoext[$random] . '.htm';
$contents = file($cachefile);
$string = implode($contents);
$doc = new DOMDocument();
#$doc->loadHTML($string);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
echo '<img class="image" src="' . $filenamenoext[$random] . '.jpg" " />"' . $title . '"<BR><BR>';
}
?>
How can I find out my keyword position on Google with PHP ?
I tried this URL :
http://www.google.com/url?sa=t&source=web&ct=res&cd=7&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&ei=0SjdSa-1N5O8M_qW8dQN&rct=j&q=flowers&usg=AFQjCNHJXSUh7Vw7oubPaO3tZOzz-F-u_w&sig2=X8uCFh6IoPtnwmvGMULQfw
but I couldn't get any html source code.
How can I do this ?
I use this script I made.
As it relies on possibly changing html from google it is not reliable but does it's job for now :
<?php
// Include the phpQuery library
// Download at http://code.google.com/p/phpquery/
include("phpQuery-onefile.php");
$country = "en";
$domain = "stackoverflow.com";
$keywords = "php google keyword rank checker";
$firstnresults = 50;
$rank = 0;
$urls = Array();
$pages = ceil($firstnresults / 10);
for($p = 0; $p < $pages; $p++){
$start = $p * 10;
$baseurl = "https://www.google.com/search?hl=".$country."&output=search&start=".$start."&q=".urlencode($keywords);
$html = file_get_contents($baseurl);
$doc = phpQuery::newDocument($html);
foreach($doc['#ires cite'] as $node){
$rank++;
$url = $node->nodeValue;
$urls[] = "[".$rank."] => ".$url;
if(stripos($url, $domain) !== false){
break(2);
}
}
}
print "Country: ".$country."\n";
print "Domain: ".$domain."\n";
print "Keywords: ".$keywords."\n";
print "Rank: ".$rank."\n";
print "First urls:\n";
print implode("\n", $urls)."\n";
?>
Maybe you should just use a class written for this purpose, for example this one on github
I have a PHP script that writes data to files in batches of 5000 table rows. When all the rows have been written to file(s) it should output the time taken to run the script. Instead what is happening is the script appears to be continually running on the browser and the output never appears but the file(s) exist, meaning the script has run. This only happens with large amounts of data. Any suggestions?
$startTime = time();
$ID = '123';
$productBatchLimit = 5000;
$products = new Products();
$countProds = $products->countShopKeeperProducts();
//limit the amount of products
//if ($countProds > $productBatchLimit){$countProds = $productBatchLimit; }
$counter = 1;
for ($i = 0; $i < $countProds; $i += $productBatchLimit) {
$xml_file = 'xml/products/'. $ID . '_'. $counter .'.xml';
$xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
$xml .= "<products>\n";
//create new file
$fh = fopen($xml_file, 'a');
fwrite($fh, $xml);
$limit = $productBatchLimit*$counter;
$prodList = $products->getProducts($i, $limit);
foreach ($prodList as $prod ){
$xml = "<product>\n";
foreach ($prod as $key => $value){
$value = functions::xml_entities($value);
$xml .= "<{$key}>{$value}</{$key}>\n";
}
$xml .= "</product>\n";
fwrite($fh, $xml);
}
$counter++;
fwrite($fh, '</products>');
fclose($fh);
}
//check to see when XML is fully formed
$validxml = XMLReader::open($xml_file);
$validxml->setParserProperty(XMLReader::VALIDATE, true);
if ($validxml->isValid()==true){
$endTime = time();
echo "Total time to generate results: ".($endTime - $startTime)." seconds. \n";
} else {
echo "Problem saving Products XML.\n";
}
I'm using this example to fetch links from a website :
http://www.merchantos.com/makebeta/php/scraping-links-with-php/
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
var_dump($href);
$url = $href->getAttribute('href');
echo "<br />Link stored: $url";
}
It works well; getting all the links; but I cannot get the actual 'title' of the link; for example if i have :
Google
I want to be able to fetch 'Google' term too.
I'm little lost and quite new to xpath.
You are looking for the "nodeValue" of the Textnode inside the "a" node.
You can get that value with
$title = $href->firstChild->nodeValue;
Full working example:
<?php
$dom = DomDocument::loadHTML("<html><body><a href='www.test.de'>DONE</a></body></html>");
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
$title = $href->firstChild->nodeValue;
echo "<br />Link stored: $url $title";
}
Prints:
Link stored: www.test.de DONE
Try this:
$link_title = $href->nodeValue;