Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I made a code that reads a table from another website and writes it on mine. Now I want to read just specific rows/columns and write it on my site. The table is filled with weather data and it refreshes every 5 minutes. I need only values for full and half hours and not all the values in the row, but just temperature. For example, there's a row for every five minutes containing temperature value, humidity, sun radiation etc. I need to find a value of, let's say 05:00, and read/write only temperature column of that row. In this case it would be: 05:00 12,5°C. And I need 48 values, because there's 24 hours per day and including another 24 half hours it's 48 all together, right..
This is a part of my code:
<?php
$trazi = ':00';
$citaj = file('proba.txt');
foreach($citaj as $linija)
{
if(strpos($linija, $trazi) !== false)
echo $linija;
}
$traziURL = "somepage";
$stranica = file_get_contents($traziURL);
$tablica = '/(<table.*<\/table>)/s';
preg_match_all($tablica, $stranica, $zeit);
echo $zeit[0][0];
$ime = "proba.txt";
$table = fopen($ime, 'w') or die ("Error!");
$podaci = $zeit[0][0];
fwrite($table, $podaci);
fclose($table);
?>
There's a chance that it won't work for you 'cause some parts are missing, but just to give you the idea.
I'm sure there are multiple other ways to do this, but I'd do it like this.
<?php
/**
* #author Bart Degryse
* #copyright 2013
*/
function getData() {
//Get the html page
$url = "http://www.essen-wetter.de/table.php";
$content = file_get_contents($url);
//Turn it into a dom document searchable by xpath
$dom = new DOMDocument();
$dom->loadHTML($content);
$xpath = new DOMXPath($dom);
//Get field names
$query = "//tr/td[position()=1 and normalize-space(text()) = 'Zeit']";
$entries = $xpath->query($query);
$entry = $entries->item(0);
$tr = $entry->parentNode;
foreach ($tr->getElementsByTagName("td") as $td) {
$fieldnames[] = $td->textContent;
}
//Get field data
$query = "//tr/td[position()=1 and (substring-after(normalize-space(text()),':') = '00' or substring-after(normalize-space(text()),':') = '30')]";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
$fieldvalues = array();
$tr = $entry->parentNode;
foreach ($tr->getElementsByTagName("td") as $td) {
$fieldvalues[] = $td->textContent;
}
$data[] = array_combine($fieldnames, $fieldvalues);
}
//Return data set
return $data;
}
//Gather the data
$data = getData();
//Do something with it
echo "<pre>\n";
foreach ($data as $row) {
echo "Temperature at {$row['Zeit']} was {$row['Temperatur']}.\n";
}
echo "</pre><hr><pre>\n";
print_r($data);
echo "</pre>\n";
?>
If you're going to display the data on a UTF-8 compatible terminal or on a web page that's declared as being UTF-8 encoded this should do it.
If you're want to use single-byte ISO-8859-1 encoding however you'll have to change this line:
$fieldnames[] = $td->textContent;
into this:
$fieldvalues[] = utf8_decode($td->textContent);
Remark
Please note that while doing this is technically not that hard legally you're on loose ground. The data on that page is copyrighted and owned by Markus Wolter. Using his data for your own purposes without his consent is considered theft.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Okay. I've got a mysql database full of users and they're corresponding information. They are assigned an auto incremented userId and directory in a users folder when account is created. Through out the past few months many of these rows have been deleted (accounts removed) from the database. My problem is when the row is removed from the database the directory for that user remains. I now have possibly thousands of folders for users that will never be used.
Ive got an array of useless directories. Problem I have is there is a difference of 16 that will be deleted. Where is this difference coming from??
<?php
ini_set('display_errors',1);
$pdo=new PDO('mysql://hostname=localhost;dbname=channel1_db', 'channel1_user', 'test123');
$data=$pdo->query('select user_id from accounts');
$data=$data->fetchAll(PDO::FETCH_ASSOC);
$alterIds = array();
foreach ($data as $id) {
$newName = "user-".$id['user_id'];
$alterIds[] = ($newName);
}
$idCount= count($alterIds);
$iterator = new DirectoryIterator(dirname(__FILE__));
$user_directories = array();
foreach ($iterator as $fileinfo) {
$user_directories[] = $fileinfo->getFilename();
}
array_shift($user_directories);//remove "."
array_shift($user_directories);//remove ".."
$fileCount = count($user_directories);
$diff = $fileCount-$idCount;
$useless_directories = array_diff($user_directories, $alterIds);
$uselessCount = count($useless_directories);
$unhandled = $uselessCount-$diff;
//display totals
echo("".$fileCount." Total Files");
echo("<br>");
echo("".$idCount." Total Ids");
echo("<br>");
echo("".$diff." projected useless");
echo("<br>");
echo("".$uselessCount." Useless Files Results");
echo("<br>");
echo("".$unhandled." Difference between projected and results");
echo("<br>");
?>
Reports...
19672 Total Files,
11038 Total Ids,
8634 Projected Useless,
8652 Useless Files Results,
18 difference between projected and actual
// scan the user director
$directories = scandir('users');
// connect to the database
$mysqli = new mysqli('localhost', 'root', '', 'test');
$user_directories = array();
// get user directories
$query = 'SELECT directory FROM user';
$results = $mysqli->query($query);
// populate the array with user directories
while ($row = $results->fetch_object()) {
$user_directories[] = $row->directory;
}
$useless_directories = array();
// loop through the directories
foreach ($directories as $directory) {
// check is a directory is not in the user directories retrieved from the database
if (!in_array(trim($directory), $user_directories)) {
// check if its a directory, scandir() scanss both directories and files
if (is_dir('users/' . $directory)) {
// populate the useless_directories
$useless_directories[] = $directory;
}
}
}
foreach ($useless_directories as $useless_directory) {
// create an array of files in the directories
$files = scandir('users/' . $useless_directory);
foreach ($files as $file) {
// delete any files in the directory
unlink('users/' . $useless_directory . '/' . trim($file));
}
// remove the directory
rmdir($useless_directory);
}
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
These are my databases:
database.csv:
barcode, Name , Qty ,Code
123456 ,Rothmans Blue , 40 ,RB44
234567 ,Rothmans Red , 40 ,RB30
345678 ,Rothmans Green , 40 ,RB20
456789 ,Rothmans Purple, 40 ,RB10
567890 ,Rothmans Orange, 40 ,RB55
stocktakemain.csv:
barcode, Name , Qty ,Code,ScQty
123456 ,Rothmans Blue , 40 ,RB44, 1
234567 ,Rothmans Red , 40 ,RB30, 1
Process:
The website has an input scan that is posted to "barcode".
It will check that the barcode exists within 'database.csv'
IF the barcode exists and is NOT within the 'stocktakemain.csv', it will add it to 'stocktakemain.csv' with a ScQty of 1. Please see Section 2 within code below.
ELSE when an existing barcode within stocktakemain.csv is scanned, append 'stocktakemain.csv' with +1 (an addition of 1) to ScQty for that particular line.
Bolded above is not working
Code:
function searchForBarcode($id, $array)
{
foreach ($array as $key => $val)
{
if (in_array($id, $val))
{
return $key;
}
}
return null;
}
$post = $_POST["barcode"];
$dbcsv = fopen('databases/database.csv', 'r');
$csvArray = array();
while(! feof($dbcsv))
{
$csvArray[] = fgetcsv($dbcsv);
}
fclose($dbcsv);
$searchf = searchForBarcode($post, $csvArray);
$result = $csvArray[$searchf];
$final = $result[+3];
if ($searchf !== NULL){
$stcsv = fopen('databases/stocktakemain.csv', 'r+');
$stArray = array();
while(! feof($stcsv))
{
$stArray[] = fgetcsv($stcsv);
}
fclose($stcsv);
$searchs = searchForBarcode($post, $stArray);
if ($searchs === NULL) {
$filew = 'databases/stocktakemain.csv';
$write = file_get_contents($filew);
$write .= print_r(implode(",",$result), true).",1\n";
file_put_contents($filew, $write);
}
else {
$filew = 'databases/stocktakemain.csv';
$resultexisting = $stArray[$searchs];
print_r($resultexisting);
echo "<br/>";
$getfilecont = file_get_contents($filew);
$getfilecont = trim($getfilecont);
$existing = explode(",", $getfilecont);
$existing[4] = trim($existing[4]);
++$existing[4];
print_r($existing);
echo "<br/>";
$writeto = print_r(implode(",",$existing), true);
print_r($writeto);
file_put_contents($filew, $writeto);
}
}
Here's some conclusions I've made from reading your code:
The else block is executed if an item is scanned that is already in the stocktakemain.csv file
$searchs contains the index of the row of the item that was scanned
$stArray contains a 2D array of the stocktakemain.csv contents - the first index is the line number, starting at 0, and the next index is the column number
Based on this, I think you need to rewrite your else block to be something like:
$scQtyColumn = 4;
// what is the current quantity?
$scQty = intval($stArray[$searchs][$scQtyColumn]);
// update the quantity in the stocktakemain.csv contents array
$stArray[$searchs][$scQtyColumn] = $scQty + 1;
// write each line to file
$output = fopen('databases/stocktakemain.csv', 'w');
foreach($stArray, $line) {
fputcsv($output, $line);
}
Could you try that out and see if it does the trick?
I am currently trying to crawl alot of data from a website, however I am struggling a little bit with it. It has an a-z index and 1-20 index, so it has a bunch of loops and DOM stuff in there. However, it managed to crawl and save about 10.000 rows at first run, but now I am at around 15.000 and it is only crawling around 100 per run.
It is probably because it has to skip the rows that it already has inserted, (made a check for that). I cant think of a way to easily skip some pages, as the 1-20 index varies a lot (for one letter there are 18 pages, other letter are only 2 pages).
I was checking if there already was an record with the given ID, if not, insert it. I assumed that would be slow, so now before the script stars I retrieve all rows, and then check with an in_array(), assuming thats faster. But it just wont work.
So my crawler is navigating 26 letters, 20 pages each letter, and then up to 50 times each page, so if you calculate it, its a lot.
Thought of running it letter by letter, but that wont really work as I am still stuck at "a" and cant just hop onto "b" as I will miss records from "a".
Hope I have explained the problem good enough for someone to help me. My code kinda looks like this: (I have removed some stuff here and there, guess all the important stuff is in here to give you an idea)
function in_array_r($needle, $haystack, $strict = false) {
foreach ($haystack as $item) {
if (($strict ? $item === $needle : $item == $needle) || (is_array($item) && in_array_r($needle, $item, $strict))) {
return true;
}
}
return false;
}
/* CONNECT TO DB */
mysql_connect()......
$qry = mysql_query("SELECT uid FROM tableName");
$all = array();
while ($row = mysql_fetch_array($qru)) {
$all[] = $row;
} // Retrieving all the current database rows to compare later
foreach (range("a", "z") as $key) {
for ($i = 1; $i < 20; $i++) {
$dom = new DomDocument();
$dom->loadHTMLFile("http://www.crawleddomain.com/".$i."/".$key.".htm");
$finder = new DomXPath($dom);
$classname="table-striped";
$nodes = $finder->query("//*[contains(concat(' ', normalize-space(#class), ' '), ' $classname ')]");
foreach ($nodes as $node) {
$rows = $finder->query("//a[contains(#href, '/value')]", $node);
foreach ($rows as $row) {
$url = $row->getAttribute("href");
$dom2 = new DomDocument();
$dom2->loadHTMLFile("http://www.crawleddomain.com".$url);
$finder2 = new DomXPath($dom2);
$classname2="table-striped";
$nodes2 = $finder2->query("//*[contains(concat(' ', normalize-space(#class), ' '), ' $classname2 ')]");
foreach ($nodes2 as $node2) {
$rows2 = $finder2->query("//a[contains(#href, '/loremipsum')]", $node2);
foreach ($rows2 as $row2) {
$dom3 = new DomDocument();
//
// not so important variable declarations..
//
$dom3->loadHTMLFile("http://www.crawleddomain.com".$url);
$finder3 = new DomXPath($dom3);
//2 $finder3->query() right here
$query231 = mysql_query("SELECT id FROM tableName WHERE uid='$uid'");
$result = mysql_fetch_assoc($query231);
//Doing this to get category ID from another table, to insert with this row..
$id = $result['id'];
if (!in_array_r($uid, $all)) { // if not exist
mysql_query("INSERT INTO')"); // insert the whole bunch
}
}
}
}
}
}
}
$uid is not defined, also, this query makes no sense:
mysql_query("INSERT INTO')");
You should turn on error reporting:
ini_set('display_errors',1);
error_reporting(E_ALL);
After your queries you should do an or die(mysql_error());
Also, I might as well say it, if I don't someone else will. Don't use mysql_* functions. They're deprecated and will be removed from future versions of PHP. Try PDO.
I am trying to get all the rows from a Google spreadsheet via a PHP/Zend script. This is the script I am using:
$service = Zend_Gdata_Spreadsheets::AUTH_SERVICE_NAME;
$client = Zend_Gdata_ClientLogin::getHttpClient('xxxxxxxxx', 'xxxxxxx', $service);
$spreadsheetService = new Zend_Gdata_Spreadsheets($client);
// Get spreadsheet key
$spreadsheetsKey = 'xxxxxxxxxxxxx';
$worksheetId = 'xxx';
// Get cell feed
$query = new Zend_Gdata_Spreadsheets_CellQuery();
$query->setSpreadsheetKey($spreadsheetsKey);
$query->setWorksheetId($worksheetId);
$cellFeed = $spreadsheetService->getCellFeed($query);
// Build an array of entries:
$ssRows = array();
$ssRow = array();
$titleRow = array();
$tableRow = 1;
foreach($cellFeed as $cellEntry) {
$row = $cellEntry->cell->getRow();
$col = $cellEntry->cell->getColumn();
$val = $cellEntry->cell->getText();
// Add each row as a new array:
if ($row != $tableRow) {
array_push($ssRows, $ssRow);
$ssRow = array();
// Move to the next row / new array
$tableRow = $row;
}
// Build the array of titles:
if ($row == 1) {
$titleRow[$col] = $val;
}
// Build the array of results with the title as the key:
else {
$key = $titleRow[$col];
$ssRow[$key] = $val;
}
}
// Pass the results array:
return array_reverse($ssRows);
This builds me an array with MOST of the details from the spreadsheet, however it always misses off the last entry - can anyone see what I am doing wrong, or is there a better way to get all the data from the spreadsheet?
The form is a 3 part form, based on different answers. On filling out one part, I want to display a URL back to the form, with some details from the first form pre-filled to make the second part of the form faster to fill out. This is all fine, it is simply the missing last entry that is the major problem!
Thanks!
Your code works like this:
if (next_row) {
data[] = current_row
current_row = array();
}
if (first_row) {
title_row logic
} else {
add cell to current_row
}
So you only add the rows to your collector once you go to the next row. This will miss the last row because you'll miss that last transition.
The easy fix is to add array_push($ssRows, $ssRow); right after the foreach loop. You will need to add a check for 0 rows, this should be skipped then.
Perhaps a more proper fix is to iterate by row, then by cell, rather than just by cell.
I am trying to get a group of attributes from an XML file based on the ID. I have looked around and haven't found a solution that works.
Here is my XML:
<data>
<vico>2</vico>
<vis>
<vi>
<id>1</id>
<name>Vill1</name>
<att>2</att>
<hp>100</hp>
<xp>10</xp>
</vi>
<vi>
<id>2</id>
<name>Vill2</name>
<att>3</att>
<hp>120</hp>
<xp>12</xp>
</vi>
</vis>
</data>
What I am looking to do is create a script that takes the value of vico and does mt_rand(0,vico) (Already created this part of the code), and based on the number that comes up, it pulls that node.
For instance, if the number is 2, I would like it to pull all of the attributes where the id in the xml file is 2. I am thinking I have to make the ID a parent of the other attributes, but I am not sure. For the life of me I can't figure this out. I also want to make sure this is even possible before I invest anymore time.
I do realize this can be done very easily with mySQL but I have chose not to do it that way for portability as I am working off a thumb drive. Any help would be greatly appreciated.
Working code for pulling based on ID:
$xml = simplexml_load_file("vi.xml")
or die("Error: Cannot create object");
$vc = $xml->vico;
$select = mt_rand()&$vc;
if ($select == 0) {
$select = $select + 1;
}
$result = $xml->xpath("/data/vis/vi/id[.='".$select."']/parent::*");
if ($result) {
$node = $result[0];
$name = $node->name;
$att = $node->att;
$hp = $node->hp;
$xp = $node->xp ;
} else {
// nothing found for this id
}
Using xpath to retrieve the node where id=your value (it is based on the value).
$result = $xml->xpath("/data/vis/vi/id[.='".$select."']/parent::*");
if ($result) {
$node = $result[0];
$name = $node->name;
$att = $node->att;
$hp = $node->hp;
$xp = $node->xp ;
} else {
// nothing found for this id
}
If you're using SimpleXML:
$name = $xml->vis->vi[$vico]->name;
$att = $xml->vis->vi[$vico]->att;
$hp = $xml->vis->vi[$vico]->hp;
$xp = $xml->vis->vi[$vico]->xp;