so in my spare time I wanted to make a web to track the GPU price on a e-commerce. I am using PHP and the library Simple HTML DOM to parse the target HTML and it happen every hour from CRON Job.
(Yes, I knew I can make it in Selenium or others to scrape data more efficiently, but in this case just to challenge myself while learning it).
How it work is : Grab data and store it into database. Next, in other table it matches data from database : When the new price of a GPU is the same as latest price, it just update the date and time; If the new price is different with the latest, it make the latest price into old price and update some other things.
The scraping things is coded for a specific e-commerce website;
These variables placement are still scattered a little bit because I tried other
things;
It grab data every hour and logs the seconds on average 40-50, so my assumption is this processing time.
My question is : How can I make the code more efficient compared to my current method?
This is the code to grab the data :
<?php
error_reporting(E_ALL ^ E_WARNING);
require_once 'simple_html_dom.php';
// Database variables here
// ...
try {
$conn = new PDO("mysql:host=$servername;$dbname", $username, $password);
// set the PDO error mode to exception
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
// Get the URL List
$stmt = $conn->prepare("SELECT id,url FROM url_list");
$stmt->execute();
$url_list = $stmt->fetchAll(PDO::FETCH_COLUMN|PDO::FETCH_UNIQUE);
} catch(PDOException $e) {
echo "Connection failed: " . $e->getMessage();
}
// Scrap the data from a website then return as array
function get_gpu_info(string $targeturl, int $gpu_id)
{
$results = array();
$html = new simple_html_dom();
$html->load_file($targeturl);
if (!empty($html)) {
$div_class = $price = $stock = "";
$div_class = $html->find("#main-pdp-container", 0);
$out_of_stock = $html->find(".css-1igct5v-unf-quantity-editor__input[disabled]", 0);
$price = $div_class->find(".price", 0)->innertext;
$price_int = intval(preg_replace('/[^\d\,]+/', '', $price));
$stock = ($div_class->find(".css-1a29oke p b", 0)->innertext) ?: 0;
if (!empty($price)) {
$results = array(
'GPUID' => $gpu_id,
'PRICE' => $price,
'PRICEINT' => $price_int,
'STOCK' => $stock
);
} else {echo "Price not found";}
} else {echo "URL Not Found";}
return $results;
}
// Scrap every single data from the URL list found
$gpu_data = array_map('get_gpu_info', array_values($url_list), array_keys($url_list));
try {
$time = date("H:i:s");
$date = date("Y-m-d");
$stmt = $conn->prepare("INSERT INTO price_history (gpu_id, price, price_int, stock, update_time, update_date)
VALUES (:insert_gpu_id, :insert_price, :insert_price_int, :insert_stock, :insert_update_time, :insert_update_date)");
$stmt->bindParam(':insert_gpu_id', $insert_gpu_id);
$stmt->bindParam(':insert_price', $insert_price);
$stmt->bindParam(':insert_price_int', $insert_price_int);
$stmt->bindParam(':insert_stock', $insert_stock);
$stmt->bindParam(':insert_update_time', $time);
$stmt->bindParam(':insert_update_date', $date);
foreach ($gpu_data as $data => $val) {
$insert_gpu_id = $val['GPUID'];
$insert_price = $val['PRICE'];
$insert_price_int = $val['PRICEINT'];
$insert_stock = $val['STOCK'];
$stmt->execute();
$stmt2 = $conn->prepare("SELECT COUNT(gpu_id) FROM gpu_data WHERE gpu_id = :gpu_id");
$stmt2->bindValue(':gpu_id', $val['GPUID'], PDO::PARAM_INT);
$stmt2->execute();
$count = (int)$stmt2->fetchColumn();
if($count) {
$stmt4 = $conn->prepare("SELECT old_price, old_price_int, latest_price, latest_price_int, latest_update_time, latest_update_date FROM gpu_data WHERE gpu_id = :gpu_id");
$stmt4->bindParam(':gpu_id', $val['GPUID']);
$stmt4->execute();
$old_data = $stmt4->fetch(PDO::FETCH_ASSOC);
$old_price_int = $old_data['old_price_int'];
$old_latest_price_int = $old_data['latest_price_int'];
$old_price = $old_data['old_price'];
$get_date = $old_data['latest_update_date'];
$get_time = $old_data['latest_update_time'];
$combined_old_date_time = date('Y-m-d H:i:s', strtotime("$get_date $get_time"));
if($old_price_int == $insert_price_int) {
//print_r("Same price");
$stmt3 = $conn->prepare("UPDATE gpu_data SET
stock = :stock,
latest_update_time = :update_time,
latest_update_date = :update_date
WHERE gpu_id = :gpu_id");
} else {
//print_r("Different price");
$stmt3 = $conn->prepare("UPDATE gpu_data SET
old_price = :old_price,
old_price_int = :old_price_int,
old_datetime = :old_datetime,
latest_price = :price,
latest_price_int = :price_int,
stock = :stock,
latest_update_time = :update_time,
latest_update_date = :update_date
WHERE gpu_id = :gpu_id");
$stmt3->bindParam(':old_price', $old_price);
$stmt3->bindParam(':old_price_int', $old_price_int);
$stmt3->bindParam(':old_datetime', $combined_old_date_time);
$stmt3->bindParam(':price', $insert_price);
$stmt3->bindParam(':price_int', $insert_price_int);
print_r("Old price updated");
}
$stmt3->bindParam(':update_time', $time);
$stmt3->bindParam(':update_date', $date);
$stmt3->bindParam(':stock', $insert_stock);
$stmt3->bindParam(':gpu_id', $val['GPUID']);
$stmt3->execute();
//print_r("GPU Data with the same record found and has been updated");
} else {//print_r("ERROR: No GPU Data with that GPU ID has been found");
}
}
//print_r("Price record/s updated successfully");
} catch(PDOException $e) {
echo $sql . "<br>" . $e->getMessage();}
$conn = null;
?>
Thanks in advance!
It's likely you're taking a lot of time to load each page you're scraping. Probably some pages are a lot slower than others. Try doing something like this, to time your load_file() operations, to figure that out.
$loadStartTime = date();
$html->load_file($targeturl);
$loadEndTime = date();
echo $targeturl . ': ' . $loadEndTime - $loadStartTime . ' seconds to load.';
Your dom-romping code looks straightforward enough.
It seems doubtful you have many thousands of rows in your table, so your database stuff should be fast enough.
Related
I need some help
Is there a way to make this in PDO? https://stackoverflow.com/a/1899508/6208408
Yes I know I could change to mysql but I use a mssql server and can't use mysql. I tried some things but I'm not as good with PDO as mysql... It's hard to find some good examples of inserting array's into database with PDO. So quickly said I have a PDO based code connected to a mssql webserver.
best regards joep
I tried this before:
//id
$com_id = $_POST['com_id'];
//array
$mon_barcode = $_POST['mon_barcode'];
$mon_merk = $_POST['mon_merk'];
$mon_type = $_POST['mon_type'];
$mon_inch = $_POST['mon_inch'];
$mon_a_date = $_POST['mon_a_date'];
$mon_a_prijs = $_POST['mon_a_prijs'];
$data = array_merge($mon_barcode, $mon_merk, $mon_type, $mon_inch, $mon_a_date, $mon_a_prijs);
try{
$sql = "INSERT INTO IA_Monitor (Com_ID, Barcode, Merk, Type, Inch, Aanschaf_dat, Aanschaf_waarde) VALUES (?,?,?,?,?,?,?)";
$insertData = array();
foreach($_POST['mon_barcode'] as $i => $barcode)
{
$insertData[] = $barcode;
}
if (!empty($insertData))
{
implode(', ', $insertData);
$stmt = $conn->prepare($sql);
$stmt->execute($insertData);
}
}catch(PDOException $e){
echo $sql . "<br>" . $e->getMessage();
}
$conn = null;
The code below should fix your problems.
$db_username='';
$db_password='';
$conn = new \PDO("sqlsrv:Server=localhost,1521;Database=testdb", $db_username, $db_password,[]);
//above added per #YourCommonSense's request to provide a complete example to a code fragment
if (isset($_POST['com_id'])) { //was com_id posted?
//id
$com_id = $_POST['com_id'];
//array
$mon_barcode = $_POST['mon_barcode'];
$mon_merk = $_POST['mon_merk'];
$mon_type = $_POST['mon_type'];
$mon_inch = $_POST['mon_inch'];
$mon_a_date = $_POST['mon_a_date'];
$mon_a_prijs = $_POST['mon_a_prijs'];
$sql = "INSERT INTO IA_Monitor (Com_ID, Barcode, Merk, Type, Inch, Aanschaf_dat, Aanschaf_waarde) VALUES (?,?,?,?,?,?,?)";
try {
$stmt = $conn->prepare($sql);
foreach ($mon_barcode as $i => $barcode) {
$stmt->execute([$com_id, $barcode, $mon_merk[$i], $mon_type[$i], $mon_inch[$i], $mon_a_date[$i], $mon_a_prijs[$i]]);
}
} catch (\PDOException $e) {
echo $sql . "<br>" . $e->getMessage();
}
}
$conn = null;
Updating is way too slow:
//class for Database
class MyDB extends SQLite3
{
function __construct()
{
$this->open('/database.db');
}
}
//new db - Object
$db = new MyDB();
if(!$db){
echo $db->lastErrorMsg();
} else {
echo "Opened database successfully\n";
}
This code updates the database and gives me begin_time and length:
//BEGIN
$db->exec('BEGIN;');
//prepare update:
$smt2 = $db->prepare("UPDATE users SET username = :username, full_name = :full_name, is_private = :is_private, is_follower = 1, updated_on = :time, was_follower = NULL WHERE user_id = :usernameId");
//bind parameter
$smt2->bindParam(':usernameId', $usernameId);
$smt2->bindParam(':username', $username);
$smt2->bindParam(':full_name', $full_name);
$smt2->bindParam(':is_private', $is_private);
$smt2->bindParam(':time', $time);
//Prepare second update
$smt3 = $db->prepare("UPDATE users SET followed_on = IfNull(followed_on, :time) WHERE user_id = :usernameId");
//bind parameter
$smt3->bindParam(':usernameId', $usernameId);
$smt3->bindParam(':time', $time);
try {
echo "start\n";
$time_begin = time();
echo $time_begin;
//LOOP
foreach ($followers as $follower) {
$usernameId = $follower->getUsernameId();
$username = $follower->getUsername();
$full_name = $follower->getFullName();
$ProfilPicUrl = $follower->getProfilePicUrl();
$is_private = $follower->isPrivate();
//muss 0 sein, aber ist im mom einfach nur '' bei false.
if(strcmp($is_private, 1) !== 0){
$is_private = 0;
}
$time = time();
//Function, which returns rows, how often an entry exists in db (can only be 0 or 1)
$existence = item_exists($db, 'users', 'user_id', $usernameId);
if($existence)
{
//EXECUTE first update
$smt2->execute();
if(!$smt2){
echo $db->lastErrorMsg();
}
//EXECUTE second update
$smt3->execute();
if(!$smt3){
echo $db->lastErrorMsg();
}
}
}
//COMMIT
$db->exec('COMMIT;');
//TIME
$time_diff = time() - $time_begin;
echo "END: ". $time_diff . "\n";
}
catch (Exception $e) {
echo $e->getMessage();
}
I already use "BEGIN" and "COMMIT" and "prepare". But for an array with 10300 entries it still takes 173 seconds. Where I insert an array with 100.000 entries that took 8 seconds! What makes the update statement in this code so slow?
I merged the two update statements into one:
$smt2 = $db->prepare("EXPLAIN UPDATE users SET followed_on = IfNull(followed_on, :time), username = :username, full_name = :full_name, is_private = :is_private, is_follower = 1, updated_on = :time, was_follower = NULL WHERE user_id = :usernameId");
It still takes 87 seconds.
"EXPLAIN QUERY PLAN" :
0|0|0|SCAN TABLE users
I'm making a simple website for a class, and I am trying to save information to my database. The error is not very specific and I do not know which part of my code I need to fix.
Error message:
check the manual that corresponds to your MariaDB server version for
the right syntax to use near ')' at line 2
My PHP code:
<?php
include 'mysqli.php' ;
$result = $con->query("select * from setList s
left join songTable t on s.SetList_ID = t.Song_ID
left join bands b on s.SetList_ID = b.Band_ID");
if ($_SERVER['REQUEST_METHOD'] == 'POST') {
$setList = $_POST['setlist'];
$venue = $_POST['venue'];
$date = $_POST['dateOfShow'];
$band= $_POST['band'];
$set = $result->fetch_object();
//error handling and form
try {
if (empty($setList) || empty($venue) || empty($date) || empty($band)) {
throw new Exception(
"All Fields Required");
}
if (isset($set)) {
$id = $set->SetList_ID;
$q = "update setList set SetList_Name = '$setList',
Venue = '$venue', Show_Date = $date, Band_Name = '$band')";
}
else{
$q = "insert setList (SetList_Name, Venue, Show_Date, Band_Name)
values ('$setList', '$venue', $date, '$band')";
}
$result = $con->query($q);
if (!$result) {
throw new Exception($con->error);
}
header('Location:my_set-lists.php');
} catch(Exception $e) {
echo '<p class ="error">Error: ' .
$e->getMessage() . '</p>';
}
}
?>
The error message tells you exactly where the problem is; you have an extra ). Replace
$q = "update setList set SetList_Name = '$setList',
Venue = '$venue', Show_Date = $date, Band_Name = '$band')";
// extra ) is here ---------------------------------------------^
With
$q = "update setList set SetList_Name = '$setList',
Venue = '$venue', Show_Date = $date, Band_Name = '$band'";
Note: your next query (starting insert setList) is also going to fail; it should be INSERT INTO setList.... A decent IDE (like PHPStorm) would catch these errors for you.
Also, you are wide open to SQL injection. You really need to be using prepared statements.
I am trying to replace previous entries in MYSQL database each time new data is available, I have the following PHP code but it seems to add new entries each time. Please help, thanks.
I have tried using REPLACE but it still does not work, could anyone tell me what it is I am doing wrong?
<?php
header('Content-Type: application/json');
$data = json_decode(file_get_contents('php://input'), true);
$mysqli = new mysqli("localhost","dbuser","Pa55uu0Rd","iewdb");
if (mysqli_connect_errno())
{
echo json_encode(array('error' => 'Failed to connect to MySQL: ' . mysqli_connect_error() ));
return;
}
if(!$data)
{
echo json_encode(array('error' => 'Error input data'));
return;
}
$usernme = $data['usernme'];
$longitude = $data['longitude'];
$latitude = $data['latitude'];
$user = $mysqli->query("SELECT id FROM Users WHERE usernme = '$usernme' LIMIT 1");
$user_id = $user->fetch_object();
if(!$user_id)
{
$mysqli->query("INSERT INTO Users (usernme) VALUES ('$usernme');");
$user_id->id = $mysqli->insert_id;
}
if($longitude && $latitude)
{
$mysqli->query("REPLACE INTO Locations (User_id,Longitude, Latitude) VALUES ($user_id->id,$longitude,$latitude);");
}
$mysqli->close();
echo json_encode(array('user_id' => $user_id->id));
use update query something like this
UPDATE MyTable
SET User_id = 'USER_ID_VALUE', Longitude='LONGITUDE_VALUE', Latitude='LATITUDE_VALUE'
WHERE SomeOtherColumn LIKE '%PATTERN%'
Logic : Instead of replacing old entry you can delete that old entries and later add fresh entries into database will always good in case of performance..
So you will have to write one delete and insert query only...instead of 3 queries
Here is my solution to the problem and it works just fine. I decided to go with UPDATE as you can see below as I thought it was tidiest, thanks for the help.
<?php
header('Content-Type: application/json');
//get parameters
$data = json_decode(file_get_contents('php://input'), true);
// Create connection
$mysqli = new mysqli("localhost","dbuser","Pa55w0rd","ewdb");
// Check connection
if (mysqli_connect_errno())
{
echo json_encode(array('error' => 'Failed to connect to MySQL: ' . mysqli_connect_error() ));
return;
}
if(!$data)
{
echo json_encode(array('error' => 'Error input data'));
return;
}
$usernme = $data['usernme'];
$longitude = $data['longitude'];
$latitude = $data['latitude'];
$user = $mysqli->query("SELECT id FROM Users WHERE usernme = '$usernme' LIMIT 1");
$user_id = $user->fetch_object();
if(!$user_id)
{
$mysqli->query("INSERT INTO Users (usernme) VALUES ('$usernme');");
$user_id->id = $mysqli->insert_id;
$mysqli->query("INSERT INTO Locations (User_id) VALUES ($user_id->id);");
}
if($longitude && $latitude)
{
$mysqli->query("UPDATE Locations SET Longitude = $longitude, Latitude = $latitude WHERE User_id = $user_id->id;");
}
/* close connection */
$mysqli->close();
echo json_encode(array('user_id' => $user_id->id));
This is the query code:
if (isset($_POST['moduleAction']) && ($_POST['moduleAction'] == 'edit')) {
$date = date('Y-m-d H:i:s', time());
$stmt = $db->prepare('UPDATE todolist SET what = ?, priority = ?, added_on = ? WHERE id = ?');
$stmt->execute(array($what, $priority + 1, $date, $id));
}
My db connection:
<?php
try {
$db = new PDO('mysql:host=' . DB_HOST .';dbname=' . DB_NAME . ';charset=utf8mb4', DB_USER, DB_PASS);
$db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
$db->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
} catch (Exception $e) {
showDbError('connect', $e->getMessage());
}
The query is not executed on the db, on another page in the same document i am executing queries to the same db without problem. I've tried executing it without a prepared statement, double quotes, restarting te connection,... nothing works.
Anyone who can push me in the right direction?
EDIT
Setting variables:
$priorities = array('low','normal','high'); // The possible priorities of a todo
$formErrors = array(); // The encountered form errors
$id = isset($_GET['id']) ? (int) $_GET['id'] : 0; // The passed in id of the todo
$what = isset($_POST['what']) ? $_POST['what'] : ''; // The todo that was sent in via the form
$priority = isset($_POST['priority']) ? $_POST['priority'] : 'low'; // The priority that was sent in via the form
You should always check for errors, there is so many reason an query fail to work as expected.
Here is the right way to check:
if(isset($_POST['moduleAction']) && ($_POST['moduleAction'] == 'edit')) {
$date = date('Y-m-d H:i:s', time());
$query ='UPDATE todolist SET what = ?, priority = ?, added_on = ? WHERE id = ?';
if($stmt = $db->prepare($query)){
if($stmt->execute(array($what, $priority + 1, $date, $id))){
echo 'execute() successful';
if($stmt->rowCount() > 0){
echo 'Affected a row';
}else{
echo 'No row affected';
}
}else{
echo 'execute() error:';
die($dbh->errorInfo());
}
}else{
echo 'prepare() error:';
die($dbh->errorInfo());
}
}
Edit
One more thing, $priority + 1 seem a little weird.
After your update I see this line:
$priority = isset($_POST['priority']) ? $_POST['priority'] : 'low';
so you try to increment a string by 1?
Anyway what happened to traditional debugging ?
$sql_debug = "UPDATE todolist
SET what = '$what', priority = '$priority', added_on = '$date'
WHERE id = $id";
echo "**************************************<br>";
echo $sql_debug."<br>";
echo "**************************************<br>";
error_log('sql = '.$sql_debug);
Take a look at the query
And run to see what happens
I think update query is correct,please check the date time format,try this code
if (isset($_POST['moduleAction']) && ($_POST['moduleAction'] == 'edit')) {
$date=date_create("2014-10-09");
date_time_set($date,13,24,46);
$datetime =date_format($date,"Y-m-d H:i:s");
$stmt = $db->prepare('UPDATE todolist SET what = ?, priority = ?, added_on = ? WHERE id = ?');
$stmt->execute(array($what, $priority + 1, $datetime, $id));
}
If your query is not execute and you get no error I would say that something is wrong with this
if(isset($_POST['moduleAction']) && ($_POST['moduleAction'] == 'edit')) {
Make sure your moduleAction is set in your post array and is really egal to 'edit'.
Hope this helps