Related
I'm making a form text input box for an inventory list. I type everything into the form and it breaks everything down to be put into SQL(not at the SQL part yet).
Input would look something like this
5-1 1/2 black 90° sch 40 (have 10 of the sch 80 ones)
amount - size - fitting name ( comments)
I got everything in their own variable now I just need to remove the "(comments)" part from the fitting name. str_replace should do the trick but it seems to only work "some times" ? I'm not sure why it doesn't always work. The picture below show that it only worked 2 times. As for my regex, I'm really bad at them. lol Thanks for any help you can give with my little problem.
foreach ($components as $value) {
$value=stripslashes($value);
// get rid of empty lines
if (empty($value)) { continue; }
// will Split the amount of fittings from the value.
$quantity = explode("-", $value);
//Will split the name of the fitting and also give the size of the fittings.
$test=preg_split('/([a-w]|[A-W])/', $quantity[1],2, PREG_SPLIT_DELIM_CAPTURE);
//split the comments from the fitting name.
$comments = explode("(", $test[2]);
//removes the remaining ) from the left side
$comments =preg_replace("/\)/", "" , $comments[1]);
if(!empty($comments)) {
$fitting_name = str_replace("($comments)","", $test[1].$test[2]);
}else{
$fitting_name = $test[1].$test[2];
}
// The table format is just to make sure everything is working right before inputting into SQL
echo "
<tr>
<td>$value</td>
<td>". $quantity[0] ."</td>
<td>$test[0]</td>
<td>$fitting_name</td>
<td>$comments </td>
</tr>
<tr>
";
}
echo "</tr>
</table>";
}
Input
0-1/2 PP union
1- 1 1/2x1 1/4 copper PP fitting red
9- 1 1/2" copper PP 90°
5-2" copper PP 90° (have 10 of the sch 80 ones)
13-2" copper PP Tee (have 10 of the sch 80 ones)
10-1*1*3/4 copper PP tee
60- 3/4" PP cap and chain value
50 - 3/4" PP value (we only have 4 more)
19- 3/4" threaded cap and chain value
0-2" threaded value
0- 2 1/2" threaded value (have 10 of the sch 80 ones)
5- 3/4" black Street 90°
0 - 3/4 black union
0- 1" black union
0-1" black tee
1 - 1 1/2 black union
6-1 1/4 black cap
7-1 1/2" * 1" black bushing
3 - 1 1/2 black coupling
5-1 1/2 black 90° sch 40? (have 10 of the sch 80 ones)
4 - 3/8" rod 6'
4-3/8" rod 10'
6 - 1/2" rod 6'
0-5/8" rod 6"
0-5/8" rod 10'
0-3/4 rod 6'
2 - 1" rod 6'(have 10 of the sch 80 ones)
I would first remove the comment before doing any parsing, to cleanse the input.
You can extract it with a RegEx :
\s*\(.*?\)
\s* matches 0 or more white spaces
\( matches a parenthesis
.*? matches any characters (lazy match)
\) matches a parenthesis
Now, you can replace this by an empty string :
<?php
$input = [
"0-1/2 PP union ",
"1- 1 1/2x1 1/4 copper PP fitting red",
"9- 1 1/2\" copper PP 90°",
"5-2\" copper PP 90° (have 10 of the sch 80 ones) ",
"13-2\" copper PP Tee (have 10 of the sch 80 ones) ",
"10-1*1*3/4 copper PP tee",
"",
"",
"60- 3/4\" PP cap and chain value ",
"50 - 3/4\" PP value (we only have 4 more)",
"19- 3/4\" threaded cap and chain value ",
"0-2\" threaded value ",
"0- 2 1/2\" threaded value (have 10 of the sch 80 ones) ",
"",
"",
"5- 3/4\" black Street 90°",
"0 - 3/4 black union ",
"0- 1\" black union ",
"0-1\" black tee ",
"1 - 1 1/2 black union ",
"6-1 1/4 black cap",
"7-1 1/2\" * 1\" black bushing",
"3 - 1 1/2 black coupling ",
"5-1 1/2 black 90° sch 40? (have 10 of the sch 80 ones) ",
"",
"",
"",
"4 - 3/8\" rod 6'",
"4-3/8\" rod 10'",
"6 - 1/2\" rod 6'",
"0-5/8\" rod 6\"",
"0-5/8\" rod 10'",
"0-3/4 rod 6'",
"2 - 1\" rod 6'(have 10 of the sch 80 ones) "
];
foreach ($input as $value)
{
$newValue = preg_replace("#\s*\(.*?\)#", "", $value);
echo $newValue . PHP_EOL;
}
Output :
0-1/2 PP union
1- 1 1/2x1 1/4 copper PP fitting red
9- 1 1/2" copper PP 90°
5-2" copper PP 90°
13-2" copper PP Tee
10-1*1*3/4 copper PP tee
60- 3/4" PP cap and chain value
50 - 3/4" PP value
19- 3/4" threaded cap and chain value
0-2" threaded value
0- 2 1/2" threaded value
5- 3/4" black Street 90°
0 - 3/4 black union
0- 1" black union
0-1" black tee
1 - 1 1/2 black union
6-1 1/4 black cap
7-1 1/2" * 1" black bushing
3 - 1 1/2 black coupling
5-1 1/2 black 90° sch 40?
4 - 3/8" rod 6'
4-3/8" rod 10'
6 - 1/2" rod 6'
0-5/8" rod 6"
0-5/8" rod 10'
0-3/4 rod 6'
2 - 1" rod 6'
Try it yourself
Plese try changing the following line
$fitting_name = str_replace("($comments)","", $test[1].$test[2]);
to
$fitting_name = str_replace("(" . trim($comments).")","", $test[1].$test[2]);
and see the effect. Please let us know the result.
I want to scrape data from this website http://demo.istat.it/bilmens2012gen/index02.html
On the left there's a webform which passes the parameters to a php page which in turn outputs the resulting html tables and in a frame in the same page.
From the the first drop-down list there are 107 cities and from the second 12 months so I should manualy run 1.284 queries to collect the desired data.
Any suggestion for automating this process?
I used R and rvest library to scrape static html tables but since these tables are generated by the form parameters I don't know how to do. Wish I could the combination of the parameters (like "city1" "month1") and retrieve the html and later do my stuff to join the data.
This is a fairly straightforward scraping job. When you select buttons on the page, the browser just requests some html from the server and puts it into the main frame. The request is just encoded in the url in this format:
Province (1 - 107) Period (1 - 12)
| |
v v
http://demo.istat.it/bilmens2012gen/query1.php?lingua=ita&Pro=1&allrp=4&periodo=1&submit=Tavola
So you can do this to get all the urls:
urls <- do.call("c",
lapply(1:107,
function(x) paste0("http://demo.istat.it/bilmens2012gen/",
"query1.php?lingua=ita&Pro=", x,
"&allrp=4&periodo=", 1:12,
"&submit=Tavola")
)
)
Of course, you still need to scrape the data from these pages. Here's an example of a function that will get the data from each link:
get_table <- function(url)
{
df <- xml2::read_html(url) %>%
html_nodes("table") %>%
`[`(2) %>% html_table()
df <- df[[1]]
breaks <- which(df[,1] == "CodiceComune")
output <- df[(breaks[1] + 2):(breaks[2] - 1),]
output <- setNames(output, paste(df[1,], df[2,]))
for(i in 3:8) output[[i]] <- as.numeric(as.character(output[[i]]))
dplyr::as_tibble(output)
}
So I can get the first period of the first region like this:
get_table(urls[1])
#> # A tibble: 315 x 11
#> `CodiceComune T~ `Comuni Totale` `Popolazioneini~ `Nati Vivi Tota~ `Morti Totale`
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 001269 Strambino 6314 1 5
#> 2 001270 Susa 6626 2 10
#> 3 001271 Tavagnasco 812 0 1
#> 4 001272 Torino 869312 749 1011
#> 5 001273 Torrazza Piemo~ 2833 2 4
#> 6 001274 Torre Canavese 592 1 1
#> 7 001275 Torre Pellice 4514 4 8
#> 8 001276 Trana 3877 2 5
#> 9 001277 Trausella 132 0 1
#> 10 001278 Traversella 351 0 0
#> # ... with 305 more rows, and 6 more variables: `SaldoNaturale Totale` <dbl>, `Iscritti
#> # Totale` <dbl>, `Cancellati Totale` <dbl>, `Saldomigratorio e per altri motivi Totale` <chr>,
#> # `Unità inpiù/menodovute avariazioniterritoriali Totale` <chr>, `Popolazionefine periodo
#> # Totale` <chr>
Of course, you would want to set up a loop to get all the pages and glue the data frames together, perhaps like this:
result_list <- list()
for(i in seq_along(urls))
{
cat("Getting url", i, "of", length(urls), "\n")
result_list[[i]] <- get_table(urls[i])
}
result_df <- do.call(rbind, result_list)
Obviously I have not tested this as it is likely to take about an hour to download and process all the tables.
I want to remove duplicates, if these duplicates got a length of more then 4 characters.
How can we achieve that? My current code also remove the duplicate - values.
CODE:
$seoproducttitle = 'HP Chromebook Chromebook 11 G5 EE - 11.6 inch - Intel® Celeron® - 4LT18EA#ABH';
$productnamestring = $seoproducttitle;
$findseo = array('/\h+inch (?:(i[357])-\w+|\h+\w+)?/', '/(\w+)#\w+/');
$replaceseo = array('" $1', '$1');
$productnamingseo = preg_replace($findseo, $replaceseo, $productnamestring);
echo implode(' ', array_unique(explode(' ', $productnamingseo)));
This outputs: HP Chromebook 11 G5 EE - 11.6" Intel® Celeron® 4LT18EA
It should output: HP Chromebook 11 G5 EE - 11.6" - Intel® Celeron® - 4LT18EA
Or: Apple MacBook Air MacBook Air - 13.3 inch - Intel Core i5-8e - MRE82N/A
Should be: Apple MacBook Air - 13.3 inch - Intel Core i5-8e - MRE82N/A
EXAMPLE: http://sandbox.onlinephpfunctions.com/code/5bcaaf47ca97d6dee359802f2d71c2d889c0d091
Update
Based on comments from OP, the required regex is
/(^| )(.{4,}) (.*)\2/
This looks for a group of 4 or more characters preceded by either a space or the start of the line and followed by a space, some number of other characters and then the group repeated again. The regex is replaced by $1$2 $3 which effectively removes the duplicate string. A couple of examples:
$seoproducttitle = 'Apple MacBook Air MacBook Air - 13.3 inch - Intel Core i5-8e - MRE82N/A';
echo preg_replace('/(^| )(.{4,}) (.*)\2/', "$1$2 $3", $seoproducttitle) . "\n";
$seoproducttitle = 'HP Chromebook 11 G5 EE Chromebook - 11.6 inch - Intel® Intel® Celeron® - 4LT18EA#ABH 4LT18EA#ABH';
echo preg_replace('/(^| )(.{4,}) (.*)\2/', "$1$2 $3", $seoproducttitle) . "\n";
Output:
Apple MacBook Air - 13.3 inch - Intel Core i5-8e - MRE82N/A Array
HP Chromebook 11 G5 EE - 11.6 inch - Intel® Celeron® - 4LT18EA#ABH
Updated demo on 3v4l.org
Original Answer
You could use this regex:
\b([^ ]{4,})( |$)(.*)\1
It looks for a group of 4 or more non-blank characters, followed by a space or end-of-string, followed by some number of other characters and then the first group repeated. The regex is replaced by $1$3 which effectively removes the duplicate string. e.g.
$seoproducttitle = 'HP Chromebook 11 G5 EE Chromebook - 11.6 inch - Intel® Intel® Celeron® - 4LT18EA#ABH 4LT18EA#ABH';
echo preg_replace('/\b([^ ]{4,})( |$)(.*)\1/', "$1$3", $seoproducttitle);
Output:
HP Chromebook11 G5 EE - 11.6 inch - Intel® Celeron® - 4LT18EA#ABH
Demo on 3v4l.org
Computers only do what we tell them, so you first need to explain the process to yourself in plain language. Then translate that to code. Then if you're having trouble doing that you've at least got a proper description of the problem to post on StackOverflow .
$words = explode(' ', $productnamingseo);
// start with an empty list of words we've seen
$output = [];
// for every word
foreach($words as $word) {
// if it's longer than 4 chars and we've already seen it
if( mb_strlen($word) >= 4 && in_array($word, $output) ) {
// debug: show omitted words
// $output[] = str_repeat('X', mb_strlen($word));
// skip it
continue;
}
// otherwise, add it to the list of words we've already seen
$output[] = $word;
}
var_dump(
$productnamingseo,
implode(' ', $output)
);
Basically im trying to parse an stdClass object heres the code I use:
while($row = $data->fetch_assoc()){
try{
$products = $client->product_by_category_web_list($row['category_id'] , true, '', '2013-01-01', 0, 1500);
}catch(SoapFault $e){
echo "No Products under this Category";
}
$i = 0;
foreach($product as $products->item){
$cat_id = $row['category_id'];
$id = $product->$i->id;
$name = $product->$i->name;
$desc = $product->$i->descrShort;
$descLong = $product->$i->descrLong;
$avail = $product->$i->availableToSell;
$deliverable = $product->$i->deliverable;
$itemWeight = $product->$i->itemWeight;
$typelkp = $product->$i->typeLkp;
$stkbrandid = $product->$i->stkBrandId;
$dataVatId = $product->$i->dataVatId;
$query = "INSERT INTO `products` (`product_id` , `product_name` , `product_short_desc` , `product_long_desc` , `product_available , product_deliverable` , `product_item_weight` , `type_id` , `brand_id` , `vat_id`)
VALUES (? , ? , ? , ? , ? , ? , ? , ? , ? , ?)";
$stmt = $mysql->prepare($query);
$stmt->bind_param("isssiiiiii" , $id , $name , $desc , $descLong , $avail , $deliverable , $itemWeight , $typelkp , $stkbrandid , $dataVatId);
$stmt->execute();
$mctp = "INSERT INTO `products_categories` (`product_id` , `category_id` , `row_updated`)
VALUES (? , ? , ?)";
$match = $mysql->query($mctp);
$match->bind_param("iii" , $id , $cat_id , 0);
$match->execute();
$i++;
}
}
the Soap Call returns this:
object(stdClass)#1 (1) {
["item"]=>
array(2) {
[0]=>
object(stdClass)#3 (10) {
["id"]=>
int(79493)
["name"]=>
string(34) "Claud Butler Phobos Kids Bike 2014"
["descrShort"]=>
string(0) ""
["descrLong"]=>
string(1269) "
This year we have worked harder than ever to offer you a great range of quality junior bikes. All exclusively designed and tested in the UK with todays young riders and the popularity of cycling foremost in our minds.
A great bike is the key to a positive riding experience, enjoyment and reliability can be the largest factors when considering your new bike and with over 130 years of design and manufacturing experience we have become the brand of family cycling. Bikes have very much become a fashion item and we believe the kids choices are as important as Mum and Dads so we have a vast range of eye catching models to choose from. We also offer alloy and steel framed models to suit all budgets while maintaining great quality and performance at great value.
Frame: HiTen Steel
Fork: Rigid Steel
Headset: Steel with bearings
Bars: Steel
Stem: Steel Quill Type
Chainset: Steel
Front Brake: Alloy V Rear
Brake: Alloy V
Rims: Alloy
Front Hub: Steel
Rear Hub: Steel
Tyres: 20 x 1.95 Front, 20 x 1.95 Rear
Seatpost: Steel
"
["availableToSell"]=>
bool(true)
["deliverable"]=>
bool(true)
["itemWeight"]=>
float(20)
["typeLkp"]=>
int(1)
["stkBrandId"]=>
int(136)
["dataVatId"]=>
int(1)
}
[1]=>
object(stdClass)#4 (10) {
["id"]=>
int(64223)
["name"]=>
string(45) "DiamondBack Accomplice Black 20 Inch BMX 2012"
["descrShort"]=>
string(0) ""
["descrLong"]=>
string(1514) "
DiamondBack Accomplice Black 20 Inch BMX 2012 Introduction
If there were laws to prevent us from providing you with pro-level quality in a complete at a suspiciously low price, the new Accomplice would have us doing some hard time. Fortunately for both of us; there aren't. The bike has, however spent some time on the drawing board over the last year. In addition to being cleaned up and simplified, the Accomplice also bolsters a new frame with Affix Bush BB, Affix 9T cassette rear hub, Affix tyres and Butted tapered forks and bars! Built in Germany by the world famous KHE bmx experts.
Specification and Features of DiamondBack Accomplice Black 20 Inch BMX 2012
Diamondback Ambigram CRMO 3 piece crank on Affix Mid Bush BB (red ano) with Affix Orbis Alloy 25T sprocket
Rear U brake with soft compound pad and front caliper brake with alloy hinge levers
Front KHE Big "V" rim on Affix Ting Hub, rear Alienation Black Sheep rim on Affix 9T Hub with Affix 2.1 Tyres
Affix system stem with 2 piece butted handlebar 8" rise 28.8" width
KHE Exhib Project 2 saddle/post combo
Diamondback/KHE collaboration: Full 4130 Butted chromoly BMX frame with integrated head tube and Diamond "X" Brace, with full 4130 butted tapered chromoly BMX forks"
DiamondBack Accomplice Black 20 Inch BMX 2012 is perfect for:
Hitting the parks, jumps or a bit of street riding.
"
["availableToSell"]=>
bool(true)
["deliverable"]=>
bool(true)
["itemWeight"]=>
float(15)
["typeLkp"]=>
int(1)
["stkBrandId"]=>
int(215)
["dataVatId"]=>
int(1)
}
}
}
Whenever im using the foreach loop I get a error: Invalid Arguement supplied foreach. Im parsing it based on the answer ive just found:
Here
I know that there's a Steam API allowing me to use data from Steam Community.
My question is, does anyone know if there's a Steam Market API?
For example, I want to get the current price of an item in the Steam Market.
I've googled and haven't found anything yet.
I'd be glad to have your help.
I could not find any documentation, but I use:
http://steamcommunity.com/market/priceoverview/?appid=730¤cy=3&market_hash_name=StatTrak%E2%84%A2 M4A1-S | Hyper Beast (Minimal Wear)
to return a JSON.
At time of writing, it returns:
{"success":true,"lowest_price":"261,35€ ","volume":"11","median_price":"269,52€ "}
You can change the currency. 1 is USD, 3 is euro but there are probably others.
A better search api that can give you all the results for a game, example using pubg which only has 272 items, if your game has more try changing the count parameter at the end
https://steamcommunity.com/market/search/render/?search_descriptions=0&sort_column=default&sort_dir=desc&appid=578080&norender=1&count=500
I indexed the available currencies steam uses for argument
¤cy=3
as:
1 : $63.83
2 : £46.85
3 : 52,--€
4 : CHF 56.41
5 : 4721,76 pуб.
6 : 235,09zł
7 : R$ 340,80
8 : ¥ 6,627.08
9 : 534,70 kr
10 : Rp 898 383.24
11 : RM257.74
12 : P3,072.66
13 : S$84.47
14 : ฿1,921.93
15 : 1.474.136,93₫
16 : ₩ 69,717.79
17 : 468,47 TL
18 : 2 214,94₴
19 : Mex$ 1,557.75
20 : CDN$ 99.09
21 : A$ 100.40
22 : NZ$ 107.55
23 : ¥ 505.96
24 : ₹ 5,733.04
25 : CLP$ 55.695,47
26 : S/.283.03
27 : COL$ 271.637,06
28 : R 1 193.49
29 : HK$ 606.83
30 : NT$ 2,189.42
31 : 293.64 SR
32 : 287.51 AED
Python dictionary with currency abbreviations and their codes:
currencies = {
"USD": 1, # United States dollar
"GBP": 2, # British pound sterling
"EUR": 3, # The euro
"CHF": 4, # Swiss franc
"RUB": 5, # Russian ruble
"PLN": 6, # Polish złoty
"BRL": 7, # Brazilian real
"JPY": 8, # Japanese yen
"SEK": 9, # Swedish krona
"IDR": 10, # Indonesian rupiah
"MYR": 11, # Malaysian ringgit
"BWP": 12, # Botswana pula
"SGD": 13, # Singapore dollar
"THB": 14, # Thai baht
"VND": 15, # Vietnamese dong
"KRW": 16, # South Korean won
"TRY": 17, # Turkish lira
"UAH": 18, # Ukrainian hryvnia
"MXN": 19, # Mexican Peso
"CAD": 20, # Canadian dollar
"AUD": 21, # Australian dollar
"NZD": 22, # New Zealand dollar
"CNY": 23, # Chinese yuan
"INR": 24, # Indian rupee
"CLP": 25, # Chilean peso
"PEN": 26, # Peruvian sol
"COP": 27, # Colombian peso
"ZAR": 28, # South African rand
"HKD": 29, # Hong Kong dollar
"TWD": 30, # New Taiwan dollar
"SAR": 31, # Saudi riyal
"AED": 32 # United Arab Emirates dirham
}
To add to what the other people have said, the temporary ban on the JSON site happens if you try and request 20 items within a minute's time from the server. If you're creating a script to request those links, add a three second delay between each script.
Also, the ban only lasts for the remaining server-side minute (which may not be 60 seconds).
You can use SteamApis.com to acquire Steam market prices and item information. The data is returned in JSON. The service is not free but also not that expensive.
The documentation is available to view here. It has detailed information on what endpoints are available and what data is returned.
There is not such API for now. But this link may help you:
Get the price of an item on Steam Community Market with PHP and Regex
It's basically what you want with pure php DOM parsing instead of an API. The main drawback is that you may have to change your code if Steam update their html markup.
Script-scraper which maps search results from https://steamcommunity.com/market/search?q= to array of objects
Array.from(document.querySelectorAll('a.market_listing_row_link')).map(item => {
const itemInfo = item.children[0]
return {
isStatTrek: itemInfo.getAttribute('data-hash-name').startsWith('StatTrak™'),
condition: itemInfo.getAttribute('data-hash-name').match(/.*\((.*)\)/)[1],
priceUSD: Number(itemInfo.querySelector('.normal_price[data-price]').getAttribute('data-price')/100)
}
})
can be used with iframe and "weapon | skin name (condition)" search template