I'm trying to substitute user agent via message Emulation.setUserAgentOverride.
I send the message with parameters:
[userAgentOverride] => Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36
[acceptLanguage] => ru-RU
[platform] => Windows
And I got strange error, which I can't resolve:
[code] => -32602
[message] => Invalid parameters
[data] => Failed to deserialize params.userAgent - BINDINGS: mandatory field missing at position 175
The weirdness is, that user agent string length in this case is 110. And I'm stuck what does it mean. Any ideas would be appreciated.
I just mistook the name of parameter:
"userAgentOverride" has to be "userAgent".
Related
I'm just trying to get the title from this product page, however it keeps showing a 403 forbidden error.
Warning: file_get_contents(https://www.brownsfashion.com/uk/shopping/jem-18k-yellow-gold-octogone-double-paved-ring-17648795): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /Applications/AMPPS/www/get_prod.php on line 13"
I tried adding the user-agent in there but still doesn't seem to work. Maybe it isn't possible.
Code below:
<?php
include('simple_html_dom.php');
$context = stream_context_create(
array(
"http" => array(
"header" => "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
)
)
);
echo file_get_contents("https://www.brownsfashion.com/uk/shopping/jem-18k-yellow-gold-octogone-double-paved-ring-17648795", false, $context);
?>
This website has 3 anti bots systems:
Riskified.
Forter.
Cloudflare.
They are used to prevent DoS/DDoS atacks, crawling tasks.... Basically you can't easily crawl them with a simple request.
To bypass them you need to simulate/use real browser. You can use selenium or playwright.
I will show you an example of crawling this website with playwright and python.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.webkit.launch(headless=True)
baseurl = "https://www.brownsfashion.com/uk/shopping/jem-18k-yellow-gold-octogone-double-paved-ring-17648795"
page = browser.new_page()
page.goto(baseurl)
title = page.wait_for_selector("//a[#data-test='product-brand']")
name = page.wait_for_selector("//span[#data-test='product-name']")
price = page.wait_for_selector("//span[#data-test='product-price']")
print("Title: " + title.text_content())
print("Name: " + name.text_content())
print("Price: " + price.text_content())
browser.close()
I hope I have been able to help you.
How can I validate a Shopify store's URL? Given a URL how can I know whether it is a valid URL or 404 page not found? I'm using PHP. I've tried using PHP get_headers().
<?php
$getheadersvalidurlresponse= get_headers('https://test072519.myshopify.com/products/test-product1'); // VALID URL
print_r($getheadersvalidurlresponse);
$getheadersinvalidurlresponse= get_headers('https://test072519.myshopify.com/products/test-product1451'); // INVALID URL
print_r($getheadersinvalidurlresponse);
?>
But for both valid and invalid URLs, I got the same response.
Array
(
[0] => HTTP/1.1 403 Forbidden
[1] => Date: Wed, 08 Jul 2020 13:27:52 GMT
[2] => Content-Type: text/html
[3] => Connection: close
..............
)
I'm expecting 200 OK status code for valid URL and 404 for invalid URL.
Can anyone please help to check whether given shopify URL is valid or not using PHP?
Thanks in advance.
This happens because Shopify differentiates between bot requests and actual genuine requests to avoid denial of service attack up to a certain point. To overcome this problem, you will have to specify the user-agent header to mimic a browser request for an appropriate HTTP response.
As an improvement, you can make a HEAD request instead of a GET request(as get_headers() uses GET request by default, as mentioned in the examples) because here we are only concerned about response metadata and not response body.
Snippet:
<?php
$opts = array(
'http'=>array(
'method'=> "HEAD",
'header'=> "User-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"
)
);
$headers1 = get_headers('https://test072519.myshopify.com/products/test-product1',0,stream_context_create($opts));
$headers2 = get_headers('https://test072519.myshopify.com/products/test-product1451',0,stream_context_create($opts));
echo "<pre>";
print_r($headers1);
print_r($headers2);
I'm trying to login into a website and send an sms with python. I'm using the request module, passing the user name and password. After I have passed that I am unable to judge whether the login has succeeded or not.
url='http://www.connectexpress.in'
headers={"User-Agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36"}
s=requests.session() ; s.headers.update(headers)
r=s.get(url) ; soup=BeautifulSoup.BeautifulSoup(r.content)
VIEWSTATE=soup.find(id="__VIEWSTATE")['value']
#VIEWSTATEGENERATOR=soup.find(id="__VIEWSTATEGENERATOR")['value']
EVENTVALIDATION=soup.find(id="__EVENTVALIDATION")['value']
login_data={"__VIEWSTATE":VIEWSTATE,
"__EVENTVALIDATION":EVENTVALIDATION,
"txtUserName":usrNme,
"txtPassword":paswrd,
"submit":'Submit'}
r1=s.post(url, data=login_data)
This is the piece of code that I used and could any one give some ideas ?
I need to get requesting browsers name in my web app.( for analytics )
In core php when I use $visitor_user_agent=$_SERVER['HTTP_USER_AGENT']it returns Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36 string when using with chrome.And then preg_match('/Chrome/i', $visitor_user_agent) can be used to know if its chrome or not.I am not sure if that was efficient way to find browser name or not.
I also found get_browser link but it is not giving browser name.
Is there a way in cakephp3 or core php to get browser name ?
This would return the user agent used for the request:
$this->request->header('User-Agent');
http://book.cakephp.org/3.0/en/controllers/request-response.html
Look into documentation of Request object.
You can get HTTP_USER_AGENT using env() method:
$this->request->env('HTTP_USER_AGENT');
You can also prepare custom detector:
$this->request->addDetector(
'chrome',
['env' => 'HTTP_USER_AGENT', 'pattern' => '/Chrome/i']
);
And then in controller just use is() method as follows:
if($this->request->is('chrome')) {
// do stuff for chrome
}
Im looking to book a hotel with expedia api
https://book.api.ean.com/
but the url is always blank,
details here using rest/json
http://developer.ean.com/docs/hotels/version_3/book_reservation/
I've seen a few people experiencing the same problem and wondered if anyone knew what it was
https://book.api.ean.com/ean-services/rs/hotel/v3/res?cid=55505
&apiKey=xxx
&locale=en_US
¤cyCode=USD
&customerUserAgent=Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.83 Safari/535.11
&customerIpAddress=127.0.0.1
&room1=1,3
&room1FirstName=TestBooking
&room1LastName=TestBooking
&room1BedTypeId=13
&room1SmokingPreference=NS
&room2=1,5
&room2FirstName=TestBooking
&room2LastName=TestBooking
&room2BedTypeId=13
&room2SmokingPreference=NS
&email=xxx
&firstName=TestBooking
&lastName=TestBooking
&homePhone=TestBooking
&workPhone=TestBooking
&creditCardType=CA
&creditCardNumber=5401999999999999
&creditCardIdentifier=123
&creditCardExpirationMonth=11
&creditCardExpirationYear=2012
&address1=travelnow
&city=Bellevue
&stateProvinceCode=WA
&countryCode=US
&postalCode=98004
&customerSessionId=0ABAA871-3127-A913-6642-A1F86D902E2B
&hotelId=211540
&arrivalDate=12/10/2012
&departureDate=12/12/2012
&supplierType=E
&rateKey=d03a8d29-1df2-4436-81d6-6b37eb4dcb78
&roomTypeCode=352749
&rateCode=1279169
&chargeableRate=803.04
including or excluding the minorrev doesn't seem to make much difference
Try urlencoding each querystring parameter:
customerUserAgent=Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.83 Safari/535.11
arrivalDate=12/10/2012
The customerUserAgent url param should be url encoded:
customerUserAgent=Mozilla%2F5.0%20(Windows%20NT%206.1)%20AppleWebKit%2F535.11%20(KHTML%2C%20like%20Gecko)%20Chrome%2F17.0.963.83%20Safari%2F535.11
arrivalDate=12%2F10%2F2012
Take a look at PHP's urlencode for more information.
Also, be sure to remove the spaces between each querystring param, not sure if it was copy/paste issue or not.