list of urls in associative array - php

My string looks like this:
http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpghttp://localhost/layerthemes/wp-content/uploads/2014/05/Eddy-Need-Remix-mp3-image.jpghttp://localhost/layerthemes/wp-content/uploads/2013/03/static-pages.png
How do I extract each urls in array like this:
array(
0 => 'http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpg'
1 => 'http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpg'
2 => 'http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpg'
)
This is how i tried with no avail:
$imgss = 'http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpghttp://localhost/layerthemes/wp-content/uploads/2014/05/Eddy-Need-Remix-mp3-image.jpghttp://localhost/layerthemes/wp-content/uploads/2013/03/static-pages.png';
preg_match_all(
"#((?:[\w-]+://?|[\w\d]+[.])[^\s()<>]+[.](?:\([\w\d]+\)|(?:[^`!()\[\]{};:'\".,<>?«»“”‘’\s]|(?:[:]\d+)?/?)+))#",
$imgss
);
foreach($imgss as $imgs){
echo '<img src="'.$imgs.'" />';
}
Any help would be appreciated. needless to say I am very weak in php
Thanks

If there are no spaces in string you can use:
$string = 'http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpghttp://localhost/layerthemes/wp-content/uploads/2014/05/Eddy-Need-Remix-mp3-image.jpghttp://localhost/layerthemes/wp-content/uploads/2013/03/static-pages.png';
$string = str_replace( 'http', ' http', $string );
$array = array_filter( explode( ' ', $string ) );
print_r( $array );

Exploding is fine but perhaps you should also validate the inputted links, ive put together this which will let you know the inputted links need to be on a new line or have a space between them, then it will validate the links and create a new array of valid links that you can then do something with.
<?php
if($_SERVER['REQUEST_METHOD'] == 'POST' & !empty($_POST['links'])){
//replace all \r\n and \n and space with , delimiter
$links = str_replace(array(PHP_EOL, "\r\n", " "), ',', $_POST['links']);
//explode using ,
$links = explode(',', $links);
//validate links by going through the array
foreach($links as $link){
//does the link contain more then one http://
if(substr_count($link, 'http://') >1){
$error[] = 'Add each url on a new line or separate with a space.';
}else{
//does the link pass validation
if(!filter_var($link, FILTER_VALIDATE_URL)){
$error[] = 'Invalid url skipping: '.htmlentities($link);
}else{
//does the link contain http or https
$scheme = parse_url($link, PHP_URL_SCHEME);
if($scheme == 'http' || $scheme == 'https'){
//yes alls good, add to valid links array
$valid_links[] = $link;
}else{
$error[] = 'Invalid url skipping: '.htmlentities($link);
}
}
}
}
//show whats wrong
if(!empty($error)){
echo '
<pre>
'.print_r($error, true).'
</pre>';
}
//your valid links do somthing
if(!empty($valid_links)){
echo '
<pre>
'.print_r($valid_links, true).'
</pre>';
}
}?>
<form method="POST" action="">
<textarea rows="2" name="links" cols="50"><?php echo (isset($_POST['links']) ? htmlentities($_POST['links']) : null);?></textarea><input type="submit" value="Submit">
</form>
Perhaps it will help.

How about:
$input = "http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpghttp://localhost/layerthemes/wp-content/uploads/2014/05/Eddy-Need-Remix-mp3-image.jpghttp://localhost/layerthemes/wp-content/uploads/2013/03/static-pages.png";
$exploded = explode("http://", $input);
$result;
for ($i = 1; $i < count($exploded); ++$i)
{
$result[$i - 1] = "http://" . $exploded[$i];
}

Here's an example, if you have control over this entire process.
Your form:
<form id="myform" method="POST">
</form>
Your javascript (using jquery):
<script>
var myurls = getUrls();
$('<input>').attr({
type: 'hidden',
name: 'myurls',
value: JSON.stringify(myurls),
}).appendTo('#myform');
// gathers your URLs (however you do this) and returns them as a javascript array
function getUrls() {
// just return this as a placeholder/example
return ["http://localhost/layerthemes/wp-content/uploads/2014/05/46430454_Subscription_XXL-4_mini.jpg", "http://localhost/layerthemes/wp-content/uploads/2014/05/Eddy-Need-Remix-mp3-image.jpg", "http://localhost/layerthemes/wp-content/uploads/2013/03/static-pages.png"];
}
</script>
Your PHP:
$myurls = json_decode($_POST['myurls']);
var_dump($myurls); // should be the array you sent
You could do this with AJAX too if you want. Or make the form automatically submit.

Related

Print out the index of all occurrences of a string in an array

I have created a simple PHP script where the user types in a sentence (or sentences) and the script takes each word from the textarea and splits it up into an array. The user will also type in a word to search for in the array.
The script should look through the array and print out the index of all places where the word is.
This is what I have so far:
foreach($parts as $item) {
if ($item == $strName) {
$k = array_search($strName, array_values($parts));
print "$k\n";
}
}
However that only prints out the first index location of the string. So if the sentence I use is "The apple fell from the tree", it will just print out "0 0", which is the first time the word appears in the array (it also does the same if the index is 1, 2, 3, etc). Is there something that I have done wrong? Sorry if I didn't include enough information.
Entire code:
<form action="sida3.php" method="post">
Text: <textarea name="textarea"></textarea>
<br>
Search word: <input type="text" name="search">
<br>
<input type="submit" name="submit" value="Submit">
</form>
<?php
if(isset($_POST['submit'])){
$parts = explode(" ", $_POST['textarea']);
$strName = $_POST['search'];
print_r ($parts);
echo '<br>';
foreach($parts as $item) {
if ($item == $strName) {
print_r(array_keys($parts, $strName));
}
}
}
?>
Here you go
function search_array( $needle, $string ){
$haystack = explode( " ", $string);
$keys_found = array_keys( array_map( "strtolower", $haystack) , strtolower( $needle ), false);
if( !empty( $keys_found ) ):
return array(
"count" => count( $keys_found ),
"keys" => $keys_found
);
else:
return false;
endif;
}
var_dump( search_array("Every","Every good boy does fine, every good girl does fine") );

how to validate "provide at least 2 tags, separated by commas" from tag textfield in php?

HTML
<h3>Tags</h3>
<input <?php echo $err_st3; ?> type="text" name="tags" id="textfield"
placeholder="Example: tag, another tag, hello tagging" value="
<?php echo #$tagsOK; ?>">
Php
$tags=array();
$tagline = $_POST['tags'];
//TAGS
if(!empty($tagline)){
$tokens = str_replace(' ', '', $tagline);
$tags = explode(',',$tokens);
$tags = array_unique($tags);
foreach ($tags as $tag) {
if(preg_match("/^[0-9a-zA-Z]$/",$tag) === 0) {
// error
}
else{
echo $count_tags = $count_tags+1;
}
}
if($count_tags <= 1){
$error[]=" - Please provide at least 2 tags, separated by commas.";
$err_st3 = $st;
}
$tagsOK = implode(', ',$tags);
}
else{
$error[]=" - Please provide at least 2 tags, separated by commas.";
$err_st3 = $st;
}
When I enter the letters like "a, b", then it will be valid.
But it does not validate words like "vehicle, characters, scene"
Instead of doing preg_match check length:
strlen($tag)==1

Regular Expression not matching content in PHP

I am trying to scrape an ebay page such as this one: http://www.ebay.co.uk/sch/Cars-/9801/i.html?_nkw=vw+golf
Everything works great except one of my regular expressions just isn't matching the content and therefore the matches aren't being pushed to $linksArray I have outputted the contents to make sure what I am trying to match is infact there - and it is. I then go print_r($linksArray) where all the matches should be. but it's not. It is an empty multi dimensional array. You can see my live example here: http://www.mycommunity.co.za/marcksack/index.php
Here is my PHP code:
<?php
echo '<form method="POST">
<input type="text" id="url" name="url" size="120" value="' . (isset($_REQUEST["url"]) && !empty($_REQUEST["url"]) ? $_REQUEST["url"] : "") . '"/>
<input type="submit" value="Submit" />
</form>';
flush();
if (isset($_REQUEST["url"]) && !empty($_REQUEST["url"])) {
$url = $_REQUEST["url"];
$phones = array();
for ($page = 1; $page <= 1; $page++) {
// get page contents
$contents = file_get_contents($url . "&_pgn=" . $page);
echo(htmlentities($contents));
// find all links patterns
// HERE IS THE PROBLEM
$pattern = '/class="lvtitle"><a href="(.*)" class="vip"/';
$linksArray = array();
preg_match_all($pattern, $contents, $linksArray);
print_r($linksArray);
$links = $linksArray[0];
foreach($links as $link) {
$pureLink = str_replace("class=\"lvtitle\"><a href=\"", "", $link);
$pureLink = str_replace("\" class=\"vip\"", "", $pureLink);
// getting sub page contents
$subContents = file_get_contents($pureLink);
// find all links patterns
$subContents = str_replace(" ", "", $subContents);
$phonePattern = '/07[0-9]{9}/';
$phonesArray = array();
preg_match_all($phonePattern, $subContents, $phonesArray);
foreach($phonesArray[0] as $element) {
// check if phone not added previousely to the phones array
if (!in_array($element, $phones)) {
// add it to the phones array
array_push($phones, $element);
echo $element . "<br />";
flush();
}
}
}
}
// print results
foreach($phones as $phone){
echo $phone."<br/>";
}
}
?>
So obviously my question is what am I doing wrong? Why are the matches not being pushed to my $linksArray variable. I really appreciate your help!
This regex works:
"/ class=\"lvtitle\"><a href=\"([^\"]*)\" class=\"vip\"/"
A few issues with your's:
You were trying to capture the URL using (.*), which will match the entire line.
It was not matching the entire line because ebay has two spaces in between the class and href attributes.
Also, as has already been mentioned, you should use the API or DOMDocument for this. But in case you are curious, this is why it wasn't working. I hope that helps!

strip defined character from string

im having a odd problem os my website, i have a script that records all the searchs and insert those search words on database, the problem is that since search engine robots started sneaking around my website, they make my script to produce search keywords like "search keywords////////////////////////////////////////////////"
I want to strip that characteres ( ////////// ) before indexed on mysql.
This is what i have:
$search=htmlspecialchars($_GET['load']);
$say=mysql_query("SELECT * FROM madvideo WHERE MATCH (baslik) AGAINST ('*$search*' IN BOOLEAN MODE)");
$saydim=mysql_num_rows($say);
$count = $saydim;
$page = !empty($_GET["page"]) ? intval($_GET["page"]) : 1;
$s = ($page-1)*$perpage;
$sayfasayisi=ceil($count/$perpage);
if(ayaral("Arananlar-Kaydet")=="1") {
$ekle=cevir($search);
#mysql_query("insert into tag (baslik,tr,tarih) values ('$search','$ekle',now()) "); }
The variable " $search " will call the search word, and i dont know whats the strip syntax i have to use to strip that nasty ///////// character.
EDIT: the code that creates the words is this:
$vtitle = str_replace("\r\n\r\n", ' ', $vtitle);
$words = explode(' ', $vtitle);
$k = count($words);
$k3 = ceil($k/3);
$new = array();
for ($i=0; $i<$k; $i+=$k3) {
$new[] = join(' ', array_slice($words,$i, $k3));
}
$tag1 = $new[0];
$tag2 = $new[1];
$tag3 = $new[2];
Use MySQL TRIM function to remove from end of the string as follows:
TRIM(TRAILING '/' FROM <string>)
#mysql_query("insert into tag (baslik,tr,tarih)
values (TRIM(TRAILING '/' FROM '$search'),'$ekle',now())")
To trim from both beginning and ending, use following
TRIM(BOTH '/' FROM <string>)
#mysql_query("insert into tag (baslik,tr,tarih)
values (TRIM(BOTH '/' FROM '$search'),'$ekle',now())")
To remove all occurences of the string use REPLACE function as follows:
REPLACE(<string>, '/', '')
#mysql_query("insert into tag (baslik,tr,tarih)
values (REPLACE('$search', '/',''),'$ekle',now())")
Hope it helps...
If the / characters appear always at the end of the $search, you can use rtrim using the second (optional) parameter:
$search = "search keywords////////////////////////////////////////////////";
$search = rtrim($search, '/ ');
echo $search; // prints 'search keywords'
EDIT:
Real example...
<?php
if(isset($_REQUEST['str'])) {
$search = $_REQUEST['str'];
$ser_chk = strpos($search, "/");
if ($ser_chk > -1) {
$search = str_replace("/", "", $search);
}
}
?>
<h1><?php print $search; ?></h1>
<form action="" method="post">
<input type="text" size="100" value="search keywords////////////////////////////////////////////////" name="str" />
<input type="submit" />
</form>
LINK TO TEST: http://simplestudio.rs/tsto.php
At least do:
$search = mysql_real_escape_string($search);
Also you can check for that characters and if founded just replace them with empty string.
$ser_chk = strpos($search, "/");
if ($ser_chk > -1) {
$search = str_replace("/", "", $search);
}
$regex = "/\//";
$string = "////search";
$string = preg_replace($regex, '', $string);
echo $string;
The flexible solution for character repetition will be a regular expression.
$output = preg_replace('/\/[\/]+/', '/'. $input);
It will handle any number of "/" and replace it with a single "/".

PHP Remove URL from string

If I have a string that contains a url (for examples sake, we'll call it $url) such as;
$url = "Here is a funny site http://www.tunyurl.com/34934";
How do i remove the URL from the string?
Difficulty is, urls might also show up without the http://, such as ;
$url = "Here is another funny site www.tinyurl.com/55555";
There is no HTML present. How would i start a search if http or www exists, then remove the text/numbers/symbols until the first space?
I re-read the question, here is a function that would work as intended:
function cleaner($url) {
$U = explode(' ',$url);
$W =array();
foreach ($U as $k => $u) {
if (stristr($u,'http') || (count(explode('.',$u)) > 1)) {
unset($U[$k]);
return cleaner( implode(' ',$U));
}
}
return implode(' ',$U);
}
$url = "Here is another funny site www.tinyurl.com/55555 and http://www.tinyurl.com/55555 and img.hostingsite.com/badpic.jpg";
echo "Cleaned: " . cleaner($url);
Edit #2/#3 (I must be bored). Here is a version that verifies there is a TLD within the URL:
function containsTLD($string) {
preg_match(
"/(AC($|\/)|\.AD($|\/)|\.AE($|\/)|\.AERO($|\/)|\.AF($|\/)|\.AG($|\/)|\.AI($|\/)|\.AL($|\/)|\.AM($|\/)|\.AN($|\/)|\.AO($|\/)|\.AQ($|\/)|\.AR($|\/)|\.ARPA($|\/)|\.AS($|\/)|\.ASIA($|\/)|\.AT($|\/)|\.AU($|\/)|\.AW($|\/)|\.AX($|\/)|\.AZ($|\/)|\.BA($|\/)|\.BB($|\/)|\.BD($|\/)|\.BE($|\/)|\.BF($|\/)|\.BG($|\/)|\.BH($|\/)|\.BI($|\/)|\.BIZ($|\/)|\.BJ($|\/)|\.BM($|\/)|\.BN($|\/)|\.BO($|\/)|\.BR($|\/)|\.BS($|\/)|\.BT($|\/)|\.BV($|\/)|\.BW($|\/)|\.BY($|\/)|\.BZ($|\/)|\.CA($|\/)|\.CAT($|\/)|\.CC($|\/)|\.CD($|\/)|\.CF($|\/)|\.CG($|\/)|\.CH($|\/)|\.CI($|\/)|\.CK($|\/)|\.CL($|\/)|\.CM($|\/)|\.CN($|\/)|\.CO($|\/)|\.COM($|\/)|\.COOP($|\/)|\.CR($|\/)|\.CU($|\/)|\.CV($|\/)|\.CX($|\/)|\.CY($|\/)|\.CZ($|\/)|\.DE($|\/)|\.DJ($|\/)|\.DK($|\/)|\.DM($|\/)|\.DO($|\/)|\.DZ($|\/)|\.EC($|\/)|\.EDU($|\/)|\.EE($|\/)|\.EG($|\/)|\.ER($|\/)|\.ES($|\/)|\.ET($|\/)|\.EU($|\/)|\.FI($|\/)|\.FJ($|\/)|\.FK($|\/)|\.FM($|\/)|\.FO($|\/)|\.FR($|\/)|\.GA($|\/)|\.GB($|\/)|\.GD($|\/)|\.GE($|\/)|\.GF($|\/)|\.GG($|\/)|\.GH($|\/)|\.GI($|\/)|\.GL($|\/)|\.GM($|\/)|\.GN($|\/)|\.GOV($|\/)|\.GP($|\/)|\.GQ($|\/)|\.GR($|\/)|\.GS($|\/)|\.GT($|\/)|\.GU($|\/)|\.GW($|\/)|\.GY($|\/)|\.HK($|\/)|\.HM($|\/)|\.HN($|\/)|\.HR($|\/)|\.HT($|\/)|\.HU($|\/)|\.ID($|\/)|\.IE($|\/)|\.IL($|\/)|\.IM($|\/)|\.IN($|\/)|\.INFO($|\/)|\.INT($|\/)|\.IO($|\/)|\.IQ($|\/)|\.IR($|\/)|\.IS($|\/)|\.IT($|\/)|\.JE($|\/)|\.JM($|\/)|\.JO($|\/)|\.JOBS($|\/)|\.JP($|\/)|\.KE($|\/)|\.KG($|\/)|\.KH($|\/)|\.KI($|\/)|\.KM($|\/)|\.KN($|\/)|\.KP($|\/)|\.KR($|\/)|\.KW($|\/)|\.KY($|\/)|\.KZ($|\/)|\.LA($|\/)|\.LB($|\/)|\.LC($|\/)|\.LI($|\/)|\.LK($|\/)|\.LR($|\/)|\.LS($|\/)|\.LT($|\/)|\.LU($|\/)|\.LV($|\/)|\.LY($|\/)|\.MA($|\/)|\.MC($|\/)|\.MD($|\/)|\.ME($|\/)|\.MG($|\/)|\.MH($|\/)|\.MIL($|\/)|\.MK($|\/)|\.ML($|\/)|\.MM($|\/)|\.MN($|\/)|\.MO($|\/)|\.MOBI($|\/)|\.MP($|\/)|\.MQ($|\/)|\.MR($|\/)|\.MS($|\/)|\.MT($|\/)|\.MU($|\/)|\.MUSEUM($|\/)|\.MV($|\/)|\.MW($|\/)|\.MX($|\/)|\.MY($|\/)|\.MZ($|\/)|\.NA($|\/)|\.NAME($|\/)|\.NC($|\/)|\.NE($|\/)|\.NET($|\/)|\.NF($|\/)|\.NG($|\/)|\.NI($|\/)|\.NL($|\/)|\.NO($|\/)|\.NP($|\/)|\.NR($|\/)|\.NU($|\/)|\.NZ($|\/)|\.OM($|\/)|\.ORG($|\/)|\.PA($|\/)|\.PE($|\/)|\.PF($|\/)|\.PG($|\/)|\.PH($|\/)|\.PK($|\/)|\.PL($|\/)|\.PM($|\/)|\.PN($|\/)|\.PR($|\/)|\.PRO($|\/)|\.PS($|\/)|\.PT($|\/)|\.PW($|\/)|\.PY($|\/)|\.QA($|\/)|\.RE($|\/)|\.RO($|\/)|\.RS($|\/)|\.RU($|\/)|\.RW($|\/)|\.SA($|\/)|\.SB($|\/)|\.SC($|\/)|\.SD($|\/)|\.SE($|\/)|\.SG($|\/)|\.SH($|\/)|\.SI($|\/)|\.SJ($|\/)|\.SK($|\/)|\.SL($|\/)|\.SM($|\/)|\.SN($|\/)|\.SO($|\/)|\.SR($|\/)|\.ST($|\/)|\.SU($|\/)|\.SV($|\/)|\.SY($|\/)|\.SZ($|\/)|\.TC($|\/)|\.TD($|\/)|\.TEL($|\/)|\.TF($|\/)|\.TG($|\/)|\.TH($|\/)|\.TJ($|\/)|\.TK($|\/)|\.TL($|\/)|\.TM($|\/)|\.TN($|\/)|\.TO($|\/)|\.TP($|\/)|\.TR($|\/)|\.TRAVEL($|\/)|\.TT($|\/)|\.TV($|\/)|\.TW($|\/)|\.TZ($|\/)|\.UA($|\/)|\.UG($|\/)|\.UK($|\/)|\.US($|\/)|\.UY($|\/)|\.UZ($|\/)|\.VA($|\/)|\.VC($|\/)|\.VE($|\/)|\.VG($|\/)|\.VI($|\/)|\.VN($|\/)|\.VU($|\/)|\.WF($|\/)|\.WS($|\/)|\.XN--0ZWM56D($|\/)|\.XN--11B5BS3A9AJ6G($|\/)|\.XN--80AKHBYKNJ4F($|\/)|\.XN--9T4B11YI5A($|\/)|\.XN--DEBA0AD($|\/)|\.XN--G6W251D($|\/)|\.XN--HGBK6AJ7F53BBA($|\/)|\.XN--HLCJ6AYA9ESC7A($|\/)|\.XN--JXALPDLP($|\/)|\.XN--KGBECHTV($|\/)|\.XN--ZCKZAH($|\/)|\.YE($|\/)|\.YT($|\/)|\.YU($|\/)|\.ZA($|\/)|\.ZM($|\/)|\.ZW)/i",
$string,
$M);
$has_tld = (count($M) > 0) ? true : false;
return $has_tld;
}
function cleaner($url) {
$U = explode(' ',$url);
$W =array();
foreach ($U as $k => $u) {
if (stristr($u,".")) { //only preg_match if there is a dot
if (containsTLD($u) === true) {
unset($U[$k]);
return cleaner( implode(' ',$U));
}
}
}
return implode(' ',$U);
}
$url = "Here is another funny site badurl.badone somesite.ca/worse.jpg but this badsite.com www.tinyurl.com/55555 and http://www.tinyurl.com/55555 and img.hostingsite.com/badpic.jpg";
echo "Cleaned: " . cleaner($url);
returns:
Cleaned: Here is another funny site badurl.badone but this and and
$string = preg_replace('/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i', '', $string);
Parsing text for URLs is hard and looking for pre-existing, heavily tested code that already does this for you would be better than writing your own code and missing edge cases. For example, I would take a look at the process in Django's urlize, which wraps URLs in anchors. You could port it over to PHP, and--instead of wrapping URLs in an anchor--just delete them from the text.
thanks mike,
update a bit, it return notice error,
'/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i'
$string = preg_replace('/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i', '', $string);
$url = "Here is a funny site http://www.tunyurl.com/34934";
$replace = 'http www .com .org .net';
$with = '';
$clean_url = clean($url,$replace,$with);
echo $clean_url;
function clean($url,$replace,$with) {
$replace = explode(" ",$replace);
$new_string = '';
$check = explode(" ",$url);
foreach($check AS $key => $value) {
foreach($replace AS $key2 => $value2 ) {
if (-1 < strpos( strtolower($value), strtolower($value2) ) ) {
$value = $with;
break;
}
}
$new_string .= " ".$value;
}
return $new_string;
}
You would need to write a regular expression to extract out the urls.

Categories