Someone know this "language"? - php

I need to parse this language in PHP, but I don't know what language it is and how to parse it.
Does someone know what language it is?
And if it's not a language, can someone explain me how to parse it?
Thank you very much
include "folder/file1.conf"
include "folder/file2.conf"
auth-mocked {
welcome = "Welcome"
login = "Login to continue:"
placeholder = "login"
button = "Login"
error = "Error:"
}
auth {
sso {
validation {
expected-uuid = "You need an UID"
}
session-not-found = "session was not found"
}
}
header {
company-name = "Company name"
help-popup {
title = "Need help?"
paragraph = "If you have any issue, you can contact your dedicated interlocutor:"
}
language-popup {
title = "Change language"
}
language = "Change language"
profile = "My profile"
terms-of-use = "Terms of use"
ao-documents = "Documents"
logout = "Logout"
user = "User"
}
black-panel {
common {
form = "You are currently filling the form:"
btn-i-understand = "Ok, thanks"
btn-link-view = "View"
}
}

I have finaly created my own parser to get the label for each keys.
function parseFile($file){
$title = "";
$key = "";
$value = "";
$str = "";
$array = array();
$results = array();
$lines = file('./generated_json/'.$file);
foreach($lines as $line){
if(strpos($line, " {\n")){
$title = str_replace(" {", "", $line);
array_push($array, $title);
$str = implode(".", $array);
}
if(strpos($line, "=")){
$keyEx = explode("=", $line);
$key = $keyEx[0];
$value = $keyEx[1];
$parsed = $str.".".$key;
$parsed = preg_replace('/\s+/', '', $parsed);
$parsed = str_replace("=", "", $parsed);
array_push($results, $parsed." = ".$value);
}
if(strpos($line, "}\n")){
array_pop($array);
$str = implode(".", $array);
}
}
return $results;
}

It may be an homemade file format, but here is a list of common file formats used for translation :
http://docs.translatehouse.org/projects/translate-toolkit/en/latest/formats/
If you don't find your file format in there, you could probably write a parser for it.

Related

how can i ban specific numbers from a .txt file?

I made a text file named ben.txt and there are some numbers line by line, for example 123456 so i want, whenever someone type !check 123456 so they should get message like Number Banned
I Made a code But it doesn't working
My Code
$been = file_get_contents(ben.txt);
$isbanned = false;
foreach ($been as $bb) {
if(strpos($message, "!sa $bb") ===0) $isbanned = true;
sendMessage($chatId, "<b>Number Banned!</b>");
return;
}
You can turn the contents of the file into a regular expression that matches any of the strings.
$ex_cont = file("ben.txt", FILE_IGNORE_NEW_LINES);
$isbanned = false;
$regex = '/^!sa (' . implode('|', array_map('preg_quote', $ex_cont)) . ')/';
if (preg_match($regex, $message)) {
$isbanned = true;
sendMessage($chatId, "<b>Number Banned!</b>");
}

automatic change spaces and period to underscore

I would like to know if there is a way not automatically format my data from ".data" to "_data" or "data name" to "data_name" ?
if (!empty($_POST)) {
foreach ($_POST as $app_id=>$val) {
if ($app_id != "submit") {
$prof_rate = $_POST[$app_id];
$tokens = explode("-", $app_id);
$label = $tokens[0];
$applicationId = $tokens[1];
// insert query technology_results
echo $label.":".$prof_rate."<br/>";
}
}
} else {
$app_val_err = "Error Message!";
}
We can try using preg_replace to replace all spaces and dots with underscores:
$input = "some.name here";
$output = preg_replace("/[ .]/", "_", $input);

Alter all a href links in php

Currently working on something where i need to add the UTM tag to all links, got 1/2 minor issues i cant figure out
This is the code im am using, the issue is if a link got a parameter like ?test=test then this refuses to add the utm tags.
The other issue is a minor issue that im not sure would make sence to change, insted of me having to add a url, it could be neat if it added utm tags to ALL a href's by default with out knowing the domain name.
Hope someone can help me out and push me in the right direction.
$url_modifier_domain = preg_quote('add-link.com');
$html_text = preg_replace_callback(
'#((?:https?:)?//'.$url_modifier_domain.'(/[^\'"\#]*)?)(?=[\'"\#])#i',
function($matches){
$url_modifier = 'utm=some&medium=stuff';
if (!isset($matches[2])) return $matches[1]."/?$url_modifier";
$q = strpos($matches[2],'?');
if ($q===false) return $matches[1]."?$url_modifier";
if ($q==strlen($matches[2])-1) return $matches[1].$url_modifier;
return $matches[1]."&$url_modifier";
},
$html);
once detected the urls you can use parse_url() and parse_str() to elaborate the url, add utm and medium and rebuild it without caring too much about the content of the get parameters or the hash:
$url_modifier_domain = preg_quote('add-link.com');
$html_text = preg_replace_callback(
'#((?:https?:)?//'.$url_modifier_domain.'(/[^\'"\#]*)?)(?=[\'"\#])#i',
function ($matches) {
$link = $matches[0];
if (strpos($link, '#') !== false) {
list($link, $hash) = explode('#', $link);
}
$res = parse_url($link);
$result = '';
if (isset($res['scheme'])) {
$result .= $res['scheme'].'://';
}
if (isset($res['host'])) {
$result .= $res['host'];
}
if (isset($res['path'])) {
$result .= $res['path'];
}
if (isset($res['query'])) {
parse_str($res['query'], $res['query']);
} else {
$res['query'] = [];
}
$res['query']['utm'] = 'some';
$res['query']['medium'] = 'stuff';
if (count($res['query']) > 0) {
$result .= '?'.http_build_query($res['query']);
}
if (isset($hash)) {
$result .= '#'.$hash;
}
return $result;
},
$html
);
As you can see, the code is longer but simpler
Edit
I made some change, searching for every href="xxx" inside the text. If the link is not from add-link.com the script will skip it, otherwise he will try to print it in the best way possible
$html = 'blabla a
a
a
a
a
a
a
a
a
a
a
';
$url_modifier_domain = preg_quote('add-link.com');
$html_text = preg_replace_callback(
'/href="([^"]+)"/i',
function ($matches) {
$link = $matches[1];
// ignoring outer links
if(strpos($link,'add-link.com') === false) return 'href="'.$link.'"';
if (strpos($link, '#') !== false) {
list($link, $hash) = explode('#', $link);
}
$res = parse_url($link);
$result = '';
if (isset($res['scheme'])) {
$result .= $res['scheme'].'://';
} else if(isset($res['host'])) {
$result .= '//';
}
if (isset($res['host'])) {
$result .= $res['host'];
}
if (isset($res['path'])) {
$result .= $res['path'];
} else {
$result .= '/';
}
if (isset($res['query'])) {
parse_str($res['query'], $res['query']);
} else {
$res['query'] = [];
}
$res['query']['utm'] = 'some';
$res['query']['medium'] = 'stuff';
if (count($res['query']) > 0) {
$result .= '?'.http_build_query($res['query']);
}
if (isset($hash)) {
$result .= '#'.$hash;
}
return 'href="'.$result.'"';
},
$html
);
var_dump($html_text);

Windows-1251 file inside UTF-8 site?

Hello everyone Masters Of Web Delevopment :)
I have a piece of PHP script that fetches last 10 played songs from my winamp. This script is inside file (lets call it "lastplayed.php") which is included in my site with php include function inside a "div".
My site is on UTF-8 encoding. The problem is that some songs titles are in Windows-1251 encoding. And in my site they displays like "������"...
Is there any known way to tell to this div with included "lastplayed.php" in it, to be with windows-1251 encoding?
Or any other suggestions?
P.S: The file with fetching script a.k.a. "lastplayed.php", is converted to UTF-8. But if it is ANCII it's the same result. I try to put and meta tag with windows-1251 between head tag but nothing happens again.
P.P.S: Script that fetches the Winamp's data (lastplayed.php):
<?php
/******
* You may use and/or modify this script as long as you:
* 1. Keep my name & webpage mentioned
* 2. Don't use it for commercial purposes
*
* If you want to use this script without complying to the rules above, please contact me first at: marty#excudo.net
*
* Author: Martijn Korse
* Website: http://devshed.excudo.net
*
* Date: 08-05-2006
***/
/**
* version 2.0
*/
class Radio
{
var $fields = array();
var $fieldsDefaults = array("Server Status", "Stream Status", "Listener Peak", "Average Listen Time", "Stream Title", "Content Type", "Stream Genre", "Stream URL", "Current Song");
var $very_first_str;
var $domain, $port, $path;
var $errno, $errstr;
var $trackLists = array();
var $isShoutcast;
var $nonShoutcastData = array(
"Server Status" => "n/a",
"Stream Status" => "n/a",
"Listener Peak" => "n/a",
"Average Listen Time" => "n/a",
"Stream Title" => "n/a",
"Content Type" => "n/a",
"Stream Genre" => "n/a",
"Stream URL" => "n/a",
"Stream AIM" => "n/a",
"Stream IRC" => "n/a",
"Current Song" => "n/a"
);
var $altServer = False;
function Radio($url)
{
$parsed_url = parse_url($url);
$this->domain = isset($parsed_url['host']) ? $parsed_url['host'] : "";
$this->port = !isset($parsed_url['port']) || empty($parsed_url['port']) ? "80" : $parsed_url['port'];
$this->path = empty($parsed_url['path']) ? "/" : $parsed_url['path'];
if (empty($this->domain))
{
$this->domain = $this->path;
$this->path = "";
}
$this->setOffset("Current Stream Information");
$this->setFields(); // setting default fields
$this->setTableStart("<table border=0 cellpadding=2 cellspacing=2>");
$this->setTableEnd("</table>");
}
function setFields($array=False)
{
if (!$array)
$this->fields = $this->fieldsDefaults;
else
$this->fields = $array;
}
function setOffset($string)
{
$this->very_first_str = $string;
}
function setTableStart($string)
{
$this->tableStart = $string;
}
function setTableEnd($string)
{
$this->tableEnd = $string;
}
function getHTML($page=False)
{
if (!$page)
$page = $this->path;
$contents = "";
$domain = (substr($this->domain, 0, 7) == "http://") ? substr($this->domain, 7) : $this->domain;
if (#$fp = fsockopen($domain, $this->port, $this->errno, $this->errstr, 2))
{
fputs($fp, "GET ".$page." HTTP/1.1\r\n".
"User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)\r\n".
"Accept: */*\r\n".
"Host: ".$domain."\r\n\r\n");
$c = 0;
while (!feof($fp) && $c <= 20)
{
$contents .= fgets($fp, 4096);
$c++;
}
fclose ($fp);
preg_match("/(Content-Type:)(.*)/i", $contents, $matches);
if (count($matches) > 0)
{
$contentType = trim($matches[2]);
if ($contentType == "text/html")
{
$this->isShoutcast = True;
return $contents;
}
else
{
$this->isShoutcast = False;
$htmlContent = substr($contents, 0, strpos($contents, "\r\n\r\n"));
$dataStr = str_replace("\r", "\n", str_replace("\r\n", "\n", $contents));
$lines = explode("\n", $dataStr);
foreach ($lines AS $line)
{
if ($dp = strpos($line, ":"))
{
$key = substr($line, 0, $dp);
$value = trim(substr($line, ($dp+1)));
if (preg_match("/genre/i", $key))
$this->nonShoutcastData['Stream Genre'] = $value;
if (preg_match("/name/i", $key))
$this->nonShoutcastData['Stream Title'] = $value;
if (preg_match("/url/i", $key))
$this->nonShoutcastData['Stream URL'] = $value;
if (preg_match("/content-type/i", $key))
$this->nonShoutcastData['Content Type'] = $value;
if (preg_match("/icy-br/i", $key))
$this->nonShoutcastData['Stream Status'] = "Stream is up at ".$value."kbps";
if (preg_match("/icy-notice2/i", $key))
{
$this->nonShoutcastData['Server Status'] = "This is <span style=\"color: red;\">not</span> a Shoutcast server!";
if (preg_match("/ultravox/i", $value))
$this->nonShoutcastData['Server Status'] .= " But an Ultravox Server";
$this->altServer = $value;
}
}
}
return nl2br($htmlContent);
}
}
else
return $contents;
}
else
{
return False;
}
}
function getServerInfo($display_array=null, $very_first_str=null)
{
if (!isset($display_array))
$display_array = $this->fields;
if (!isset($very_first_str))
$very_first_str = $this->very_first_str;
if ($html = $this->getHTML())
{
// parsing the contents
$data = array();
foreach ($display_array AS $key => $item)
{
if ($this->isShoutcast)
{
$very_first_pos = stripos($html, $very_first_str);
$first_pos = stripos($html, $item, $very_first_pos);
$line_start = strpos($html, "<td>", $first_pos);
$line_end = strpos($html, "</td>", $line_start) + 4;
$difference = $line_end - $line_start;
$line = substr($html, $line_start, $difference);
$data[$key] = strip_tags($line);
}
else
{
$data[$key] = $this->nonShoutcastData[$item];
}
}
return $data;
}
else
{
return $this->errstr." (".$this->errno.")";
}
}
function createHistoryArray($page)
{
if (!in_array($page, $this->trackLists))
{
$this->trackLists[] = $page;
if ($html = $this->getHTML($page))
{
$fromPos = stripos($html, $this->tableStart);
$toPos = stripos($html, $this->tableEnd, $fromPos);
$tableData = substr($html, $fromPos, ($toPos-$fromPos));
$lines = explode("</tr><tr>", $tableData);
$tracks = array();
$c = 0;
foreach ($lines AS $line)
{
$info = explode ("</td><td>", $line);
$time = trim(strip_tags($info[0]));
if (substr($time, 0, 9) != "Copyright" && !preg_match("/Tag Loomis, Tom Pepper and Justin Frankel/i", $info[1]))
{
$this->tracks[$c]['time'] = $time;
$this->tracks[$c++]['track'] = trim(strip_tags($info[1]));
}
}
if (count($this->tracks) > 0)
{
unset($this->tracks[0]);
if (isset($this->tracks[1]))
$this->tracks[1]['track'] = str_replace("Current Song", "", $this->tracks[1]['track']);
}
}
else
{
$this->tracks[0] = array("time"=>$this->errno, "track"=>$this->errstr);
}
}
}
function getHistoryArray($page="/played.html")
{
if (!in_array($page, $this->trackLists))
$this->createHistoryArray($page);
return $this->tracks;
}
function getHistoryTable($page="/played.html", $trackColText=False, $class=False)
{
$title_utf8 = mb_convert_encoding($trackArr ,"utf-8" ,"auto");
if (!in_array($page, $this->trackLists))
$this->createHistoryArray($page);
if ($trackColText)
$output .= "
<div class='lastplayed_top'></div>
<div".($class ? " class=\"".$class."\"" : "").">";
foreach ($this->tracks AS $title_utf8)
$output .= "<div style='padding:2px 0;'>".$title_utf8['track']."</div>";
$output .= "</div><div class='lastplayed_bottom'></div>
<div class='lastplayed_title'>".$trackColText."</div>
\n";
return $output;
}
}
// this is needed for those with a php version < 5
// the function is copied from the user comments # php.net (http://nl3.php.net/stripos)
if (!function_exists("stripos"))
{
function stripos($haystack, $needle, $offset=0)
{
return strpos(strtoupper($haystack), strtoupper($needle), $offset);
}
}
?>
And the calling script outside the lastplayed.php:
include "lastplayed.php";
$radio = new Radio($ip.":".$port);
echo $radio->getHistoryTable("/played.html", "<b>Last played:</b>", "lastplayed_content");
If all of your source data is in windows-1251, you can use something like:
$title_utf8=mb_convert_encoding($title,"utf-8","Windows-1251")
and put that converted data in your HTML stream.
Since I'm only looking at docs, I'm not 100% sure that the source encoding alias is correct; you may want to try CP1251 if Windows-1251 doesn't work.
If your source data isn't reliably in 1251, you'll have to come up with a heuristic to guess, and use the same conversion method. mb_detect_encoding may help you.
You cannot change the encoding of just part of an HTML document, but you can certainly convert everything to UTF-8 easily enough.
The newer ID3 implementations have an encoding marker in their text frames:
$00 ISO-8859-1 (ASCII)
$01 – UCS-2 in ID3v2.2 and ID3v2.3, UTF-16 encoded Unicode with BOM.
$02 – UTF-16BE encoded Unicode without BOM in ID3v2.4 only.
$03 – UTF-8 encoded Unicode in ID3v2.4 only.
Is it possible that your content is in UTF16?
Based on the code you've posted, it's not clear how $trackArr is defined, as it's not referenced elsewhere. It looks like you have several problems.
$title_utf8 = mb_convert_encoding($trackArr ,"utf-8" ,"auto")
"auto" expands to a list of encodings that do not include Windows-1251, so I'm not sure why you've used it. You really should use "Windows-1251". I have tried using "Windows-1251,utf-16" on a mac with PHP installed, but autodetect fails to find a suitable encoding against a relatively short string, so it looks like you're going to have to be the one to guess.
But that code doesn't look like it has any reason to exist anyway, as you overwrite the values with your iteration:
foreach ($this->tracks AS $title_utf8)
$output .= "<div style='padding:2px 0;'>".$title_utf8['track'].\"</div>";
In each iteration, the variable $title_utf8 is assigned to the current track. What you probably want is something more like:
foreach ($this->tracks AS $current_track)
$output .= "<div style='padding:2px 0;'>". mb_convert_encoding($current_track ,"utf-8" ,"Windows-1251");
mb_convert_encoding takes a string as the first argument, not an array or object, so you need to apply this encoding on each string that is not utf-8.
Just to let you know that the latest version supports character encoding/decoding :-)

PHP Remove URL from string

If I have a string that contains a url (for examples sake, we'll call it $url) such as;
$url = "Here is a funny site http://www.tunyurl.com/34934";
How do i remove the URL from the string?
Difficulty is, urls might also show up without the http://, such as ;
$url = "Here is another funny site www.tinyurl.com/55555";
There is no HTML present. How would i start a search if http or www exists, then remove the text/numbers/symbols until the first space?
I re-read the question, here is a function that would work as intended:
function cleaner($url) {
$U = explode(' ',$url);
$W =array();
foreach ($U as $k => $u) {
if (stristr($u,'http') || (count(explode('.',$u)) > 1)) {
unset($U[$k]);
return cleaner( implode(' ',$U));
}
}
return implode(' ',$U);
}
$url = "Here is another funny site www.tinyurl.com/55555 and http://www.tinyurl.com/55555 and img.hostingsite.com/badpic.jpg";
echo "Cleaned: " . cleaner($url);
Edit #2/#3 (I must be bored). Here is a version that verifies there is a TLD within the URL:
function containsTLD($string) {
preg_match(
"/(AC($|\/)|\.AD($|\/)|\.AE($|\/)|\.AERO($|\/)|\.AF($|\/)|\.AG($|\/)|\.AI($|\/)|\.AL($|\/)|\.AM($|\/)|\.AN($|\/)|\.AO($|\/)|\.AQ($|\/)|\.AR($|\/)|\.ARPA($|\/)|\.AS($|\/)|\.ASIA($|\/)|\.AT($|\/)|\.AU($|\/)|\.AW($|\/)|\.AX($|\/)|\.AZ($|\/)|\.BA($|\/)|\.BB($|\/)|\.BD($|\/)|\.BE($|\/)|\.BF($|\/)|\.BG($|\/)|\.BH($|\/)|\.BI($|\/)|\.BIZ($|\/)|\.BJ($|\/)|\.BM($|\/)|\.BN($|\/)|\.BO($|\/)|\.BR($|\/)|\.BS($|\/)|\.BT($|\/)|\.BV($|\/)|\.BW($|\/)|\.BY($|\/)|\.BZ($|\/)|\.CA($|\/)|\.CAT($|\/)|\.CC($|\/)|\.CD($|\/)|\.CF($|\/)|\.CG($|\/)|\.CH($|\/)|\.CI($|\/)|\.CK($|\/)|\.CL($|\/)|\.CM($|\/)|\.CN($|\/)|\.CO($|\/)|\.COM($|\/)|\.COOP($|\/)|\.CR($|\/)|\.CU($|\/)|\.CV($|\/)|\.CX($|\/)|\.CY($|\/)|\.CZ($|\/)|\.DE($|\/)|\.DJ($|\/)|\.DK($|\/)|\.DM($|\/)|\.DO($|\/)|\.DZ($|\/)|\.EC($|\/)|\.EDU($|\/)|\.EE($|\/)|\.EG($|\/)|\.ER($|\/)|\.ES($|\/)|\.ET($|\/)|\.EU($|\/)|\.FI($|\/)|\.FJ($|\/)|\.FK($|\/)|\.FM($|\/)|\.FO($|\/)|\.FR($|\/)|\.GA($|\/)|\.GB($|\/)|\.GD($|\/)|\.GE($|\/)|\.GF($|\/)|\.GG($|\/)|\.GH($|\/)|\.GI($|\/)|\.GL($|\/)|\.GM($|\/)|\.GN($|\/)|\.GOV($|\/)|\.GP($|\/)|\.GQ($|\/)|\.GR($|\/)|\.GS($|\/)|\.GT($|\/)|\.GU($|\/)|\.GW($|\/)|\.GY($|\/)|\.HK($|\/)|\.HM($|\/)|\.HN($|\/)|\.HR($|\/)|\.HT($|\/)|\.HU($|\/)|\.ID($|\/)|\.IE($|\/)|\.IL($|\/)|\.IM($|\/)|\.IN($|\/)|\.INFO($|\/)|\.INT($|\/)|\.IO($|\/)|\.IQ($|\/)|\.IR($|\/)|\.IS($|\/)|\.IT($|\/)|\.JE($|\/)|\.JM($|\/)|\.JO($|\/)|\.JOBS($|\/)|\.JP($|\/)|\.KE($|\/)|\.KG($|\/)|\.KH($|\/)|\.KI($|\/)|\.KM($|\/)|\.KN($|\/)|\.KP($|\/)|\.KR($|\/)|\.KW($|\/)|\.KY($|\/)|\.KZ($|\/)|\.LA($|\/)|\.LB($|\/)|\.LC($|\/)|\.LI($|\/)|\.LK($|\/)|\.LR($|\/)|\.LS($|\/)|\.LT($|\/)|\.LU($|\/)|\.LV($|\/)|\.LY($|\/)|\.MA($|\/)|\.MC($|\/)|\.MD($|\/)|\.ME($|\/)|\.MG($|\/)|\.MH($|\/)|\.MIL($|\/)|\.MK($|\/)|\.ML($|\/)|\.MM($|\/)|\.MN($|\/)|\.MO($|\/)|\.MOBI($|\/)|\.MP($|\/)|\.MQ($|\/)|\.MR($|\/)|\.MS($|\/)|\.MT($|\/)|\.MU($|\/)|\.MUSEUM($|\/)|\.MV($|\/)|\.MW($|\/)|\.MX($|\/)|\.MY($|\/)|\.MZ($|\/)|\.NA($|\/)|\.NAME($|\/)|\.NC($|\/)|\.NE($|\/)|\.NET($|\/)|\.NF($|\/)|\.NG($|\/)|\.NI($|\/)|\.NL($|\/)|\.NO($|\/)|\.NP($|\/)|\.NR($|\/)|\.NU($|\/)|\.NZ($|\/)|\.OM($|\/)|\.ORG($|\/)|\.PA($|\/)|\.PE($|\/)|\.PF($|\/)|\.PG($|\/)|\.PH($|\/)|\.PK($|\/)|\.PL($|\/)|\.PM($|\/)|\.PN($|\/)|\.PR($|\/)|\.PRO($|\/)|\.PS($|\/)|\.PT($|\/)|\.PW($|\/)|\.PY($|\/)|\.QA($|\/)|\.RE($|\/)|\.RO($|\/)|\.RS($|\/)|\.RU($|\/)|\.RW($|\/)|\.SA($|\/)|\.SB($|\/)|\.SC($|\/)|\.SD($|\/)|\.SE($|\/)|\.SG($|\/)|\.SH($|\/)|\.SI($|\/)|\.SJ($|\/)|\.SK($|\/)|\.SL($|\/)|\.SM($|\/)|\.SN($|\/)|\.SO($|\/)|\.SR($|\/)|\.ST($|\/)|\.SU($|\/)|\.SV($|\/)|\.SY($|\/)|\.SZ($|\/)|\.TC($|\/)|\.TD($|\/)|\.TEL($|\/)|\.TF($|\/)|\.TG($|\/)|\.TH($|\/)|\.TJ($|\/)|\.TK($|\/)|\.TL($|\/)|\.TM($|\/)|\.TN($|\/)|\.TO($|\/)|\.TP($|\/)|\.TR($|\/)|\.TRAVEL($|\/)|\.TT($|\/)|\.TV($|\/)|\.TW($|\/)|\.TZ($|\/)|\.UA($|\/)|\.UG($|\/)|\.UK($|\/)|\.US($|\/)|\.UY($|\/)|\.UZ($|\/)|\.VA($|\/)|\.VC($|\/)|\.VE($|\/)|\.VG($|\/)|\.VI($|\/)|\.VN($|\/)|\.VU($|\/)|\.WF($|\/)|\.WS($|\/)|\.XN--0ZWM56D($|\/)|\.XN--11B5BS3A9AJ6G($|\/)|\.XN--80AKHBYKNJ4F($|\/)|\.XN--9T4B11YI5A($|\/)|\.XN--DEBA0AD($|\/)|\.XN--G6W251D($|\/)|\.XN--HGBK6AJ7F53BBA($|\/)|\.XN--HLCJ6AYA9ESC7A($|\/)|\.XN--JXALPDLP($|\/)|\.XN--KGBECHTV($|\/)|\.XN--ZCKZAH($|\/)|\.YE($|\/)|\.YT($|\/)|\.YU($|\/)|\.ZA($|\/)|\.ZM($|\/)|\.ZW)/i",
$string,
$M);
$has_tld = (count($M) > 0) ? true : false;
return $has_tld;
}
function cleaner($url) {
$U = explode(' ',$url);
$W =array();
foreach ($U as $k => $u) {
if (stristr($u,".")) { //only preg_match if there is a dot
if (containsTLD($u) === true) {
unset($U[$k]);
return cleaner( implode(' ',$U));
}
}
}
return implode(' ',$U);
}
$url = "Here is another funny site badurl.badone somesite.ca/worse.jpg but this badsite.com www.tinyurl.com/55555 and http://www.tinyurl.com/55555 and img.hostingsite.com/badpic.jpg";
echo "Cleaned: " . cleaner($url);
returns:
Cleaned: Here is another funny site badurl.badone but this and and
$string = preg_replace('/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i', '', $string);
Parsing text for URLs is hard and looking for pre-existing, heavily tested code that already does this for you would be better than writing your own code and missing edge cases. For example, I would take a look at the process in Django's urlize, which wraps URLs in anchors. You could port it over to PHP, and--instead of wrapping URLs in an anchor--just delete them from the text.
thanks mike,
update a bit, it return notice error,
'/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i'
$string = preg_replace('/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i', '', $string);
$url = "Here is a funny site http://www.tunyurl.com/34934";
$replace = 'http www .com .org .net';
$with = '';
$clean_url = clean($url,$replace,$with);
echo $clean_url;
function clean($url,$replace,$with) {
$replace = explode(" ",$replace);
$new_string = '';
$check = explode(" ",$url);
foreach($check AS $key => $value) {
foreach($replace AS $key2 => $value2 ) {
if (-1 < strpos( strtolower($value), strtolower($value2) ) ) {
$value = $with;
break;
}
}
$new_string .= " ".$value;
}
return $new_string;
}
You would need to write a regular expression to extract out the urls.

Categories