PHP - Whatsapp Web - Message format

PHP - Whatsapp Web - Message format - php

How to format Whatsapp Web Message ?
I trying replacing tags like this :
$nl = "%0D%0A";
$space = "%20";
$MSG = nl2br($MSG);
$MSG = str_replace( array("<b>","<bold>","</b>","</bold>"), array("*","*","*","*"), $MSG);
$MSG = str_replace( array(" ","<br>","\n", "\r\n"), array($space,$nl,$nl,$nl), $MSG);
I tried using urlencode, htmlspecialchars and nothing.
I´m receiving on https://api.whatsapp.com/send?phone=XXX&text=MSG a totally unformated and with a lot of chars string. Like this :
%F0%9F%94%94%2A...

I found the error and I´m posting here to help others.
// Should use UTF8
$MSG = utf8_encode($MSG);
// Whatsapp patterns
$nl = "%0D%0A"; // newline
$space = "%20"; // space
// Replace some Whatstapp tags
$MSG = str_replace( array("<b>","<bold>","</b>","</bold>"), array("*","*","*","*"), $MSG);
// Replace newline to Whatsapp format
$MSG = str_replace( array(" ","<br>","\n", "\r\n"), array($space,$nl,$nl,$nl), $MSG);
I replace only BOLD but you can add others tags. See :
https://faq.whatsapp.com/general/chats/how-to-format-your-messages/?lang=en

Related

Swift Mailer - multiple recipients not working

I have a problem with Swift Mailer, It seems easy, but I'm strugling with it for hours.
I need to send an email to multiple recipients.
This is a string I begin with (the content of $email->getRecipients() ):
// Just an example
first#gmail.com, second#gmail.com
First I remove all non-visible characters:
$string = preg_replace('/[\x00-\x1F\x7F]/u', '', $email->getRecipients());
Then I add surranding quotes:
$addQuotes = "'" . str_replace(",", "','", $string) . "'";
And remove empty spaces:
$recipients = str_replace(' ', '', $addQuotes);
Which gives me:
'first#gmail.com','second#gmail.com'
If I paste the string manually:
->setTo(['first#gmail.com','second#gmail.com'])
It works. But when I try to put variable like this:
$array = explode(',', $recipients);
...
->setTo($array)
Emails are not being sent.
When I do this:
->setTo([$recipients])
I get the error Address in mailbox given ['...','...'] does not comply with RFC 2822, 3.6.2.
I also tried:
foreach($array as $recipient) {
$message->setTo($recipient);
$this->get('mailer')->send($message);
}
Not working! But again, if I paste string directly, it works:
foreach($array as $recipient) {
$message->setTo('first#gmail.com');
$this->get('mailer')->send($message);
}
The code:
private function sendEmail($email)
{
//$email->getRecipients() = 'first#gmail.com, second#gmail.com'
$string = preg_replace('/[\x00-\x1F\x7F]/u', '', $email->getRecipients());
$addQuotes = "'" . str_replace(",", "','", $string) . "'";
$recipients = str_replace(' ', '', $addQuotes);
$array = explode(',', $recipients);
$message = \Swift_Message::newInstance()
->setSubject($email->getSubject())
->setFrom($email->getSender())
->setTo($array)
->setBody(
$this->renderView(
'Emails/default.html.twig',
array('body' => $email->getBody())
),
'text/html'
);
//foreach($array as $recipient) {
// $message->setTo($recipient);
// $this->get('mailer')->send($message);
//}
$this->get('mailer')->send($message);
}

Replacing characters in MIME encoded emails

I am looking for a way to simply replace characters with their ASCII counterparts in MIME encoded emails. I've written preliminary code below, but it seems like the str_replace commands I'm using will keep on going forever to catch all possible combinations. Is there a more efficient way to do this?
<?php
$strings = "=?utf-8?Q?UK=20Defence=20=2D=20Yes=2C=20Both=20Labour=20and=20Tory=20Need=20To=20Be=20Very=20Much=20Clearer=20On=20Defence?=";
function decodeString($input){
$space = array("=?utf-8?Q?","=?UTF-8?Q?", "=20","?=");
$hyphen = array("=E2=80=93","=2D");
$dotdotdot = "=E2=80=A6";
$pound = "=C2=A3";
$comma = "=2C";
$decode = str_replace($space, ' ', $input);
$decode = str_replace($hyphen, '-', $decode);
$decode = str_replace($pound, '£', $decode);
$decode = str_replace($comma, ',', $decode);
$decode = str_replace($dotdotdot, '...', $decode);
return $decode;
}
echo decodeString($strings);
?>

I figured it out - I have to pass $strings to the mb_decode_mimeheader() function.

PHP imap_fetchbody

I have been trying to fetch message but unsuccessful.
$body = imap_fetchbody($inbox, $email_id, 0);
the messages without attachments are good and I have output but with attachments
gives some complicated outputs out of which both html and plain message are encoded with some (Content-Type) which is a part of gmail messages

You can use the following code to get the plain text part of a multipart email body:
<?php
//get the body
$body = imap_fetchbody($inbox, $email_id, 0);
//parse the boundary separator
$matches = array();
preg_match('#Content-Type: multipart\/[^;]+;\s*boundary="([^"]+)"#i', $body, $matches);
list(, $boundary) = $matches;
$text = '';
if(!empty($boundary)) {
//split the body into the boundary parts
$emailSegments = explode('--' . $boundary, $body);
//get the plain text part
foreach($emailSegments as $segment) {
if(stristr($segment, 'Content-Type: text/plain') !== false) {
$text = trim(preg_replace('/Content-(Type|ID|Disposition|Transfer-Encoding):.*?\r\n/is', '', $segment));
break;
}
}
}
echo $text;
?>

$body = imap_fetchbody($inbox, $email_id, 1.0);
this seems to be the only one working for me. I think the first integer in the last parameter represents the section of the email, so if it starts with a zero it will contain all the header information. If it starts with a one then it contains the message information. Then the second integer followed by the period is the section of that section. So when I put zero it shows information, but when I put one or two it doesn't show anything for some emails.

This helped
$body = imap_fetchbody($inbox, $email_id, 1.1);

DOMDocument->saveHTML() vs urlencode with commercial at symbol (#)

Using DOMDocument(), I'm replacing links in a $message and adding some things, like [#MERGEID]. When I save the changes with $dom_document->saveHTML(), the links get "sort of" url-encoded. [#MERGEID] becomes %5B#MERGEID%5D.
Later in my code I need to replace [#MERGEID] with an ID. So I search for urlencode('[#MERGEID]') - however, urlencode() changes the commercial at symbol (#) to %40, while saveHTML() has left it alone. So there is no match - '%5B#MERGEID%5D' != '%5B%40MERGEID%5D'
Now, I know can run str_replace('%40', '#', urlencode('[#MERGEID]')) to get what I need to locate the merge variable in $message.
My question is, what RFC spec is DOMDocument using, and why is it different than urlencode or even rawurlencode? Is there anything I can do about that to save a str_replace?
Demo code:
$message = 'Google';
$dom_document = new \DOMDocument();
libxml_use_internal_errors(true); //Supress content errors
$dom_document->loadHTML(mb_convert_encoding($message, 'HTML-ENTITIES', 'UTF-8'));
$elements = $dom_document->getElementsByTagName('a');
foreach($elements as $element) {
$link = $element->getAttribute('href'); //http://www.google.com?ref=abc
$tag = $element->getAttribute('data-tag'); //thebottomlink
if ($link) {
$newlink = 'http://www.example.com/click/[#MERGEID]?url=' . $link;
if ($tag) {
$newlink .= '&tag=' . $tag;
}
$element->setAttribute('href', $newlink);
}
}
$message = $dom_document->saveHTML();
$urlencodedmerge = urlencode('[#MERGEID]');
die($message . ' and url encoded version: ' . $urlencodedmerge);
//<a data-tag="thebottomlink" href="http://www.example.com/click/%5B#MERGEID%5D?url=http://www.google.com?ref=abc&tag=thebottomlink">Google</a> and url encoded version: %5B%40MERGEID%5D

I believe that those two encoding serve different purposes. urlencode() encodes "a string to be used in a query part of a URL", while $element->setAttribute('href', $newlink); encodes a complete URL to be used as an URL.
For example:
urlencode('http://www.google.com'); // -> http%3A%2F%2Fwww.google.com
This is convenient for encoding the query part, but it cannot be used on <a href='...'>.
However:
$element->setAttribute('href', $newlink); // -> http://www.google.com
will properly encode the string so that it is still usable in href. The reason that it cannot encode # because it cannot tell whether # is a part of the query or is it part of the userinfo or email url (for example: mailto:invisal#google.com or invisal#127.0.0.1)
Solution
Instead of using [#MERGEID], you can use ##MERGEID##. Then, you replace that with your ID later. This solution does not require you to even use urlencode.
If you insist to use urlencode, you can just use %40 instead of #. So, your code will be like this $newlink = 'http://www.example.com/click/[%40MERGEID]?url=' . $link;
You can also do something like $newlink = 'http://www.example.com/click/' . urlencode('[#MERGEID]') . '?url=' . $link;

urlencode function and rawurlencode are mostly based on RFC 1738. However, since 2005 the current RFC in use for URIs standard is RFC 3986.
On the other hand, The DOM extension uses UTF-8 encoding, which is based on RFC 3629 . Use utf8_encode() and utf8_decode() to work with texts in ISO-8859-1 encoding or Iconv for other encodings.
The generic URI syntax mandates that new URI schemes that provide for
the representation of character data in a URI must, in effect,
represent characters from the unreserved set without translation, and
should convert all other characters to bytes according to UTF-8, and
then percent-encode those values.
Here is a function to decode URLs according to RFC 3986.
<?php
function myUrlEncode($string) {
$entities = array('%21', '%2A', '%27', '%28', '%29', '%3B', '%3A', '%40', '%26', '%3D', '%2B', '%24', '%2C', '%2F', '%3F', '%25', '%23', '%5B', '%5D');
$replacements = array('!', '*', "'", "(", ")", ";", ":", "#", "&", "=", "+", "$", ",", "/", "?", "%", "#", "[", "]");
return str_replace($entities, $replacements, urldecode($string));
}
?>
PHP Fiddle.
Update:
Since UTF8 has been used to encode $message:
$dom_document->loadHTML(mb_convert_encoding($message, 'HTML-ENTITIES', 'UTF-8'))
Use urldecode($message) when returning the URL without percents.
die(urldecode($message) . ' and url encoded version: ' . $urlencodedmerge);

The root cause of your problem has been very well explained from a technical point of view.
In my opinion, however, there is a conceptual flaw in your approach, and it created the situation that you are now trying to fix.
By processing your input $message through a DomDocument object, you have moved to a higher level of abstraction. It is wrong to manipulate as a unique plain string something that has been "promoted" to a HTML stream.
Instead of trying to reproduce DomDocument's behaviour, use the library itself to locate, extract and replace the values of interest:
$token = 'blah blah [#MERGEID]';
$message = '<a id="' . $token . '" href="' . $token . '"></a>';
$dom = new DOMDocument();
$dom->loadHTML($message);
echo $dom->saveHTML(); // now we have an abstract HTML document
// extract a raw value
$rawstring = $dom->getElementsByTagName('a')->item(0)->getAttribute('href');
// do the low-level fiddling
$newstring = str_replace($token, 'replaced', $rawstring);
// push the new value back into the abstract black box.
$dom->getElementsByTagName('a')->item(0)->setAttribute('href', $newstring);
// less code written, but works all the time
$rawstring = $dom->getElementsByTagName('a')->item(0)->getAttribute('id');
$newstring = str_replace($token, 'replaced', $rawstring);
$dom->getElementsByTagName('a')->item(0)->setAttribute('id', $newstring);
echo $dom->saveHTML();
As illustrated above, today we are trying to fix the problem when your token is inside a href, but one day we may want to search and replace the tag elsewhere in the document. To account for this case, do not bother making your low-level code HTML-aware.
(an alternative option would be not loading a DomDocument until all low-level replacements are done, but I am guessing this is not practical)
Complete proof of concept:
function searchAndReplace(DOMNode $node, $search, $replace) {
if($node->hasAttributes()) {
foreach ($node->attributes as $attribute) {
$input = $attribute->nodeValue;
$output = str_replace($search, $replace, $input);
$attribute->nodeValue = $output;
}
}
if(!$node instanceof DOMElement) { // this test needs double-checking
$input = $node->nodeValue;
$output = str_replace($search, $replace, $input);
$node->nodeValue = $output;
}
if($node->hasChildNodes()) {
foreach ($node->childNodes as $child) {
searchAndReplace($child, $search, $replace);
}
}
}
$token = '<>&;[#MERGEID]';
$message = '<a/>';
$dom = new DOMDocument();
$dom->loadHTML($message);
$dom->getElementsByTagName('a')->item(0)->setAttribute('id', "foo$token");
$dom->getElementsByTagName('a')->item(0)->setAttribute('href', "http://foo#$token");
$textNode = new DOMText("foo$token");
$dom->getElementsByTagName('a')->item(0)->appendchild($textNode);
echo $dom->saveHTML();
searchAndReplace($dom, $token, '*replaced*');
echo $dom->saveHTML();

If you use saveXML() it won't mess with the encoding the way saveHTML() does:
PHP
//your code...
$message = $dom_document->saveXML();
EDIT: also remove the XML tag:
//this will add an xml tag, so just remove it
$message=preg_replace("/\<\?xml(.*?)\?\>/","",$message);
echo $message;
Output
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>Google</body></html>
Notice that both still correctly convert & to &

Would it not make sense to just urlencode the original [#mergeid] whan saving it in the first place as well? Your search should then match without the need for the str_replace?
$newlink = 'http://www.example.com/click/'.urlencode('[#MERGEID]').'?url=' . $link;
I know this does not answer the first post of the question, but you cannot post code in comments as far as I can tell.

Need to validate "last tweet" script

Have been searching for a solution for hours.
My entire WordPress theme validates, except this script I'm using to receive the last tweet:
<?php
$twitterUsername = get_option('of_twitter_username');
$username = $twitterUsername; // Your twitter username.
$prefix = ""; // Prefix - some text you want displayed before your latest tweet.
$suffix = ""; // Suffix - some text you want display after your latest tweet.
$feed = "http://search.twitter.com/search.atom?q=from:" . $username . "&rpp=1";
function parse_feed($feed) {
$stepOne = explode("<content type=\"html\">", $feed);
$stepTwo = explode("</content>", $stepOne[1]);
$tweet = $stepTwo[0];
$tweet = str_replace("<", "<", $tweet);
$tweet = str_replace(">", ">", $tweet);
return $tweet;
}
$twitterFeed = file_get_contents($feed);
echo stripslashes($prefix) . parse_feed($twitterFeed) . stripslashes($suffix);
?>
The error, it seems, is:
$tweet = str_replace(">", ">", $tweet);
Not sure how to fix this.
Thanks for any help.

Replace the two str_replace calls with:
$tweet = html_entity_decode($tweet);

Maybe a simplier way (you don't need to parse) is to load http://search.twitter.com/search.json?q=from:the_username and make a json_decode of the result.
Then you can get the last tweet easily.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP - Whatsapp Web - Message format - php

Related

Swift Mailer - multiple recipients not working

Replacing characters in MIME encoded emails

PHP imap_fetchbody

DOMDocument->saveHTML() vs urlencode with commercial at symbol (#)

Need to validate "last tweet" script

Categories

Resources