PHP - UTF-8 to UCS-2

Asked By Tamer Hatoum
08-Dec-09 12:02 PM

Hello ;

I am searching for a convertor function from UTF-8 to UCS-2 can any body help?

regards...

re  re

08-Dec-09 12:32 PM

look here

<?php
//script from http://zizi.kxup.com/
//javascript unesape
function unescape($str) {
 
$str = rawurldecode($str
);
 
preg_match_all("/(?:%u.{4})|&#x.{4};|&#\d+;|.+/U",$str,$r
);
 
$ar = $r[0
];
print_r($ar
);
  foreach(
$ar as $k=>$v
) {
    if(
substr($v,0,2) == "%u"
)
     
$ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,-4
)));
    elseif(
substr($v,0,3) == "&#x"
)
     
$ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1
)));
    elseif(
substr($v,0,2) == "&#"
) {
echo
substr($v,2,-1)."<br>"
;
     
$ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1
)));
    }
  }
  return
join("",$ar
);
}
?>

http://php.net/manual/en/function.iconv.php

You can convert character encoding using the following two options:  You can convert character encoding using the following two options:

08-Dec-09 01:23 PM
  • The iConv character set can let you convert from one character encoding to another.  Here are the references that you need on the iConv character conversions.
  • Or use the mb_Convert_Encoding function by specifying the from and the to target encodings like:        $cstr = mb_convert_encoding($str,"UCS2","UTF-8");                                                                    The above mb_Convert_Encoding  functions converts string from UTF-8 to UCS2

SMS WEB  convert  SMS WEB convert

09-Dec-09 03:26 AM

Hello;

I need this convert to use in my web sms service , couse the company which I am deailing with I have to vivst there web to make the convert to the string the copy it to my web page then send it...

so What I want to make my own convertor.

so I tried this :

<?
 
 
function utf8toucs2hex($utf8)
{
 $utf8_hex = bin2hex( $utf8 );
 return utf8toucs2($utf8_hex);
}

function utf8hextoucs2hex($str)
{
       $ucs2 = "";

       for ($i=0;$i<strlen($str);$i+=2)
       {
                $char1hex = $str[$i].$str[$i+1];
              
  $char1dec = hexdec($char1hex);
                if ( $char1dec < 128)
  {
                        $results = $char1hex;
  }
                else if ( $char1dec < 224 )
                {
                 $char2hex = $str[$i+2].$str[$i+3];
                        $results = dechex( ((hexdec($char1hex)-192)*64) + (hexdec($char2hex)-128) );
                        $i+=2;
                }
                else if ( $char1dec < 240 )
                {
                 $char2hex = $str[$i+2].$str[$i+3];
                 $char3hex = $str[$i+4].$str[$i+5];
                        $results = dechex( ((hexdec($char1hex)-224)*4096) + ((hexdec($char2hex)-128)*64) + (hexdec($char3hex)-128) );
                        $i+=4;
                }
  else
  {
   //Not supported: UCS-2 only
                        $i+=6;
  }

  while ( strlen($results) < 4 )
  {
   $results = '0' . $results;
  }

                $ucs2 .= $results;
        }

        return $ucs2;
}

 $error="";
 if (isset ($_POST['submit']) && !empty($_POST['Mobile'])&& !empty($_POST['msg'])) {
$user = "username";
$password = "password";
$api_id ="apID";
$baseurl ="http://api.clickatell.com";
$text =$_POST['msg'];
$message_unicodehex = utf8toucs2hex($text);
$to = $_POST['Mobile'];
// auth call
$url = "$baseurl/http/auth?user=$user&password=$password&api_id=$api_id&unicode=1";
// do auth call
$ret = file($url);
$from="QN";
// split our response. return string is on first line of the data returned
$sess = split(":",$ret[0]);
if ($sess[0] == "OK") {
$sess_id = trim($sess[1]); // remove any whitespace

$url = "$baseurl/http/sendmsg?session_id=$sess_id&to=$to&text=$message_unicodehex&from=$from";
// do sendmsg call
$ret = file($url);
$send = split(":",$ret[0]);
if ($send[0] == "ID")
$error="success
message ID: ". $send[1];
else
$error='send message failed <br>check the mobile number format to be without + and without 00 ( 9746026267)';
} else {
$error="Authentication failure: ". $ret[0];
exit();
}
}
else $error="Plz Fill All the fields";
?>

So when I try the string in english it return to me the same string when I use there convertor. it works the same as there convertor , but if I put an arabic string my convertor gives defferent one then there convertor result...

what can I do?

regards..

SOLVED  SOLVED
09-Dec-09 12:51 PM

Hello All;

I have solved the problem and make that function of convertor:

function sms__unicode($message) {
  if (function_exists('iconv')) {
    $latin = @iconv('UTF-8', 'ISO-8859-1', $message);
    if (strcmp($latin, $message)) {
      $arr = unpack('H*hex', @iconv('UTF-8', 'UCS-2BE', $message));
      return strtoupper($arr['hex']) .'&unicode=1';
    }
  }
  return FALSE;
}

It works great with arabic unicoding .....

End of Post...

  tuyen replied to Tamer Hatoum
19-Nov-10 12:41 AM
You try to use this function: mb_convert_encoding($content, "UCS-2LE");
Create New Account
help
Fonts and character encodings .NET Framework hi 1) a) Do Fonts know anything about coded character sets ( Unicode, Ascii = 85)? In other words, does Font file specify which coded character sets may use this Font? b) I assume if Font supports certain coded character set, then any character encoding (aka code page) for that coded character set can use this Font? 2) a) Is Font file also where it is specified to code points ) chosen when some application tries to map this Font to particular coded character set? c) Particular Font is basically file with instructions how to draw its glyphs? So
decoding character encoding confusion C++ / VB Hi, I'm, a little confused about how to convert text encodings in a file downloaded from the Internet to memory (via InternetReadFile()). I download the file in an edit control I get Now, is this an artifact of the encoding or is it a "hardcoded" html string which has nothing to with the encoding? ie can I translate "&" to "&" by using some for of MultiByteToWideChar() ( or similar) or must should I go about this in the most efficient manner possible? eg I want to convert "Hello & Goodbye" to "Hello & Goodbye" TIA Lastly, I hope this is an appropriate group - apologies this kind of thing, but I think you need to (a) Get rid of these character entities; for example replace & by the byte value 38. (b) Use MultiByteToWideChar with the CP_UTF8 code page to convert to wide character unicode (UTF16). - - David Wilkinson Visual C++ MVP Hello David, So you think
new StreamWriter(File.Open(@"C: \ abc.txt", FileMode.Append)); sw11.WriteLine(str3); sw11.Close(); str3 = Encoding.Unicode.GetString(Encoding.Convert(Encoding.ASCII, Encoding.Unicode, Encoding.ASCII.GetBytes(str1))) + Encoding.Unicode.GetString(Encoding.Convert(Encoding.ASCII, Encoding.Unicode, Encoding.ASCII.GetBytes("~"))) + Encoding.Unicode.GetString(Encoding.Convert(Encoding.UTF8
wide character (unicode) and multi-byte character .NET Framework Hello everyone, Wide character and multi-byte character are two popular encoding schemes on Windows. And wide character is using unicode encoding scheme. But each time I feel confused when talking with another team - - codepage - - at the time. I am more confused when I saw sometimes we need codepage parameter for wide character conversion, and sometimes we do not need for conversion. Here are two examples, code page
Convert any character set strings to ascii character set string in oracle This explains how to convert any character set strings to ascii character set strings in oracle. Oracle has a function by the name, ASCIISTR used to convert any character set strings to ascii character set strings in oracle. Syntax: ASCIISTR (str) Ex: select ASCII