
Опытный
 
Профиль
Группа: Участник
Сообщений: 777
Регистрация: 7.5.2005
Где: ты моя темноглаза я где?!
Репутация: нет Всего: 39
|
Добрый день у меня есть файл такого содержания Код | <html> <head> <title>\xc3\xe0\xe7\xee\xe0\xed\xe0\xeb\xe8\xe7\xe0\xf2\xee\xf0\xfb \xf3\xed\xe8\xe2\xe5\xf0\xf1\xe0\xeb\xfc\xed\xfb\xe5 \xc3\xc0\xcd\xca-4. \xcc\xe5\xf2\xf0\xe0 \xd2\xe5\xeb\xe5\xea\xee\xec</title> <meta http-equiv="Content-Type" content="text/html; charset=windows-1251"> <meta name="Description" content="\xc3\xe0\xe7\xee\xe0\xed\xe0\xeb\xe8\xe7\xe0\xf2\xee\xf0\xfb \xf3\xed\xe8\xe2\xe5\xf0\xf1\xe0\xeb\xfc\xed\ xfb\xe5 \xc3\xc0\xcd\xca-4. \xcc\xe5\xf2\xf0\xe0 \xd2\xe5\xeb\xe5\xea\xee\xec - \xef\xee\xf1\xf2\xe0\xe2\xea\xe8 \xe8\xe7\xec\xe5\xf0\xe8\xf 2\xe5\xeb\xfc\xed\xee\xe9 \xf2\xe5\xf5\xed\xe8\xea\xe8 \xef\xee \xe7\xe0\xe2\xee\xe4\xf1\xea\xe8\xec \xf6\xe5\xed\xe0\xec. \xdd\xea\xf1\xea\ xeb\xfe\xe7\xe8\xe2\xed\xfb\xe9 \xe4\xe8\xeb\xe5\xf0."> <meta name="Keywords" content="\xc3\xe0\xe7\xee\xe0\xed\xe0\xeb\xe8\xe7\xe0\xf2\xee\xf0\xfb \xf3\xed\xe8\xe2\xe5\xf0\xf1\xe0\xeb\xfc\xed\xfb \xe5 \xc3\xc0\xcd\xca-4, \xea\xe8\xef\xe8\xe0, \xe8\xe7\xec\xe5\xf0\xe8\xf2\xe5\xeb\xfc\xed\xfb\xe5 \xef\xf0\xe8\xe1\xee\xf0\xfb"> <meta name="Robots" content="all"> <link rel="stylesheet" href="style.css" type="text/css"> <script type="text/javascript" src="menu.js" ></SCRIPT> </head> <body bgcolor="#FFFFFF" leftmargin="0" topmargin="0" marginwidth="0" marginheight="0"> <table width="100%" border="0" cellspacing="0" cellpadding="0" align="center" height="100%" bgcolor="#FFFFFF"> <tr> <td align="left" valign="middle" height="156"> <table width="100%" border="0" cellspacing="0" cellpadding="0" height="156"> <tr align="left"> <td valign="top" width="246" height="156"><a href="index.htm" ><img src="images/logo.gif" width="246" height="156" border="0" hspace="0" vs pace="0" alt="\xca\xc8\xcf \xca\xce\xcd\xd2\xd0\xce\xcb\xdc\xcd\xce \xc8\xc7\xcc\xc5\xd0\xc8\xd2\xc5\xcb\xdc\xcd\xdb\xc5 \xcf\xd0\xc8\xc1\xc e\xd0\xdb \xca\xc8\xcf\xe8\xc0"></a></td><td valign="top" width=156><img src="images/sfield.jpg" width="310" height="156" border="0" hspace= "0" vspace="0"></td> <td valign="top" background="images/loop.gif" height="156"> <table width="100%" border="0" cellspacing="0" cellpadding="0" height="156"> <tr> <td align="left" valign="middle">
<table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr valign="middle"> <td align="right"><img src="images/sfield2.jpg" width=208 height=155 border=0></td>
</tr> </table>
</td> </tr>
</table> </td> </tr>
|
таких файлов много, в нем содержится инфа, как бы мне ее достать в нормальном текстовом виде т/е читаемом. Заранее благодарю Я пробовал как - нить так но пока не вышло Код | <?php /* * decode/cp1251.php * $Id: cp1251.php,v 1.1.4.2 2004/02/24 15:57:27 kink Exp $ * * Copyright (c) 2003-2004 The SquirrelMail Project Team * Licensed under the GNU GPL. For full terms see the file COPYING. * * This file contains cp1251 decoding function that is needed to read * cp1251 encoded mails in non-cp1251 locale. * * Original data taken from: * ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1250.TXT * * Name: cp1251 to Unicode table * Unicode version: 2.0 * Table version: 2.01 * Table format: Format A * Date: 04/15/98 * Contact: [email protected] * */
function charset_decode_cp1251 ($string) { // global $default_charset;
//if (strtolower($default_charset) == 'windows-1251') // return $string;
/* Only do the slow convert if there are 8-bit characters */ /* avoid using 0xA0 (\\240) in ereg ranges. RH73 does not like that */ //if (! ereg("[\\200-\\237]", $string) and ! ereg("[\\241-\\377]", $string) ) // return $string;
$cp1251 = array( "\\x80" => 'Ђ', "\\x81" => 'Ѓ', "\\x82" => '‚', "\\x83" => 'ѓ', "\\x84" => '„', "\\x85" => '…', "\\x86" => '†', "\\x87" => '‡', "\\x88" => '€', "\\x89" => '‰', "\\x8a" => 'Љ', "\\x8b" => '‹', "\\x8c" => 'Њ', "\\x8d" => 'Ќ', "\\x8e" => 'Ћ', "\\x8f" => 'Џ', "\\x90" => 'ђ', "\\x91" => '‘', "\\x92" => '’', "\\x93" => '“', "\\x94" => '”', "\\x95" => '•', "\\x96" => '–', "\\x97" => '—', "\\x98" => '�', "\\x99" => '™', "\\x9a" => 'љ', "\\x9b" => '›', "\\x9c" => 'њ', "\\x9d" => 'ќ', "\\x9e" => 'ћ', "\\x9f" => 'џ', "\\xa0" => ' ', "\\xa1" => 'Ў', "\\xa2" => 'ў', "\\xa3" => 'Ј', "\\xa4" => '¤', "\\xa5" => 'Ґ', "\\xa6" => '¦', "\\xa7" => '§', "\\xa8" => 'Ё', "\\xa9" => '©', "\\xaa" => 'Є', "\\xab" => '«', "\\xac" => '¬', "\\xad" => '', "\\xae" => '®', "\\xaf" => 'Ї', "\\xb0" => '°', "\\xb1" => '±', "\\xb2" => 'І', "\\xb3" => 'і', "\\xb4" => 'ґ', "\\xb5" => 'µ', "\\xb6" => '¶', "\\xb7" => '·', "\\xb8" => 'ё', "\\xb9" => '№', "\\xba" => 'є', "\\xbb" => '»', "\\xbc" => 'ј', "\\xbd" => 'Ѕ', "\\xbe" => 'ѕ', "\\xbf" => 'ї', "\\xc0" => 'А', "\\xc1" => 'Б', "\\xc2" => 'В', "\\xc3" => 'Г', "\\xc4" => 'Д', "\\xc5" => 'Е', "\\xc6" => 'Ж', "\\xc7" => 'З', "\\xc8" => 'И', "\\xc9" => 'Й', "\\xca" => 'К', "\\xcb" => 'Л', "\\xcc" => 'М', "\\xcd" => 'Н', "\\xce" => 'О', "\\xcf" => 'П', "\\xd0" => 'Р', "\\xd1" => 'С', "\\xd2" => 'Т', "\\xd3" => 'У', "\\xd4" => 'Ф', "\\xd5" => 'Х', "\\xd6" => 'Ц', "\\xd7" => 'Ч', "\\xd8" => 'Ш', "\\xd9" => 'Щ', "\\xda" => 'Ъ', "\\xdb" => 'Ы', "\\xdc" => 'Ь', "\\xdd" => 'Э', "\\xde" => 'Ю', "\\xdf" => 'Я', "\\xe0" => 'а', "\\xe1" => 'б', "\\xe2" => 'в', "\\xe3" => 'г', "\\xe4" => 'д', "\\xe5" => 'е', "\\xe6" => 'ж', "\\xe7" => 'з', "\\xe8" => 'и', "\\xe9" => 'й', "\\xea" => 'к', "\\xeb" => 'л', "\\xec" => 'м', "\\xed" => 'н', "\\xee" => 'о', "\\xef" => 'п', "\\xf0" => 'р', "\\xf1" => 'с', "\\xf2" => 'т', "\\xf3" => 'у', "\\xf4" => 'ф', "\\xf5" => 'х', "\\xf6" => 'ц', "\\xf7" => 'ч', "\\xf8" => 'ш', "\\xf9" => 'щ', "\\xfa" => 'ъ', "\\xfb" => 'ы', "\\xfc" => 'ь', "\\xfd" => 'э', "\\xfe" => 'ю', "\\xff" => 'я' ); $string = str_replace(array_keys($cp1251), array_values($cp1251), $string);
return $string; } $f = fopen("x.html","r") or die ("Error"); while (!feof($f)) { $value = fgets($f, 4096); // print $value; // $s = charset_decode_cp1251( $value ); $s = htmlentities( $value ); // echo html_entity_decode($s, "Windows-1251", "utf8"); echo html_entity_decode($s); // echo mb_convert_encoding($value, "Windows-1251", "UTF-16LE"); // echo html_entity_decode($s, ENT_COMPAT, "utf8"); // print $s2; }
?>
|
Добавлено через 5 минут и 39 секундВы не поверите Vingrad решил проблему он заменил символы типа Ђ на их русские эквиваленты я заменил ее в коде., теперь все гуд Добавлено через 7 минут и 44 секундыблин забавно вышло
|