忽略 file_get_contents and/or 的源文件类型编码转换为 json 编码
ignore source filetype encoding of file_get_contents and/or convert to json encoding
当我在浏览器中加载 http://www.nydailynews.com/json/cmlink/NYDN.Local.Article.rss 时,它会很好地加载 JSON 内容。但是当使用 file_get_contents
拉取内容时,我会得到像
这样的奇怪字符
��Y�r��}OU�aV�@
我试过 $contents = mb_convert_encoding(file_get_contents('http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss'), 'HTML-ENTITIES', "UTF-8");
但只有 returns 一种 XML 类型的格式,而不是 JSON 在浏览器中可见的格式。
更新:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,'http://www.nydailynews.com/json/cmlink/NYDN.Local.Article.rss');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_ENCODING , 'gzip');
$content = curl_exec ($ch);
试试这个:
$contents = file_get_contents('http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss'); print_r(gzdecode($contents));
您可以查看此 post 了解更多信息:why file_get_contents returning strange characters?
您可以尝试使用 DOMDocument
将编码转换为 utf-8
$contents= file_get_contents("http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss");
$dom = new DOMDocument();
if($dom->loadXML($contents)){ // $contents is an XML document with iso-8859-1 encoding specified in the declaration
$dom->encoding = 'utf-8'; // convert document encoding to UTF8
return $dom->saveXML(); // return valid, utf8-encoded XML
}
当我在浏览器中加载 http://www.nydailynews.com/json/cmlink/NYDN.Local.Article.rss 时,它会很好地加载 JSON 内容。但是当使用 file_get_contents
拉取内容时,我会得到像
��Y�r��}OU�aV�@
我试过 $contents = mb_convert_encoding(file_get_contents('http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss'), 'HTML-ENTITIES', "UTF-8");
但只有 returns 一种 XML 类型的格式,而不是 JSON 在浏览器中可见的格式。
更新:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,'http://www.nydailynews.com/json/cmlink/NYDN.Local.Article.rss');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_ENCODING , 'gzip');
$content = curl_exec ($ch);
试试这个: $contents = file_get_contents('http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss'); print_r(gzdecode($contents));
您可以查看此 post 了解更多信息:why file_get_contents returning strange characters?
您可以尝试使用 DOMDocument
$contents= file_get_contents("http://www.nydailynews.com/cmlink/NYDN.Local.Article.rss");
$dom = new DOMDocument();
if($dom->loadXML($contents)){ // $contents is an XML document with iso-8859-1 encoding specified in the declaration
$dom->encoding = 'utf-8'; // convert document encoding to UTF8
return $dom->saveXML(); // return valid, utf8-encoded XML
}