成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專(zhuān)欄INFORMATION COLUMN

php生成sitemap

Acceml / 2203人閱讀

摘要:但是假設(shè)我們的網(wǎng)站進(jìn)行經(jīng)常更新,那么是不是每次我都要手動(dòng)更新呢。

由于工作的原因,最近需要生成網(wǎng)站的sitemap.xml,谷歌百度了很多地方,沒(méi)有發(fā)現(xiàn)并合適可用的代碼,三思之后還是決定自己寫(xiě)吧!雖然可能寫(xiě)的有所缺陷,但是畢竟是認(rèn)認(rèn)真真寫(xiě)的,希望對(duì)一些后來(lái)者有所幫助......

1、為什么要自己寫(xiě)腳本生成sitemap.xml?

很多人會(huì)說(shuō),在網(wǎng)上有現(xiàn)成的工具,掃一下就可以了,沒(méi)有必要自己寫(xiě)。是的,的確是這樣的。但是假設(shè)我們的網(wǎng)站進(jìn)行經(jīng)常更新,那么是不是每次我都要手動(dòng)更新sitemap呢。我很懶,那么,有沒(méi)有更好的方案呢?肯定是有的,我是否可以起一個(gè)定時(shí)任務(wù),每天晚上更新一次呢,此時(shí)腳本就有用武之地了

2、文檔目錄:
    配置文件 - config/config.ini.php
    sitemap主文件 - SiteMap.class.php
3、主文件代碼

     * @version 1.0
     */
    namespace MaweibinguoSiteMap;
    class SiteMap
    {
        const SCHEMA = "http://www.sitemaps.org/schemas/sitemap/0.9";

        /**
         * @var webUrlList
         * @access public
         */
        public $webUrlList = array();

        /**
         * @var siteMapList
         * @access public
         */
        public $siteMapList = array();

        /**
         * @var isUseCookie
         * @access public
         */
        public $isUseCookie = false;

        /**
         * @var cookieFilePath
         * @access public
         */
        public $cookieFilePath = "";

        /**
         * @var xmlWriter
         * @access private
         */
        private $_xmlWriter = "";

        /**
         * init basic config
         *
         * @access public
         */
        public function __construct()
        {
            $this->_xmlWriter = new XMLWriter();

            $result = $this->_enviromentTest();
        }

        /**
         * test the enviroment for the script 
         *
         * @access pirvate
         */
        private function _enviromentTest()
        {
            $sapiType = php_sapi_name ();
            if( strtolower($sapiType) != "cli" ) {
                echo " The Script Must Run In Command Lines ", "
";
                   exit();    
            }
        }

        /**
         * load the configValue for genrating sitemap by configname
         *
         * @param string $configName
         * @return string $configValue
         * @access public
         */
        public function loadConfig($configName)
        {
            /* init return value */
            $configValue = "";

            /* load config value */
            $configPath = __DIR__ . "/config/config.ini.php";
            if(file_exists( $configPath )) {
                require $configPath;
            } else {
                echo "Can not find config file", "
";
                exit();    
            }
            $configValue = $$configName;

            /* return config value */
            return $configValue;
        }

        /**
         * generate sitemap.xml for the web
         *
         * @param siteMapList
         * @access public
         */
        public function generateSiteMapXml($siteMapList)
        {
            /* init return result */
            $result = false;
            if( !is_array($siteMapList) || count($siteMapList) <= 0 ) {
                echo "The SiteMap Cotent Is Empty","
";
                exit();
            }

            /* check the parameter */
            $siteMapPath = $this->loadConfig("SITEMAPPATH");
            if(!file_exists($siteMapPath)) {
                $commandStr = "touch ${siteMapPath}";
                exec($commandStr);
            }
            if( !is_writable($siteMapPath) ) {
                echo "Is Not Writeable","
";
                exit();
            }
            $this->_xmlWriter->openURI($siteMapPath);
            $this->_xmlWriter->startDocument("1.0", "UTF-8");
            $this->_xmlWriter->setIndent(true);
            $this->_xmlWriter->startElement("urlset");
            $this->_xmlWriter->writeAttribute("xmlns", self::SCHEMA);
            foreach($siteMapList as $siteMapKey => $siteMapItem) {
                $this->_xmlWriter->startElement("url");
                $this->_xmlWriter->writeElement("loc",$siteMapItem["Url"]);
                $this->_xmlWriter->writeElement("title",$siteMapItem["Title"]);
                $changefreq = !empty($siteMapItem["ChangeFreq"]) ? $siteMapItem["ChangeFreq"] : "Daily";
                $this->_xmlWriter->writeElement("changefreq",$changefreq);
                $priority = !empty($siteMapItem["Priority"]) ? $siteMapItem["Priority"] : 0.5;
                $this->_xmlWriter->writeElement("priority",$priority);
                $this->_xmlWriter->endElement();
            }
            $this->_xmlWriter->endElement();

            /* return return */
            return $result;
        }

        /**
         * start to send request to the target url, and get the reponse 
         *
         * @param string $targetUrl
         * @return mixed $returnData 
         * @access public
         */
        public function sendRequest($url)
        {
            /* init return value */
            $responseData = false;

            /* check the parameter */
            if( !filter_var($url, FILTER_VALIDATE_URL) ) {
                return $responseData;
            }
            $connectTimeOut = $this->loadConfig("CURLOPT_CONNECTTIMEOUT");
            if( $connectTimeOut === false ) {
                return $responseData;
            }
            $timeOut = $this->loadConfig("CURLOPT_TIMEOUT");
            if( $timeOut === false ) {
                return $responseData;
            }

            $handle = curl_init();
            curl_setopt($handle, CURLOPT_URL, $url);
            curl_setopt($handle, CURLOPT_HEADER, false);
            curl_setopt($handle, CURLOPT_AUTOREFERER, true);
            curl_setopt($handle, CURLOPT_RETURNTRANSFER , true);
            curl_setopt($handle, CURLOPT_CONNECTTIMEOUT, $connectTimeOut);
            curl_setopt($handle, CURLOPT_TIMEOUT, $timeOut);
            curl_setopt($handle, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)" );
            $headersItem = array(    "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
                                    "Connection: Keep-Alive"     );
            curl_setopt($handle, CURLOPT_HTTPHEADER, $headersItem);
            curl_setopt($handle, CURLOPT_FOLLOWLOCATION, 1);

            $cookieList = $this->loadConfig("COOKIELIST");
            $isUseCookie = $cookieList["IsUseCookie"];
            $cookieFilePath = $cookieList["CookiePath"];
            if($isUseCookie) {
                if(!file_exists($cookieFilePath)) {
                    $touchCommand = " touch {$cookieFilePath} ";
                    exec($touchCommand);
                }
                curl_setopt($handle, CURLOPT_COOKIEFILE, $cookieFilePath);
                curl_setopt($handle, CURLOPT_COOKIEJAR, $cookieFilePath);
            }
            $responseData = curl_exec($handle);
            $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
            if($httpCode != 200) {
                $responseData = false;
            }
            curl_close($handle);

            /* return response data */
            return $responseData;
        }

        /**
         * get the sitemap content of the url, it contains url, title, priority, changefreq
         *
         * @param string $url 
         * @access public
         */
        public function generateSiteMapList($url)
        {
            $content = $this->sendRequest($url);

            if($content !== false) {
                $tagsList = $this->_parseContent($content, $url);
                $urlItem = $tagsList["UrlItem"];
                $title = $tagsList["Title"];

                $siteMapItem = array(    "Url" => trim($url),
                                        "Title" => trim($title)    );
                $priority = $this->_calculatePriority($siteMapItem["Url"]);
                $siteMapItem["Priority"] = $priority;
                $changefreq = $this->_calculateChangefreq($siteMapItem["Url"]);
                $siteMapItem["ChangeFreq"] = $changefreq;

                $this->siteMapList[] = $siteMapItem;            
                foreach($urlItem as $nextUrl) {
                    if( !in_array($nextUrl, $this->webUrlList) ) {
                        $skipUrlList = $this->loadConfig("SKIP_URLLIST");
                        foreach($skipUrlList as $keyWords) {
                            if( stripos($nextUrl, $keyWords) !== false ) {
                                continue 2;
                            }
                        }
                        $this->webUrlList[] = $nextUrl;
                        echo $nextUrl,"
";
                        $this->generateSiteMapList($nextUrl);
                    }
                }
            }
        }

        /**
         *teChangefreq get sitemaplist of the web
         *
         * @access public
         * @return array $siteMapList
         */
        public function getSiteMapList()
        {
            return $this->siteMapList;
        }

        /**
         * calate the priority of the targeturl
         *
         * @param string $targetUrl
         * @return float $priority
         * @access private
         */
        private function _calculatePriority($targetUrl)
        {
            /* init priority */
            $priority = 0.5;

            /* calculate the priority */
            if( filter_var($targetUrl, FILTER_VALIDATE_URL) ) {
                $priorityList = $this->loadConfig("PRIORITYLIST");
                foreach($priorityList as $priorityKey => $priorityValue) {
                    if(stripos($targetUrl, $priorityKey) !== false) {
                        $priority = $priorityValue;
                        break;
                    }
                }
            }

            /* return priority */
            return $priority;
        }

        /**
         * calate the changefreq of the targeturl
         *
         * @param string $targetUrl
         * @return float $changefreq
         * @access private
         */
        private function _calculateChangefreq($targetUrl)
        {
            /* init changefreq*/
            $changefreq = "Daily";

            /* calculate the priority */
            if( filter_var($targetUrl, FILTER_VALIDATE_URL) ) {
                $changefreqList = $this->loadConfig("CHANGEFREQLIST");
                foreach($changefreqList as $changefreqKey => $changefreqValue) {
                    if(stripos($targetUrl, $changefreqKey) !== false) {
                        $changefreq = $changefreqValue;
                        break;
                    }
                }
            }

            /* return priority */
            return $changefreq;
        }

        /**
         * format url 
         * 
         * @param $url
         * @param $orginUrl
         * @access private
         * @return $formatUrl
         */
        private function _formatUrl($url, $originUrl)
        {
            /* init url */
            $formatUrl = "";

            /* format url */
            if( !empty($url) && !empty($originUrl) ) {
                $badUrlItem = array(    "", 
                                        "/" , 
                                        "javascript",
                                        "javascript:;",
                                        ""    );
                $formatUrl = trim($url);
                   $formatUrl = trim($formatUrl, "#");
                $formatUrl = trim($formatUrl, """);
                $formatUrl = trim($formatUrl, """);
                if(stripos($formatUrl, "http") === false && !in_array($formatUrl, $badUrlItem)) {
                    if(strpos($formatUrl, "/") === 0) {
                        $domainName = $this->loadConfig("DOMAIN_NAME");    
                        $formatUrl = $domainName . trim($formatUrl, "/");
                    } else {
                        $formatUrl = substr( $originUrl, 0, strrpos($originUrl, "/") ) ."/". $formatUrl;
                    }
                } elseif( stripos($formatUrl, "http") === false && in_array($formatUrl, $badUrlItem) ) {
                    $formatUrl = "";
                }
            }

            /* return url */
            return $formatUrl;
        }

        /**
         * check domain is right
         * 
         * @param $url
         * @return $url
         * @access private
         */
        private function _checkDomain($url)
        {
            /* init url */
            $result = false;

            /* check domain */
            if($url) {
                $domainName = $this->loadConfig("DOMAIN_NAME");
                if( stripos($url, $domainName) === false ) {
                    return $result;
                }
                $result = true;
            }
        
            /* return url */
            return $result;
        }

        /**
         * parse the response content, so that we can get the urls
         *
         * @param string $content
         * @param string $originUrl
         * @return array $urlItem
         * @access public
         */
        public function _parseContent($content, $originUrl)
        {
            /* init return data */
            $tagsList = array();

            /* start parse */
            if( !empty($content) && !empty($originUrl) ) {
                $domainName = $this->loadConfig("DOMAIN_NAME");

                /* get the attribute of href for tags  */
                $regStrForTagA = "# $url) {
                        $formatUrl = $this->_formatUrl($url, $originUrl);
                        if( empty($formatUrl) ) {
                            unset($urlItem[$urlKey]);
                            continue;
                        }

                        $result = $this->_checkDomain($formatUrl);
                        if($result === false) {
                            unset($urlItem[$urlKey]);
                            continue;
                        }
                        $urlItem[$urlKey] = $formatUrl;
                    }
                }

                $tagsList["UrlItem"] = $urlItem;

                /* get the title tags content */
                $regStrForTitle = "#(.*?)#um";
                if( preg_match($regStrForTitle, $content, $matches) ) {
                    $title = $matches[1];    
                }
                $tagsList["Title"] = $title;

            }

            /* return tagsList */
            return $tagsList;
        }
    }

    /* here is a example */

    $startTime = microtime(true);
    echo "/***********************************************************************/","
";
    echo "/*                    start to run {$startTime}                        */","
";
    echo "/***********************************************************************/","

";

    $siteMap = new SiteMap();
    $domain = $siteMap->loadConfig("DOMAIN_NAME");
    $siteMap->generateSiteMapList($domain);
    $siteMapList = $siteMap->getSiteMapList();
    $siteMap->generateSiteMapXml($siteMapList);

    $endTime = microtime(true);
    $takeTime = $endTime - $startTime;
    echo "/***********************************************************************/","
";
    echo "/*               Had Done, 	 it total take {$takeTime}      */","
";
    echo "/***********************************************************************/","
";
?> 
4、配置文件代碼
 true,
                            "CookiePath" => "/tmp/sitemapcookie"    );

    //sitemap文件的保存地址
    $SITEMAPPATH = "./sitemap.xml";

    //根據(jù)連接關(guān)鍵字設(shè)置priority
    $PRIORITYLIST = array(    "product" => "0.8",
                            "device" => "0.6",
                            "intelligent" => "0.4",
                            "course" => "0.2"    );

    //根據(jù)連接關(guān)鍵字設(shè)置CHANGEFREQ
    $CHANGEFREQLIST = array(    "product" => "Always",
                                "device" => "Hourly",
                                "intelligent" => "Daily",
                                "course" => "Weekly",
                                "login" => "Monthly",
                                "about" => "Yearly"    );

?>
5、獲取源碼包

單擊下載源代碼 (提取碼:fc1c)

文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址:http://systransis.cn/yun/21739.html

相關(guān)文章

  • typecho插件編寫(xiě)教程7 - Helper類(lèi)

    摘要:輸出類(lèi)似強(qiáng)行刪除某插件此方法用于卸載插件失敗時(shí)的替補(bǔ)方法,老高一般將此方法寫(xiě)入插件的方法里,這樣刷新以下后臺(tái),出問(wèn)題的插件就被卸載了。比如老高的插件,就用此方法為系統(tǒng)添加了一個(gè)的路由。 此文原本發(fā)表于我的博客 老高的技術(shù)博客 ,歡迎和老高交流! Helper類(lèi)為我們封裝了很多與插件有關(guān)的操作,并且全部是公共靜態(tài)方法,比如獲取系統(tǒng)配置、添加路由、添加面板等功能,是開(kāi)發(fā)插件必不可少的工...

    Carson 評(píng)論0 收藏0
  • Linux crontab 訪(fǎng)問(wèn)PHP URL完成定時(shí)任務(wù)

    摘要:而我本人需要完成的任務(wù)是定時(shí)訪(fǎng)問(wèn)一個(gè)文件鏈接去生成,所以訪(fǎng)問(wèn)就不能用去完成,而是要用。本站的這篇下執(zhí)行定時(shí)任務(wù)命令詳解寫(xiě)的非常詳細(xì),建議看一下。 crontab -e 新建/編輯一個(gè)任務(wù)crontab -l 列出所有任務(wù) crontab 格式: 基本格式 :分鐘 小時(shí) 日 月 星期 命令第1列表示分鐘1~59 每分鐘用或者 /1表示第2列表示小時(shí)1~23(0表示0點(diǎn)...

    zhangyucha0 評(píng)論0 收藏0
  • 每日 30 秒 ? 投懷送抱

    showImg(https://segmentfault.com/img/remote/1460000018808058?w=900&h=500); 簡(jiǎn)介 SEO、sitemap、搜索引擎優(yōu)化、簡(jiǎn)單教程 在曖昧期和暗戀期時(shí)心里總是懸掛著: ta 為什么還不和我表白? ta 是不是對(duì)我沒(méi)感覺(jué)? ta 是不是只是把我當(dāng)備胎? ta 是不是對(duì)誰(shuí)都這樣? 解決問(wèn)題最簡(jiǎn)單的方式就是直接 問(wèn)問(wèn)對(duì)方...

    kevin 評(píng)論0 收藏0

發(fā)表評(píng)論

0條評(píng)論

最新活動(dòng)
閱讀需要支付1元查看
<