| 注册 | 忘记密码
奇技淫巧 - 阅读主题
<<  <  1  >  >>

IP地址->地理位置转换的测评

好(0) 差(0) 阅读(6184) 评论(0)
Wen 给 Wen 发消息 给 Wen 发email
作者头像
等级:◆◆◆◆◇◇◇

前些天做NetNest Stats统计系统,需要根据IP地址统计地理位置。

首先想到了把IP地址与地理位置的转换关系写入MySQL数据库再调用查询。前天在Exceed PHP Club看到有一个QQWry.php程序,其中的QQWry类可以直接读取纯真的IP数据库QQWry.dat文件,于是去找来了一个,读后觉得算法不大高明,于是自己又重写了一个ip2addr函数。

今天来个测评,MySQL vs QQWry类 vs ip2addr函数。测评程序如下:

<?php
require_once ("func_ip.php");
function EncodeIp($strDotquadIp) {
    $arrIpSep = explode('.', $strDotquadIp);
    if (count($arrIpSep) != 4) return 0;
    $intIp = 0;    
    foreach ($arrIpSep as $k => $v) $intIp += (int)$v * pow(256, 3 - $k);
    return $intIp;
}
function GetMicroTime() {
    list($msec, $sec) = explode(" ", microtime());
    return ((double)$msec + (double)$sec);
}

for ($i = 0; $i < 50; $i++) { // 产生50个随机ip地址
    $strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
    echo $strIp;
    echo "\r\n";
    $arrIp[$i] = $strIp;
}
// 准备MySQL数据库
$resConn = mysql_connect("localhost", "test", "");
mysql_select_db("test");
// MySQL测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
    $intIp = EncodeIp($arrIp[$i]);
    $strSql = "SELECT region, address FROM ip_data
        WHERE ipstart <= ".$intIp." AND ipend >= ".$intIp." LIMIT 1"
    ;
    $resResult = mysql_query($strSql);
    if (!($arrAddr = mysql_fetch_array($resResult))) $arrAddr = array(
        "region" => "(unknown)",
        "address" => "(unknown)"
    );
    echo $arrAddr["region"];
    echo "\r\n";
    echo $arrAddr["address"];
    echo "\r\n";
}
$dblTimeSQL = GetMicroTime() - $dblTimeStart;
// MySQL测评结束
mysql_close();
// 准备QQWry类
require_once ("QQWry.php");
$objQQWry = new QQWry();
// QQWry测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
    $objQQWry->qqwry($arrIp[$i]);
    echo $objQQWry->Country;
    echo "\r\n";
    echo $objQQWry->Local;
    echo "\r\n";
}
$dblTimeQQWry = GetMicroTime() - $dblTimeStart;
// QQWry测评结束
// ip2addr测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
    $intIp = EncodeIp($arrIp[$i]);
    $arrAddr = ip2addr($intIp);
    echo $arrAddr["region"];
    echo "\r\n";
    echo $arrAddr["address"];
    echo "\r\n";
}
$dblTimeIp2Addr = GetMicroTime() - $dblTimeStart;
// ip2addr测评结束
// 输出结果
echo $dblTimeSQL; echo "\r\n";
echo $dblTimeIp2Addr; echo "\r\n";
echo $dblTimeQQWry; echo "\r\n";
?>

测评环境:联想原装机,P4 2.8G超线程,256M内存,Windows XP Professional,PHP 4.4.0,MySQL 4.0.25-nt

测评两次分别得数据(四舍五入保留3位小数,单位皆为秒):

MySQL查询时间:27.626
ip2addr查询时间:0.088
QQWry查询时间:0.044

MySQL查询时间:19.394
ip2addr查询时间:0.111
QQWry查询时间:0.044

首先MySQL方法毫无疑问处于严重劣势,耗时多出两个数量级。至于后两者似乎ip2addr要弱于QQWry。但是此程序中是先测试ip2addr后测试QQWry,所以测试QQWry时IP数据文件已经被操作系统cache了。为了验证这一点,不妨把测评ip2addr与QQWry的程序段互换,测评2次得到结果(省略说明):

26.630
0.040
0.065

27.051
0.039
0.073

为了更清晰比较ip2addr与QQWry的优劣,为其各写一段程序进行测评。测评QQWry的程序:

<?php
function GetMicroTime() {
    list($msec, $sec) = explode(" ", microtime());
    return ((double)$msec + (double)$sec);
}

// 随机产生5000个ip地址
for ($i = 0; $i < 5000; $i++) {
    $strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
    $arrIp[$i] = $strIp;
}
require_once ("QQWry.php");
$objQQWry = new QQWry();
// 测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 5000; $i++) $objQQWry->qqwry($arrIp[$i]);
$dblTimeDuration = GetMicroTime() - $dblTimeStart;
echo $dblTimeDuration; echo "\r\n";
?>

测评ip2addr的程序:

<?php
require_once ("func_ip.php");
function EncodeIp($strDotquadIp) {
    $arrIpSep = explode('.', $strDotquadIp);
    if (count($arrIpSep) != 4) return 0;
    $intIp = 0;    
    foreach ($arrIpSep as $k => $v) $intIp += (int)$v * pow(256, 3 - $k);
    return $intIp;
}
function GetMicroTime() {
    list($msec, $sec) = explode(" ", microtime());
    return ((double)$msec + (double)$sec);
}

// 随机产生5000个ip地址
for ($i = 0; $i < 5000; $i++) {
    $strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
    $arrIp[$i] = $strIp;
}
// 测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 5000; $i++) {
    $intIp = EncodeIp($arrIp[$i]);
    $arrAddr = ip2addr($intIp);
}
$dblTimeDuration = GetMicroTime() - $dblTimeStart;
echo $dblTimeDuration; echo "\r\n";
?>

测评QQWry五次的数据如下:

4.696
4.581
4.517
4.656
4.556

而ip2addr的数据如下:

4.278
4.179
4.238
4.188
4.224

可见ip2addr获得本次测评的优胜,但是ip2addr仅仅比QQWry快一点点。对于少量IP地址的转换,ip2addr与QQWry表现应该差不多。不过MySQL表现实在太差了。稍后就用ip2addr改造NetNest Stats统计系统。

附:文件func_ip.php源代码:

<?php
if (!defined("IPFILE")) define ("IPFILE", "QQWry.dat");

function bin2dec($strBin) {
    $intLen = strlen($strBin);
    for (
        $i = 0, $intBase = 1, $intResult = 0;
        $i < $intLen; $i++, $intBase *= 256
    ) $intResult += ord($strBin{$i}) * $intBase;
    return $intResult;
}

// error code: 1-open file error; 2-data error;
function ip2addr($intIp) {
    $arrUnknown = array(
        "region" => "(unknown)",
        "address" => "(unknown)"
    );
    $fileIp = fopen(IPFILE, "rb");
    if (!$fileIp) return 1;
    $strBuf = fread($fileIp, 4);
    $intFirstRecord = bin2dec($strBuf);
    $strBuf = fread($fileIp, 4);
    $intLastRecord = bin2dec($strBuf);
    $intCount = floor(($intLastRecord - $intFirstRecord) / 7);
    if ($intCount < 1) return 2;
    $intStart = 0;
    $intEnd = $intCount;
    while ($intStart < $intEnd - 1) {
        $intMid = floor(($intStart + $intEnd) / 2);
        $intOffset = $intFirstRecord + $intMid * 7;
        fseek($fileIp, $intOffset);
        $strBuf = fread($fileIp, 4);
        $intMidStartIp = bin2dec($strBuf);
        if ($intIp == $intMidStartIp) {
            $intStart = $intMid;
            break;
        }
        if ($intIp > $intMidStartIp) $intStart = $intMid;
        else $intEnd = $intMid;
    }
    $intOffset = $intFirstRecord + $intStart * 7;
    fseek($fileIp, $intOffset);
    $strBuf = fread($fileIp, 4);
    $intStartIp = bin2dec($strBuf);
    $strBuf = fread($fileIp, 3);
    $intOffset = bin2dec($strBuf);
    fseek($fileIp, $intOffset);
    $strBuf = fread($fileIp, 4);
    $intEndIp = bin2dec($strBuf);
    if ($intIp < $intStartIp || $intIp > $intEndIp) return $arrUnknown;
    $intOffset += 4;
    while (($intFlag = ord(fgetc($fileIp))) == 1) {
        $strBuf = fread($fileIp, 3);
        $intOffset = bin2dec($strBuf);
        if ($intOffset < 12) return $arrUnknown;
        fseek($fileIp, $intOffset);
    }
    switch ($intFlag) {
        case 0:
            return $arrUnknown;
            break;
        case 2:
            $intOffsetAddr = $intOffset + 4;
            $strBuf = fread($fileIp, 3);
            $intOffset = bin2dec($strBuf);
            if ($intOffset < 12) return $arrUnknown;
            fseek($fileIp, $intOffset);
            while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
                $strBuf = fread($fileIp, 3);
                $intOffset = bin2dec($strBuf);
                if ($intOffset < 12) return $arrUnknown;
                fseek($fileIp, $intOffset);
            }
            if (!$intFlag) return $arrUnknown;
            $arrAddr = array(
                "region" => chr($intFlag)
            );
            while (ord($c = fgetc($fileIp))) $arrAddr["region"] .= $c;
            fseek($fileIp, $intOffsetAddr);
            while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
                $strBuf = fread($fileIp, 3);
                $intOffset = bin2dec($strBuf);
                if ($intOffset < 12) {
                    $arrAddr["address"] = "(unknown)";
                    return $arrAddr;
                }
                fseek($fileIp, $intOffset);
            }
            if (!$intFlag) {
                $arrAddr["address"] = "(unknown)";
                return $arrAddr;
            }
            $arrAddr["address"] = chr($intFlag);
            while (ord($c = fgetc($fileIp))) $arrAddr["address"] .= $c;
            return $arrAddr;
            break;
        default:
            $arrAddr = array("region" => chr($intFlag));
            while (ord($c = fgetc($fileIp))) $arrAddr["region"] .= $c;
            while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
                $strBuf = fread($fileIp, 3);
                $intOffset = bin2dec($strBuf);
                if ($intOffset < 12) {
                    $arrAddr["address"] = "(unknown)";
                    return $arrAddr;
                }
                fseek($fileIp, $intOffset);
            }
            if (!$intFlag) {
                $arrAddr["address"] = "(unknown)";
                return $arrAddr;
            }
            $arrAddr["address"] = chr($intFlag);
            while (ord($c = fgetc($fileIp))) $arrAddr["address"] .= $c;
            return $arrAddr;
    }
}
?>

后续相关:PHP对GB编码动态转UTF-8编码的测评

Share/Save/Bookmark
最后修改:Wen 于 2005-09-02 12:49:43

发表于 2005-08-16 12:02:12
奇技淫巧 - 阅读主题
<<  <  1  >  >>

Valid XHTML 1.0 | Valid CSS2 | WAI-A WCAG 1.0

Copyright 2005-2018 WEN'S Horizon [32/0.036]