前些天做NetNest Stats统计系统,需要根据IP地址统计地理位置。
首先想到了把IP地址与地理位置的转换关系写入MySQL数据库再调用查询。前天在Exceed PHP Club看到有一个QQWry.php程序,其中的QQWry类可以直接读取纯真的IP数据库QQWry.dat文件,于是去找来了一个,读后觉得算法不大高明,于是自己又重写了一个ip2addr函数。
今天来个测评,MySQL vs QQWry类 vs ip2addr函数。测评程序如下:
<?php
require_once ("func_ip.php");
function EncodeIp($strDotquadIp) {
$arrIpSep = explode('.', $strDotquadIp);
if (count($arrIpSep) != 4) return 0;
$intIp = 0;
foreach ($arrIpSep as $k => $v) $intIp += (int)$v * pow(256, 3 - $k);
return $intIp;
}
function GetMicroTime() {
list($msec, $sec) = explode(" ", microtime());
return ((double)$msec + (double)$sec);
}for ($i = 0; $i < 50; $i++) { // 产生50个随机ip地址
$strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
echo $strIp;
echo "\r\n";
$arrIp[$i] = $strIp;
}
// 准备MySQL数据库
$resConn = mysql_connect("localhost", "test", "");
mysql_select_db("test");
// MySQL测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
$intIp = EncodeIp($arrIp[$i]);
$strSql = "SELECT region, address FROM ip_data
WHERE ipstart <= ".$intIp." AND ipend >= ".$intIp." LIMIT 1"
;
$resResult = mysql_query($strSql);
if (!($arrAddr = mysql_fetch_array($resResult))) $arrAddr = array(
"region" => "(unknown)",
"address" => "(unknown)"
);
echo $arrAddr["region"];
echo "\r\n";
echo $arrAddr["address"];
echo "\r\n";
}
$dblTimeSQL = GetMicroTime() - $dblTimeStart;
// MySQL测评结束
mysql_close();
// 准备QQWry类
require_once ("QQWry.php");
$objQQWry = new QQWry();
// QQWry测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
$objQQWry->qqwry($arrIp[$i]);
echo $objQQWry->Country;
echo "\r\n";
echo $objQQWry->Local;
echo "\r\n";
}
$dblTimeQQWry = GetMicroTime() - $dblTimeStart;
// QQWry测评结束
// ip2addr测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 50; $i++) {
$intIp = EncodeIp($arrIp[$i]);
$arrAddr = ip2addr($intIp);
echo $arrAddr["region"];
echo "\r\n";
echo $arrAddr["address"];
echo "\r\n";
}
$dblTimeIp2Addr = GetMicroTime() - $dblTimeStart;
// ip2addr测评结束
// 输出结果
echo $dblTimeSQL; echo "\r\n";
echo $dblTimeIp2Addr; echo "\r\n";
echo $dblTimeQQWry; echo "\r\n";
?>
测评环境:联想原装机,P4 2.8G超线程,256M内存,Windows XP Professional,PHP 4.4.0,MySQL 4.0.25-nt
测评两次分别得数据(四舍五入保留3位小数,单位皆为秒):
MySQL查询时间:27.626
ip2addr查询时间:0.088
QQWry查询时间:0.044
MySQL查询时间:19.394
ip2addr查询时间:0.111
QQWry查询时间:0.044
首先MySQL方法毫无疑问处于严重劣势,耗时多出两个数量级。至于后两者似乎ip2addr要弱于QQWry。但是此程序中是先测试ip2addr后测试QQWry,所以测试QQWry时IP数据文件已经被操作系统cache了。为了验证这一点,不妨把测评ip2addr与QQWry的程序段互换,测评2次得到结果(省略说明):
26.630
0.040
0.065
27.051
0.039
0.073
为了更清晰比较ip2addr与QQWry的优劣,为其各写一段程序进行测评。测评QQWry的程序:
<?php
function GetMicroTime() {
list($msec, $sec) = explode(" ", microtime());
return ((double)$msec + (double)$sec);
}// 随机产生5000个ip地址
for ($i = 0; $i < 5000; $i++) {
$strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
$arrIp[$i] = $strIp;
}
require_once ("QQWry.php");
$objQQWry = new QQWry();
// 测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 5000; $i++) $objQQWry->qqwry($arrIp[$i]);
$dblTimeDuration = GetMicroTime() - $dblTimeStart;
echo $dblTimeDuration; echo "\r\n";
?>
测评ip2addr的程序:
<?php
require_once ("func_ip.php");
function EncodeIp($strDotquadIp) {
$arrIpSep = explode('.', $strDotquadIp);
if (count($arrIpSep) != 4) return 0;
$intIp = 0;
foreach ($arrIpSep as $k => $v) $intIp += (int)$v * pow(256, 3 - $k);
return $intIp;
}
function GetMicroTime() {
list($msec, $sec) = explode(" ", microtime());
return ((double)$msec + (double)$sec);
}// 随机产生5000个ip地址
for ($i = 0; $i < 5000; $i++) {
$strIp = mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255).".".mt_rand(0, 255);
$arrIp[$i] = $strIp;
}
// 测评开始
$dblTimeStart = GetMicroTime();
for ($i = 0; $i < 5000; $i++) {
$intIp = EncodeIp($arrIp[$i]);
$arrAddr = ip2addr($intIp);
}
$dblTimeDuration = GetMicroTime() - $dblTimeStart;
echo $dblTimeDuration; echo "\r\n";
?>
测评QQWry五次的数据如下:
4.696
4.581
4.517
4.656
4.556
而ip2addr的数据如下:
4.278
4.179
4.238
4.188
4.224
可见ip2addr获得本次测评的优胜,但是ip2addr仅仅比QQWry快一点点。对于少量IP地址的转换,ip2addr与QQWry表现应该差不多。不过MySQL表现实在太差了。稍后就用ip2addr改造NetNest Stats统计系统。
附:文件func_ip.php源代码:
<?php
if (!defined("IPFILE")) define ("IPFILE", "QQWry.dat");function bin2dec($strBin) {
$intLen = strlen($strBin);
for (
$i = 0, $intBase = 1, $intResult = 0;
$i < $intLen; $i++, $intBase *= 256
) $intResult += ord($strBin{$i}) * $intBase;
return $intResult;
}
// error code: 1-open file error; 2-data error;
function ip2addr($intIp) {
$arrUnknown = array(
"region" => "(unknown)",
"address" => "(unknown)"
);
$fileIp = fopen(IPFILE, "rb");
if (!$fileIp) return 1;
$strBuf = fread($fileIp, 4);
$intFirstRecord = bin2dec($strBuf);
$strBuf = fread($fileIp, 4);
$intLastRecord = bin2dec($strBuf);
$intCount = floor(($intLastRecord - $intFirstRecord) / 7);
if ($intCount < 1) return 2;
$intStart = 0;
$intEnd = $intCount;
while ($intStart < $intEnd - 1) {
$intMid = floor(($intStart + $intEnd) / 2);
$intOffset = $intFirstRecord + $intMid * 7;
fseek($fileIp, $intOffset);
$strBuf = fread($fileIp, 4);
$intMidStartIp = bin2dec($strBuf);
if ($intIp == $intMidStartIp) {
$intStart = $intMid;
break;
}
if ($intIp > $intMidStartIp) $intStart = $intMid;
else $intEnd = $intMid;
}
$intOffset = $intFirstRecord + $intStart * 7;
fseek($fileIp, $intOffset);
$strBuf = fread($fileIp, 4);
$intStartIp = bin2dec($strBuf);
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
fseek($fileIp, $intOffset);
$strBuf = fread($fileIp, 4);
$intEndIp = bin2dec($strBuf);
if ($intIp < $intStartIp || $intIp > $intEndIp) return $arrUnknown;
$intOffset += 4;
while (($intFlag = ord(fgetc($fileIp))) == 1) {
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
if ($intOffset < 12) return $arrUnknown;
fseek($fileIp, $intOffset);
}
switch ($intFlag) {
case 0:
return $arrUnknown;
break;
case 2:
$intOffsetAddr = $intOffset + 4;
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
if ($intOffset < 12) return $arrUnknown;
fseek($fileIp, $intOffset);
while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
if ($intOffset < 12) return $arrUnknown;
fseek($fileIp, $intOffset);
}
if (!$intFlag) return $arrUnknown;
$arrAddr = array(
"region" => chr($intFlag)
);
while (ord($c = fgetc($fileIp))) $arrAddr["region"] .= $c;
fseek($fileIp, $intOffsetAddr);
while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
if ($intOffset < 12) {
$arrAddr["address"] = "(unknown)";
return $arrAddr;
}
fseek($fileIp, $intOffset);
}
if (!$intFlag) {
$arrAddr["address"] = "(unknown)";
return $arrAddr;
}
$arrAddr["address"] = chr($intFlag);
while (ord($c = fgetc($fileIp))) $arrAddr["address"] .= $c;
return $arrAddr;
break;
default:
$arrAddr = array("region" => chr($intFlag));
while (ord($c = fgetc($fileIp))) $arrAddr["region"] .= $c;
while (($intFlag = ord(fgetc($fileIp))) == 2 || $intFlag == 1) {
$strBuf = fread($fileIp, 3);
$intOffset = bin2dec($strBuf);
if ($intOffset < 12) {
$arrAddr["address"] = "(unknown)";
return $arrAddr;
}
fseek($fileIp, $intOffset);
}
if (!$intFlag) {
$arrAddr["address"] = "(unknown)";
return $arrAddr;
}
$arrAddr["address"] = chr($intFlag);
while (ord($c = fgetc($fileIp))) $arrAddr["address"] .= $c;
return $arrAddr;
}
}
?>




User login

