DEV Community

loading...

Content Trends with PHP

lbonanomi
Internet loudmouth since 1996
Updated on ・2 min read

Once upon a time I lived in a share house with a rotating cast of gnarly geeks and one cable modem, so I got pretty well acquainted with the pfSense firewall. Now one of the strange things about pfSense was that it had much of its system scripts rewritten in PHP; seeing rc.d scripts rewritten in PHP made me appreciate the idea of a batteries-included scripting language.

One of the more oddly-shaped batteries in PHP is metaphone(), an algorithm similar to (but more precise than) than the venerable soundex for generating approximately phonetic pronunciations of strings. Because Metaphone implicitly ignores numerals and punctuation it's a handy tool for fuzzy matching of text strings.

Here's a fun example of finding content trends in a systems or application log:

#!/bin/php

<?
    $keys = array();
    $counts = array();
    $total = 0;

    foreach (file($_SERVER['argv'][1]) as $line) {
        $total++;
        $lphone = metaphone($line);
        $keys[$lphone] = trim($line);
        $counts[$lphone] = $counts[$lphone] + 1;
    }

    arsort($counts);

    $topten = array_slice($counts, 0, 10, true);

    foreach ($topten as $comm => $count) {
        print round((($count * 100) / $total), 2)."%\t\"".$keys[$comm]."\"\n";
    }
?>

And this will let you get the "shape" of files like /var/log/messages:

guidance: ~/trend.php /var/log/messages

64.82%  "May 24 13:52:29 guidance sshd[30083]: Did not receive identification string from 151.101.2.217"
11.55%  "May 24 13:49:34 guidance sshd[31567]: Received disconnect from 151.101.194.217: 11: disconnected by user"
10.27%  "May 24 13:12:55 guidance sshd[175458]: Accepted publickey for root from 151.101.130.217 port 49141 ssh2"
2.84%   "May 24 13:52:24 guidance sshd[29567]: Connection closed by 151.101.2.217 [preauth]"
1.21%   "May 24 13:48:13 guidance sshd[174773]: Received disconnect from 151.101.194.217: 11: Closed due to user request. [preauth]"
1.11%   "May 24 13:44:13 guidance altsshd[125993]: Received disconnect from 10.126.138.57: 11: disconnected by user"
0.96%   "May 24 13:39:43 guidance altsshd[105719]: Accepted publickey for op from 151.101.130.217 port 44578 ssh2"
0.95%   "May 24 13:50:32 guidance sodad[173222]: sodad-ipc_temp (PID 173222) exiting"
0.95%   "May 24 13:51:01 guidance sodad[194844]: sodad (PID 194844) starting"
0.42%   "May 24 13:41:59 guidance sshd[115807]: Authorized to lbonanomi, krb5 principal bonanomi@DEV.TO (krb5_kuserok)"

Discussion (0)