Linux – Page 8 – JEB's Blog

Crashing RT 3.8 on Debian

Thanks to this post I tracked down a crash issue with FreezeThaw and RT. *sigh*

In essence, edit /etc/request_tracker3.8/RT_SiteConfig.pm, and add Set($WebSessionClass , 'Apache::Session::File');.

But RT still rocks!

Asus, where is the B204 w/Linux?

Dear Asus (pron: “a-sue-s”, not “ace-us”, apparently),

I’ve been looking for a small set-top home media player running Linux, with an HDMI on board, capable of running 1080p video. Looking around I’m quite fond of the Eeebox B204, but I don’t want to pay for MS Windows. I like the Bluetooth for my remote keyboard/mouse/input, and the battery UPS is cute up until the point the battery has lost its ability to charge/discharge.

I wanted to email you, but your asus.com web site has telephone numbers only. Its been a long, long time for you to bring this product to market in the UK. You’re also reselling different models in different markets, and your branding of the models is somewhat confusing: the B204 is “better/higher spec” than the B206 that was released at the same time?

Having said that, the on-line retailers I am Google-Frugal-Shopping searching through all seem to list the product, but have zero stock. Are there supply line issues? Is the product recalled? Or is it selling really well? Where are the Linux versions (sans-Microsoft-tax)? And why is your UK channel selling these at a considerable mark-up compared to the US retail prices? Currency does fluctuate, but I’m sure that UKÂ£ and US$ are not on parity…. yet.

Sincerely,
JEB

Debian 5.0 Lenny is out

As of around 23:00 UTC today, Debian GNU Linux 5.0 is kind-of out and about. The official web site isn’t updated yet, and final CD images are being generated now, but the symlink of stable has come to rest on Lenny; testing is now Squeeze, and dear old 4.0 Etch is oldstable. For those not aware, the code names are all characters from Disney/Pixar’s Toy Story.

Lenny runs on 11 different CPU architectures, including the standard i386 32-bit and its equivalent 64-bit (AMD64). It ships with kernel 2.6.26. It has MySQL 5.0.51a (which has interesting STATISTICS table in the INFORMATION_SCHEMA).

Some of the gotchas that may come up:

Kernel 2.4 is dropped
Firmware for various devices may have been split out into separate packages; example, on an HP BL460 blade system, you’ll need to install firmare-bnx2
Apache 1 has been dropped; use Apache 2

See this for a rough summary and the release news story for more.

Logging to MySQL in 3rd Normal Form

I’m at it again with my Log3NF! When last I did this, Debian‘s Perl packages were in no shape for using MySQL stored procedures, but time has passed and everything is ready….

Any web server software, like Apache, can log requests that come in when people browse sites. Typically people record the accesses and do statistical analysis on it – to see visitor numbers, people stealing graphics, preferred browser versions of the visitors, where people are being linked-to from, etc. All of this data can be quite voluminous, and much of it is repetitive.

For a long time there has existed the ability to log this data to a simple flat MySQL (or other) database. However, most of those implementations have used just one table to store all the records in a log line. This means the data still has to be split apart for analysis.

So, what have I done? Well, I have written a bunch of table structures to handle each component of a standard “combined” log file, and a table that joins each of these components of a log line together. Plus I have written some table structures to hold summary data of this, so over time I can delete the original log entries and just keep the summaries. Then I have written some stored procedures to parse the incoming log entry and split it into these tables, and update the summary statistics. Here’s the main table that ties everything together – you’ll see it’s indexed in every way possible, so you cna see the possibilities for reporting from it…

CREATE TABLE Access ( ID bigint unsigned auto_increment primary key, IPv4 int unsigned not null, index index_IP(IPv4), Ident_ID int unsigned, User_ID int unsigned, At datetime not null, index index_At(At), Protocol_ID tinyint unsigned, index index_Protocol_ID(Protocol_ID), Method_ID tinyint unsigned not null, index index_Method_ID(Method_ID), Status_ID tinyint unsigned not null, index index_Status_ID(Status_ID), Path_ID bigint unsigned, index index_Path_ID(Path_ID), Referer_ID bigint unsigned, index index_Referer_ID(Referer_ID), UserAgent_ID bigint unsigned, index index_UserAgent_ID(UserAgent_ID), Bytes int unsigned, index index_Bytes(Bytes), Server_ID smallint unsigned, index index_Server_ID(Server_ID), Site_ID smallint unsigned, index index_Site_ID(Site_ID), Timezone_ID tinyint unsigned not null );

This supports having multiple web sites logging to it (think virtual hosting several sites) and server farms (multiple servers for big web sites, distributed global delivery).

Next up, I wrote a small script to load a pre-existing access log using this stored procedure. But thats rather slow, so I have written a “Log Handler” for Apache 2 with Mod_Perl 2. This means that as each access is performed, it is logged live to 3rd normal form in MySQL. The handler is very brief:

package JEB::Log3NFHandler; use strict; use warnings; use Apache2::RequestRec (); use Apache2::Const -compile => qw(OK DECLINED); use Apache::DBI; use Time::Zone; my $dbh;
sub handler { my $r = shift; $dbh = DBI->connect('dbi:mysql:database=' . $r->dir_config("Log3NFDatabase"),Â $r->dir_config("Log3NFDatabaseUser"),Â $r->dir_config("Log3NFDatabasePassword")||"") unless $dbh; return Apache2::Const::DECLINED unless $dbh; my $sql = "call Log3NF(?, ?, ?, from_unixtime(?), ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"; my $sth = $dbh->prepare($sql); $sth->bind_param(1, $r->connection->remote_ip); $sth->bind_param(2, "-"); # Ident $sth->bind_param(3, $r->user()); $sth->bind_param(4, $r->request_time()); $sth->bind_param(5, $r->protocol()); $sth->bind_param(6, $r->method()); $sth->bind_param(7, $r->status()); $sth->bind_param(8, $r->uri()); $sth->bind_param(9, $r->headers_in->get('Referer')||'-'); # Referer $sth->bind_param(10, $r->headers_in->get('User-Agent')); # Useragent $sth->bind_param(11, $r->bytes_sent()); # Bytes $sth->bind_param(12, $ENV{'SERVER_NAME'}); # Server name $sth->bind_param(13, $r->hostname()); # Site name #tz_local_offset()/60 $sth->bind_param(14, "+0000"); # Timezone $sth->execute(); $sth->finish; return Apache2::Const::OK; } 1; # modules must return true

You’ll notice the Timezone set to “+0000”; while the TZ variable in Mod_Perl says a location (“Europe/London”), it doesn’t give an offset from GMT. I’m also always logging ident as “-“, since I cant see how Mod_Perl makes that available. The configuration of the Database, DB User and Password are all taken from the Apache configuration file from the PerlSetVar directive.

With this data in 3rd normal form, viewing it means several joins, or making use of another of the newer facilities that saw daylight in MySQL 5.1: views. So a couple of views sit around to make this data easily accessible.

With this data being stored as it happens, I wrote a CGI script to render this data – to give me some graphs of the last 5 minutes of activity, in real time. In fact, its dynamic, so I can zoom in to the last 5 mins, or out to the last 800 minutes. This real-time analysis shows HTTP status codes, popular paths being requested (by hits and by bytes), plus a per-minute hits and bytes.

But there’s more… lets to some analysis on where these hits are coming from. MaxMind distribute a free Country CSV database that shows roughly where all these IPs are coming from. We load this CSV into a normalised form, and start to integrate this into the live and summary tables…

… at least, that’s where I am up to now.

I’ve been looking at this approach since around 2002, when I had to perform all the normalisation in client-side Perl. But abstracting away the normalisation into the MySQL stored procedure makes this much neater, and less prone to inconsistencies (the client doesn’t have to update the main table and ensure it puts in the correct foreign keys).

I will put this code up for public consumption soon, so if you’re interested in 3rd normal form logging, drop me an email!

Open Source after the apocolypse

Looks like Red Hat’s CEO is making a point about Open Source being dominant in the wake of the global financial crisis.Â So what does this mean for the people who write, code, and distribute Free software?

Hiring and Keeping FLOSS Developers

Hiring open source developers is a tricky role. They need the freedom to contribute to their project, which is probably outside of your business goals. If they are unable to gt enough time to do their goals, then two several things happen:

Their open source project suffers through lack of time devoted to it
Their enthusiasm for their commercial goals suffer, since their open source effort is waining
The open source project may indirectly feed back to the company goals, so the company suffers from Free projects it can use commercially
The employee leaves, and you have to train a new employee (and pay recruiters, etc).

So, Google has a “20% time” programme, where staff can work on (approved) projects. This avoids the above conflict. Ergo, Google has some very enthusiastic and dedicated staff. And a lot of them.

Clearly there is a cost uplift, but the feedback loop does exist, if the management framework can understand it (and the purse strings can afford the 20% cost overhead of having enough staff to cover the workload).

Where do the current FLOSS developers end up?

Well, with a head start on a large number of technologies, and an understanding of the way projects come / go / survive / fork / die, established FLOSS practitioners end up… leading, from the front. Which means becoming management. Which means reporting, directing, and not actually doing the work, which for some is a motivation in itself.

So, shake the dust off and move away from the coal face. Start pointing at the seams from the back, and watch others try to tackle it.

Where will we start seeing Open Source in the Enterprise?

With major releases in the last few weeks of OpenOffice.org, and Gimp, and a slew of other projects, there is now a set of tools that cover the majority of what the corporate desktop actually needs. And these are on the Win32 platform, which the enterprise is comfortable with.

OpenOffice.org 3 now can read Microsoft Office 2007 file formats. Admittedly its “import” only, but it can save to Microsoft Office 2003 file formats, and thats enough of a bridge. Besides, if someone wants to save to any other format, they can contribute to the project to make it happen.

Gimp is starting to look more polished. What it needs is more examples to get the users who think of Photoshop as a verb to understand that photo editing doesn’t start and end with one commercial product.

Getting Group policy support for these projects on their Win32 is a “nice to have”, but probably wont happen (unless someone feels stronly about it enough to do it themselves), but moving this to the Linux environment should become easier.

Update 2008-10-22

A contrary opinion.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30