Analysing SRS database usage

Heikki Lehvaslaiho heikki at ebi.ac.uk
Tue Nov 24 09:10:38 EST 1998


SRS managers,

You might have noticed that standard httpd log analysing programs are
of no use if you want to analyse SRS usage. The relevant requests call
wgetz with a long list of options and, to make situation even worse,
session IDs are there and make almost every request unique.

The attached icarus script modifies those requests that have a
database name among wgetz options by stripping out everything else.

cetus.ebi.ac.uk - - [24/Nov/1998:12:32:48 +0000] "GET
/srs5bin/cgi-bin/wgetz?-id+4ktgJ1ALlWR+-e+[EMBL-ID:'AB015367']
HTTP/1.0" 200 3140

becomes :

cetus.ebi.ac.uk - - [24/Nov/1998:12:32:48 +0000] "GET EMBL" 200 3140

This is something that any log analyser program can sort properly.

Yours,
	-Heikki
 
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www2.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki at ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
-------------- next part --------------
#!/bin/env icarus
#
# logmod.i
#   Cleans SRS HTTPD log file entries of wgetz options and session ids
#   so that the use of databases can be summed up.
#
# Heikki Lehvaslaiho, EMBL-EBI
# heikki at ebi.ac.uk
#
# v.1.1  Nov 23 1998
#

if:($ArgN<2) {
   $Print:|Usage: logmod.i http_log_file [> modified_http_log_file]
   $Exit
}
$logfile = $Arg:2

$rules={

  hit:    ~ {$In:[file:text] $Out pre $Skip:0} 
          	  ln? {$Wrt}
          ~
  dbhit:  ~ {$In:hit $Out}
            /.*GET /
            #note: ignores queries for number of links between databases
            ( /[-\+](info|e|lib|l)[\+=]\\[?([^-\+ ]+)/ 
            {$Wrt:[s:$2] $name=$2} |
		   /./
		 )+
	  ~
  write:  ~ {$In:hit $Out pre $Request:dbhit }
            /(.*GET )\/srs5bin\/cgi-bin\/wgetz/ {$Print:$1} 
            /[^"]+(.*)/ {if:$name!='' $Print:"$name$1" else $Print:$Ct} |
            /.*/ {$Print:$Ct}
          ~
  #other
  ln:     ~ /[^\n]*\n/ ~
}

if:$TestMode {
  $job = $JobNew:[prod:$rules skip:" \r\n" fileName:$logfile]
  while:$JobHasInput:$job {
    $JobTokens:[$job name:write print:0] 
    $JobNext:$job
  }
}


More information about the Bio-srs mailing list