Skip to content

Remove Comments from IIS Logs

If you think that Log Parser is a bit on the slow side (i.e. if you’re dealing with big IIS logs) and you want to bulk import your logs into SQL Server, then you’ll have to remove # comments from the log files. Microsoft has the PrepWebLog Utility to do this, but it seems to choke for files that are > 100 MB. Also, you’ll have to write this as a batch file so it goes through a whole directory of files.

I wrote a Perl script that’s relatively fast (faster than PrepWebLog) and it can crawl folders/subfolders recursively. Here it is:

# parse.pl
# example: 
#   parse c:\temp\logs\logs*\*.log
#
# Requirement: no spaces in the directory names and file names.
# This gets called via run.bat. 


sub getFileList 
{    
    # This function returns an array of file list based on filter
    # This is the filter they can put in.       
    # Returns a file with full path. 
    # Example of filters: getFileList ( "*.log" );
    @files = ;
    return @files;    
}


sub remove_comments
{
  # Remove # pound sign comments from files. 
  # @_[0] = filename
  
  open (my $in, "", "@_[0].txt") 
      or die "out: @_[0]";

  while( my $line = )
  {
      print $out $line
          unless $line =~ /^#/;
  }

  close $in;
  close $out;
}


########## MAIN #############
$arg = @ARGV[0];

# Location of root directory of logs files
#$arg = 'c:\temp\logs\logs*\*.log';

# Replace slashes
$arg =~ s/\\/\\\\/g;

# Loop through all the log files. 
for $file (getFileList ($arg))
{  
  print ( "Processing file $file ... \n" );    
  remove_comments( $file );  
}

The Perl script gets called via run.bat:

REM No spaces in directory and file names.
perl Parse.pl D:\statesites\W3SVC*\*.log
pause

Categories

IIS, Perl

Dan View All

Blog owner.

%d bloggers like this: