Remove Comments from IIS Logs
If you think that Log Parser is a bit on the slow side (i.e. if you’re dealing with big IIS logs) and you want to bulk import your logs into SQL Server, then you’ll have to remove # comments from the log files. Microsoft has the PrepWebLog Utility to do this, but it seems to choke for files that are > 100 MB. Also, you’ll have to write this as a batch file so it goes through a whole directory of files.
I wrote a Perl script that’s relatively fast (faster than PrepWebLog) and it can crawl folders/subfolders recursively. Here it is:
# parse.pl # example: # parse c:\temp\logs\logs*\*.log # # Requirement: no spaces in the directory names and file names. # This gets called via run.bat. sub getFileList { # This function returns an array of file list based on filter # This is the filter they can put in. # Returns a file with full path. # Example of filters: getFileList ( "*.log" ); @files = ; return @files; } sub remove_comments { # Remove # pound sign comments from files. # @_[0] = filename open (my $in, "", "@_[0].txt") or die "out: @_[0]"; while( my $line = ) { print $out $line unless $line =~ /^#/; } close $in; close $out; } ########## MAIN ############# $arg = @ARGV[0]; # Location of root directory of logs files #$arg = 'c:\temp\logs\logs*\*.log'; # Replace slashes $arg =~ s/\\/\\\\/g; # Loop through all the log files. for $file (getFileList ($arg)) { print ( "Processing file $file ... \n" ); remove_comments( $file ); }
The Perl script gets called via run.bat:
REM No spaces in directory and file names. perl Parse.pl D:\statesites\W3SVC*\*.log pause