Parse valgrind suppressions.sh

From WxWiki
Jump to navigation Jump to search

This is a script for parsing valgrind's output when using the --gen-suppressions=all option. See Creating a Valgrind suppression file for too much information.

#! /usr/bin/awk -f
# Copyright: 2022 David Hart
# Licence:   wxWindows licence
#
# A script to extract the actual suppression info from the output of (for example) valgrind --leak-check=full --show-reachable=yes --error-limit=no --gen-suppressions=all ./minimal
# The desired bits are between ^{ and ^} (including the braces themselves).
# The combined output should either be appended to /usr/lib/valgrind/default.supp, or placed in a .supp of its own
# If the latter, either tell valgrind about it each time with --suppressions=<filename>, or add that line to ~/.valgrindrc

# NB This script uses the |& operator, which I believe is gawk-specific. In case of failure, check that you're using gawk rather than some other awk

# The script looks for suppressions. When it finds one it stores it temporarily in an array,
# and also feeds it line by line to the external app 'md5sum' which generates a unique checksum for it.
# The checksum is used as an index in a different array. If an item with that index already exists the suppression must be a duplicate and is discarded.

BEGIN { suppression=0; md5sum = "md5sum" }
  # If the line begins with '{', it's the start of a supression; so set the var and initialise things
  /^{/  {
           suppression=1;  i=0; next 
        }
  # If the line begins with '}' its the end of a suppression
  /^}/  {
          if (suppression)
           { suppression=0;
             close(md5sum, "to")  # We've finished sending data to md5sum, so close that part of the pipe
             ProcessInput()       # Do the slightly-complicated stuff in functions
             delete supparray     # We don't want subsequent suppressions to append to it!
           }
     }
  # Otherwise, it's a normal line. If we're inside a supression, store it, and pipe it to md5sum. Otherwise it's cruft, so ignore it
     { if (suppression)
         { 
            supparray[++i] = $0
            print |& md5sum
         }
     }


 function ProcessInput()
 {
    # Pipe the result from md5sum, then close it     
    md5sum |& getline result
    close(md5sum)
    # gawk can't cope with enormous ints like $result would be, so stringify it first by prefixing a definite string
    resultstring = "prefix"result

    if (! (resultstring in chksum_array) )
      { chksum_array[resultstring] = 0;  # This checksum hasn't been seen before, so add it to the array
        OutputSuppression()              # and output the contents of the suppression
      }
 }

 function OutputSuppression()
 {
  # A suppression is surrounded by '{' and '}'. Its data was stored line by line in the array  
  print "{"  
  for (n=1; n <= i; ++n)
    { print supparray[n] }
  print "}" 
 }

Thanks for your contribution, which inspired me to create a variant with the following characteristics:

* Doesn't create a subprocess to compute the md5 for each suppression.
* Eliminates any suppression name line from the checksum, so you can freely name your suppression rules.
* Outputs the rules in a stable order (sorted by checksum) so that you can easily diff rules files.
* You can specify your existing rule base as input while running your program, so duplicates are picked-off from previous runs.

Usage:

  grindmerge <input >output
or, more cleverly:
  valgrind --show-leak-kinds=all --leak-check=full  foo 3>&1 1>&2 2>&3 | grindmerge -f oldrules >newrules

The redirections swap stdout and stderr for valgrind (and your program), so that you can merge output with less noise. (Note that stderr output from your program will be discarded with this approach.)

-f file lets you merge rules from an existing file while reading new data on stdin.

Even if you generate your rules with wildcards, keep the resulting database for future runs so that you don't have to re-analyze the same reports.

Enjoy.

#!/usr/bin/perl

# Copyright 2016 Timothe Litt litt at acm _ddot_ org
#
# May be freely used and copied providing this notice is retained
# No warranty, use at your own risk
#

# valgrind foo 3>&1 1>&2 2>&3 | grindmerge -f input >newrules

use warnings;
use strict;

use Digest::MD5 qw/md5_hex/;

my %known;
my $dups = 0;

sub parse {
    my $in = shift;

    while( <$in> ) {
        next unless( /^\{/ );

        my $block = $_;
        while( <$in> ) {
            if( /^\}/ ) {
                $block .= "}\n";
                last;
            }
            $block .= $_;
        }
        last unless( defined $block );
        if( $block !~ /\}\n/ ) {
            print STDERR ("Unterminated suppression at line $.\n" );
            last;
        }
        my $key = $block;
        $key =~ s/(\A{[^\n]*\n)\s*[^\n]*\n/$1/;
        my $sum = md5_hex( $key );
        $dups++ if( exists $known{$sum} );
        $known{$sum} = $block;
    }
}

if( @ARGV >= 2 && $ARGV[0] eq '-f' ) {
    if( open( my $db, '<', $ARGV[1] ) ) {
        parse( $db );
        close( $db );
    } else {
        print STDERR ("Open failed for $ARGV[1]: $!\n");
        exit 1;
    }
    print STDERR ("Read " . keys( %known ) . " suppressions from $ARGV[1]\n" );
}

parse( \*STDIN );

print $known{$_} foreach ( sort keys %known );

print STDERR ("Squashed $dups duplicate suppressions\n")