Trimming empty attributes

The “empty attribute” problem is mostly confined to Netscape/Sun directories servers. Basically the server, even with schema checking turned on, will allow you to violate schema and schema syntax rules in all kinds of ways. Here are some ways to get at those offending entries.


Most directory admins like their servers to strictly enforce schema and schema syntax rules. It helps avoid future issues in re-loading directory entries into the same server and makes migration from one server to another much more efficient and trouble-free.

Unfortunately the original developers of the Netscape Directory succumbed to pressure from customers (or anticipated such pressure) and designed a very loose schema enforcement mechanism that allowed you to add attributes to entries that weren’t assigned to any of the entry’s object classes or even in the schema at all. They also allowed violations of defined attribute syntax (schema syntax enforcement was not even an option), like non-numeric character data in what are supposed to be numeric-only attributes. The violations allowed include adding attributes without any value at all. In later versions while schema checking was turned on by default, syntax checking was still turned off (the latter was probably the result of issues in replicating between older and newer directory server releases).

Note: there is a way to turn off schema and syntax checking in later Netscape family directories. Although it is not recommended, you can do this by changing “nsslapd-schemacheck” and “nsslapd-syntaxcheck” in your dse.ldif from “on” (the default for 389/Red Hat directories; Sun/Oracle DSEE leave syntax checking turned off) to “off”, thusly:

nsslapd-schemacheck: off
nsslapd-syntaxcheck: off

Keep in mind that this change should be made after bringing the directory service down (otherwise the automatic backup of the config will overwrite it). It will become effective when you bring it back up. Again, I don’t recommend this but realistically it may be necessary where you’ve got a ton of data and not much time to get it loaded up. If you do wind up performing a data load like this, I strongly suggest you turn schema and syntax checking back on when you’re done. This will result in issues editing offending data, but that’s usually a Day 2 requirement. If you’re running in a mixed environment I’d at least turn schema checking back on.

Here are some grep and perl regex operations that can help when you’re trying to move entries from a server that has this kind of non-compliant data to one that enforces schema rules (e.g. both OpenLDAP and 389 Directory strictly enforces schema rules even when schema checking is turned off).

In the examples below I’ve dumped my data to an LDIF file.

Here’s a grep to reveal all attributes without a value:

grep :$ file.ldif

Here’s a perl one-liner to find and delete every line containing an “empty attribute”:

perl -pi -e 's/^.+:$n//g' file.ldif

Here is a perl script that can be used to accomplish this as well. It generates an LDIF file that can be executed against a running directory using ldapmodify.

#!/usr/bin/perl -w
# cleannullattrs.pl Clean out attributes with null values
use strict;
use Net::LDAP;
use Net::LDAP::Entry;

my $HOME = $ENV{'HOME'};
our($dirHost);
require "$HOME/etc/admin.conf";

my @containers = (
"ou=East,ou=People,dc=example,dc=com",
"ou=West,ou=People,dc=example,dc=com",
"ou=Europe,ou=People,dc=example,dc=com",
"ou=Asia,ou=People,dc=example,dc=com",
"ou=Customers,ou=People,dc=example,dc=com"
);
my $query = "(objectclass=excoorgperson)";
my $outfile = "$HOME/data/admin/cleannullattrs.ldif";
my $errfile = "$HOME/data/logs/cleannullattrs.log";
my $ldap = Net::LDAP->new($dirHost, version =>'3');
my $mesg = $ldap->bind($dirUsr, password=>$dirPass) or die $!;

open LOGZ, ">$errfile" or die $!;
open FH, ">$outfile" or die $!;

my $time = localtime();
print LOGZ "$timetBegin cleanup of attributes with null valuesn";

foreach my $container(@containers) {

   $mesg = $ldap->search(
           base =>$container,
	   scope =>'sub',
	   filter =>$query,
   );
   while (my $entry = $mesg->shift_entry()) {
      my $dn = $entry->dn;
      my $uid = $entry->get_value('uid');
      my @attrnames = $entry->attributes();

      foreach my $attrname(@attrnames) {
  	my $attrval = $entry->get_value($attrname);
  	if($attrval !~ /[A-Z]|[0-9]/gi) {
	   if (($attrname eq 'cn')||($attrname eq 'uid')) {
	      print "t$uid had null $attrname, entry will be deletedn";
              print LOGZ "t$uid had null $attrname, entry will be deletedn";
              print FH "dn: $dnn";
              print FH "changetype: deleten";
              print FH "n";
	   }
	   else {
	      print "t$uid had null $attrname, attr will be deletedn";
	      print LOGZ "t$uid had null $attrname, attr will be deletedn";
	      print FH "dn: $dnn";
	      print FH "changetype: modifyn";
	      print FH "delete: $attrnamen";
	      print FH "n";
	   }
	}
     }

  }
}
$time = localtime();
print "$timetEnd cleanup processn";
print LOGZ "$timetEnd cleanup processn";

$ldap->unbind;
close FH;
close LOGZ;

__END__;