← Back to Blog Home

In our previous post, we introduced 4 regular expressions that help us locate credit card numbers.  Today, we’ve got a few more handy RegExes for your data classification library. This time we’re targeting legal data.

Find “All Rights Reserved” NOT near your company name

Regular expression:

\b(?!all rights reserved\W+(?:\w+\W+){1,10}?acme)all rights reserved\b

Use case: you want to find files within your organization that you do not own the rights to, and verify that they are being used in accordance with their license.

Find “attorney” near “client” near “privilege”

Regular expressions:

\battorney\W+(?:\w+\W+){1,10}?client\W+(?:\w+\W+){1,10}?privilege\b
\battorney\W+(?:\w+\W+){1,10}?privilege\W+(?:\w+\W+){1,10}?client\b
\bclient\W+(?:\w+\W+){1,10}?privilege\W+(?:\w+\W+){1,10}?attorney\b
\bclient\W+(?:\w+\W+){1,10}?attorney\W+(?:\w+\W+){1,10}?privilege\b
\bprivilege\W+(?:\w+\W+){1,10}?attorney\W+(?:\w+\W+){1,10}?client\b
\bprivilege\W+(?:\w+\W+){1,10}?client\W+(?:\w+\W+){1,10}?attorney\b

Use case: you want to find files that contain confidential information that should only be shared between an attorney and their client.

This should get you started, but remember, finding sensitive data is only the first step.  In the “All Rights Reserved” example, once you find these files you need to interview the people who are using them in order to figure out whether you’re compliant.  This can be quite a project if you don’t have an audit trail that can help you find the data owner.  In the attorney-client privilege example, the next step would be to ensure that only the right people had access to the data. How do you know who the right people are? Your best bet is to ask the data owner.

Hmm, I’m sensing a pattern here.

 

Leave a Comment