automated, scripts, shell

Use of find with regular expressions

I found myself needing to write a clean: rule in a makefile which would wipe out all the auxiliary files generated by latex (among other things). Moreover, the project has some subdirectories in which there might be more of those files.
Apart from finding files by date, size, modification time… the find utility can use regular expressions. I am not delving a lot into this (what kind of expressions, etc.), I just wanted to point out two things:

  • That it comes in quite handy.
  • That the regex must match the complete file name as reported by find.

The second item is worth noticing. If you need to find, say all the files in this directory structure having an x, you cannot use

$ find . -regex 'x'

which, if it were a perl regex would match anything containing an x (assuming the quotation marks stand for the regex delimiter). What you need is to match the whole string, taking into account that any find in . starts with './'. Hence, find the files in this directory structure ‘having an x in its name’ is written

$ find . -regex './.*x.*'

which means exactly: find all the files matching ‘start with dot-slash, then anything, an x and anything’, where anything can be of length 0.

The clean: in my makefile reads now:

clean:
    find -E . -regex "\./.*.(aux|log|blg|bbl|~)" -exec rm -f '{}' \;

The -E option means ‘use modern regular expressions’, which are the ones everybody uses nowadays. I prefer using ‘\.‘ for “real dots,” it looks cleaner to me.

It has been quite a while, has it not?

speak up

Add your comment below, or trackback from your own site.

Subscribe to these comments.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*Required Fields