Accents have again (I still live in Spain and do some work for the Spanish Administration) crept into my terminal. This time it was a group of University professors which had to create a lot of files and directories concerning a historical catalog and even though I remember telling the coordinator not to use accents or spaces or any funny characters, they did it. I should have known better…
The problem was then to take away all spaces and non-ASCII characters from directory and file names.
After thinking about it a bit, I came up with the solution below. There may be a simpler way (using mkdir -p) but then you would have to remove a lot of files and be careful to check that the copies have succeeded before rm‘ing anything… Too much of a mess to me.
Find has a couple of options to specify the depth of the search to be done: -maxdepth and -mindepth, both starting at 0 (the pwd).
#!/bin/sh
# Yes, they USE spaces in filenames...
IFS='
'
# Notice that OS X has no seq command.
# I am sure they have not reached insanity YET, so 10 is a likely bound
for depth in 1 2 3 4 5 6 7 8 9 10 ; do
find . -type d -mindepth $depth -maxdepth $depth | while read -r i ; do
# It is better to know the ALLOWED set of chars, not leaving
# it to the shell's fancy. Any unallowed item becomes an underscore
j=`echo "$i" | sed -e 's/[^a-zA-Z_0-9./]/_/g'`
# only move if source != destination
if [ "$i" != "$j" ] ; then
# for logging purposes, one can never be too careful:
echo moving "$i" TO "$j"
# this will actually not happen but just in case...
if [ -e "$j" ] ; then
echo "COLLISION: last move not done"
else
mv "$i" "$j"
fi
fi
done
done
Two remarks:
- I prefer specifying the whole set of allowed characters because I was going to have to repeat the job in Perl (I had to edit some html files pointing to those directories), so the [:alnum:] class etc. would complicate things more due to the differences between Perl’s and sh’s regex’s.
- I was practically certain that there would not be any collisions. In different circumstances, I would have logged a lot more information and prevented collisions using a counter.
I want to remark that the above code had:
for i in `find . -type d -mindepth $depth -maxdepth $depth` ; do
instead of the piped while you see above. Thanks to Pierre Gaston for his comment.
You reinvent the wheel: apt-get install convmv.
Hi, AC:
reinvented up to a point: I tried convmv on my MBPro and the
Almost all POSIX filesystems do not care about how filenames are encoded
(man convmv) problem crept up… So I ended up writing the above.
However, thanks for the pointer, I ought to have mentioned it.
Pedro.