The Archives

  • 12.Nov.11
    Unicode sorting in perl shell | pfortuny | (0)
    Just Normalize. Assume file.txt is a list of unicode words, then cat file.txt | perl -e 'use Unicode::Normalize; my @w ;\ while () {chomp; push @w, $_;} ; @w = sort {NFD($a) cmp NFD($b) } @w ;\ print(join("\n", @w))' will output the sorted list (well, sorted according to the NFD normalization, which for Spanish is enough).