shell

String manipulation in Bash

I rely on Bash on much of my scripting needs although it is usually not worthy when you need to do something complicated. However, we usually think Bash limits are lower than they really are.

In this post I am going to explain the surprising number of string manipulation operations that Bash supports.

String Length

The syntax to know the lenght of a string is:

${#string}

For example:

$ string=abcdefghijklmnopqrstuvwxyz
$ echo ${#string}
26

Substring Extraction

${string:position}

Extracts substring from $string at $position.

$ string=abcdefghijklmnopqrstuvwxyz
$ echo ${string:1}
bcdefghijklmnopqrstuvwxyz

Note that strings are zero-based indexed.

If you want to extract a substring of a specific length then:

${string:position:length}

Extracts $length characters from $string at $position.

For example:

$ echo ${string:5:3}
fgh

It is also possible to extract from the end of the string using a negative index value, but you have to be careful because of this little caveat:

$ echo ${string:-3}

will not print the last 3 characters of the string but this

$ echo ${string:-3}
abcdefghijklmnopqrstuvwxyz

This is because the following syntax:

${parameter:-default}

sets the default shell variable value. So in our case ${string:-3} is actually doing nothing because string has already been set (if it hadn’t been set then its value would have been 3).

In order to avoid this behaviour you can use parentheses or an added space to escape the position parameter.

$ echo ${string:(-3)}
xyz
$ echo ${string: -3}
xyz

The position and length arguments can be parameterized, that is, represented as a variable, rather than as a numerical constant.

Substring Removal

${string#substring}

Deletes shortest match of $substring from front of $string.

${string##substring}

Deletes longest match of $substring from front of $string.

For example:

$ string=abcABC123ABCdef
$ echo ${string#a*C}
123ABCdef
$ echo ${string##a*C}
def

To understand better the shortest and longest matches:

string=abcABC123ABCdef
#      |----|       shortest
#      |----------| longest

Substrings can also be deleted from back of the string.

${string%substring}

Deletes shortest match of $substring from back of $string.

${string%%substring}

Deletes longest match of $substring from back of $string.

string=abcABC123ABC
$ echo ${string%B*C}
abcABC123A
$ echo ${string%%B*C}
abcA

That means:

string=abcABC123ABC
#                || shortest
#       |---------| longest

Substring Replacement

${string/substring/replacement}

Replace first match of $substring with $replacement.

${string//substring/replacement}

Replace all matches of $substring with $replacement.

$substring and $replacement may refer to either literal strings or variables.

$ string=abcABC123ABCabc
$ match=abc
$ new=xyz
$ echo ${string/$match/$new}
xyzABC123ABCabc
$ echo ${string//$match/$new}
xyzABC123ABCxyz

We can also replace a string using # and % for front-end and back-end matching.

${string/#substring/replacement}

If $substring matches front end of $string, substitute $replacement for $substring.

${string/%substring/replacement}

If $substring matches back end of $string, substitute $replacement for $substring.

$ string=abcABC123ABCabc
$ match=abc
$ new=xyz
$ echo ${string/#$match/$new}
xyzABC123ABCabc
$ echo ${string/%$match/$new}
abcABC123ABCxyz

Of course, if you need to do something more complicated and you still want to solve it with a shell script you can always use sed or awk :)

References

speak up

Add your comment below, or trackback from your own site.

Subscribe to these comments.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*Required Fields