Blog Archives

#Linux : extract substring in #Bash

One of the most important tasks in dealing with data is string manipulation. We already saw how to use awk and grep to efficiently sift through text files using command line tools instead of developing ad-hoc code. To step it up a notch, we can also do some heavier preprocessing of the data, such as selecting only the subset of information that matches a particular pattern, to ensure data coming out of our pipeline is of good quality.

In this case, we use a Bash feature called parameter expansion. Let’s assume we have the text data in a variable TEXT_LINE and an expression pattern to match (in file-name matching format), this is a summary of the possible expansion:

  • Delete shortest match of pattern from the beginning
    ${TEXT_LINE#pattern}
  • Delete longest match of pattern from the beginning
    ${TEXT_LINE##pattern}
  • Delete shortest match of pattern from the end
    ${TEXT_LINE%pattern}
  • Delete longest match of pattern from the end
    ${TEXT_LINE%%pattern}
  • Get substrings based on position using numbers
    ${TEXT_LINE:START:END}
  • Replace particular strings or patterns
    ${TEXT_LINE/pattern/replace}

So for example, to extract only the file name without the extension:

${TEXT_LINE%.*}

or to extract user name from an email:

${TEXT_LINE%%@*.*}

or extract the file name from an absolute path:

${TEXT_LINE##*/}

NOTE: You can’t combine two operations, instead you have to assign to an intermediate variable.

#Linux : How to Slice an Array in #Bash

A lot of things can be done using just the command line.
In fact, Bash shell scripting language is a touring-complete language, so anything can be done!

One important feature is the ability to slice an array (i.e. select a contiguous subset of elements of a collection).
So let’s say for example we stored the list of installed packages into a variable PACKAGE_LIST as a bash array:

#PACKAGE_LIST=(`dpkg -l | awk '{print $2}'`)

and for some reason we want to select elements from 4 to 10:

#PACKAGE_LIST=${PACKAGE_LIST[@]:3:10}

Let me explain. Here, we are using Bash parameter expansion:

  • The [@] following the array name returns the whole content of the array.
  • The :X:Y part is doing the slicing by taking a slice of length Y starting at position X. Note that if X is negative, that is we start at X elements from the end, we must put a space between the colon and the number.

#VMware Fusion: Script to Easily Install VMware Tools [OUTDATED]

If you run a linux guest VM, every time you update the kernel you need to reinstall VMwareTools for optimal performances.
After selecting Virtual Machine > Install VMware Tools you need to untar the archive and then run a script that ask you many question, etc.
This can be very tedious, so this is a little script that minimize typing:

#!/bin/bash
tar xzf /media/VMware\ Tools/VMwareTools-*.tar.gz -C /tmp
umount /media/VMware\ Tools
sudo /tmp/vmware-tools-distrib/vmware-install.pl -d
mkdir -pv ~/Desktop/VMwareShared
rm -v ~/Desktop/VMwareShared/*
if [ -d /mnt/hgfs ]
then
ln -sv /mnt/hgfs/* ~/Desktop/VMwareShared/
fi
vmware-user

VMware now recommends to use the open-vm-tools-desktop provided by the Linux distribution of your choice.

NOTE: -d option implies default answers to install script (most of the time they are ok)
NOTE2: the script create a directory on Desktop with all directories shared by the host system with the VM
NOTE3: this script has been tested only on Ubuntu 12.04 LTS
NOTE4: this script install native VMware Tools, if you want you can install open tools instead, but you can’t install both at the same time!