m.(A.TT)/er
Learning By Way Of Ignorance
Monday 01/01/18 02:36:02 PM

Lately I've stumbled upon a variety of tools that are included in most Linux distributions that I have never needed, was never made aware of or just never found.

One example would be comm :

$ whatis comm
comm                 (1)  - compare two sorted files line by line

Another [embarrassing] example is scp's ability to copy files/directories from one remote machine to another remote machine - in one step - without the files/directories ever being copied locally:

$ ssh host2 'ls -la /tmp/foolish'
total 8
drwxr-xr-x 2 mm   mm   4096 Jul  3 12:22 .
drwxrwxrwt 9 root root 4096 Jul  3 12:23 ..

$ scp host1:/etc/hosts host2:/tmp/foolish/

$ ssh host2 'ls -la /tmp/foolish'
total 12
drwxr-xr-x 2 mm   mm   4096 Jul  3 12:24 .
drwxrwxrwt 9 root root 4096 Jul  3 12:24 ..
-rw-r--r-- 1 mm   mm    221 Jul  3 12:24 hosts

Lacking a personal life and always seeking to incorporate new tools, I've developed a serious jones - a habit if you will - of listing packages and looking up binaries and scripts that I don't recognize (I have a crush on all package with 'utils' in the name). I would also stumble on GNU gold while hanging out in /usr/bin or /usr/sbin. This would always result in me wanting to find what packages my new friends were members of and immediate sift through the contents.

Considering most of the hosts that I work...err...hang out with are based on RedHat Enterprise Linux (or RHEL in ComputerNerdish), I would search through RPM package listings. In the past I have worked extensively with RPM's - I've managed hosts and their contents with rpm - I've managed Apt-RPM and Yum package repositories - I've built and currently build rpms. In fact, I recently just finished an automated package build and distribution solution that works with Subversion and will be releasing it to the wild very soon. In spite of all this *bragadocia* I was under the impression that the only way to query which rpm a file or directory belonged to was by running rpm -qf /path/to/file and the only way to list the contents of that package was to run rpm -ql packagename. That is, until I incorporated some slick functions into my .bashrc that replaced the two step operation of looking up and listing packages and found myself listing the contents of the package rpm:

$ wrl rpm

Listing Contents of rpm-4.4.2.3-18.el5:

/bin/rpm
/etc/cron.daily/rpm
/etc/logrotate.d/rpm
/etc/rpm
/usr/bin/gendiff
/usr/bin/rpm2cpio
/usr/bin/rpmdb
/usr/bin/rpmquery
...

Doh! I didn't even have to look at rpmquery's usage and I knew all the work I had just finished was sorta Elisha Gray'd (except I didn't come to timely innovation). Nevertheless it was mildly fun and educating to recreate the wheel (some of rpmquery's functionality) and so I thought I would share what I came up with.

Shell Functions As A Tool Test Drive

I spend a lot of time writing tools to automate or simplify stuff I make my minions (computers) do. I've found that I can non-intrusively test the functionality of a potential tool by incorporating it into my environment via functions in my .bashrc.

Let's look at the first two functions, which functionally accomplishes looking up the package name for any file either in my PATH or specified.

From $HOME/.bashrc:

function Validate {
    [ -d "$1" ] && Results="$1" && return 0
    local ItsType=$(type -t $1 || return 1)
    [[ "${ItsType}" == +(file) ]] && Results=$(type -P $1)  && return 0
    [[ "${ItsType}" == +(alias|keyword|function|builtin) ]] && return 1
}

function Which_RPM {
    Validate $1  && Target=${Results} || return 1
    Validate rpm && Rpm=${Results}    || return 1
    ${Rpm} -qf ${Target}
}

Validate

The first function is used to verify the existence, path and type of any non-builtin executables. This is used primarily for portability and secondarily for security. It serves as a functional resource (either directly or indirectly) for the rest of the functions covered here.

Which_RPM

Unlike the first, the second function accomplishes one of the tasks that I set out to achieve; namely to return the the package name for any base (i.e. not the full path) file name that happens to be in my $PATH. It will also return the package name for any file or directory that is specified by passing the full path on the command line. The first two operations utilize Validate to check the location of the argument passed on the command line and to check the location of the rpm excutable. The last line combines the two to get the desired result.

One thing I want to point out is my naming convention for variables and functions.

Variables : Clear, purpose named, CamelCase styled, curly brace bound

Validated Commands : Variable name matches command for which it stored with first letter capitalized, curly brace bound

Functions : Clear, purpose named, CamelCase like, with underscored word separation

All caps variables reduce readability and therefore I avoid them. The use of underscores has been something I avoided in the past - they look terrible with lowercase and they waste keyboard strokes (this is solved later on). However I have found using caps with underscored separate names look pretty good. Finally, the underscores aid in distinguishing the quasi namespace hierarchy which nicely separates functions from other stuff (like variables born of my crazy variable naming convention).

$ Which_RPM monit
monit-5.1.1-1.el5.rf

$ Which_RPM /etc/postfix        
postfix-2.3.3-2.1.el5_2

The next functional goal is to be able to list the contents of the packages that Which_RPM returns.

Which_RPM_List

function Which_RPM_List {
    Validate rpm && Rpm=${Results} || return 1
    for SomeFiles in $(echo $@)
    do
        WhichRPMResult=$(Which_RPM $@)
        echo -e "\nListing Contents of ${WhichRPMResult}:\n"
        ${Rpm} -ql ${WhichRPMResult}
    done
}

The first thing to note is that although Which_RPM_List calls Which_RPM we still have to Validate the rpm executable. Bash has no mechanism or concept of inheritance AND variable scoping is fairly restrictive when it comes to functions. Normally the right solution to this problem is to declare a global variable for anything that is used or shared by more than one function. But I wanted to see if I could accomplish everything in referenceable, modular units. Also notice that the name of the function follows the standard of establishing a namespace. Let's see this function in action:

$ Which_RPM_List monit

Listing Contents of monit-5.1.1-1.el5.rf:

/etc/monit.conf
/etc/monit.d
/etc/rc.d/init.d/monit
/usr/bin/monit
/usr/share/doc/monit-5.1.1
/usr/share/doc/monit-5.1.1/CHANGES.txt
/usr/share/doc/monit-5.1.1/COPYING
/usr/share/doc/monit-5.1.1/LICENSE
/usr/share/doc/monit-5.1.1/README
/usr/share/doc/monit-5.1.1/README.DEVELOPER
/usr/share/doc/monit-5.1.1/README.SSL
/usr/share/man/man1/monit.1.gz
/var/lib/monit

At this point, I was fairly happy but I immediately realized that Which_RPM will not deal with more than one argument passed to it. Sure I could go back and change it so that it can deal with multiple files or dirs, returning multiple package names, but I won't. Out of laziness and curiousity I decided to create another function that will use Which_RPM to do its thing on an entire iterated list. Here's what we got:

Which_RPM_Plural

function Which_RPM_Plural {
    Validate basename && Basename=${Results} || return 1
    for SomeFiles in $(echo $@)
    do
        WhichRPMResult=$(Which_RPM ${SomeFiles}) || continue
        JustBase=$(basename ${SomeFiles})
        echo "${JustBase} :: ${WhichRPMResult}"
    done
}

Seems pretty straight forward, even if it is not as efficient as it could be. One of the things I wanted to make sure it could do deal with passing everything in a directory as well as Bash brace expandable paths:

$ Which_RPM_Plural /etc/profile.d/*
bash_completion.sh :: bash-completion-20060301-1.el5.rf
colorls.csh :: coreutils-5.97-23.el5_4.2
colorls.sh :: coreutils-5.97-23.el5_4.2
glib2.csh :: glib2-2.12.3-4.el5_3.1
glib2.sh :: glib2-2.12.3-4.el5_3.1
lang.csh :: initscripts-8.45.30-2.el5.centos
lang.sh :: initscripts-8.45.30-2.el5.centos
less.csh :: less-436-2.el5
less.sh :: less-436-2.el5
vim.csh :: vim-enhanced-7.0.109-6.el5
vim.sh :: vim-enhanced-7.0.109-6.el5
which-2.sh :: which-2.16-7

$ Which_RPM_Plural /sbin/{fsck,service,chkconfig}
fsck :: e2fsprogs-1.39-23.el5
service :: initscripts-8.45.30-2.el5.centos
chkconfig :: chkconfig-1.3.30.2-2.el5

Yeah, this is nice, except it's butt ugly. Never fear, another function is here:

function Which_RPM_Plural_Formatted {
    Validate column && Column=${Results} || return 1
    Which_RPM_Plural $@ | ${Column} -t
}

And tonights winning lottery numbers are:

Which_RPM_Plural_Formatted

$ Which_RPM_Plural_Formatted /etc/profile.d/*
bash_completion.sh  ::  bash-completion-20060301-1.el5.rf
colorls.csh         ::  coreutils-5.97-23.el5_4.2
colorls.sh          ::  coreutils-5.97-23.el5_4.2
glib2.csh           ::  glib2-2.12.3-4.el5_3.1
glib2.sh            ::  glib2-2.12.3-4.el5_3.1
lang.csh            ::  initscripts-8.45.30-2.el5.centos
lang.sh             ::  initscripts-8.45.30-2.el5.centos
less.csh            ::  less-436-2.el5
less.sh             ::  less-436-2.el5
vim.csh             ::  vim-enhanced-7.0.109-6.el5
vim.sh              ::  vim-enhanced-7.0.109-6.el5
which-2.sh          ::  which-2.16-7

$ Which_RPM_Plural_Formatted /sbin/{fsck,service,chkconfig}
fsck       ::  e2fsprogs-1.39-23.el5
service    ::  initscripts-8.45.30-2.el5.centos
chkconfig  ::  chkconfig-1.3.30.2-2.el5

At this point I should be happy, but I am far from it. Why? Well, while I like the long, quasi namespaced, formatted function names, they make for terrible command line usage (terrible to type). We could set some aliases to shorter names OR we can bust out some more functions. YEAH! (obviously I am intentionally getting carried away):

function wr {
    Which_RPM "$@"
}

function wrl {
    Which_RPM_List "$@"
}

function wrpf {
    Which_RPM_Plural_Formatted "$@"
}

This is where the quasi namespaced fuction names really pay off. I simply name each new substituting function with the lowercased, first character of each word in the function name being substituted. Now to show off them shorty function names from the command line. But first, tab completion:

wrpf

$ wr
wr        write     wrjpgcom  wrl       wrpf      
$ wr

1 $ wrpf /etc/profile.d/*
bash_completion.sh  ::  bash-completion-20060301-1.el5.rf
colorls.csh         ::  coreutils-5.97-23.el5_4.2
colorls.sh          ::  coreutils-5.97-23.el5_4.2
glib2.csh           ::  glib2-2.12.3-4.el5_3.1
glib2.sh            ::  glib2-2.12.3-4.el5_3.1
lang.csh            ::  initscripts-8.45.30-2.el5.centos
lang.sh             ::  initscripts-8.45.30-2.el5.centos
less.csh            ::  less-436-2.el5
less.sh             ::  less-436-2.el5
vim.csh             ::  vim-enhanced-7.0.109-6.el5
vim.sh              ::  vim-enhanced-7.0.109-6.el5
which-2.sh          ::  which-2.16-7

Alas, thorough satisfaction is achieved except for the part where I found rpmquery already provides more or less everything I came up with:

rpmquery -f

$ rpmquery -f /sbin/{fsck,service,chkconfig}
e2fsprogs-1.39-23.el5
initscripts-8.45.30-2.el5.centos
chkconfig-1.3.30.2-2.el5

Bourne/Bash Have Hash

Thursday 08/04/11

For a while I have been using function Validate in order to ensure the existence of all non-builtin commands needed by a script. This also aided in making sure my scripts were slightly more secure by validating that the the command I expect to be a binary is actually a binary and not an alias to a malicious script or even malicious function established in the current shell. The benefits of function Validate don't exactly come without tradeoff. While one of my goals will always be to maximize the usage of language builtins, some of my scripts make use of a fair amount of non-builtin commands. For instance, Ambit uses the following non-builtin commands:

mkdir uniq sort sed cat touch host rm grep 
column head comm tail mv tr egrep logger

None of this stuff is 'exotic' or gleaned from packages that are not usually part of the typical base *NIX install, but it is still a high number of commands to validate. Luckily, the validation process happens only once and has shown to execute very fast. But...

As someone that has overlooked important information when learning a subject I have developed a habit of going back and re-reading material (over and over sometimes). Today I was looking through Bash's Parameter Expansion documentation when I stumbled on the the hash builtin to Bash:

$ help hash
hash: hash [-lr] [-p pathname] [-dt] [name ...]
    For each NAME, the full pathname of the command is determined and
    remembered.  If the -p option is supplied, PATHNAME is used as the
    full pathname of NAME, and no path search is performed.  The -r
    option causes the shell to forget all remembered locations.  The -d
    option causes the shell to forget the remembered location of each NAME.
    If the -t option is supplied the full pathname to which each NAME
    corresponds is printed.  If multiple NAME arguments are supplied with
    -t, the NAME is printed before the hashed full pathname.  The -l option
    causes output to be displayed in a format that may be reused as input.
    If no arguments are given, information about remembered commands is 
    displayed.

Doh!

This site will always use valid XHTML and valid CSS.

© AfterTheTweet.com 2018