Starbase: Frequently Asked Questions

Introducing Starbase

Q.1. What is Starbase?
Q.2. Why might I be interested in it?
Q.3. Where can Starbase be obtained?
Q.4. How much does it cost?

Getting Started with Starbase

Q.5. What do I need to know to get started?
Q.6. What is the format of a table file?
Q.7. What are the minimum commands I need to know to get started?
Q.8. What are operators that can be used with the row command?
Q.9. What is a regular expression? How do I use it?
Q.10. Are there any tutorials on Starbase and databases available?

Troubleshooting

Q.11. How do I find out about reported bugs in starbase programs?
Q.12. How can I tell the difference between spaces and TAB characters in my tables?
Q.13. Emacs keeps screwing up the TABs in my table! How can I make it behave?
Q.14. Whenever I try to use the column command, I get a "No such file or directory" error.

Introducing Starbase

Q.1. What is Starbase?

Starbase is a simple relational database system developed by John Roll that is specially suited for managing tables of astronomical data. It is made up of collection of UNIX programs that make use of standard UNIX features and tools. The basic table manipulating features are similar to the /rdb system (but does not require it). Extra support for astronomical applications were implemented at the Smithsonian Astrophysical Observatory. The package also contains tools from the Star Link Project.

Q.2. Why might I be interested in it?

Starbase's simple design makes it particularly well-suited for manipulating "small" tables (< 100,000 rows?) containing astronomical data. With a little preparation in advance Starbase can handle very large tables (hundreds of millions of rows). Here are some reasons why:

Q.3. Where can Starbase be obtained?

Starbase can be downloaded from the starbase home page home page

Q.4. How much does it cost?

Nothing--it's free.

Getting Started With Starbase

Q.5. What do I need to know to get started?

Check Q.6 for a quick overview of table format and then Q.7 for an introduction to the most basic commands.

Starbase comes with HTML documentation. A good place to start is the main Starbase page (the starbase man page or starbase). This lists and briefly describes the programs that make up Starbase. The extensions that have been added at SAO are covered in the tawk extensions page.

Q.6. What is the format of a table file?

A Starbase table is a plain ASCII file with the following components: Here's an example:
Table1

This table is named Table1

This is a text comment in the header of a table.  This portion may 
ramble on and on but should not contain any line with ONLY dash (-)
and tab characters.  The dash line is the indication that the data
table is about to begin.

RA	Dec
--	---
0:0:0	0:0:0
12:00	-30.0
15	60:00:30.4
If you use an editor to create a table, it's a good idea to look it over to make sure that there is only one TAB character between each column in a row. To see where the TABS are, run cat -tv table | more, where table is the name of the file containing the table (see also Q.12).

See the format man page for more examples of tables.

Q.7. What are the minimum commands I need to know to get started?

Refer to the starbase man page for a short summary of all the starbase commands. For starters, though, here's a quick explanation of the three commands you might use the most.

justify

Before you do anything interesting with a table, you often want to just look at it. A good way to do this is with the jusitfy command, which will neaten up a table so that all the columns are aligned. Try,
       justify < in.tbl | more
where in.tbl is the input table. Or,
       justify -i in.tbl | more
Tables are usually read into a starbase command either with the -i option or with the UNIX input redirection operator, <. The output is usually either redirected to a file (using >) or piped into another command like more or another starbase command.

row

The most common fuction of a database is to allow one to extract records that match certain conditions. With Starbase, this is accomplished with the row command. If the table looks like the one given in Q.6 and is called in.tbl, one can extract all rows where the RA is greater or equal to 12 hours with:
    row 'RA>=12' < in.tbl > out.tbl
Or,
    row 'RA>11:59' < in.tbl | justify | more
See Q.8 for a summary of all the operators one can use to select rows of data. Check also the file format man page to see how one can define special table variables that can be used by the row command.

column

Often one is only interested in seeing certain columns from a table (especially if the rows are very long). Using the example table from Q.6, in.tbl, one can extract just the RA and Dec columns with:
    row 'RA>11:59' < in.tbl | column RA Dec > out.tbl
Or,
    column RA Dec < in.tbl | row 'RA>11:59' > out.tbl
Refer to question Q.14 if you get an error like column: RA: No such file or directory.

Other useful commands

The header and headeroff commands allow you to extract or remove the header portion from a table. compute allows you to calculate new values for a column from other values in the row. jointable is used to combine rows from two tables that have matching values in a column.

For a complete list, see the starbase man page.

Q.8. What are operators that can be used with the row command?

The row and compute commands makes use of the operators supported by the UNIX awk command. Here's a list of those used for comparing two values:

Operation operator example
Equal == x == 40, s == "string"
Greater than > x > 40, s > "string"
Greater than or equal >= x >= 40
Less than < x < 40
Less than or equal <= x <= 40, s <= "string"
string contains ~ s ~ /string/
string does not contain !~ s !~ /string/
Logical AND && x > 5 && y < 40
Logical OR && s ~ /big/ || s ~ /large/
Logical NOT ! ! x
Logical grouping ( ) (x > 5 && x < 11) || (x == 0 && y == 1)

Q.9. What is a regular expression? How do I use it?

A regular expression is a way of representing a pattern of characters. Such patterns are used to search for substrings in a string of characters or to see if a string matches a particular pattern. In Starbase, one would use a regular expression with the row command to search for records in which the text in a column matches a certain pattern. For example, in
    row 'type ~ /gala/' < in.tbl > out.tbl
"gala" is the regular expression. This command returns all records in which the string in the "type" column contains the substring "gala". This includes galaxy, galaxies, and extragalactic.

The advantage of regular expressions is their ability to describe patterns symbolically via metacharacters. Here's some of the of the metacharacters you might use:
. matches any single character
* matches zero or more of the previous character
+ matches one or more of the previous character
? matches zero or one of the previous character
^ when at the beginning of a regular expression, matches the beginning of a value
$ when at the end of a regular expression, matches the end of a value
[...] matches any one of the characters between the brackets
[^...] matches any one character not among those between the brackets
\ escapes the special meaning of the next character, e.g. \. means a real period.
Here are some sample uses:
type ~ /proto.*env/ matches when type contains a substring beginning with "proto" and ending with "env", including "protostellar environment" and "protogalactic environment".
type ~ /[gG]ala/ matches "gala" or "Gala" appearing anywhere in type
title ~ /^Study/ matches when title begins with the "Study"
type ~ /^[gG]alaxy$/ type is restricted to being equal to either "galaxy" or "Galaxy"
Note that some metacharacters are also interpreted by the UNIX shell. That's why it is a good idea to enclose search clauses in single quotes when using the row command.

Q.10. Are there any tutorials on Starbase and databases available?

Because of Starbase's similarity to the commercial /rdb package, the book UNIX Relational Database Management by Manis, Schaffer, and Jorgensen (Prentice Hall, ~$70) is often recommended. Chapters 1, 2, 3, 4, and 6 will bring an astronomer new to relational databases up to speed.

Troubleshooting

Q.11. How do I find out about reported bugs in starbase programs?

The most up-to-date list of bugs can be accessed from http://cfa-www.harvard.edu/~john/starbase/BUGS. The Starbase distribution also comes with a list of bugs in a file called BUGS.

Q.12. How can I tell the difference between spaces and TAB characters in my tables?

The most important thing to keep in mind when manipulating Starbase tables is to make sure that one and only one TAB character appears between each column. Normally, one only needs to worry about this when editing tables by hand. It's often easy to accidently insert spaces where a TAB should, or to insert multiple TABs where only one should go. (The Emacs editor by default will often replace spaces with TABs without your knowing it--see Q.13 for more details.)

Thus, if you've been editing a Starbase table by hand, you may wish to check it over to ensure that the TABs are where they belong. To do this you can use the UNIX cat command by specifying the "-vt" options:

    cat -vt in.tbl | more
TAB characters in the table will appear as "^I". You might find the output a bit of a jumble. If so, you might try sending the table through the justify command first:
    justify -i in.tbl | cat -vt | more

Q.13. Emacs keeps screwing up the TABs in my table! How can I make it behave?

If you use a recent version of the Emacs editor to edit a Starbase table, you may find that it sometimes corrupts the TAB structure of the table. This is usually due to a default feature of the Emacs Text Mode: when you hit the TAB key, not only will Emacs insert a TAB character, it will also replace as many of the preceding spaces with TAB characters as possible while still keeping the same alignment.

To make your Emacs compatible with Starbase, you need tell Emacs to only insert a single TAB when you hit the TAB key. This can be accomplished by placing the following into your .emacs file (normally in your home directory):

(defun my-text-mode-hook ()
  (define-key text-mode-map "\t" 'self-insert-command))
(add-hook 'text-mode-hook 'my-text-mode-hook)

Q.14. Whenever I try to use the column command, I get a "No such file or directory" error.

If when you issue the command:
    column RA < in.tbl > out.tbl
you get the error message column: RA: No such file or directory, you may need to adjust your command search path. Some UNIX systems come with a another command called "column" (e.g. Linux has /usr/bin/column) not related to Starbase. To make sure the Starbase version gets called, place the directory containing the Starbase commands closer to the beginning of your search path. For example, if the starbase commands are located in /usr/local/starbase/bin, you can (if you are using the C-shell) type:
    set path = (/usr/local/starbase/bin $path)