Sloan Digital Sky Survey at the CFA Data Release 7

Photometric Data

The PhotoPrim.db starbase file is a subset of the SDSS PhotoPrimary view of the entire photometry database. It has been sequentially indexed (i.e. physically sorted) by declination, (the dec column), into 1/10 degree bins and right ascension-sorted (ra column) within each declination bin. The database is located in the directory: /data/astrocat/SDSS-dr6.

One important improvement since DR6 is the improved photometric calibration, known as UBERCAL. See UBERCAL for more information on the algorithm.

There are over 357,000,000 rows in the database, out of over 585,000,000 in the full PhotObjAll set. Using the search program from the Starbase package (see makes obtaining region-delimited objects quick even for this size database of about 260 GB.

The selected columns are:


To see this for yourself, use the Starbase progran headline and execute:

  headline </data/astrocat/SDSS-dr7/PhotoPrim.db
Note that the column order may be different than that shown. This is unimportant as all Starbase references are to a column's name, not its relative position.

Full documentation on these columns is available at the SDSS site:


Note that the  u, g, r, i, z  magnitudes are the modelMag magnitudes from the SDSS view, and to get the dereddened magnitude, you need to subtract the corresponding  extinction  value. The petroMag magnitudes are recommended for use with galaxies and the psfMag magnitudes are recommended for stars.

Also, note that the colors, e.g. petroMag_u_g, are not in the original table view from Sloan, but were computed including the extinction corrections, when extracting the values to put in this database, e.g.:

  petroMag_u_g = (petroMag_u - extinction_u) - (petroMag_g - extinction_g)

These columns have been indexed for use by search, where all but the dec.ra indices are secondary:


The primary index "dec.ra" being on RA and Dec means the database file itself is sorted by both RA and Dec, into bins of 0.1 degrees in Dec and by RA within each Dec bin, so that nearby objects are likely to be in the same disk block or at least nearby. Each secondary index consists of a sorted file of pointers to the indexed column. The search program uses these indices, if available, for fast access. See also the section below Searching the database.

Validating objects in the database

Objects must be validated before you can be sure what you're getting.

For example, if the type field is 3, the object is (supposed to be) a galaxy; type = 6   is a star and type = 8   is a sky exposure. The full type table, reproduced from the SDSS web site:

Unknown UNK 0
Cosmic Ray CR 1
Defect DEFECT 2
Galaxy GALAXY 3
Ghost GHOST 4
Known object KNOWNOBJ 5
Star STAR 6
Star trail TRAIL 7
Sky SKY 8
Also, each object has associated flag bits from the reduction stage to allow selection or rejection on various citeria. These are 64-bit quantities in the original SDSS database. Since Starbase is an awk front end, it cannot handle these as numbers, so I have split them into two 32-bit fields, calling the lower-order one flags and the high order one flags2. Then, one or more bits can be tested using the bit-field testing functions, and(), or(), not(), and xor().

For example, the 0x4 bit signifies an object too close to the edge of the frame, the 0x40000 bit means the object has one or more saturated pixels, while the 0x400000 bit means the sky level was bad. To test and reject objects with any of these conditions, the following pipeline fragment can be used:

  [...input... ] | row 'and(flags, 0x440004U) == 0' | [...output...]
NB: The U suffix forces the starbase programs to interpret the number as unsigned, which is what you need for bit logical operations on all 32 bits.

Since the 'stationary' bit is 0x1000000000, it is in the upper 32 bits, so the bit-field and test would be

and(flags2,0x10U) == 0x10U
and has to be tested separately from the bits in flags.

The full set of 64 flag bits is documented at the site.

Searching the database

The search and row commands from the Starbase package can be used to search for rows with specific objects or values. The URL for the full Starbase documentation is:
Briefly, search uses index files, if available, to quickly seek to matching rows in the database, while row is a front-end for the awk variant mawk, for statements where the column names substitute for $1, $2, etc., so the user does not have to know the column order of the database.

To find all the objects with photometry near one or more coordinates, a small Starbase tab-separated table containing the RA and Dec is used containing the center(s) of the space to be searched. The example below is for one coordinate, but more are possible (the SDSS coordinates are all in degrees).

There are multiple ways to create a Starbase table. Two examples:

echo "190. 50." | fldtotable ra dec >/tmp/table

printf "ra\tdec\n-\t-\n190.\t50.\n" >/tmp/table

Both of these result in a table that looks like:
  ra      dec
  --      ---
  190.    50.
where the whitespace between the columns is a single tab character. Then, using this table /tmp/table, the command to search around the coordinate with a radius of 0.05 degrees is:
  search PhotoPrim.db -S2ddd ra dec 0.05 </tmp/table
Note that unless you are in the /data/astrocat/SDSS-dr7 subdirectory, you need to give the full path to PhotoPrim.db ; I am typing only the file name for clarity in this document.

A simple table cam be made on the fly, piping it directly into search:

  printf "ra\tdec\n-\t-\n190.\t50.\n" | search PhotoPrim.db -S2ddd ra dec 0.05  
This returns a short table with 165 rows in just a fraction of a second. Searching linearly through the 285 GB would take approximately 11 hours, depending on the speed of your network link and machine.

To find all the objects with observed spectra near one or more coordinates, the output of a search such as the above is checked for rows where the column specObjID is non-zero. It happens that the 165 returned objects from the above search contains just a single object with a corresponding spectrum. It can be isolated using row on the above output:

  printf "ra\tdec\n-\t-\n190.\t50.\n"    |
  search PhotoPrim.db -S2ddd ra dec 0.05 |
  row 'specObjID != 0'                   |
  project specObjID                       
[N.B.: the above is a single command broken across lines for readability]
Here, the project program outputs only the columns given as its arguments, in this case giving

Using this result for a query of the SpecPhoto.db database, the command:

  row <SpecPhoto.db 'specObjID == "359951516078964736"' |
  justify DR class subclass ra%.4f dec%.4f z z_err zwarning
Note that the awk statement inside the row command is using a string comparison, not a numeric one. The specObjID and objID columns are 64-bit numbers which are too big for starbase's underlying implementation of awk.

The command returns (where justify projects and formats a subset of the columns to display):

  DR   class   subclass     ra        dec      z         z_err     zwarning
  ---  ------  -----------  --------  -------  --------  --------  --------
  DR3  GALAXY  STARFORMING  189.9365  50.0143  0.083877  0.000009      0x0U
Note that the two databases may have slightly different values of RA and Dec for the object. In the above case, for example, the positions differ radially by 0.076 arc-seconds.

To print the RA and Dec in the conventional sexagismal format instead of degrees, use the Starbase compute command as a filter to convert the RA to hours (i.e., divide by 15), then use the "@" format specifier with justify on the ra and dec columns (and see the column name change of 'zwarning' at the end):

  row <SpecPhoto.db 'specObjID == "359951516078964736"' |
  compute 'ra /= 15.'                                   |
  justify DR progName class subclass ra%.1@ dec%@ z%.3f z_err%.4f zwarning=zwarn
and the output is:
  DR   class   subclass     ra          dec       z      z_err  zwarn
  ---  ------  -----------  ----------  --------  -----  -----  -----
  DR3  GALAXY  STARFORMING  12:39:44.6  50:00:51  0.084  0.000   0x0U

Getting Images of PhotoPrim Objects

getobjid is a tool to query the SDSS SkyServer for a JPEG image of the area around an object or coordinate, putting the image into a directory designated by the user.

for example, the command

   getobjid -d $HOME -h 200 -w 400 -e GL -q 'GA(15,18)'
            -b 1. -f sdss.jpg -v -o 587731511532060708 
(line broken into two or clarity) delivers this image whose objid = 587731511532060708, where there is a label, a grid marking the center, and triangles marking galaxies with magnitudes between 15 and 18 in any of its bands.

For those interested in the details, see the documentation on the getobjid tool, below, and the full set of options is documented at the SDSS help page:

[Introduction] [Photometric Data] [Spectral Data] [Spectral Line Data] [Tools & Examples]