Sloan Digital Sky Survey at the CFA Data Release 6

Spectral Data

In the database directory /data/astrocat/SDSS-dr6 are the SpecObjAll.db and SpecPhoto.db tables. These databases are a subset of the SpecObjAll view of the SDSS spectral database. The SpecObjAll.db table contains all the rows (but not all the columns) of the SpecObjAll view, while SpecPhoto.db rows are from the SpecPhoto view, where the rows are SciencePrimary, i.e. it eliminates rows that are flagged as poor quality or are duplicates from earlier data releases or do not have specPrimary == 1. The two databases have been indexed in RA and Dec for quicker searching for spectra near a particular coordinate.

The spectral reductions are from the Princeton/Sloan database, who redid the Sloan reductions. The Princeton info page is:
There are 1,223,015 rows in the SpecPhoto.db table and 1,282,560 in the full set in SpecObjAll.db. These numbers from the Princeton data set are slightly different from those given by the Sloan server. The CFA database files are 489 MB and 512 MB, respectively.

The subset of columns is the same in both databases:

specObjIDobjIDDR progName
radecprimTarget secTargetspecPrimary  
platemjdfiberID plqualplsn2
classsubclassobjType waveMinwaveMaxwcoverage
zz_errzwarning RChi2RChi2Diff
vdispvdisp_errvdispChi2 vdispZvdispZ_err
firstRAfirstDecfirstWarning firstFIntfirstFPeakfirstRMS
firstMajfirstMinfirstFPA firstSkyRMSfirstMatchDistfirstNmatch
tmassRatmassDecfirstErrMaj tmassErrMintmassErrAng
tmassJtmassJIvartmassH tmassHIvartmassKtmassKIvar
tmassPhQualtmassRdFlgtmassBlflg tmassCcFlgtmassGalContam
tmassMpflgtmassJdatetmassDupSrc tmassMatchDisttmassNmatch
usnoRAusnoDecusnoRmag usnoBmagusnoMatchDistunnoNmatch

The DR column gives which data release this row is from, i.e., one of: DR0, DR1, DR2, DR3, DR4  or  DR6  (where I have renamed EDR to DR0).

Full information on the meaning of these columns is found at:
See also their information on known problems. More information is also available from the SDSS web site:

Searching the databases

The row and search commands from the Starbase package can be used to search for rows with specific objects or values. The URL for the full Starbase documentation is:
Briefly, row is a front-end for awk statements where the column names substitute for $1, $2, etc., so the user does not have to know the column order of the database. The command to pull all primary observations from the SpecObjAll.db table is:
  row 'specPrimary == 1 && primTarget != 0 && secTarget == 0' <SpecObjAll.db
Note that this does not eliminate the duplications among the data releases, so it is still somewhat larger than the SpecPhoto.db dataset.

To find all the objects with spectra near one or more coordinates, a small Starbase tab-separated table containing the RA and Dec is used containing the center(s) of the space to be searched. [see the discussion above] The example below is for one coordinate, but more are possible (the SDSS coordinates are all in degrees):

  printf "ra\tdec\n-\t-\n190.\t50.\n"     |          [this makes the table]
  search SpecObjAll.db -S2ddd ra dec 0.05 |          [this performs the search]
  justify ra%4f dec%4f class subclass z z_err objID  [project & format output]

It happens that the search returns a single object, a starforming galaxy at Z = 0.08. The output from above is:
  ra        dec      class   subclass     z      z_err  objID             
  --------  -------  ------  -----------  -----  -----  ------------------
  189.9365  50.0143  GALAXY  STARFORMING  0.084  0.000  587735430151929950
And, since the objID is non-zero, that means it is in the photometric database PhotoPrim.db also. To access that data row, you could row the entire database, testing for the correct objID, but that would take hours. So, use search again, and scan the much smaller output of the search:
  printf "ra\tdec\n-\t-\n190.\t50.\n"     |  [this makes the table]
  search PhotoPrim.db -S2ddd ra dec 0.05  |  [this performs the search]
  row 'objID == "587735430151929950"'     |  [select the object]
  justify ra%.4f dec%.4f r Err_r%.3f i Err_i%.3f isoA_r isoB_r isoPhi_r
  ra        dec      r       Err_r  i       Err_i  isoA_r  isoB_r  isoPhi_r
  --------  -------  ------  -----  ------  -----  ------  ------  --------
  189.9365  50.0143  17.272  0.007  16.996  0.008  21.982  17.175    170.37
The search returns 165 rows in less than a second, so the row command has a far smaller table to check.

Actually, since the objID column is a secondary index, the PhotoPrim database should be directly searched:

  search PhotoPrim.db -V objID 587735430151929950 | 
  justify ra%.4f dec%.4f r Err_r%.3f i Err_i%.3f isoA_r isoB_r isoPhi_r
and the output is the same as before.

Joins between the spectral and photometric databases

The search program takes an input table, using one of its columns as values to search. The output rows can be joined with the input rows using the -j option to search.

For example, if the search in the SpecPhoto database is:

  printf "233.45. 54.75." | fldtotable ra dec | 
  search SpecPhoto.db -S2ddd ra dec 0.075     | 
  justify specObjID objID ra%.4f dec%.4f class z z_err  
then the output is:
  specObjID          objID             ra       dec     class  z      z_err
  ------------------ ----------------- -------- ------- ------ ------ --------
  173055145871933440 588007004725117130 233.4319 54.7478 GALAXY 0.1179 0.000042
  173613531751514112 588007004725117173 233.5103 54.7107 GALAXY 0.1531 0.000035
  173613531772485632 588007004725117186 233.5444 54.7246 GALAXY 0.1173 0.000028
This can be combined into a search on these objects in the PhotoPrim.db database, yielding columns from both. If a column name is duplicated between the two databases (other than the join field), the input database columns have "_1" appended to them and the searched database columns have "_2" appended. In the example below, "z" in SpecPhoto is the redshift, but "z" in PhotoPrim is a Z-filter magnitude. Other columns such as ra and dec also conflict.

This pipeline of commands:

  printf "233.45. 54.75." | fldtotable ra dec | 
  search SpecPhoto.db -S2ddd ra dec 0.075     | 
  search -j PhotoPrim.db objID                |
  compute 'ra_2 /= 15.                        |
  justify ra_2=ra%.1@ dec_2=dec%.0@ class petroMag_r petroMag_i \
           petroMag_r_i z_1=z%.3f zwarning=zw 
  ra          dec       class  petroMag_r petroMag_i petroMag_r_i  z     zw
  ----------  --------  ------ ---------- ---------- ------------  ----- --
  15:33:43.6  54:44:52  GALAXY     17.455     16.944     0.499152  0.118  0
  15:34:02.3  54:42:38  GALAXY     17.294     16.852     0.430647  0.153  0
  15:34:10.6  54:43:28  GALAXY     17.565     17.17      0.38362   0.117  0
Note that the "z" from the input SpecPhoto.db table had been renamed to "z_1" because of the join conflict with "z" from PhotoPrim.db. The justify restored its original name and ignored the "z" magnitude from the PhotoPrim.db table, which became "z_2".

Obtaining the spectra

The spectra are stored in the /data/astrocat archive as multspec FITS files, one for each plate. The easiest way to extract one or more of them is to use getspecid.

The getspecid program uses specObjID values from one of three sources: on the command line, input on standard input as a Starbase table, or input as a simple list on standard input.

For example, the search of SpecPhoto.db above yields three objects:

These can be extracted by piping this table into getspecid:
  printf "233.45. 54.75." | fldtotable ra dec | 
  search SpecPhoto.db -S2ddd ra dec 0.075     | 
  getspecid -v
which prints:
  Extract fiber   1 to ./spPlate-0614-53437-001.fits
  Extract fiber 353 to ./spPlate-0616-52374-353.fits
  Extract fiber 358 to ./spPlate-0616-52374-358.fits
while writing the files to the current directory. Without the "-v" option, it would silently write out the files. The FITS header is updated with the correct fiber RA and Dec, the source file name as "SPECFILE", the specObjID as "SPEC_ID" and the fiber ID as "FIBERID".

Sometimes it may be useful simply to extract the path to the file and fiber:

  printf "233.45. 54.75." | fldtotable ra dec | 
  search SpecPhoto.db -S2ddd ra dec 0.075     | 
  getspecid -n
which prints:
  /data/astrocat/SDSS-dr6/Spectro/spectro_DR6/0614/spPlate-0614-53437.fits     1
  /data/astrocat/SDSS-dr6/Spectro/spectro_DR2/0616/spPlate-0616-52374.fits   353
  /data/astrocat/SDSS-dr6/Spectro/spectro_DR2/0616/spPlate-0616-52374.fits   358
The following gives the details of the spectra storage hierachy. For those who want more details on the program, see the discussion and examples below.

To create the full path to one of the FITS files, use the DR, plate, mjd, and fiberID values. Using, for example,  DR=="DR1", plate=="0649", mjd=="52201" and fiberid=="320", generates the full path which is

To actually get the spectrum for fiberID = 320 into its own file, one possibility is to use the imcopy command in IRAF, e.g. to get the 320th spectrum from one of these FITS files (see below for HDU explanation):
cl> imcopy file.fits[0][*,320] mydir/my.fits
(where "file.fits" represents the actiual filename), or, using the funimage command from the FUNTOOLS package from a shell (single quotes prevent the shell from interpreting any characters as wildcards):
funimage file.fits'[0,*:*,320:320]' mydir/my.fits
The Princeton web site also has instructions on using IDL and SuperMONGO.
[ As of this writing, the SM links are broken due to a disk crash. Should be fixed ... soon? - WFW ]

In the Spectro subdirectory there is platelist.db, which lists all the plates from which release and MJD.

Each plate FITS file has 7 extensions, as documented at the Princeton web page.

   HDU #0 = Flux in units of 10^(-17) erg/s/cm^2/Ang [FLOAT]
   HDU #1 = Inverse variance (1/sigma^2) for the above [FLOAT]
   HDU #2 = AND mask [32-bit INT]
   HDU #3 = OR mask [32-fit INT]
   HDU #4 = Wavelength dispersion in pixels [FLOAT]
   HDU #5 = Plug-map structure from plPlugMapM file [BINARY FITS TABLE]
   HDU #6 = Average sky flux in units of 10^(-17) erg/s/cm^2/Ang [FLOAT]
Note that the spectra are in log-wavelength. To compute the wavelengths of each bin use the header values CRVAL1 and CD1_1. Below is an IDL code snippet contributed by Jenny Greene:
   pix1=sxpar(header, 'CRVAL1', OOPS)
   disp=sxpar(header, 'CD1_1', OOPS)
   for k=0,n_elements(respec)-1 do begin

[Introduction] [Photometric Data] [Spectral Data] [Spectral Line Data] [Tools & Examples]