Technical Information

The HDH has three key folders for managing data:

  • Originals: Contains original data. The spatial reference may be different between data sets and the quality may be questionable.
  • Working: Each folder in working represents one data set that will be delpolyed to the HDH. The spatial reference will be WGS84, Geographic. All the data sets must be high quailty.
  • Final: These are the data sets that are deployed on the LIVE server (i.e. they are only on TEST for testing). The foldlers are numbered by their database entry (i.e. they do not make sense without the web site to get the ID). The contents of final is completely generated by content in the working folder using a Python script. Each folder in Final will contain:
    • Zip file: this is the file that will be downlaoded from the web site and should only contain the final data and metadata
    • Display.png: image to display on the info page. It should be about 500 pixels wide.
    • Thumb.png: thumbnail image that appears on the data set page. Should be about 150pixels wide.
    • MapData: a folder with the data for the online map (i.e. a set of tiles).

Database:

  • DataGroups - appear as "Collections" on web site
  • DataSets - appear as "Data Sets" on web site
    • ForeignKey: DataGroupID

Process for adding data sets

  1. Originals are placed in an appropriate folder in "Originals"
  2. Data is QAQCed and placed in an appropriate folder in "Working" (this is a manual process)
    1. Each working folder should contain the following for one data set:
      1. A "TIFF", "IMG", or Shapefile
  3. Run the "CreateFinal"
    1. First, change the InputPath to point to the new folder in "Working"
    2. Run the script to create the final folders in "Final"
      1. This folder will contain:
        1. Zipped file with the data
        2. "Thumb.png" file
        3. "Display.png" file
        4. "Pyramid" folder containing the data for CanvasMap
      2. Script adds a datbase entry in "TBL_DataSets"
        1. Title: match the file name
        2. WorkingFilePath: Path to the file in "Working"
      3. User returns to the web site, searches for the data based on the name, and edits the data set record.

Process for adding collection (data groups)

Data Group could be either a number of georeferenced rasters or a raster with a single point (like Flicker). Georeferenced need to be converted to pyramids, point ones will just appear as a thumbnail on the map.

  1. Originals are placed in a folder in "Originals".
  2. There will be multiple data sets in the folder. They will all be TIFF or all be Shapefiles.
  3. For fully-georeferenced data there can aslo be a CSV file with the following headings:
    1. Title
    2. Description
    3. Filename
  4. For "point" files the CSV file becomes a point-shapefile with points for each data set
  5. The rest of the process is the same as for data sets.

May need to be able to parse a CSV (or shapefile?) to update the data base recordds without having to edit each one on the web.

Integrity checking

  • Check for "final" folders without a database record
    • Delete the "final" folder, yes?
  • Check for database record without a "final" folder
    • If the "WorkingFilePath" poitns to a valid folder, generate the "Final" data
    • else delete the Final folder.
  • Check that the database content (on the web) actually matches the data, check that the data can be downloaded, unzipped, and loaded into ArcMap.

Roll Process

  • Creating the final folders will be tested first on TEST
  • Then, the folders that are ready in "Working" will be moved to LIVE and the script will be run on LIVE to create the "Final" folders and their associated entries in TBL_DataSets.
  • To compare the database:
    • Run the "Schema.bat" file on TEST and LIVE. This will create an "InvasiveSchema.txt" file in each server's "C:\Temp" folder.
    • Bring these files back to a workstaton and run "WinDiff" on the files.
    • The WinDiff will show the differences. Make changes until there are no differences.

Issues:

  • How are database records tied to the "Working" folder the data came from?
    • "WorkingFilePath" is added by the Python script (Seth to fix)
  • Do we need to have "RefX", "RefY", "RefWidth" and "RefHeight" in TBL_DataSets?
    • I'm not sure. The user might want to know the area for the data set but the pyramid will have this information so we can zoom to the area on the map
    • However, "point" files will not have a pyramid so we may want to at least add RefX and RefY (in geographic?)

To Do:

  • Delete all entries in TBL_DataSets
    • Need to talk about referential integrity
  • Referential integrity (Jim -> Tina and Seth)
  • Database compare? (Jim)
  • Database "rels" that reference tbl_datasets (Jim)
  • Setup portable with PostgreSQL, etc. (Tina)
  • Run script to add TBL_DataSet entries and folders in 3_Final
    • One folder from "Working" at a time (Seth)
  • Get CanvasMap "working" in HDH (initial)

Done:

  • Rename "3_Final" folder (backup?) (done)
  • Create new "3_Final" folder (done)
  • Organize "Originals" (done for now)
  • Organize "Working" (done)
    • Remove "display.png", "zips", "Thumbs", other stuff
    • Folder for each data set
    • Folders for data groups containing folders for data sets
  • Finish CanvasMap (Jim) (Done?)