Assignment: Formatting Files
Reformatting Files
One of the most common tasks that programmers will be asked to tackle is to take an existing file and either reformat it, fix a problem in it, or convert the values in it to another format. This assignment will give you the chance to do this for yourself and put together a number of concepts you've learned so far.
The U.S. Board on Geographic Names contains a very large database of the names of cities, streams, valleys, and other physical features on the earth. You've been asked to create a file that can be loaded into ArcGIS. The data in the columns of the table should contain values that are nicely readable so we can use them as symbology in ArcGIS. Download one of the files from their website (feel free to use California if desired) and produce a text file that has the following columns (in any order). You can filter the contents of the files to just output one feature class.
- Feature Name
- Feature Type
- Latitude in decimal degrees
- Longitude in decimal degrees
- Date in the original format
- Date in a nice human-readable format (i.e. 10th of October, 2011)
Download a file from the U.S. Board on Geographic Names web site and check the "delimiter" for text. Use this to "split" up the file into "tokens". Then, select the tokens you want to include in your output file. Write out the file based on the columns above.
The overall structure of a file of this complexity is important for readability. Your file should have the following sections:
- File Header
- Any "imports" that are needed
- Functions
- The main script
Remember to add a header for each function and other appropriate comments.
A Function to Convert Dates to Nice Human-Readable Forms
Create a Python script with a function that converts a US formatted date (MM/DD/YYYY) into a nice, human-readable, form like: 31st of October, 2011. Call the function with at least 3 dates and print them out to make sure it works. Put this function at just below the file header in your file before any other code that might call it.
Tip: A pre-defined list is an easy and fast method for finding the text string that matches a given month's name (i.e. matching "1" to January).
Turn In:
- The input file you used or, if the file is huge, the download location for the file.
- The Python program you created to convert the file. The program should include:
- A function to convert a date string to a nice human-readable string
- The final output file. Please make sure you can load the file into ArcGIS. This is both important for grading and to make sure your file is usable. Remember that ArcGIS is very limited on what it will accept as a header for tabular data. Only used the underscore ("_") for punctuation and keep the column names to 10 characters or less.
Extra Credit:
Add the following attributes to the file:
- Latitude in degrees, minutes, and seconds (including the appropriate punctuate and hemisphere letter)
- Longitude in degrees, minutes, and seconds
Note: ArcGIS cannot display the degree symbol and is not happy with the double quote symbol either. What would be a good approach to format geographic coordinates as attributes?