GIS Logo GSP 118 (318): GIS Programming

Shapefile Format

Introduction

We've already worked with CSV and TXT files and these are highly recommended for point data because the formats are so easy to work with. Shapefiles are the most commonly used format for polygon and polyline data. However, shapefiles have a little bit of complexity if you need to read and write the spatial data within the "file".

First, remember that the a shapefile is made up of several files. The spatial data is in the "shp" file and the attributes are in the "dbf" file. This is why the two types of data are often read and written in completely separate function calls. The other other files within a shapefile are important but can largely be ignored when programming (just keep all the files together).

DBF Files

The DBF file are read as a series of "records" or rows of attribute values. The values are read and written from the left to right of the way they appear. This is just like they would appear in a text file and is very straight forward.

Shapefiles with Points

Shapes that are points are read and written from the first feature in the file to the last, just as you would expect. There is a "multipoint" shapefile type but these are very rarely used.

Shapefiles with Polylines

Polylines are still read from the first feature to the last. There is a great deal of confusion at this point because polyline features can actually contain multiple "line strings" where each line string is a series of points. In other words, you could have a single "polyline" that represents a stream network, then each "line string" in the polyline would represent a reach in the stream. The line strings are then made up of a series of points.

Shapefiles with Polygons

Polygons are very similar to polylines in that they can contain multiple "polygons" within a polygon shape. This is more useful as we have states like Hawaii that contain multiple "polygons" within their spatial data.

The other confusion factor with polygons is that they can contain "holes". Lakes like Crater Lake in Oregon have islands within them. Thus the island would be represented by a "hole" in the polygon for Creator lake. The really weird part is that Esri chose to "encode" holes by reversing the direction of the points in a polygon when representing holes. This makes reading and writing polygons from shapefiles more challenging than expected.

See the Esri documentation referenced below for more information.

Additional Resources

Esri Shapefile Technical Description