Point Data As Data Frames (using CSVs)
Reading and Displaying Data
The easiest way to manage spatial data in R is as a data table. You can load data into R with a single line of code. Use the "read.csv()" function like the one below to create a comma separated value (csv) into R Studio. Take a look at the workspace and you'll see there is a new "TheData" object. Click on this object to see its contents.
TheData = read.csv('C:/ProjectsR/Clustering/TwoClusters.csv')
The sample below shows how to read a csv file into R, remove any blank lines, and then plot the data.
## load data TheData = read.csv('C:/ProjectsR/Clustering/TwoClusters.csv') ## remove any missing entries from the data TheData = na.omit(TheData) ## plots the points as x,y data plot(TheData$X,TheData$Y, main="Simple Plotting Example", xlab="Longitude", ylab="Latitude", pch=19)
The plot() function can be used to create graphs for a wide variety of data in R. The first two parameters are the x and y arrays and because we specified two arrays, plot() will automatically create a scatter gram.
The syntax on the last line where it says "main=..." is how you specify optional parameters for R functions. If you examine the documentation for the plot() function, you'll see there are a large number of parameters that determine how the plots will look.
Dimensions of Data
The read.csv() function will read data into a data frame object. You can get information on the data frame with the following functions.
NumberOfRows = nrows(TheDataFrame) NumberOfColumns = ncols(TheDataFrame)
Columns & Rows
You can manipulate the columns in a data frame:
TheDataFrame=TheDataFrame[-1] # delete the first column
TheDataFrame=TheDataFrame[-1,] # delete the first row
Additional Resources
Check out this website for other useful "na" tools:
- na.fail: Stop if any missing values are encountered
- na.omit: Drop out any rows with missing values anywhere in them and forgets them forever.
- na.exclude: Drop out rows with missing values, but keeps track of where they were (so that when you make predictions, for example, you end up with a vector whose length is that of the original response.)
- na.pass: Take no action.
- na.tree.replace (library (tree): For discrete variables, adds a new category called "NA" to replace the missing values.
- na.gam.replace (library gam): Operates on discrete variables like na.tree.replace(); for numerics, NAs are replaced by the mean of the non-missing entries.