Amadeusz Zajac GIS Blog: August 2018

Sunday, August 12, 2018

GIS5103 - GIS Programming - Final Project

As I had previously mentioned in my proposal and presentation, the script I set out to create was designed mainly for archaeology contract work. Basically, it would allow the user to obtain quick geometry calculations pertaining to the amount of various disturbances within an archaeological study area (or APE). The script would work in several distinct steps, first by creating a clip polygon of all input polygons within the given APE, create an area column, calculate the acreage of the overlap for each intersection (while at the same time creating an acreage of an entire APE), printing the results, creating another column in each of the clipped polygons (and writing the total Study Area acreage within it) as well as a percentage of coverage column, performing a division calculation to determine the percentage of the APE that each clip occupies and finally printing the second set of results. If all went smoothly in PythonWin, my intention was to revisit the later chapters and create a toolbox which would allow the user to use the script as a tool with manually-defined parameters.

The first step of operation was actually done outside of Python. I created a new MXD, imported a basemap and zoomed in on a random area in Pennsylvania. After setting the coordinate system of the MXD to ‘State Plane – Pennsylvania South US feet’ I began creating hypothetical survey data which I would use to test my script. I drew an APE polygon around an intersection, covering the area of several residential yards. Next I created three classes of disturbance which stretched outside of/and intersected with my study area. I called them ‘Wetland.shp’, ‘Excessive Slope.shp’, and ‘Modern Disturbance.shp’. (All shapefiles above mentioned were polygons. The Modern Disturbance file covered the asphalt road and well as all driveways and houses, the Wetland file contained two separate parts and the slope was made up of just one large area to the south. Once the shapefiles were finished, I exited arcmap. My intention was to turn it on and off while testing my script, in order to avoid errors when making edits to the same file on two different programs. The script I am creating is, as of now, limited to three separate disturbance inputs. This was done simply to conserve time in the beginning, but time-permitting I would be able to go back and add more if necessary.

I began the script by simply importing all modules I thought may be necessary to run it. I imported arcpy, os, fileinput and string, and from arcpy I imported env. I set the workspace to my shapefile folder and enabled overwrite output. Within the same part of the script, I decided to define all of my most common and most necessary inputs. I defined my mxd file with a filepath, and the study area was defined shortened to “clip_features”. I also defined “clip_feature1”, “clip_feature2”, and “clip_feature3” as my disturbance shapefiles. Because my first operation would be to clip the three, I defined the path of the initial clipped polygons and shortened them to “clipped_polygon_1”, “clipped_polygon2”, and “clipped_polygon3”. I thought that the best way to troubleshoot the script, would be to print messages at the beginning of each portion of the script, as well as printing “complete” at the end of each operation. That way, whenever I would run the code I would know exactly how much of it worked flawlessly and what part needed fixing.

The first actual operation performed by the script, would be to clip all of the disturbance shaefiles in order to isolate the overlap within the APE. I used the arcpy.Clip_analysis function and named each input feature to be clipped by the clip_feature. I did this three individual times for each of the polygons, but added an “if” clause for input 2 and 3. The ‘if’ condition was simply a presence of the given shapefile as it is defined. If it did not exist, or was not defined the script would print a blank space and move on. The idea was to allow the script to be run with less than three disturbance areas whenever necessary without showing any errors. (I applied the same formula in all of the following steps, when it came to performing operations on the disturbance polygons.) After the clip operation was completed, I checked my output folder in ArcCatalog to make sure that the new shapefiles were contained there. They were.

Next, I had to add my area columns to the APE and the clipped disturbances. I began by defining local variables to make my work easier, and I defined fieldName1 as “Area” and fieldName2 as “APE_Area”. I added the “APE_Area” column to my APE shapefile using the arcpy.AddField_management function. The column created was a Double number with precision set to 15 and field length to 10. This was necessary when dealing with a decimal point, and because the areas we calculate at work are always in Acres, a decimal value makes a big difference. I repeated the step individually (as in the clipping step above) for my disturbance inputs, but instead of the “APE_Area” column I created the “Area” column. I made this distinction, because I would have to merge the two shapefiles later and could not have the same column name twice with different values. My logic was that this would make my life easier later, when calculating the percentages of the area is question in a later step. I ran my script to test it, and opened my mxd to see if the changes had been made. Thus far, it ran smoothly and all the necessary and the APE file as well as all the initial clips contained a new column within their attribute table.

Within the same section of the script, I went back to perform area calculations. As I had stated before, at work we always deal with acres. I used the aprcpy.CalculateField_management() function on each of the polygons, populating the “APE_Area” column in the APE shapefile with its acreage, and the “Area” column for each of the disturbance polygons. I made sure that the operation is being done on the clipped polygons instead of the originals, and added each of the operations directly below the corresponding arcpy.AddField_management operations from the previous paragraph. I saved my script and ran it, then re-opened my mxd and went to check my results. Thus far, all seemed to be in place and the columns were populated by the acreage total with a decimal.

The fourth step of the script was to simply print the area results. This would be the first part of the code that would allow the user to obtain new information about the data they are working with. In order to display the results in a user-friendly manner I used the search cursor function. For the APE, I defined the cursor to the “APE_Area” field, and printed a message joined by two string texts: "The APE has a total area of", (row), "acres." This way, python would print out the acreage of each separate polygon feature within a shapefile in question. For the next three input files, I redefined the cursor to focus on the “Area” field instead, in order to obtain the acreage of the clipped polygons. Because the names, and the order of disturbances would vary depending on project the message I printed was: "The first disturbance listed includes an area the size of", (row), "acres within the APE." I saved and ran my script, and all was well thus far. I was happy to see that both rows of the first disturbance were listed separately and contained correct, individual acreage.

Next came the time to import the “APE_Area” field into my disturbance polygons. I had begun running into a bit of trouble here, as I had initially hoped that I would be able to perform these calculations by calling values from various polygons. For example, I wanted to input a simple equation where the value of the “Area” field for each of the polygons could be divided by the value of the “APE_Area” and multiplied by 100. I intended to use the result of this equation to populate a new field in the attribute table of the disturbance polygons in order to display the percentage of APE that is taken up by each of these features. But after a very long time on Esri forums, and many attempts at troubleshooting I saw that I was not able to figure out how to do this properly. It was then that I decided to use the Spatial Join function to create yet another set of shapefiles, which would contain both columns in their Attribute table (meaning: each shapefile would contain as “Area” field that would display the acreage of its own features, as well as an “APE_Area” field that would display the number representing the acreage of the entire APE. Before doing this, however I defined a few local attributes once more and created “PerCofAPE”. (This field was meant to display the result of the percentage equation I had mentioned previously.)

I executed the arcpy.SpatialJoin_analysis() function on all three of my clipped disturbance polygons (combining them with my APE polygon) and called them "OFC1_join.shp", "OFC2_join.shp" and "OFC3_join.shp" respectively. I ran another arcpy.AddField_management() operation to add the “PerCofAPE” column to each of the joined polygons as a Double, saved and ran my script. When I opened up ArcMap, I saw that the created polygons were in their correct place and after viewing their attributes I saw that they contained individual columns with each of the desired numbers as well as the new, blank “PerCofAPE” field.

The last (and most difficult) operation in my script was to populate the “PerCofAPE” field with the result of the percentage equation. (As I had mentioned above, this would be (“Area”)/(“APE_Area”)*100). I figured that the easiest way to do this was to simply perform a field calculation on all the joined shapes using the arcpy.CalculateField_management function. It looked as though the equation would yield a result (in a form of a decimal, for now) but no matter what I did I would receive an error message stating that the field in question was not nullable. This was extremely confusing, as I had made sure that all created fields throughout the script were designated as nullable. I spent a long time trying to troubleshoot the problem, but ultimately I was not able to figure it out on time. The last remaining step in the script, was supposed be to simply print these percentages in python in a legible manner. Unfortunately, no matter what I had tried the “PerCofAPE” field remained blank and not editable.

Ultimately, this script proved to be a lot more challenging and time consuming. I had previously thought that simple calculations would be easy to figure out and troubleshoot but in this particular case it proved to be too much for me. I am proud that I was able to achieve most of what I set out to do, and if I were able to get past this particular issue, I think the script would have worked well. I had also toyed with the idea of setting the script up as a user-friendly tool and get parameters that could be set in ArcMap (like we did in our later modules) directly. But unfortunately this was impossible as I had gotten stuck on a crucial part of the script and could not move on to the final steps.

Thursday, August 9, 2018

GIS5265 - GIS Applications for Archaeology - Final Project

Camino de Mulas was a long distance trade route stretching from the general location of San Jose, Costa Rica and ran down to David, Panama. The route was operational from around 1601CE – 1730CE, and was eventually destroyed by the building of the Panamerican Highway. While parts of it may still remain, the exact location of the route is no longer known. Historical sources provide us with a list of known locations that intersected the route at the height of its activity, and can be used to recreate the possible location of the route by performing a series of Least Cost Path analyses in ArcGiS. While there are many unknown factors that may have affected the exact direction of the route, a good estimate of its location can be generated using basic known information and a little bit of common-sense speculation in regards to the current nature of the landscape.

I began my process by creating individual point shapefiles, and placing them at the coordinates given to us in the lab document and labeling each location by its provided name. This step gave me an idea about the size of the area as a whole, which would in turn help me locate an appropriate DEM raster. After retrieving a large DEM from the Earth Explorer website, I used the clip tool to decrease the size of the raster to reach just above the northernmost location and just below of the southernmost location. Doing this would decrease processing times for all tool operations that will need to be run when finally creating a weighted overlay raster. Because the final result would require more data than elevation, I took time to track down a land classification shapefile for the general region of Central America. I clipped the shapefile, and after taking some time to classify the land cover categories based on estimated difficulty in the features attribute table, I used the Features to Raster tool to convert it to a workable raster.

Once I had two workable rasters, I began my work to create a weighted overlay raster. I began by creating a new toolbox, and within it a new model. Much like in the first part of the final project, I began by going into the models properties, and replicating the Values set up in the previous step (Workspace, Raster Analysis and Processing Extent.) I dragged the slope tool and used the provided standard (.00000912) for my Z value. The raster input was, of course, the DEM downloaded from Earth Explorer and had the output raster was set to be created in a whole new folder designated for the second part of the project. I added the reclassify tool to the model and connected the two operations together. I recreated the same setup for my Slope raster as I had done in the first part of the lab. (10 grade categories).

After successfully running the tool, my Reclass raster was successfully created. Now, the time has come to combine the factors from both rasters to create a Weighted Overlay. I dragged the Weighted Overlay tool into my model, and set the Evaluation Scale as: from 1 to 10 by 1, as previously instructed and aligned the evaluation scales so that they match up (this time I set the NoValue field to 9 instead of 1 in attempt to give water more weight.) I added the Land Cover raster, and weighted the previously defined classes as follows: easy – 1, medium – 4, hard – 9. I set the ratio of influence as 80-20 favoring the Reclass raster and ran the tool.

As I examined my overlay product, I noticed that the entire water region showed up within an Easy-Medium range. I found that this may be problematic when predicting the Least Cost path. The lab instructions specify that Camino de Mulas was a mule route, and should not include any sort of water travel. I decided to retrace my steps and give the Land Cover raster more weight in order to contrast the results at the end. I made another Weighted Overlay raster, this time favoring the Reclass raster 60 – 40 to contrast alongside of the first raster. My hope was that by giving more weight to the Land Cover (which did not contain any part of the ocean) would help divert the least cost route away from the water for more accurate results.

I moved on to the next step, and took time to create all of the Least Cost Paths between the points. Because using a single model gave me many problems in the previous part of the lab, I decided to calculate each Least Cost Distance and Least Cost Path individually. This is where I took time to re-name my location point files by adding an order number after their name. (This will make it easier to work in order.) I used the Least Distance tool on my origin point and created a new backlist raster. Then, I used the new products to calculate the Least Distance Path between the first two points. I followed these steps for each individual path, for both of my weighted rasters until the entirety of the Camino de Mulas was mapped.

At the very end, I created two new line features, and traced over the Least Cost Path in order to make it more legible at the large scale. I added a shapefile of Central Americas road systems to account for any overlap, created a new data frame and set up my map layout.

As I examined the contrast between the two different weighted rasters, and it seems as though the 60-40 raster is more likely to be more accurate. I conclude this based on the direction the path takes around the area of Paso Real, Boruca and Palmar Sur in the 80-20 weighted raster. After approaching this area along the coastline (presumably weighed down by the water) the path cuts across Palmar Sur, and moves north to Boruca, then to Paso Real and cuts back to Palmar Sur. This seems to be counter-productive, and the historical document provided clearly states that the Railway Section near Palmar Sur ought to be the third stop of the above mentioned locations. The Least Cost Paths generated using the 60-40 Weighed Overlay Raster seem to provide a much more reasonable vision of the direction that Camino de Mulas had most likely taken.

Sunday, August 5, 2018

GIS5103 - GIS Programming - Module 11

From the very start of this course, the very idea of coding and writing script was a bit intimidating. I had not taken a computer-related class since sixth grade, and though I enjoyed doing GIS work I did not think I would be able to grasp the finer tunings of its programs. However, after completing some three or four labs, I began to feel that working with Python was coming easier and easier. In a way, it was a little bit like learning another language: the more one uses the language, the better ones grammar and understanding becomes.

Overall, I really love the fact that I was able to dive into the “logic” of the tools used in ArcGis. Knowing common Python commands, it is now easier to understand tools

which I had never used before. Knowing how many of them are constructed allows me to visualize how pre-made tools may be scripted, and how they may be modified into custom tools when needed. Having a better understanding of code and scripting, truly feels like there are very little limitations when using ArcMap.

One of the most useful exercises to me were the tool-creation modules at the end of the semester. From my personal experience, there are many people within the fields of archaeology or environmental science who are not classically trained GIS users, but use ArcGIS either way to perform simple tasks related to their work. Being able to create and modify script tools allows me to help these GIS users

in a way that would be useful to

them, but does not require them to

spend a lot of time training.

Wednesday, August 1, 2018

GIS5103 - GIS Programming - Module 10

This weeks lab focused on creating a custom tool based on an already existing script. This was done mostly within ArcCatalog in a blank ArcMap project. The necessary parameters could be set by pulling up the script properties, within the correct toolbox.

The script had to be modified in order to make the tool compatible with various inputs that the GiS user chooses each time the tool is ran. All file paths, and feature classes have to be generalized, and simply defined as parameters within a tool using the GetParameters() tool in Python.