Wednesday, October 15, 2025

Special Topics in GIS - Module 6 - Scale Effect and Spatial Data Aggregation


The sixth and final module in Special Topics in GIS covered the topics of scale effect and spatial data aggregation. Understanding how we represent geographic data is crucial in GIS. In this lab, we explored scale effects on vector data, resolution effects on raster data, and how these concepts connect to the issue of gerrymandering.

Vector data (points, lines, and polygons) can appear very different depending on the scale. At a small scale (zoomed out), features may look simplified or smoothed. At a larger scale (zoomed in), more detail is visible, which can reveal complexities or errors not seen before. This affects how spatial relationships are interpreted, and can even change analysis results when aggregating or comparing areas.

Raster data stores information in grid cells. The resolution refers to the size of these cells, high resolution means smaller cells and more detail, while low resolution uses larger cells, which may miss important features or patterns. Resolution impacts the accuracy of measurements like land cover, elevation, or temperature, especially when zooming or resampling.

The final topic we explored was gerrymandering, which is the manipulation of political district boundaries to benefit a specific party or group. It often results in bizarrely shaped districts that dilute or over-concentrate voters. This undermines fair representation. In this part of the lab, the Modified Areal Unit Problem (MAUP) was explored in the context of political districts. According to ESRI, MAUP refers to a type of statistical bias that can arise during spatial analysis of aggregated data, where applying the same analysis to the same data yields different results depending on how the data is grouped or zoned.

One way to measure gerrymandering is through compactness. A common metric is the Polsby-Popper score, which compares a district's area to its perimeter. A lower score suggests a less compact (and potentially gerrymandered) shape. Below is a screenshot of a North Carolina District 12 that had the lowest Polsby-Popper score in the continental U.S. according to my calculations. Its irregular shape suggests it may have been drawn with intent beyond geographic or community boundaries, a potential sign of gerrymandering.



By examining the geometry of voting districts and understanding how scale and resolution affect spatial data, we can better identify and challenge distortions in political representation.

Overall, this lab was a great experience highlighting how the way we structure and visualize spatial data through scale, resolution, and boundary choices can deeply influence analysis and real-world outcomes. From population patterns to political fairness, understanding these effects is essential for responsible GIS work and informed decision-making.



Wednesday, October 1, 2025

Special Topics in GIS - Module 5 - Surfaces - Surface Interpolation

The fifth module in Special Topics in GIS introduced the topic of surface interpolation. In this lab, we explored the use of several interpolation techniques to create continuous surfaces of water quality across Tampa Bay. Interpolation is valuable because it allows us to estimate values between sampling points and better visualize spatial patterns. However, each technique approaches the problem differently and produces distinct results.

Thiessen polygons assign each location to the nearest sample point, which is simple to apply but results in abrupt boundaries that don’t reflect smooth changes in water quality. Inverse Distance Weighting (IDW) provides a more gradual surface, giving greater weight to nearby samples and reducing the blocky appearance of Thiessen. Spline goes further by fitting a smooth, curved surface through the data, producing a visually appealing result but sometimes creating unrealistic peaks or sinks in areas with clustered high values or sparse sampling. These differences highlight the importance of choosing an interpolation method that matches both the data characteristics and the purpose of the analysis. 

Below is a screenshot of my results using the Spline Tension technique:




Overall, this exercise showed that while all interpolation methods can create useful surfaces, their assumptions and behaviors vary widely. Understanding these differences is key to interpreting the results and making informed choices for an analysis. This lab gave me a stronger understanding of how interpolation works and why the method you choose matters. It was interesting to see how the same water quality data could look so different depending on the approach.

Tuesday, September 23, 2025

Special Topics in GIS - Module 4 - Surfaces - TINs and DEMs

The fourth module in Special Topics in GIS introduced the topics of creating, editing, and analyzing TINs and DEMs.  The goal was to explore how different elevation data models represent terrain and how these representations can be used in spatial analysis.

One of the portions of this week's lab involved utilizing a DEM to develope a 3D ski run suitibility map. By calculating three critical terrain variables: elevation, slope, and aspect, these layers was reclassified to reflect suitability for downhill skiing. For example, higher elevations scored more favorably, slopes between 20° and 45° were rated highly suitable, and north-facing aspects were given priority. These reclassified layers were then combined in a weighted overlay, where elevation was weighted most heavily, followed by slope and aspect.

The result was a ski run suitability map (shown below), which highlights the best areas for potential ski development. Areas in darker colors represent more favorable conditions, while lighter areas represent less suitable terrain. This exercise not only illustrated how elevation data can be modeled differently with TINs and DEMs, but also how those models support real-world decision making when combined with spatial analysis.




Another portion of the lab provided a point feature class that was used to create a TIN model. Contour lines (100m) were then visualized by modifying the symbology. Next, using the Spline tool, the point feature class was used to create a set DEM based contour lines. The two sets of countour lines were then analyzed and compared. Below is a screenshot shot witht he DEM based contour lines depicted in blue and the TIN based contour lines in grey.



Overall I found this week's module to be very helpful in my understanding of elevation models. We had touched on some of these types of elevation models and tools in previous coursework assignments but being able to apply them to real life analysis has been very beneficial.

Wednesday, September 17, 2025

Special Topics in GIS - Module 3 - Data Quality - Assessment

The third module in Special Topics in GIS continued the focus on spatial data quality. This week’s task was an exercise in assessing the quality of road networks. We were asked to apply similar methodologies to those introduced in the assigned readings, such as Haklay (2010), How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. The assignment consisted of conducting an accuracy assessment for completeness using two road centerline datasets: one compiled by the Jackson County GIS department and the other from TIGER (2000).

The analysis began by calculating the total length of each dataset for the entire county. To ensure valid length comparisons, both road layers were first projected into the same coordinate system with meter units.

Next, using a provided grid that divided the county into 5 × 5 meter cells, the roads were intersected with the grid using the Pairwise Intersect tool. This step ensured that road segments were split at grid boundaries and attributed to the correct cells. The total road length for each cell was then calculated for both datasets using the Summary Statistics tool.

The results were then joined back to the grid layer so that each cell contained length values from both datasets. A percent difference field was added, and a formula was applied to compute relative completeness. This allowed me to identify and count how many cells favored the Jackson County centerlines versus the TIGER centerlines.

Finally, a choropleth map was created to display spatial patterns of completeness. Symbology highlighted cells where one dataset contained more road length than the other, as well as a neutral class for near-equal differences.

The visualization illustrates the percent difference in road length between the TIGER roads dataset and the county street centerlines. The calculation was based on the formula:

% π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ = (π‘‘π‘œπ‘‘π‘Žπ‘™ π‘™π‘’π‘›π‘”π‘‘β„Ž π‘œπ‘“ π‘π‘’π‘›π‘‘π‘’π‘Ÿπ‘™π‘–π‘›π‘’π‘  − π‘‘π‘œπ‘‘π‘Žπ‘™ π‘™π‘’π‘›π‘”π‘‘β„Ž π‘œπ‘“ 𝑇𝐼𝐺𝐸𝑅 π‘…π‘œπ‘Žπ‘‘π‘ )/(π‘‘π‘œπ‘‘π‘Žπ‘™ π‘™π‘’π‘›π‘”π‘‘β„Ž π‘œπ‘“ π‘π‘’π‘›π‘‘π‘’π‘Ÿπ‘™π‘–π‘›π‘’π‘ ) ×100%

Cells where TIGER roads contained more total length than the county centerlines are symbolized in shades of red and orange, while cells where county centerlines were longer are shown in shades of green. This highlights areas where one dataset is more complete than the other.

Below is my final map layout:



Tuesday, September 9, 2025

Special Topics in GIS - Module 2 - Data Quality - Standards

The second module in Special Topics in GIS focused on data quality standards with an exercise on determining the horizontal positional accuracy of two road networks in the city of Albuquerque, New Mexico. Our findings were to be reported in accordance with the National Standard for Spatial Data Accuracy (NSSDA). We were provided two polyline shapefiles that represented road centerlines from the city of Albuquerque and StreetMap USA as well as a mosaic of orthophotos of the study area.

The NSSDA standard requires at least 20 test points within our study area, with each point separated by more than one-tenth of the area's horizontal distance. The NSSDA value is an accuracy measurement of our Root Mean Square Error at the 95% confidence interval. To help achieve the requirements I used the split tool to divide the study area into four quadrants and then bookmarked each quadrant which reduced the need to zoom in and out to check the spacing of my points.

Next step was to begin finding "good" intersections that contained data from both the Albuquerque and Street Maps data sets.  After identifying my "good" intersections it was then time to determine what I thought were the "true" reference points at the intersections using the orthophotos. Below is a screenshot showing my reference or "true" locations of the intersections according to the orthophotos.




Next step was to use the Add XY Tool to determine X and Y coordinates for each of the points for all three datasets. I then exported the three attribute tables to excel files using the Table To Excel tool. Following the NSSDA horizontal accuracy statistic worksheet, the independent (true) points from the orthophotos X and Y coordinates were compared to the test points datasets. Calculations were then completed to determine the accuracy statistics for the two test datasets.  Below are the results from my worksheet for the StreetMap USA NSSDA value calculations: 


The final column of the table shows the calculated squared error distance. The values are summed and then averaged. The NSSDA horizontal accuracy is calculated by multiplying the Root Mean Square Error (RMSE) by 1.7308. Below are my final accuracy statements for each of the two datasets.

Tested 13.01 ft (3.96 m) horizontal accuracy at 95% confidence level for the Albuquerque Streets data set.

Using the National Standard for Spatial Data Accuracy, the Albuquerque Streets data set tested to 13.01 (3.96 m) feet horizontal accuracy at 95% confidence level.

Tested 312.95 ft (95.38 m) horizontal accuracy at 95% confidence level for the Street Map USA data set.

Using the National Standard for Spatial Data Accuracy, the Street Map USA data set tested to 312.95 ft (95.38) feet horizontal accuracy at 95% confidence level.



Tuesday, September 2, 2025

Special Topics in GIS - Module 1 - Calculating Metrics for Spatial Data Quality

The first module for Special Topics in GIS covered aspects of spatial data quality with focus on defining and understanding the difference between precision and accuracy. According to the International Organization for Standardization's (ISO) document 3534-1, accuracy can be defined as the "closeness of agreement between a test result and the accepted reference value". This document also defines precision as the "closeness of agreement between independent test results obtained under stipulated conditions" (ISO, 2007). 

In Part A of the lab assignment, the precision and accuracy metrics of provided data were determined. When determining precision, a distance (in meters) that accounts for 68% of the repeated observations was calculated.  When determining accuracy, the average waypoint was measured from an accepted reference point. Below is my map product showing projected waypoints, the average location, and circular buffers corresponding to 50%, 68%, and 95% precision estimates. A "true" reference point was later added to determine a horizontal distance to the established average waypoint location.



Horizontal accuracy refers to how close a measured GPS position (or the mean of many positions) is to the true location on the ground. It is typically reported as the distance between the GPS-derived position and a known reference point. 

Horizontal precision, on the other hand, describes how tightly repeated GPS measurements cluster together, regardless of whether they are centered on the true location. Precision is often expressed as the radius within which a certain percentage of positions (e.g., 68% or 95%) fall.

My horizontal precision (68%) was 4.5 m and my horizontal accuracy of 3.25 m produced a difference of 1.25 m. I would say that this would not be a significant difference because it sits within the 68% precision radius. My results for vertical accuracy were as follows with my mean waypoint elevation coming in at 28.54 and the mean elevation for the "true" reference point being 22.58. This is roughly a 5.96 m difference which I would think is significant at least in some cases. 


In Part B of the lab assignment, the RMSE metric was calculated, along with a cumulative distribution function (CDF). The CDF describes the probability of a random variable taking on a given variable or less, showing a more complete error distribution instead of selected metrics. For this portion we were provided another dataset where we used Excel for the analysis. Here we calculated minimum, maximum, mean, median, root square mean, and the 68th, 90th, and 95th percentiles. The final portion of the lab consisted of displaying the dataset using a cumulative distribution function (CDF) graph which is displayed below.



Overall, I really learned a lot in this lab and had the opportunity to brush up on my Excel skills which I have not utilized for a while.  I am looking forward to building upon what I learned in this module. 



Wednesday, August 7, 2024

Applications in GIS - Module 6 - Suitability & Least Cost Analysis

In Module 6, we learned about Suitability and Least Cost Path Analysis. We were introduced to performing suitability analysis using both vector and raster analysis tools. We prepared our data for suitability analysis using different approaches, such as Boolean and scoring, and adjusted specific parameters using scoring and weighting. Additionally, we performed least-cost path and corridor analysis using cost surfaces as well.

In Scenario 2, our task was to perform a suitability analysis for a land developer. We analyzed several variables including proximity to roads, elevation slope, proximity to rivers, and land cover type. These variables were reclassified and ranked based on the value of each cell in the raster. The raster layers were then combined using the overlay tool. Finally, we were required to create a map layout comparing the results from the two alternatives. Below is my final product.




Special Topics in GIS - Module 6 - Scale Effect and Spatial Data Aggregation

The sixth and final module in Special Topics in GIS covered the topics of scale effect and spatial data aggregation. Understanding how we re...