Rick's GIS Portfolio: Special Topics in GIS - Module 3 - Data Quality

Wednesday, September 17, 2025

Special Topics in GIS - Module 3 - Data Quality - Assessment

The third module in Special Topics in GIS continued the focus on spatial data quality. This week’s task was an exercise in assessing the quality of road networks. We were asked to apply similar methodologies to those introduced in the assigned readings, such as Haklay (2010), How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets. The assignment consisted of conducting an accuracy assessment for completeness using two road centerline datasets: one compiled by the Jackson County GIS department and the other from TIGER (2000).

The analysis began by calculating the total length of each dataset for the entire county. To ensure valid length comparisons, both road layers were first projected into the same coordinate system with meter units.

Next, using a provided grid that divided the county into 5 × 5 meter cells, the roads were intersected with the grid using the Pairwise Intersect tool. This step ensured that road segments were split at grid boundaries and attributed to the correct cells. The total road length for each cell was then calculated for both datasets using the Summary Statistics tool.

The results were then joined back to the grid layer so that each cell contained length values from both datasets. A percent difference field was added, and a formula was applied to compute relative completeness. This allowed me to identify and count how many cells favored the Jackson County centerlines versus the TIGER centerlines.

Finally, a choropleth map was created to display spatial patterns of completeness. Symbology highlighted cells where one dataset contained more road length than the other, as well as a neutral class for near-equal differences.

The visualization illustrates the percent difference in road length between the TIGER roads dataset and the county street centerlines. The calculation was based on the formula:

% 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = (𝑡𝑜𝑡𝑎𝑙 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑐𝑒𝑛𝑡𝑒𝑟𝑙𝑖𝑛𝑒𝑠 − 𝑡𝑜𝑡𝑎𝑙 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑇𝐼𝐺𝐸𝑅 𝑅𝑜𝑎𝑑𝑠)/(𝑡𝑜𝑡𝑎𝑙 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑐𝑒𝑛𝑡𝑒𝑟𝑙𝑖𝑛𝑒𝑠) ×100%

Cells where TIGER roads contained more total length than the county centerlines are symbolized in shades of red and orange, while cells where county centerlines were longer are shown in shades of green. This highlights areas where one dataset is more complete than the other.

Below is my final map layout:

Rick's GIS Portfolio

Wednesday, September 17, 2025

Special Topics in GIS - Module 3 - Data Quality - Assessment

No comments:

Post a Comment

Special Topics in GIS - Module 6 - Scale Effect and Spatial Data Aggregation

Blog Archive

Blog Archive

Labels