GIS 5007 Module 4: Data Classification

April 12, 2025

Module 4 of Computer Cartography focused on the various ways to classify data in cartography and how classification impacts the outcome of what a map may indicate to it’s intended audience. For this exercise, we created a map of the senior citizen (age 65 and above) of the Miami-Dade County, Florida population using data from the U.S. Census Bureau (2010).

We made maps using Equal Interval, Natural Breaks (Jenks), Quantile, and Standard Deviation classification methods. Efforts were made to ensure our non-normalized and normalized data maps followed a design that was easy to follow and created for intuitive data interpretation with visual balance. The gradient indicates amount of the senior citizen population from lighter to darker – lightest hue meaning less concentration, darker hue indicating greater concentration.

Equal Interval: By dividing the total ranges by number of classes, this method will place data into equal subranges. This allows a uniform data distribution but can omit any subtleties in the data. Because all the ranges are equal, it doesn’t show there is more variation based on the data – it simply takes the data and disperses it equally based on the low and high end points, without taking into consideration the variation within that. In this lab, the maps equally distributed the data by 15.83 in the percentage (Map 1, non-normalized map) and by 2639 in the normalized by square area map (Map 2). The majority of the data was within the 0-15.83 data point range (Map 1) and 0-2639 data point range (Map 2) respectively. This causes the majority of the map to be one class, which makes it visually seem like not a lot of seniors citizens are in a large amount of area.

Quantile: By dividing the total number of data values and assigning the same equal amount of data values to a chosen amount of classes, this method can cause a distortion of the map and what the data is saying depending on the data values. Because it’s trying to evenly spread the data between classes based on the amount of data, it conceals the nuances of the data itself. While it does give the appearance of high variety within the data, it does not represent the data accurately in this lab exercise – note the large data range that dark green (indicating higher concentration of senior citizens) has in both non-normalized (Map 1) and normalized maps (Map 2).

Standard Deviation: This method indicates how much data value(s) vary from the mean by subtracting or adding repeatedly from the mean of our data. This method appears to showcase the class value that is closest to the mean. The data values closest to the mean have the majority of surface area on the maps (in this case, -0.50 – 0.50 std deviation in the non-normalized map, and < -0.50 std deviation in the normalized map). This method conceals variation that is near the mean, but can show potential outliers in the data.

Natural Break: This method is based on the data’s natural grouping. Similar values are grouped together to show the biggest difference between the classes. This emphasizes where the highest concentration of similar data points are located. The outliers are concealed as they are grouped in with other data values that are nearest to their value.

We created maps that showed data frames for each classification method listed above – one map using non-normalized data (Map 1 - in this case, percentage data), and one map using normalized data (Map 2 - in this situation, data adjusted to show the amount of individuals by area of each census tract per square mile).

Map 1: Percentage of Population Age 65 And Above in Miami-Dade County, Florida (Census Tracts 2010) using non-normalized data.

Map 2: Individuals Age 65 And Above in Miami-Dade County, Florida (Census Tracts 2010) using normalized data - indicates senior citizen population by square mile area.

If this data were to be presented to an audience, it should be presented in the normalized distribution over square miles using Natural Breaks. This is better for the layperson to visualize and understand more accurately what the data is representing (seniors per square mile). The percentages map(s) makes it seem as if there are a lot more individuals 65 and older within the area(s). Other classification methods (Quantile, Equal Interval) determine where data is placed by number of values and how many classes are chosen, which can affect the outcome of understanding to the audience and be misleading. Standard Deviation classification method is not the best display for this data (senior citizen population) because the audience would need to have some understanding statistics, which could negatively affect the audience’s understanding of the map.

Search This Blog

A. Starling's GIS Blog

GIS 5007 Module 4: Data Classification

Comments

Post a Comment

Popular posts from this blog

GIS5935 Module 1.3: Data Quality - Assessment

GIS5935 Module 2.1: Surfaces - TINs and DEMs

GIS5007 Orientation: About Me