Dasymetric Mapping and Areal Interpolation
Details of the Technique
(Mennis, J. and Hultgren, T., 2006.
Intelligent dasymetric mapping and its
application to areal interpolation. Cartography and
Geographic Information Science, 33(3): 179194.)
IDM takes as input
count data mapped to a set of source zones and a categorical
ancillary data set, and redistributes the data to a set of target
zones formed from the intersection of the source and ancillary
zones. Data are redistributed based on a combination of areal
weighting and the relative densities of ancillary classes. Consider
a source zone s and an ancillary zone z where z
is associated with ancillary class c. Target zone t
is defined as an area of overlap of s and z. The
estimated count for a given target zone is calculated as

(1) 
where
is
the estimated density of ancillary class c.
The value of
may
be set by the analyst, if the analyst has a priori knowledge of the
density value for that class. Or, the analyst may choose to derive
the data density for any ancillary class by sampling a subset of the
total source zones that may be associated with that ancillary
class. The analyst has three options for the sampling method
employed. The ‘containment’ method selects those source zones that
are wholly contained within an individual ancillary class. The
‘centroid’ method selects those source zones that have their
centroids contained within an individual ancillary class. The
‘percent cover’ method allows the user to set a threshold percentage
value and then selects those source zones whose area of occupation
by a single ancillary class is equal to or exceeds that threshold.
Once a sample of source zones has been selected as representative of
a particular ancillary class,
may
be calculated as

(2) 
where m is the number of sampled
source zones associated with ancillary class c.
Note that even when
an analyst chooses to derive the density of most of the ancillary
classes by sampling, there may be one or two ancillary classes to
which the analyst knows that no data should be distributed. In the
case where one or more ancillary classes are assigned a data density
of zero by the analyst, the term
in
Equation 1 refers only to the areas of target zones associated with
ancillary classes that are inhabited, i.e. for which a data density
of zero has not been enforced by the analyst. Likewise, the term
in
Equation 2 refers only to the densities of ancillary classes that
are inhabited. In addition, the term
in
Equation 2 is replaced by the area of the source zone occupied by
inhabited ancillary classes.
To account for
spatial variation in the relationship between data density and
ancillary class, IDM can incorporate an additional data set of
region zones, where the data density for each ancillary class is
calculated separately for each individual region. There is also the
possibility that a particular ancillary class may go unsampled,
which can occur using both the containment and percent cover
sampling methods. In this case, the unsampled class’s density is
estimated using ‘refined’ areal weighting. First, the count
assigned to each target zone associated with an unsampled class is
estimated based on the previously estimated densities of the other
ancillary classes that occupy that target zone’s host source zone.
For instance, consider a source zone that overlaps multiple
ancillary zones. Some ancillary zones are associated with an
ancillary class that has gone unsampled, denoted ancillary class
u, whose density estimate is therefore unknown. The other
ancillary zones are associated with an ancillary class whose density
estimate is known, denoted ancillary class k, because it was
derived from sampling or assigned a preset density value by the
analyst. The count of a target zone associated with u is
calculated as

(3) 
where
is
the estimated count of the target zone associated with u,
is
the estimated density of k,
is
the area of the target zone associated with k, and
is
the area of the target zone associated with u. Note that
is
a temporary estimate, used only to estimate the density of the
ancillary class whose density estimate is unknown, and is not the
‘final’ estimated count for that target zone. Once the value of
is
found, the estimated density of ancillary class u can be
calculated using the formula

(4) 
where
is
the estimated density of u and p is the number of
target zones in the entire data set associated with u. 