8.7.3. Internal Geocoding

Geocoding is a complex function and requires all the right data be in place to do it well. Cartographica has a powerful Geocoding capability, but to use it, you must gather data first. Most data is accurate to the last census update and may be more recent, based on the government's move to modernize and increase accuracy in the data or your commercial source. Keep in mind that if you are trying to geocode an old address, your accuracy may suffer if the data is too new, since some street realignments, renaming, renumbering, and removals may have changed the landscape since the address data was taken.

In comparison to the geocoding services, the Internal geocoding is much more strict in terms of how it interprets data. So, you should expect to use highly regularized data to match the TIGER/Line files; for example, "100 A St SE" in DC will not match "100 A St" using Tiger/Line data.

First, you need to figure out the area of your search. Generally speaking, you want to limit geocoding to a limited number of counties, to reduce the amount of work necessary on your computer. However, if you must do a large area, or if your area is undefined, there are geocoding services that can do bulk geocoding for a small fee.

Preparing to geocode with TIGER data

  1. Locate the TIGER files on the Census Bureau web site. First, go to the TIGER/Line Shapefiles page, then select the particular year you're interested in. To use these files, you will need at least the Edges layer for the area that you're working with.

  2. Locate the appropriate files on the server and download the state, county or counties that you are interested in using for geocoding. After unzipping the files, choose File  >  ImportVector Data and then select the edges.shp file for import. This will import the edge data for this area.

  3. If you need to geocode more than one census area, you may either merge the two layers into a single layer and use the new layer as the geocoding layer, or to run the geocoder twice, once with each layer selected.

  4. The next step is to configure the geocoder. Tools  >  Geocoder Options. If you are using standard TIGER/Line shape files, Cartographica should set up the geocoder automatically. In most cases, not only is the layer determined, but the field names are chosen as well. If you are using a file from a different source, or of a different vintage, you may need to set up the individual field names. At a minimum, Cartographica requires: Numbers From and Numbers To (for either side of the street or both), the Street Name, and Street Type (Rd, Ave, etc.). Each of the other entries (prefix and suffix, zip codes, state, and city) are optional but will increase the accuracy. In the case of City or State, you may type in value to match the City or State section of the address.

However, it is not a requirement to use TIGER files for geocoding. The need is to have a minimum set of parameterized lines. Any line layer can be selected as the Geocoding layer.

Configuring Internal Geocoding

  1. Choose Tools  >  Geocoder Options.

    The Geocoder Options window appears.

    Geocoder Options

    Figure 8.7. Geocoder Options


  2. From the Geocode using layer menu, select the layer that has the attributed lines in it.

  3. Each of the geocoding fields in this dialog provides a more specific way to find the data.

    For each field in the geocoding layer that you want to match to addresses, select an appropriate Geocoding Fields entry and choose the corresponding field. Generally, the more fields that you use, the more precise the geocoding will be. However, there are times when avoiding certain data fields is appropriate due to inaccurate data or the format of the original addresses.

    The most important features are Street Name, Street Type (ave, rd, street, etc.) and Numbers From and Numbers To on the right and left sides of the street (indicated by the left and right columns). If you have these at a minimum, your data will probably geocode pretty well.

    If you have Zip Codes for the left and right sides, those can help make things noticeably faster when working with large sets of lines.

  4. If you have data for only one state and the data does not contain a state column, type in the state abbreviation in the State field to automatically exclude any addresses not in that state.

  5. Once you have selected all of the geocoding options, click OK.

Once you have set up geocoding, you can use it for the Import Tabular Data or Acquire Database Data features. Each of these have options to read the data as addresses and then geocode them.