04 Making Data
Tutorial 4 focuses on approaches to making data. You will complete tutorial 4 in conjunction with Assignment 2.
For tutorial 4 you will select and complete one of the following three exercises. (You are of course welcome to complete more than one but only one is required.) We suggest reading assignment 2, considering what you are interested in working on for the assignment, then selecting the tutorial that is most relevant:
-
Option 1: Making Data from Archives which covers methods for georeferencing maps and creating vector data through tracing features (aka digitizing)
-
Option 2: Making Data from Field Observations which covers methods of creating vector data through a georeferenced survey instrument
-
Option 3: Making Data from Text/Addresses which covers methods of creating data through geocoding addresses or place names (see below)
Deliverables
- Option 1: Map composition showing of digitized features from an archival map, overlaid on the archival map
- Option 2: Map composition of the data collected through your survey that conveys one of the unique fields you collected data on
- Option 3: Map composition showing geocoded addresses for cellular antennas
Making Data from Text/Addresses
In this exercise you will learn the basic steps of geocoding, and how to execute them in QGIS.
Geocoding refers to methods of assigning geographic information to textual references to place, such as addresses or place names. This process involves matching textual data to reference data that contains geographic information.
The primary reference data you will typically use today comes from large scale mapping projects like Open Street Maps or Google Maps that have amassed large datasets of named spatial features. You may also encounter projects or examples where reference data had to be constructed or takes another form such as when working with historic information.
In this exercise you will geocode point locations of permits for work on cellular antennas in NYC using the Nominatim API. This is an open-source geocoding API which allows you to query the data collected and maintained by Open Street Map. You will use QGIS to access this service and as such you should be familiar with the usage policy.
Data
- Please download NYC Department of Buildings (DOB) records for requests to perform work on cellular antennas from NYC Open Data here.
- You may also want to download the metadata for this dataset which is contained in the attached excel file
Exercise
Adding data
Open a new QGIS project.
Add the antenna permits dataset to your project with Layer
>Add Layer
>Add Delimited Text Layer
. For the geometry definition select No Geometry (Attribute Table Only)
.
Once you have added this dataset take a look at the attribute table. Notice that there are several fields related to the address for each cellular antenna location but no coordinates corresponding to those addresses.
Constructing an address field
There is a field containing a numeric value for each borough (aka county) that the antenna is located in. In order to have a correctly formatted address to geocode we must create a new text field that has the borough name corresponding to each numeric value.
Use the Field Calculator
(icon looks like an abacus) to create a new field with county names that correspond to each borough code.
Note: the metadata for this antenna permits dataset contains a data dictionary for the borough codes.
The geocoding tool requires that you provide it a single field with a complete address for each record in the dataset. We will concatenate the fields that each contain a part of the antenna address into a single address field. Again use the Field Calculator
to create a new field.
Geocoding
Prior to geocoding our data using the newly created address field we will filter the full dataset of over 6,000 records to reduce the number of requests we send to the Nominatim API (speeding up processing times and reducing the strain on Nominatim’s resources of this brief tutorial).
Using the select by expression tool select some subset of antennas in NYC. The example below shows a selection for all of the antennas in Staten Island.
Next find Batch Nominatim geocoder
tool by searching for it in the Processing Toolbox
.
Open Batch Nominatim geocoder
and follow the prompts to select the parameters you want to use in querying this API. Make sure to check the box next to Selected features only
and then specify the field containing addrses information.
Note: when doing this on your own it is a good practice to select a handful of features to geocode as a test before running a query on a large number of addresses.
You may create a temporary layer rather than initially saving to file. Then select Run
.
Once the tool finishes running you should see your geocoded points representing permit applications to install or maintain cellular antennas. The tool should also specify if any addresses were not found.
Add XYZ layer for context
To view your geocoded results in context you can add map tiles to your QGIS project by adding an XYZ Tile Layer
. Map tiles are a type of spatial data that provide a raster representation of a location at varying levels of detail depending on the zoom level (think Google Maps).
In the Browser Panel, right-click XYZ Tiles
and select New Connection
. Give your new connection a name like “Google Satellite”, and paste the following URL into the URL field:
http://mt0.google.com/vt/lyrs=s&hl=en&x={x}&y={y}&z={z}
Drag the new XYZ Tiles layer you have just connected to into your map window to add it to your project.
Option 3: Deliverable
Create a map composition showing cellular antenna locations for some portion of NYC. Upload it to canvas.