The Fallacies of Free Data
As a GIS data library publisher, you might expect us to favor paid subscriptions for maintained data libraries. You would be right, and there are some important reasons for this behind the scenes. Customers often at first try to compile the data themselves, but are frustrated by the variety of formats, coordinate systems, integration, cybersecurity issues and lack of seamlessness. Moreover, drifting data schemas can rapidly change from month to month making data updates a sisyphean task.
We are completely in favor of government organizations publishing all of their data as widely and as freely as possible, however we find download methodologies seem to come and go with disturbing frequency, particularly as resources are redeployed during the pandemic. WhiteStar spends time and effort keeping track of more than 3,142 US counties and county-equivalents in the USA as well as tens of federal government websites. We want customers to be able to reliably consume GIS data in a predictable format, coordinate system and database schema.
We also cheerfully research issues in the authoritative source data we provide. Customers often want to know the history and data collection processes used to compile the data. For example, why are attribute fields in well data not fully populated in some cases? How do the X Y coordinates within the data relate to longitude/latitude? How was the raster map georeferenced and to which base? Customers love having access to WhiteStar personnel willing to liaise with authoritative source data providers. Do you have better data internally? We can get that incorporated into the master data for you so you do not have to manage it.
Public data often have issues of integration across jurisdictions as well. For example, a state’s authority stops at the border and may not cleanly transition to the adjacent state(s). As shown in the accompanying graphic, we find gaps and overlaps in land survey data that must be inspected and resolved before the GIS data can be used for project map generation.
For better or for worse, governmental organizations publish data in formats ranging from PDF files to Access Tables to Esri File Geodatabases. WhiteStar knows that customers want to consume data in a curated and consistent format using a consistent coordinate system and ideally in web stream form that can easily be added to any GIS or CAD system.
Robert C. White, Jr.
President and CEO
WhiteStar Corporation