Creating new shapefiles and features#

Now, that we learned how to read and iterate over shapefiles, make selections based on attribute values and apply spatial filters, we can now move towards a new, important field: the creating of geometries, features and entire files

Creating a new vector file#

Creating a new vector file will always require some standardized steps:

  1. Define a driver. This determines the type of vector file you want to create. We have worked a lot with shapefiles so for this example, but you can also create kml files, gpkg files, etc. What is a nice feature is to create a file in memory, and not on the disc. This is helpful, e.g., in case you want to create temporary vector files or test files. For this we use the driver 'memory' (in comparison to ESRI Shapefile for a physical shapefile

  2. Create the file itself, the driver determines which type of file will be created.

  3. Create a layer - following the general structure of a vector file, we need to have a layer in which all features are stored. Here, we also define the definition of the layer, which determines, whether it is a point, line or polygon file. In this step, we also define the spatial reference (i.e., projection)

Let’s do these things step by step. First, we need to call again the necessary libraries

from osgeo import ogr, osr
driver = ogr.GetDriverByName('memory')
outDS = driver.CreateDataSource('')

With regards to the spatial reference of the layer, we introduce here another functionality: for many projections, there is a defined EPSG code. Check out the Wikipedia page for a more detailed description of the ESPG codes. ogr offers the functionality of creating a spatial reference based on such an EPSG code. What is needed, is a two-step-process: (1) Creation of an empty spatial reference, (2) populate the empty spatial reference by importing from the osr libraries using the method .ImportFromEPSG(). Alternatives are .ImportFromWkt() or .SetWellKnownGeogCS("WGS84"):

srs = osr.SpatialReference()

We create a polygon shapefile here, use ogr.wkbPoint or ogr.wkbLineString if you want to create a point/line file

outlyr = outDS.CreateLayer("", srs, geom_type=ogr.wkbPolygon) 

Now, we have created a shapefile inside memory including a layer, that can be populated with features. However, the current layer does not carry any attributes. Below is an example for adding an attribute field id that hypothetically can carry an integer value. Once this is done, we also have to retrieve the layer definition through .GetLayerDefn(). This is an important because every feature we add needs to receive the information on whether it is a polygon/line/point layer.

idField = ogr.FieldDefn("id", ogr.OFTInteger)
defn = outlyr.GetLayerDefn() # needed to create the new features in the next step

All done now! Yet, the shapefile is still stored in memory. In order to flush it to our disc, e.g., to visualize it inside QGIS, I have provided a small function/tool that copies the shapefile from memory to disc. The function carries two arguments: (1) the shapefile that is currently stored in memory, and (2) path, which is a string indicating the full path of the desired file. At the moment, the function does this only for shapefiles. Can you imagine of an updated function, that can recognize the what type of vector file you would want to have (e.g., kml, gpkg, ESRI Shapefile, etc.) and creates the output accordingly?

Important to consider

Think a bit about in which type of situations you would want to create a shapefile inside the memory and in which situations you would want to have a shapefile that is stored on disc. Have in mind, that the process of writing something from memory to the physical disc is a relatively slow process which you generally want to avoid.

def CopySHPDisk(shape, outpath):
    drvV = ogr.GetDriverByName('ESRI Shapefile')
    outSHP = drvV.CreateDataSource(outpath)
    lyr = shape.GetLayer()
    outSHP.CopyLayer(lyr, 'lyr')

Creating features and geometries#

Now that we know how to create a new vector layer, it is time populate this layer with some features. This includes (a) the features including the attributes, and (b) the geometries associated with it. We will do this below using examples for points, lines and polygons - this is to exemplify that the key workflow is in all cases the same except for some small detailes. Let’s create in a first three shapefiles inside memory

pts = driver.CreateDataSource('')
ptsLyr = pts.CreateLayer("", srs, geom_type=ogr.wkbPoint)
idField = ogr.FieldDefn("id", ogr.OFTInteger)
ptsDefn = ptsLyr.GetLayerDefn()
line = driver.CreateDataSource('')
lineLyr = line.CreateLayer("", srs, geom_type=ogr.wkbLineString)
idField = ogr.FieldDefn("id", ogr.OFTInteger)
lineDefn = lineLyr.GetLayerDefn()
pols = driver.CreateDataSource('')
polsLyr = pols.CreateLayer("", srs, geom_type=ogr.wkbPolygon)
idField = ogr.FieldDefn("id", ogr.OFTInteger)
polsDefn = polsLyr.GetLayerDefn()

The next step is to create features. As you remember from the general structure of a vector data file, a feature consists of the geometry and a number of attributes. The attributes themselves are defined by the layer (see cells above), meanign that every new feature created has by default these attributes - though, they will contain the value None by default and need to be updated. In general, the strategy is:

  1. En empty feature, that imports the general definition of the layer

  2. A function that modifies the attribute \(\rightarrow\) in our case only the attribute id

  3. The geometry, that is added to the feature

  4. The creation of the feature inside the layer

Point features#

For a point feature it looks as follows. Below we create a single point and control our creation using the already well known function .GetFeatureCount()

# Definition of a random starting coordinate
x = 13.356061147056309
y = 52.50885105569029
# Create the empty feature, get definition of the layer
pntFeat = ogr.Feature(ptsDefn)
# Modify the attribute
pntFeat.SetField("id", 1)
# Define a point coordinate
pntGEOM = ogr.Geometry(ogr.wkbPoint)
pntGEOM.AddPoint(x, y)
# Set the geometry into the feature
# add the feature to layer

# Check functionality

Line features#

The creation of a line feature is similar. Here, we create a line consisting of three points, all 0.05° apart from the starting point in east direction (now without comments):

# Definition of a random step size
step = 0.05
lineFeat = ogr.Feature(lineDefn)
lineFeat.SetField("id", 1)
lineGEOM = ogr.Geometry(ogr.wkbLineString)
lineGEOM.AddPoint(x, y)
lineGEOM.AddPoint(x + step, y)
lineGEOM.AddPoint(x + step + step, y)


Polygon features#

The case of polygons is a bit more tricky. While it looks similar to a line feature (i.e., a sequential collection of points) the difference here is (a) we create a LinearRing as a first geometry, which we afterwards copy into a second geometry, (b) the first and the last coordinate have to be identical so that the ring is closed. Here, we create a square, with the origin in our xy location, and with size of 0.05° x 0.05°:

polFeat = ogr.Feature(polsDefn)
polFeat.SetField("id", 1)
ring = ogr.Geometry(ogr.wkbLinearRing)
ring.AddPoint(x, y)
ring.AddPoint(x+step, y)
ring.AddPoint(x+step, y-step)
ring.AddPoint(x, y-step)
ring.AddPoint(x, y)
poly = ogr.Geometry(ogr.wkbPolygon)


With this, you have now the knowledge and basic code examples to create new shapefiles and populate them. You know how to do this using point, line and polygon files. This will equip you well for the exercise of this week! Again, make sure you not only read this chapter, but also run the code lines inside your own notebook. Play around with different coordinate systems to get familiar with this functionality; and, as already mentioned previously, have a look at other sources: