16

This question may expose my ignorance on programming but I'm curious about how people are using different python data structures within ArcPy.

This page lists the data structures in Python. I understand how lists can be implemented in GIS (list of feature classes, list of feature types, list of data frames, etc). I understand how sets can be used as well (to remove duplicates). How are people implementing tuples, dictionaries, and other data structures within ArcPy? Also, are there other examples of lists and sets which I haven't listed?

Furthermore, no doubt, people are creating custom classes in ArcPy. Under what circumstances and situations do you require these? Can you provide examples? Is anyone creating custom classes which inherit from the built-in arcpy classes?

I don't require answers to all of these questions, I'm just curious how people are using Python in GIS and what workflows require these customizations.

1
  • 4
    Interesting question but this does not have a definitive answer. Should be a community wiki. Commented Sep 28, 2012 at 2:19

5 Answers 5

14

Many arcpy functions that take multiple inputs accept Python list objects.

For example the Dissolve_management function accepts a list of field names to dissolve on:

arcpy.Dissolve_management("taxlots", "C:/output/output.gdb/taxlots_dissolved",
    ["LANDUSE", "TAXCODE"], "", "SINGLE_PART", "DISSOLVE_LINES")

A tuple can be used in place of a list when you do not need to modify the order or number of elements, as tuples are immutable. They are a useful data structure for heterogeneous but related pieces of data, such as the elements of a timestamp or the coordinates of a point. You will often see lists of tuples, where a tuple serves as a distinct record with a fixed number of attributes, while the list could easily change size, be re-ordered (sorted), etc. See this StackOverflow question for more on the uses of lists vs. tuples.

A dictionary can be used as a fast lookup table to cache a relatively small but frequently-used set of key-value pairs into memory. I saw an interesting example of this on the ArcGIS forums: http://forums.arcgis.com/threads/55099-Update-cursor-with-joined-tables-work-around-w-dictionaries

Their use of a dictionary instead of a join sped up their calculation from 3.5 hours to 15 minutes.

A simpler example might be if you have a million address records with an attribute with the abbreviated state name (CA), but for display purposes you want to spell out the proper name (California), you could use this dictionary as a lookup table when populating a full state name field.

I have not found a need to write a class in Python for use in arcpy myself, but that's not to say there isn't such a use case. A class might be useful when you have a set of closely-related functions (behaviors) that operate on some input (data), and you want to be able to use those data and behaviors in an object-oriented way, but this is more likely going to be business-logic specific and not related to arcpy.

7

Blah238 covers this topic well, so I will just add a couple of examples from my own work. I develop a lot of airport data, and one of the things I have to do regularly is read in order along the surveyed runway centerline points from a runway. You'd think that these points would be in order (in the GIS database) already, but they rarely are. The centerline points occur every 10 feet along the centerline and are flanked on either side by two other rows of survey points spaced 10 feet apart. You get the picture: a plethora of points ... and usually all mixed in together database-wise. With what I am doing in my scripts, it is usually easiest to just select out the centerline points by attributes (or spatially if need be), read the coordinates for each, and dump the results into a Python list. I can then sort, pop, reverse, etc. the list however I need, and it's fast.

Likewise, I use Python dictionaries extensively (probably far more than some would approve of). I have to create sets of 3D unit vectors for each runway end at an airport, and I access these constantly within a script and do this in many of my scripts. I keep many other sets of regularly accessed data in dictionaries, too. Like lists, they are fast and flexible. Highly recommended.

As far as classes go, like Blah238, I haven't found a need to create any. There are probably a few cases where a class would be preferred in my scripts, but I really haven't been able to identify those places. Someone with more programming experience would probably find them quickly.

5

I too love dictionaries - use 'em all the time. This method gets some spatial reference properties and stores it all in a dict:

def get_coord_sys(self, in_dataset):
    """Get and return info on dataset coord sys/projection"""
    spatial_ref = arcpy.Describe(in_dataset).spatialReference
    # Get spatial ref props and put in dictionary
    spat_ref_dict = {}
    spat_ref_dict["name"] = spatial_ref.name
    spat_ref_dict["type"] = spatial_ref.type
    spat_ref_dict["gcs_code"] = spatial_ref.GCSCode
    spat_ref_dict["gcs_name"] = spatial_ref.GCSName
    spat_ref_dict["pcs_code"] = spatial_ref.PCSCode
    spat_ref_dict["pcs_name"] = spatial_ref.PCSName
    return spat_ref_dict

This method snippet extracts point geometries from two featureclasses, I then use the geometries later on to do some trig:

def build_fields_of_view(self):
        """For all KOPs in a study area, build left, right, center FoV triangles"""
        try:    
            fcs = {os.path.join(self.gdb, "WindFarmArray"):[], os.path.join(self.gdb, "KOPs"):[]}
            # Build a dict of WTG and KOP array geometries, looks like:
            #  {'KOPs': [[1, -10049.2697098718, 10856.699451165374], 
            #            [2, 6690.4377855260946, 15602.12386816188]], 
            #   'WindFarmArray': [[1, 5834.9321158060666, 7909.3822339441513], 
            #                     [2, 6111.1759513214511, 7316.9684107396561]]}
            for k, v in fcs.iteritems():
                rows = arcpy.SearchCursor(k, "", self.sr)
                for row in rows:
                    geom = row.shape
                    point = geom.getPart()
                    id = row.getValue("OBJECTID")
                    v.append([id, point.X, point.Y])   

            kops = fcs[os.path.join(self.gdb, "KOPs")] # KOP array
            wtgs = fcs[os.path.join(self.gdb, "WindFarmArray")] # WTG array

A LOT of what I am currently working on involves extracting the coordinates and attributes from vector feature classes and rasters so the data can be pushed into another piece of software that doesn't even know what GIS data is. So, I use lists and dictionaries a lot for this.

4
  • thanks for the answer. Why is a dictionary a better choice than another data structure in these cases? Commented Oct 2, 2012 at 4:52
  • I just like to be able to call my values by my keys. Commented Oct 2, 2012 at 13:19
  • 2
    another reason dicts may be preferable is because they are read a lot faster than lists because they are not ordered. hence, very long lists may take a bit more time to process if they have many entries. Commented Jan 2, 2013 at 22:13
  • @gotanuki True, and if you need to use a big list, use a tuple instead, as they are faster than lists as well. Commented Jan 3, 2013 at 14:04
2

Read this while putting together an answer and had to make some edits..

I'm no Python expert but I think the idea behind using classes is that you can instantiate an object that has a bunch of methods ready to go which relate to the data structure as well as centralizing your methods. There is also some variable scope benefits with classes vs modules, the above link gets to this point somewhat.

I have a class called featureLayer (probably not pythonic-ly named...still learning). I can do

sys.path.append(r"\\Path\To\Scripts")
import gpFuncs as gpF
fc = arcpy.GetParameterAsText(0)
featureLayer = gpF.featureLayer(fc)
points = featureLayer.featureVerticesToPoints(featureid, "", first_and_last)

The definition to do this is a class method which just iterates the features, parts and vertices. Then I can turn my points object into a featureLayer instance and do other stuff which my class has.

I think if built correctly classes should incapuslate functionality. For example, soon I'll start refactoring so that I have a featureLayer class that has methods and attributes that all feature layers have. Then inherit from it to build a featureLayerStrict class instance that will inherit all of featureLayers attributes/methods but will instantiate with a specific geometry type like polygon.

1
0

I work mainly in VB .net but find myself using python and arcpy more and more. In VB I like and try to use Enums as it makes reading the code clearer. Earlier versions of python did not implement Enums so a hack was to create a Class exposing some properties, a bunch of examples are discussed on Stack Overflow. It now looks like the latest version of python implements these which is discussed here.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.