How to work with GraphQL in Python

graphql python tutorial

Organizations from Netflix to Shopify to PayPal consider GraphQL to be the de facto standard for providing an external application programmatic interface (API). It’s the most recent step in the advancement of communication and service composition, building upon lessons learned from previous efforts like REST and SOAP. 

GraphQL offers a wide range of benefits, including:

  • A higher level of flexibility than REST or SOAP
  • The ability to deliver nested data in a single call – no more need to fetch data1 so you can run query2, which makes GraphQL ideal for low-bandwidth applications like:
  • Ideal for phone apps and IoT devices.

If you are looking to add a layer of interoperability to your project, it’s well worth taking a look at, especially if you expect to support connectivity from apps and/or devices.

As you would expect, GraphQL’s query language includes the ability to do basic manipulation of resources using verbs (like create, update, and delete) which you might be familiar with from using REST APIs. But its syntax provides a great deal of flexibility that’s very hard to achieve with approaches like RESTful. And GraphQL APIs are very easy to work with using Python. 

In this tutorial, we will explore ActiveState’s GraphQL API through the GQL 3 GraphQL Client for Python.

Before You Start: Install The GraphQL Python Environment

To follow along with the code in this article, you can download and install our pre-built GraphQL environment, which contains a version of Python 3.9 and the packages used in this post, along with already resolved dependencies!

In order to download this ready-to-use Python environment, you will need to create an ActiveState Platform account. Just use your GitHub credentials or your email address to register. Signing up is easy and it unlocks the ActiveState Platform’s many benefits for you!

Or you could also use our State tool to install this runtime environment.

ActiveState Platform graphQL
 

For Windows users, run the following at a CMD prompt to automatically download and install our CLI, the State Tool along with the GraphQL runtime into a virtual environment:

powershell -Command "& $([scriptblock]::Create((New-Object Net.WebClient).DownloadString('https://platform.activestate.com/dl/cli/install.ps1'))) -activate-default Pizza-Team/GraphQL"

For Linux or Mac users, run the following to automatically download and install our CLI, the State Tool along with the GraphQL runtime into a virtual environment:

sh <(curl -q https://platform.activestate.com/dl/cli/install.sh) --activate-default Pizza-Team/GraphQL

Getting Started with GraphQL APIs

One of GraphQL’s many benefits is that it is self-documenting, which is a big deal given the disparity between some API’s documentation and their actual implementation. It has its limits (like most things), but it works quite well overall.

ActiveState provides a quickstart tutorial so that you can get their GraphQL up and running in no time. Since a diagram is worth a thousand words, let’s start by querying the API in order to get a compatible schema with this excellent GraphQL Visualizer:

query IntrospectionQuery {
	   __schema {
	     queryType {
	       name
     }
     mutationType {
	       name
     }
     subscriptionType {
	       name
     }
     types {
	       ...FullType
     }
     directives {
	       name
       description
       args {
	         ...InputValue
       }
     }
   }
 }
  fragment FullType on __Type {
	   kind
   name
   description
   fields(includeDeprecated: true) {
	     name
     description
     args {
	       ...InputValue
     }
     type {
	       ...TypeRef
     }
     isDeprecated
     deprecationReason
   }
   inputFields {
	     ...InputValue
   }
   interfaces {
	     ...TypeRef
   }
   enumValues(includeDeprecated: true) {
	     name
     description
     isDeprecated
     deprecationReason
   }
   possibleTypes {
	     ...TypeRef
   }
 }
  fragment InputValue on __InputValue {
	   name
   description
   type {
	     ...TypeRef
   }
   defaultValue
 }
  fragment TypeRef on __Type {
	   kind
   name
   ofType {
	     kind
     name
     ofType {
	       kind
       name
       ofType {
	         kind
         name
       }
     }
   }
 }

You can run the query directly in the ActiveState GraphQL sandbox and paste the results to get a nice directed graph diagram of the available types:

working with graphql in python

Introspection is a key concept of the GraphQL standard, since it provides a mechanism for getting the actual query capabilities and limits of an API. For more information, check out the detailed examples in the GraphQL documentation.

Querying in Python with GraphQL

Now that we can actually see what kind of data is available, let’s build some basic queries and explore some projects. Our target organization will be the famous Pizza-Team, which has many interesting projects in different languages. The following code queries all of Pizza-Team’s projects and returns only their names:

import json
import pandas as pd
from gql import Client, gql
from gql.dsl import DSLSchema, DSLQuery, dsl_gql
from gql.transport.requests import RequestsHTTPTransport
#Set a transport layer
transport = RequestsHTTPTransport(url="https://platform.activestate.com/sv/mediator/api", verify=True, retries=3,)
client = Client(transport=transport, fetch_schema_from_transport=True)
#Standard GraphQL query
query = gql(
	  """
  {
	    projects(org: "Pizza-Team") {
	      ... on Project {
	        name
        description
      }
    }
  }
"""
)
#Execute query and normalize the json response payload
prjs = pd.json_normalize(client.execute(query)['projects'])
prjs.head()

This code contains some interesting chunks, so let’s break it down:

  • The first section defines a transport protocol to communicate to the API. It can use either HTTP (synchronous/asynchronous) or WebSockets protocols. In this case, the synchronous HTTP client is based on the Requests library. Interestingly, you can validate the query dynamically against the schema that was retrieved. You can also specify a number of retries to perform in the event of a communication failure. 
  • The second section provides the query. As you can see, the param org is required, and only the name and description attributes are returned for each project.
  • The last part of the code runs the query, which returns a JSON object that’s parsed using the Pandas json_normalize function in order to get a DataFrame.

Domain Specific Languages for Querying GraphQL

Domain Specific Languages (DSL) are specialized tools that are tightly coupled to a context. In this case, you can think of each GraphQL schema as the seed of a DSL. Fortunately, the GQL client library can generate a simple DSL based on the schema that was retrieved by the client.

You can rebuild the previous query without using the long string format as follows:

ds = DSLSchema(client.schema)
query = dsl_gql(
	   DSLQuery(
	       ds.Query.projects(org="Pizza-Team").select(
	           ds.Project.name,
           ds.Project.description,
       )
   )
)
prjs = pd.json_normalize(client.execute(query)['projects'])
prjs.head()

The code is fairly simple: 

  1. Create an instance of a DSL schema based on the previously retrieved client schema. 
  2. Use the constructors for the required operation (DSLQuery, DSLMutation, or DSLSubscription). 
  3. Build the body of the query by navigating through the schema using the auto-generated attributes.

Parameterizable Queries

The DSL approach has other advantages, such as the ability to customize queries through parameters. To illustrate the point, let’s build a dependency graph for two of Pizza-Team’s projects: AutoML-Tools and Social-Distancing.

First, we create a parameterized query that accepts the name of the project and returns the source dependencies with their versions:

#builds the query using the DSL
query = dsl_gql(
	  DSLQuery(
	      ds.Query.projects(org="Pizza-Team").select(
	          ds.Project.name,
          ds.Project.description,
          #notice the nested structure query
          ds.Project.commit.select (
	              ds.Commit.commit_id,
              ds.Commit.sources.select (
	                  ds.Source.name,
                  ds.Source.version
              )
          )
      )
  )
)
prjs = pd.json_normalize(client.execute(query)['projects'])
#applies a transformation to count the number of dependencies
prjs['num_dependencies'] = prjs.apply(lambda x: len( x['commit.sources'] ), axis=1)
prjs.head()

There are some interesting things in this snippet, so let’s break down the main bits:

The first section creates a query using the aforementioned DSL, but it uses a nested graph model this time:

  1. It selects the associated commit for each project 
  2. Obtains the id 
  3. Returns the name and version of each source that’s linked to the commit  

The second section uses the DataFrame to count the number of dependencies by applying a simple lambda function to the commit.sources column.

ActiveState Runtime environmentsOther Considerations

In addition to queries, GraphQL also supports two other types of operations:

  • Mutations are the classic “create, update, and delete” operations. 
  • Subscriptions are connections that listen to changes in server data. They’re usually based on the WebSockets protocol, and allow you to build pipelines based on events that occur within the data.

The synchronous HTTP transport used in these examples is a good starting point, but given the increasing scale of data managed in modern applications, you will probably have to move to an asynchronous approach. Fortunately, the GQL library supports asynchronous HTTP transport in addition to WebSockets.

Conclusions – Programmatic Queries Using GraphQL

There are huge advantages to working with GraphQL, including:

  • Providing developers the flexibility to retrieve only the information they require. 
  • Self-documenting API graphs
  • Payload reduction
  • Quick data retrieval

In addition, the ability to use DSLs to map GraphQL schemas to native code provides a smooth way to increase your development speed. Offering GraphQL as part of your applications will help promote interoperability, and you can also take advantage of implementations offered by service providers. 

The ActiveState Platform has exposed their API using GraphQL, making it easy to query their catalog of Python packages for things like vulnerability information and open source licenses. Coming soon, you’ll also be able to use the API to query for:

  • Software Bill of Materials (SBOM)a complete list of all the dependencies in your project, along with each dependency’s name, supplier/author, version, license, timestamp, and dependency relationship – all of which can be used to satisfy US Government SBOM requirements.

Next Steps:

Python Environment for GraphQL

Recommended Reads:

ActiveState’s Build Graph – Sneak Peak

Spotify & Music Videos – A Python Microservice Tutorial

Recent Posts

Scroll to Top