Organizations from Netflix to Shopify to PayPal consider GraphQL to be the de facto standard for providing an external application programmatic interface (API). It’s the most recent step in the advancement of communication and service composition, building upon lessons learned from previous efforts like REST and SOAP.
GraphQL offers a wide range of benefits, including:
- A higher level of flexibility than REST or SOAP
- The ability to deliver nested data in a single call – no more need to fetch data1 so you can run query2, which makes GraphQL ideal for low-bandwidth applications like:
- Ideal for phone apps and IoT devices.
If you are looking to add a layer of interoperability to your project, it’s well worth taking a look at, especially if you expect to support connectivity from apps and/or devices.
As you would expect, GraphQL’s query language includes the ability to do basic manipulation of resources using verbs (like create, update, and delete) which you might be familiar with from using REST APIs. But its syntax provides a great deal of flexibility that’s very hard to achieve with approaches like RESTful. And GraphQL APIs are very easy to work with using Python.
In this tutorial, we will explore ActiveState’s GraphQL API through the GQL 3 GraphQL Client for Python.
Before You Start: Install The GraphQL Python Environment
To follow along with the code in this article, you can download and install our pre-built GraphQL environment, which contains a version of Python 3.9 and the packages used in this post, along with already resolved dependencies!
In order to download this ready-to-use Python environment, you will need to create an ActiveState Platform account. Just use your GitHub credentials or your email address to register. Signing up is easy and it unlocks the ActiveState Platform’s many benefits for you!
Or you could also use our State tool to install this runtime environment.
For Windows users, run the following at a CMD prompt to automatically download and install our CLI, the State Tool along with the GraphQL runtime into a virtual environment:
powershell -Command "& $([scriptblock]::Create((New-Object Net.WebClient).DownloadString('https://platform.activestate.com/dl/cli/install.ps1'))) -activate-default Pizza-Team/GraphQL"
For Linux or Mac users, run the following to automatically download and install our CLI, the State Tool along with the GraphQL runtime into a virtual environment:
sh <(curl -q https://platform.activestate.com/dl/cli/install.sh) --activate-default Pizza-Team/GraphQL
Getting Started with GraphQL APIs
One of GraphQL’s many benefits is that it is self-documenting, which is a big deal given the disparity between some API’s documentation and their actual implementation. It has its limits (like most things), but it works quite well overall.
ActiveState provides a quickstart tutorial so that you can get their GraphQL up and running in no time. Since a diagram is worth a thousand words, let’s start by querying the API in order to get a compatible schema with this excellent GraphQL Visualizer:
query IntrospectionQuery { __schema { queryType { name } mutationType { name } subscriptionType { name } types { ...FullType } directives { name description args { ...InputValue } } } } fragment FullType on __Type { kind name description fields(includeDeprecated: true) { name description args { ...InputValue } type { ...TypeRef } isDeprecated deprecationReason } inputFields { ...InputValue } interfaces { ...TypeRef } enumValues(includeDeprecated: true) { name description isDeprecated deprecationReason } possibleTypes { ...TypeRef } } fragment InputValue on __InputValue { name description type { ...TypeRef } defaultValue } fragment TypeRef on __Type { kind name ofType { kind name ofType { kind name ofType { kind name } } } }
You can run the query directly in the ActiveState GraphQL sandbox and paste the results to get a nice directed graph diagram of the available types:
Introspection is a key concept of the GraphQL standard, since it provides a mechanism for getting the actual query capabilities and limits of an API. For more information, check out the detailed examples in the GraphQL documentation.
Querying in Python with GraphQL
Now that we can actually see what kind of data is available, let’s build some basic queries and explore some projects. Our target organization will be the famous Pizza-Team, which has many interesting projects in different languages. The following code queries all of Pizza-Team’s projects and returns only their names:
import json import pandas as pd from gql import Client, gql from gql.dsl import DSLSchema, DSLQuery, dsl_gql from gql.transport.requests import RequestsHTTPTransport #Set a transport layer transport = RequestsHTTPTransport(url="https://platform.activestate.com/sv/mediator/api", verify=True, retries=3,) client = Client(transport=transport, fetch_schema_from_transport=True) #Standard GraphQL query query = gql( """ { projects(org: "Pizza-Team") { ... on Project { name description } } } """ ) #Execute query and normalize the json response payload prjs = pd.json_normalize(client.execute(query)['projects']) prjs.head()
This code contains some interesting chunks, so let’s break it down:
- The first section defines a transport protocol to communicate to the API. It can use either HTTP (synchronous/asynchronous) or WebSockets protocols. In this case, the synchronous HTTP client is based on the Requests library. Interestingly, you can validate the query dynamically against the schema that was retrieved. You can also specify a number of retries to perform in the event of a communication failure.
- The second section provides the query. As you can see, the param org is required, and only the name and description attributes are returned for each project.
- The last part of the code runs the query, which returns a JSON object that’s parsed using the Pandas json_normalize function in order to get a DataFrame.
Domain Specific Languages for Querying GraphQL
Domain Specific Languages (DSL) are specialized tools that are tightly coupled to a context. In this case, you can think of each GraphQL schema as the seed of a DSL. Fortunately, the GQL client library can generate a simple DSL based on the schema that was retrieved by the client.
You can rebuild the previous query without using the long string format as follows:
ds = DSLSchema(client.schema) query = dsl_gql( DSLQuery( ds.Query.projects(org="Pizza-Team").select( ds.Project.name, ds.Project.description, ) ) ) prjs = pd.json_normalize(client.execute(query)['projects']) prjs.head()
The code is fairly simple:
- Create an instance of a DSL schema based on the previously retrieved client schema.
- Use the constructors for the required operation (DSLQuery, DSLMutation, or DSLSubscription).
- Build the body of the query by navigating through the schema using the auto-generated attributes.
Parameterizable Queries
The DSL approach has other advantages, such as the ability to customize queries through parameters. To illustrate the point, let’s build a dependency graph for two of Pizza-Team’s projects: AutoML-Tools and Social-Distancing.
First, we create a parameterized query that accepts the name of the project and returns the source dependencies with their versions:
#builds the query using the DSL query = dsl_gql( DSLQuery( ds.Query.projects(org="Pizza-Team").select( ds.Project.name, ds.Project.description, #notice the nested structure query ds.Project.commit.select ( ds.Commit.commit_id, ds.Commit.sources.select ( ds.Source.name, ds.Source.version ) ) ) ) ) prjs = pd.json_normalize(client.execute(query)['projects']) #applies a transformation to count the number of dependencies prjs['num_dependencies'] = prjs.apply(lambda x: len( x['commit.sources'] ), axis=1) prjs.head()
There are some interesting things in this snippet, so let’s break down the main bits:
The first section creates a query using the aforementioned DSL, but it uses a nested graph model this time:
- It selects the associated commit for each project
- Obtains the id
- Returns the name and version of each source that’s linked to the commit
The second section uses the DataFrame to count the number of dependencies by applying a simple lambda function to the commit.sources column.
Other Considerations
In addition to queries, GraphQL also supports two other types of operations:
- Mutations are the classic “create, update, and delete” operations.
- Subscriptions are connections that listen to changes in server data. They’re usually based on the WebSockets protocol, and allow you to build pipelines based on events that occur within the data.
The synchronous HTTP transport used in these examples is a good starting point, but given the increasing scale of data managed in modern applications, you will probably have to move to an asynchronous approach. Fortunately, the GQL library supports asynchronous HTTP transport in addition to WebSockets.
Conclusions – Programmatic Queries Using GraphQL
There are huge advantages to working with GraphQL, including:
- Providing developers the flexibility to retrieve only the information they require.
- Self-documenting API graphs
- Payload reduction
- Quick data retrieval
In addition, the ability to use DSLs to map GraphQL schemas to native code provides a smooth way to increase your development speed. Offering GraphQL as part of your applications will help promote interoperability, and you can also take advantage of implementations offered by service providers.
The ActiveState Platform has exposed their API using GraphQL, making it easy to query their catalog of Python packages for things like vulnerability information and open source licenses. Coming soon, you’ll also be able to use the API to query for:
- Software Bill of Materials (SBOM) – a complete list of all the dependencies in your project, along with each dependency’s name, supplier/author, version, license, timestamp, and dependency relationship – all of which can be used to satisfy US Government SBOM requirements.
Next Steps:
- Install our GraphQL Python environment for Windows, Linux or macOS and try it out by using it to query/work with some GraphQL APIs like:
- The ActiveState Platform GraphQL API
- List of GraphQL APIs you can also try