Datastore#

Modules#

Shortcut methods for getting set up with Google Cloud Datastore.

You’ll typically use these to get started with the API:

>>> from google.cloud import datastore
>>>
>>> client = datastore.Client()
>>> key = client.key('EntityKind', 1234)
>>> key
<Key('EntityKind', 1234), project=...>
>>> entity = datastore.Entity(key)
>>> entity['answer'] = 42
>>> entity
<Entity('EntityKind', 1234) {'answer': 42}>
>>> query = client.query(kind='EntityKind')

The main concepts with this API are:

  • Client which represents a project (string) and namespace (string) bundled with a connection and has convenience methods for constructing objects with that project / namespace.
  • Entity which represents a single entity in the datastore (akin to a row in relational database world).
  • Key which represents a pointer to a particular entity in the datastore (akin to a unique identifier in relational database world).
  • Query which represents a lookup or search over the rows in the datastore.
  • Transaction which represents an all-or-none transaction and enables consistency when race conditions may occur.
class google.cloud.datastore.Batch(client)[source]#

Bases: object

An abstraction representing a collected group of updates / deletes.

Used to build up a bulk mutation.

For example, the following snippet of code will put the two save operations and the delete operation into the same mutation, and send them to the server in a single API request:

>>> from google.cloud import datastore
>>> client = datastore.Client()
>>> batch = client.batch()
>>> batch.put(entity1)
>>> batch.put(entity2)
>>> batch.delete(key3)
>>> batch.commit()

You can also use a batch as a context manager, in which case commit() will be called automatically if its block exits without raising an exception:

>>> with batch:
...     batch.put(entity1)
...     batch.put(entity2)
...     batch.delete(key3)

By default, no updates will be sent if the block exits with an error:

>>> with batch:
...     do_some_work(batch)
...     raise Exception()  # rolls back
Parameters:client (google.cloud.datastore.client.Client) – The client used to connect to datastore.
begin()[source]#

Begins a batch.

This method is called automatically when entering a with statement, however it can be called explicitly if you don’t want to use a context manager.

Overridden by google.cloud.datastore.transaction.Transaction.

Raises:ValueError if the batch has already begun.
commit()[source]#

Commits the batch.

This is called automatically upon exiting a with statement, however it can be called explicitly if you don’t want to use a context manager.

Raises:ValueError if the batch is not in progress.
current()[source]#

Return the topmost batch / transaction, or None.

delete(key)[source]#

Remember a key to be deleted during commit().

Parameters:key (google.cloud.datastore.key.Key) – the key to be deleted.
Raises:ValueError if the batch is not in progress, if key is not complete, or if the key’s project does not match ours.
mutations#

Getter for the changes accumulated by this batch.

Every batch is committed with a single commit request containing all the work to be done as mutations. Inside a batch, calling put() with an entity, or delete() with a key, builds up the request by adding a new mutation. This getter returns the protobuf that has been built-up so far.

Return type:iterable
Returns:The list of datastore_pb2.Mutation protobufs to be sent in the commit request.
namespace#

Getter for namespace in which the batch will run.

Return type:str
Returns:The namespace in which the batch will run.
project#

Getter for project in which the batch will run.

Return type:str
Returns:The project in which the batch will run.
put(entity)[source]#

Remember an entity’s state to be saved during commit().

Note

Any existing properties for the entity will be replaced by those currently set on this instance. Already-stored properties which do not correspond to keys set on this instance will be removed from the datastore.

Note

Property values which are “text” (‘unicode’ in Python2, ‘str’ in Python3) map to ‘string_value’ in the datastore; values which are “bytes” (‘str’ in Python2, ‘bytes’ in Python3) map to ‘blob_value’.

When an entity has a partial key, calling commit() sends it as an insert mutation and the key is completed. On return, the key for the entity passed in is updated to match the key ID assigned by the server.

Parameters:entity (google.cloud.datastore.entity.Entity) – the entity to be saved.
Raises:ValueError if the batch is not in progress, if entity has no key assigned, or if the key’s project does not match ours.
rollback()[source]#

Rolls back the current batch.

Marks the batch as aborted (can’t be used again).

Overridden by google.cloud.datastore.transaction.Transaction.

Raises:ValueError if the batch is not in progress.
class google.cloud.datastore.Client(project=None, namespace=None, credentials=None, _http=None, _use_grpc=None)[source]#

Bases: google.cloud.client.ClientWithProject

Convenience wrapper for invoking APIs/factories w/ a project.

>>> from google.cloud import datastore
>>> client = datastore.Client()
Parameters:
  • project (str) – (Optional) The project to pass to proxied API methods.
  • namespace (str) – (Optional) namespace to pass to proxied API methods.
  • credentials (Credentials) – (Optional) The OAuth2 Credentials to use for this client. If not passed (and if no _http object is passed), falls back to the default inferred from the environment.
  • _http (Session) – (Optional) HTTP object to make requests. Can be any object that defines request() with the same interface as requests.Session.request(). If not passed, an _http object is created that is bound to the credentials for the current object. This parameter should be considered private, and could change in the future.
  • _use_grpc (bool) – (Optional) Explicitly specifies whether to use the gRPC transport (via GAX) or HTTP. If unset, falls back to the GOOGLE_CLOUD_DISABLE_GRPC environment variable. This parameter should be considered private, and could change in the future.
allocate_ids(incomplete_key, num_ids)[source]#

Allocate a list of IDs from a partial key.

Parameters:
Return type:

list of google.cloud.datastore.key.Key

Returns:

The (complete) keys allocated with incomplete_key as root.

Raises:

ValueError if incomplete_key is not a partial key.

batch()[source]#

Proxy to google.cloud.datastore.batch.Batch.

current_batch#

Currently-active batch.

Return type:google.cloud.datastore.batch.Batch, or an object implementing its API, or NoneType (if no batch is active).
Returns:The batch/transaction at the top of the batch stack.
current_transaction#

Currently-active transaction.

Return type:google.cloud.datastore.transaction.Transaction, or an object implementing its API, or NoneType (if no transaction is active).
Returns:The transaction at the top of the batch stack.
delete(key)[source]#

Delete the key in the Cloud Datastore.

Note

This is just a thin wrapper over delete_multi(). The backend API does not make a distinction between a single key or multiple keys in a commit request.

Parameters:key (google.cloud.datastore.key.Key) – The key to be deleted from the datastore.
delete_multi(keys)[source]#

Delete keys from the Cloud Datastore.

Parameters:keys (list of google.cloud.datastore.key.Key) – The keys to be deleted from the Datastore.
get(key, missing=None, deferred=None, transaction=None, eventual=False)[source]#

Retrieve an entity from a single key (if it exists).

Note

This is just a thin wrapper over get_multi(). The backend API does not make a distinction between a single key or multiple keys in a lookup request.

Parameters:
  • key (google.cloud.datastore.key.Key) – The key to be retrieved from the datastore.
  • missing (list) – (Optional) If a list is passed, the key-only entities returned by the backend as “missing” will be copied into it.
  • deferred (list) – (Optional) If a list is passed, the keys returned by the backend as “deferred” will be copied into it.
  • transaction (Transaction) – (Optional) Transaction to use for read consistency. If not passed, uses current transaction, if set.
  • eventual (bool) – (Optional) Defaults to strongly consistent (False). Setting True will use eventual consistency, but cannot be used inside a transaction or will raise ValueError.
Return type:

google.cloud.datastore.entity.Entity or NoneType

Returns:

The requested entity if it exists.

Raises:

ValueError if eventual is True and in a transaction.

get_multi(keys, missing=None, deferred=None, transaction=None, eventual=False)[source]#

Retrieve entities, along with their attributes.

Parameters:
  • keys (list of google.cloud.datastore.key.Key) – The keys to be retrieved from the datastore.
  • missing (list) – (Optional) If a list is passed, the key-only entities returned by the backend as “missing” will be copied into it. If the list is not empty, an error will occur.
  • deferred (list) – (Optional) If a list is passed, the keys returned by the backend as “deferred” will be copied into it. If the list is not empty, an error will occur.
  • transaction (Transaction) – (Optional) Transaction to use for read consistency. If not passed, uses current transaction, if set.
  • eventual (bool) – (Optional) Defaults to strongly consistent (False). Setting True will use eventual consistency, but cannot be used inside a transaction or will raise ValueError.
Return type:

list of google.cloud.datastore.entity.Entity

Returns:

The requested entities.

Raises:

ValueError if one or more of keys has a project which does not match our project.

Raises:

ValueError if eventual is True and in a transaction.

key(*path_args, **kwargs)[source]#

Proxy to google.cloud.datastore.key.Key.

Passes our project.

put(entity)[source]#

Save an entity in the Cloud Datastore.

Note

This is just a thin wrapper over put_multi(). The backend API does not make a distinction between a single entity or multiple entities in a commit request.

Parameters:entity (google.cloud.datastore.entity.Entity) – The entity to be saved to the datastore.
put_multi(entities)[source]#

Save entities in the Cloud Datastore.

Parameters:entities (list of google.cloud.datastore.entity.Entity) – The entities to be saved to the datastore.
Raises:ValueError if entities is a single entity.
query(**kwargs)[source]#

Proxy to google.cloud.datastore.query.Query.

Passes our project.

Using query to search a datastore:

>>> query = client.query(kind='MyKind')
>>> query.add_filter('property', '=', 'val')

Using the query iterator

>>> query_iter = query.fetch()
>>> for entity in query_iter:
...     do_something(entity)

or manually page through results

>>> query_iter = query.fetch(start_cursor=cursor)
>>> pages = query_iter.pages
>>>
>>> first_page = next(pages)
>>> first_page_entities = list(first_page)
>>> query_iter.next_page_token is None
True
Parameters:kwargs (dict) – Parameters for initializing and instance of Query.
Return type:Query
Returns:A query object.
transaction()[source]#

Proxy to google.cloud.datastore.transaction.Transaction.

class google.cloud.datastore.Entity(key=None, exclude_from_indexes=())[source]#

Bases: dict

Entities are akin to rows in a relational database

An entity storing the actual instance of data.

Each entity is officially represented with a Key, however it is possible that you might create an entity with only a partial key (that is, a key with a kind, and possibly a parent, but without an ID). In such a case, the datastore service will automatically assign an ID to the partial key.

Entities in this API act like dictionaries with extras built in that allow you to delete or persist the data stored on the entity.

Entities are mutable and act like a subclass of a dictionary. This means you could take an existing entity and change the key to duplicate the object.

Use get() to retrieve an existing entity:

>>> client.get(key)
<Entity('EntityKind', 1234) {'property': 'value'}>

You can the set values on the entity just like you would on any other dictionary.

>>> entity['age'] = 20
>>> entity['name'] = 'JJ'

However, not all types are allowed as a value for a Google Cloud Datastore entity. The following basic types are supported by the API:

In addition, three container types are supported:

  • list
  • Entity
  • dict (will just be treated like an Entity without a key or exclude_from_indexes)

Each entry in a list must be one of the value types (basic or container) and each value in an Entity must as well. In this case an Entity as a container acts as a dict, but also has the special annotations of key and exclude_from_indexes.

And you can treat an entity like a regular Python dictionary:

>>> sorted(entity.keys())
['age', 'name']
>>> sorted(entity.items())
[('age', 20), ('name', 'JJ')]

Note

When saving an entity to the backend, values which are “text” (unicode in Python2, str in Python3) will be saved using the ‘text_value’ field, after being encoded to UTF-8. When retrieved from the back-end, such values will be decoded to “text” again. Values which are “bytes” (str in Python2, bytes in Python3), will be saved using the ‘blob_value’ field, without any decoding / encoding step.

Parameters:
  • key (google.cloud.datastore.key.Key) – Optional key to be set on entity.
  • exclude_from_indexes (tuple of string) – Names of fields whose values are not to be indexed for this entity.
kind#

Get the kind of the current entity.

Note

This relies entirely on the google.cloud.datastore.key.Key set on the entity. That means that we’re not storing the kind of the entity at all, just the properties and a pointer to a Key which knows its Kind.

class google.cloud.datastore.Key(*path_args, **kwargs)[source]#

Bases: object

An immutable representation of a datastore Key.

To create a basic key directly:

>>> Key('EntityKind', 1234, project=project)
<Key('EntityKind', 1234), project=...>
>>> Key('EntityKind', 'foo', project=project)
<Key('EntityKind', 'foo'), project=...>

Though typical usage comes via the key() factory:

>>> client.key('EntityKind', 1234)
<Key('EntityKind', 1234), project=...>
>>> client.key('EntityKind', 'foo')
<Key('EntityKind', 'foo'), project=...>

To create a key with a parent:

>>> client.key('Parent', 'foo', 'Child', 1234)
<Key('Parent', 'foo', 'Child', 1234), project=...>
>>> client.key('Child', 1234, parent=parent_key)
<Key('Parent', 'foo', 'Child', 1234), project=...>

To create a partial key:

>>> client.key('Parent', 'foo', 'Child')
<Key('Parent', 'foo', 'Child'), project=...>
Parameters:
  • path_args (tuple of string and integer) – May represent a partial (odd length) or full (even length) key path.
  • kwargs (dict) – Keyword arguments to be passed in.

Accepted keyword arguments are

  • namespace (string): A namespace identifier for the key.
  • project (string): The project associated with the key.
  • parent (Key): The parent of the key.

The project argument is required unless it has been set implicitly.

completed_key(id_or_name)[source]#

Creates new key from existing partial key by adding final ID/name.

Parameters:id_or_name (str or integer) – ID or name to be added to the key.
Return type:google.cloud.datastore.key.Key
Returns:A new Key instance with the same data as the current one and an extra ID or name added.
Raises:ValueError if the current key is not partial or if id_or_name is not a string or integer.
flat_path#

Getter for the key path as a tuple.

Return type:tuple of string and integer
Returns:The tuple of elements in the path.
classmethod from_legacy_urlsafe(urlsafe)[source]#

Convert urlsafe string to Key.

This is intended to work with the “legacy” representation of a datastore “Key” used within Google App Engine (a so-called “Reference”). This assumes that urlsafe was created within an App Engine app via something like ndb.Key(...).urlsafe().

Parameters:urlsafe (bytes or unicode) – The base64 encoded (ASCII) string corresponding to a datastore “Key” / “Reference”.
Return type:Key.
Returns:The key corresponding to urlsafe.
id#

ID getter. Based on the last element of path.

Return type:int
Returns:The (integer) ID of the key.
id_or_name#

Getter. Based on the last element of path.

Return type:int (if id) or string (if name)
Returns:The last element of the key’s path if it is either an id or a name.
is_partial#

Boolean indicating if the key has an ID (or name).

Return type:bool
Returns:True if the last element of the key’s path does not have an id or a name.
kind#

Kind getter. Based on the last element of path.

Return type:str
Returns:The kind of the current key.
name#

Name getter. Based on the last element of path.

Return type:str
Returns:The (string) name of the key.
namespace#

Namespace getter.

Return type:str
Returns:The namespace of the current key.
parent#

The parent of the current key.

Return type:google.cloud.datastore.key.Key or NoneType
Returns:A new Key instance, whose path consists of all but the last element of current path. If the current key has only one path element, returns None.
path#

Path getter.

Returns a copy so that the key remains immutable.

Return type:list of dict
Returns:The (key) path of the current key.
project#

Project getter.

Return type:str
Returns:The key’s project.
to_legacy_urlsafe()[source]#

Convert to a base64 encode urlsafe string for App Engine.

This is intended to work with the “legacy” representation of a datastore “Key” used within Google App Engine (a so-called “Reference”). The returned string can be used as the urlsafe argument to ndb.Key(urlsafe=...). The base64 encoded values will have padding removed.

Note

The string returned by to_legacy_urlsafe is equivalent, but not identical, to the string returned by ndb.

Return type:bytes
Returns:A bytestring containing the key encoded as URL-safe base64.
to_protobuf()[source]#

Return a protobuf corresponding to the key.

Return type:entity_pb2.Key
Returns:The protobuf representing the key.
class google.cloud.datastore.Query(client, kind=None, project=None, namespace=None, ancestor=None, filters=(), projection=(), order=(), distinct_on=())[source]#

Bases: object

A Query against the Cloud Datastore.

This class serves as an abstraction for creating a query over data stored in the Cloud Datastore.

Parameters:
  • client (google.cloud.datastore.client.Client) – The client used to connect to Datastore.
  • kind (str) – The kind to query.
  • project (str) – (Optional) The project associated with the query. If not passed, uses the client’s value.
  • namespace (str) – (Optional) The namespace to which to restrict results. If not passed, uses the client’s value.
  • ancestor (Key) – (Optional) key of the ancestor to which this query’s results are restricted.
  • filters (tuple[str, str, str]) – Property filters applied by this query. The sequence is (property_name, operator, value).
  • projection (sequence of string) – fields returned as part of query results.
  • order (sequence of string) – field names used to order query results. Prepend - to a field name to sort it in descending order.
  • distinct_on (sequence of string) – field names used to group query results.
Raises:

ValueError if project is not passed and no implicit default is set.

add_filter(property_name, operator, value)[source]#

Filter the query based on a property name, operator and a value.

Expressions take the form of:

.add_filter('<property>', '<operator>', <value>)

where property is a property stored on the entity in the datastore and operator is one of OPERATORS (ie, =, <, <=, >, >=):

>>> from google.cloud import datastore
>>> client = datastore.Client()
>>> query = client.query(kind='Person')
>>> query.add_filter('name', '=', 'James')
>>> query.add_filter('age', '>', 50)
Parameters:
Raises:

ValueError if operation is not one of the specified values, or if a filter names '__key__' but passes an invalid value (a key is required).

ancestor#

The ancestor key for the query.

Return type:Key or None
Returns:The ancestor for the query.
distinct_on#

Names of fields used to group query results.

Return type:sequence of string
Returns:The “distinct on” fields set on the query.
fetch(limit=None, offset=0, start_cursor=None, end_cursor=None, client=None, eventual=False)[source]#

Execute the Query; return an iterator for the matching entities.

For example:

>>> from google.cloud import datastore
>>> client = datastore.Client()
>>> query = client.query(kind='Person')
>>> query.add_filter('name', '=', 'Sally')
>>> list(query.fetch())
[<Entity object>, <Entity object>, ...]
>>> list(query.fetch(1))
[<Entity object>]
Parameters:
  • limit (int) – (Optional) limit passed through to the iterator.
  • offset (int) – (Optional) offset passed through to the iterator.
  • start_cursor (bytes) – (Optional) cursor passed through to the iterator.
  • end_cursor (bytes) – (Optional) cursor passed through to the iterator.
  • client (google.cloud.datastore.client.Client) – (Optional) client used to connect to datastore. If not supplied, uses the query’s value.
  • eventual (bool) – (Optional) Defaults to strongly consistent (False). Setting True will use eventual consistency, but cannot be used inside a transaction or will raise ValueError.
Return type:

Iterator

Returns:

The iterator for the query.

filters#

Filters set on the query.

Return type:tuple[str, str, str]
Returns:The filters set on the query. The sequence is (property_name, operator, value).
key_filter(key, operator='=')[source]#

Filter on a key.

Parameters:
keys_only()[source]#

Set the projection to include only keys.

kind#

Get the Kind of the Query.

Return type:str
Returns:The kind for the query.
namespace#

This query’s namespace

Return type:str or None
Returns:the namespace assigned to this query
order#

Names of fields used to sort query results.

Return type:sequence of string
Returns:The order(s) set on the query.
project#

Get the project for this Query.

Return type:str
Returns:The project for the query.
projection#

Fields names returned by the query.

Return type:sequence of string
Returns:Names of fields in query results.
class google.cloud.datastore.Transaction(client)[source]#

Bases: google.cloud.datastore.batch.Batch

An abstraction representing datastore Transactions.

Transactions can be used to build up a bulk mutation and ensure all or none succeed (transactionally).

For example, the following snippet of code will put the two save operations (either insert or upsert) into the same mutation, and execute those within a transaction:

>>> with client.transaction():
...     client.put_multi([entity1, entity2])

Because it derives from Batch, Transaction also provides put() and delete() methods:

>>> with client.transaction() as xact:
...     xact.put(entity1)
...     xact.delete(entity2.key)

By default, the transaction is rolled back if the transaction block exits with an error:

>>> with client.transaction():
...     do_some_work()
...     raise SomeException  # rolls back
Traceback (most recent call last):
  ...
SomeException

If the transaction block exits without an exception, it will commit by default.

Warning

Inside a transaction, automatically assigned IDs for entities will not be available at save time! That means, if you try:

>>> with client.transaction():
...     entity = Entity(key=client.key('Thing'))
...     client.put(entity)

entity won’t have a complete key until the transaction is committed.

Once you exit the transaction (or call commit()), the automatically generated ID will be assigned to the entity:

>>> with client.transaction():
...     entity = Entity(key=client.key('Thing'))
...     client.put(entity)
...     print(entity.key.is_partial)  # There is no ID on this key.
...
True
>>> print(entity.key.is_partial)  # There *is* an ID.
False

If you don’t want to use the context manager you can initialize a transaction manually:

>>> transaction = client.transaction()
>>> transaction.begin()
>>>
>>> entity = Entity(key=client.key('Thing'))
>>> transaction.put(entity)
>>>
>>> transaction.commit()
Parameters:client (google.cloud.datastore.client.Client) – the client used to connect to datastore.
begin()[source]#

Begins a transaction.

This method is called automatically when entering a with statement, however it can be called explicitly if you don’t want to use a context manager.

Raises:ValueError if the transaction has already begun.
commit()[source]#

Commits the transaction.

This is called automatically upon exiting a with statement, however it can be called explicitly if you don’t want to use a context manager.

This method has necessary side-effects:

  • Sets the current transaction’s ID to None.
current()[source]#

Return the topmost transaction.

Note

If the topmost element on the stack is not a transaction, returns None.

Return type:google.cloud.datastore.transaction.Transaction or None
Returns:The current transaction (if any are active).
id#

Getter for the transaction ID.

Return type:str
Returns:The ID of the current transaction.
rollback()[source]#

Rolls back the current transaction.

This method has necessary side-effects:

  • Sets the current transaction’s ID to None.