The developers' guide to Insight data

Our in-depth detailed guide about how to use insight data. 

Before you start

Things you need to know:

  • Less technical information is also available
    This is a developers' guide to using our Insight data; for a shorter, less technical introduction to Insight data see our guide for non-developers.

What is Insight data? 

Insight data is storable information related to your customers’ purchasing histories and habits. It allows you to store collections of data against your contacts or account via our API.

So they’re additional contact data fields, extending the contact data fields we already store?

Yes – but with the prime difference being these are transaction-based ones. Any additional keyed data you have for a contact can be stored as Insight data, with very little restriction on the type of data it can be. As a broad rule, you can store anything that is serialisable to JSON. 

Just as you can currently segment upon contact data fields such as gender, age and geography, Insight data gives you the ability to segment upon types of items purchased, the regularity of purchases and amount spent on purchases, for instance. As ever, the quality of your Insight data will dictate how well you can segment address books and personalise your campaigns and offers.

Dataset examples

Here are some examples of datasets that could be stored as Insight data:

  • For an auction site, a list of all bids a user has made and if they succeeded or not, including the date and time of the bid, the amount of the bid and the category and name of the item bid on.
  • For a travel agent, a list of destinations a user has visited via bookings on the site, including the number of bookings for the destination (solo, couple, family, etc.,), the amount spent and the type of accommodation.
  • A list of a user’s likes and dislikes.

The key point is that many different types of Insight data can be stored for each contact, and structured in a way of your choosing.

What can I do with this data?

Once Insight data is stored against your contacts, you can use it to write queries to segment your contacts against this data and create new address books. This will enable you to send targeted, personalised campaigns based upon your contacts’ transactional history and habits. Using the above examples, you will be able to run segmentations based on bids made, countries visited and likes and dislikes.

For more on segmenting Insight data, read our article on Insight data segmentation.

How is Insight data stored?

This section provides information and guidelines on storing your insight data. It does get quite technical (if it hasn’t enough already!), so if you are reading this and are not a developer then you will probably want a developer to read it instead, or at least help you with understanding it.

Keys are used to refer to individual pieces of data

At its heart, Insight data is a key-value pair storage mechanism; you store a piece of data against a contact, using a key to refer to it. If you need to update or delete a piece of data at a later date, you use the key as a reference to retrieve the piece of data and alter it.

Restrictions on keys

Keys must:

  • Be unique (it is up to the user of the API to ensure uniqueness)

Keys mustn't:

  • Begin with a number
  • Exceed 255 characters in length

Keys can only contain the following characters:

  • Alphanumeric (a-z, A-Z, 0-9)
  • Hyphen ("-")
  • Underscore ("_")

Data is stored against a contact or as account-scoped data

A contact identifier (email address or ID) is submitted with each Insight data document uploaded. If the identifier used is "account" rather than an email address, then the record will be stored as account-scoped Insight data (AccountInsight data). This data can be used for advanced personalisation and viewed by clicking the Product recommendations tab in the 'Insight data' area.

JSON data representation

All data to be stored must be serialisable as JSON. This means complex object structures can be used to represent data. For example, you could represent the contents of a user’s shopping basket like this:

"basketId": 4858,
"datePurchased": "2013-03-19T17:22:54.8042",
"productPurchased": [{
"name": "a dvd",
"cost": 10.99,
"isOnSale": false
}, {
"name": "a book",
"cost": 2.99,
"isOnSale": true

There are a few restrictions and extensions to the JSON format as outlined below.

Restrictions on collection names

Valid collection names can only contain the following characters:

  • Alphanumeric (a-z, A-Z, 0-9)
  • Hyphen ("-")
  • Underscore ("_")

Furthermore, collection names can't:

  • Begin with a number
  • Exceed 255 characters in length
  • Have exactly the same name as any other collection, even if the collections are differently scoped (e.g. one collection is contact-scoped and another collection is account-scoped (AccountInsight))

Restrictions and extensions on values

  • String values are restricted to 1000 characters in length.
  • Date time values are also supported via a subset of the ISO 8601 standard (e.g. 2013-03-19T17:22:55.804Z)
  • Lists must be of the same type. Lists may only contain either Booleans, strings, numbers, objects, lists, etc.

See the JSON specification for all other value types.

A note on date time formatting

Not all ISO 8601 formats are supported (for example the string "2013" would not be treated as a date). The following are valid for this implementation.

   Complete date plus hours, minutes and seconds:
      YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
   Complete date plus hours, minutes, seconds and a decimal fraction of a second
      YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)


     YYYY = four-digit year
     MM   = two-digit month (01=January, etc.)
     DD   = two-digit day of month (01 through 31)
     hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
     mm   = two digits of minute (00 through 59)
     ss   = two digits of second (00 through 59)
     s    = one or more digits representing a decimal fraction of a second
     TZD  = time zone designator (Z or +hh:mm or -hh:mm)

All times without a time zone designator will be treated as UTC.

JSON stores data in schema-bound collections

While you can store any JSON structure as Insight data, it is not strictly schemaless. Insight data requires that you store similarly structured data in a collection. Each collection may have a different schema, allowing you to store a variety of different data against contacts.

The schema of a collection is defined ‘on the fly’

Schemas are defined implicitly upon uploading data via the API for the first time. Furthermore, schemas are extendable but not editable.

For example, let’s say you’ve just uploaded the following object into a collection you’ve created called “preferences”:

"likes": {
"animals": [
"animal": "dogs"
"numbers": [
"number": 1,
"number": 2,
"number": 7
"dislikes": {
"animals": [
"animal": "rats"
"numbers": [13]


For all proceeding uploads, the preferences collection will expect an object with two properties, “likes” and “dislikes”, each of which refer to an object where each object has a property called “animals” which is a list of strings and “numbers” which is a list of numbers.

Types are immutable

Once you’ve defined a named property, its type can’t be changed. For example, if you upload this JSON object to define your schema for a collection:

"id": "1",
"name": "Tom Bloggs"

You couldn’t then upload the following object:

"id": 2,
"name": {
"first": "john",
"second": "smith"

Why? Because the “id” property has been changed from a string to a number and the “name” property has been changed from a string to an object.

Objects are extendable

The only things that are mutable are objects because they are extendable. Properties can be added to an object at any time (however they are then bound by the above rule). For example, if this object was uploaded to define a collection’s schema:

"fullName": "John Bloggs"

Then uploading this object would extend the collection's schema to include the two new properties, “firstName” and “lastName”:

"fullName": "John Bloggs"
"firstName": "John",
"lastName": "Bloggs"

Bulk import Insight data

As standard, you will be adding your data on a single-transaction-at-a-time basis – for instance, at the point of online checkout.

Import using the app

To import using the app, check out the article Import insight data.

Import using the API

To get you up and running, it is highly likely that you’ll want to make an initial bulk import of all your historical data. This can be done using the REST API methods Bulk add transactional data to contacts and Get transactional data import status.

It is also possible to use the SOAP API methods ImportTransactionalData and GetTransactionalDataImportProgress, however as our SOAP API is now deprecated and we can no longer offer support for it, we advise you use the REST API methods.

You can perform multiple bulk imports at a time for an account. Typically we'd suggest five as a maximum, for optimal speed and performance.

Plan ahead

Given the nature of Insight data storage, you will want to spend some time thinking about how your Insight data needs to work for you. A little bit of planning will go a long way! What sort of data do you want to store? How might you want to extend this data as you move forward?

The point to remember is that you don’t want to become backed into a corner by initiating a schema that will prove constrictive. For example, you will probably want to avoid entering your product ID as an integer. This won’t give you very easily identifiable information and you won’t be able to change it once it is uploaded. To change it, you would need to start again with a new schema and re-upload all of your data to conform to it. This is definitely something to be avoided!

Data schema

Set out below are the standard order and product Insight data schemas as commonly used by our integrations. These schemas are still extendable, however. Certain attributes are mandatory for using our product recommendations feature. These are indicated with a green tick.

Order Insight data (Contact insight) schema

Mandatory attributes for product recommendations

Please note that an order collection must be named orders. A green tick () next to an attribute indicates that it's mandatory for our product recommendations feature.
Attribute   Type
id   string
order_total ()   numerical
payment   string
delivery_method   string
delivery_total   numerical
*currency ()   string
order_status   string
email   string
quote_id   string
purchase_date ()   date
billing_address   array
  billing_address_1 string
  billing_address_2 string
  billing_city string
  billing_country string
  billing_postcode string
delivery_address   array
  delivery_address_1 string
  delivery_address_2 string
  delivery_city string
  delivery_country string
  delivery_postcode string
products ()   array
  name () string
  **price () numerical
  sku () string
  qty () numerical
order_subtotal ()   numerical
base_subtotal_incl_tax   numerical
***discount_amount   numerical
couponCode   string

*The currency attribute must be one that is supported. You can find a list of supported currency codes and how to format them in the Currency conversion article.
**Price: this price should correspond to the item's unit price.
*** The discount amount that's deducted from the order total. This should be a positive value.

Product Insight data (AccountInsight) schema

Mandatory attributes for product recommendations

Please note that a product collection must be named with a prefix of catalog_. A green tick () next to an attribute indicates that it's mandatory for our product recommendations feature.
Attribute Type
id string
*parent_id string
name () string
price () numerical
specialPrice numerical
price_incl_tax numerical
specialPrice_incl_tax numerical
url () string 
sku () string 
stock numerical
**type string
status string 
image_path () string 
***categories array
“id” : ”123”, // string
“id”: “456” // string

*The parent_id field is used to indicate whether the product is linked to a parent (or configurable) product. The value will be the ID of that parent product. If the product is standalone or is itself a parent product, the value can be blank or remain as the parent ID respectively.

**The type field is used to define the kind of product (in terms of hierarchy). Some possible values include 'Configurable', 'Bundle', 'Variant' or 'Virtual'. The possible values will be dictated by your ecommerce platform.

*** The categories field provides a way to link a product to a category or a set of categories. Using the categories field requires to synchronise a separate categories_* insight collection, as described further below. The categories field must be an array of objects, including an “id” field as a string value. These id(s) provided need(s) to match category ids available in the categoriescollection.

Categories Insight data schema

Categories insight collections can be used to store the categories of your products, as defined in your ecommerce store or ERP system. It allows you to segment contacts based on the product categories they purchase, or apply filters to your catalog or product recommendations.

The categories collection must be named with the prefix of categories_ and have a suffix matching the product catalog it relates to.
For example:

  • For the catalog catalog_SnowYo_Retail you should create categories_SnowYo_Retail

Please note that a product collection must be named with a prefix of catalog_. A green tick () next to an attribute indicates that it's mandatory for our product recommendations feature.

Attribute Type
id () string
name () string


*parent_id can be used to create parent-child hierachry between categories. For example, if 'Clothing' has the 'id':'123', 'Dresses' ('id':'456') could have 'parent_id' set to '123'. If a category has not got any defined parent, you can use 'parent_id':'0'.

See also

Did you find this article helpful?

Can we help?

Thanks for using Dotdigital. If you need more help or support, then contact our support team.