Reading from HubSpot entities
Prerequisite
A HubSpot object you would like to read from. You will need the object name such as contact or task. The following table shows the supported entities.
Supported entities for source:
Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select * | Supports partioning |
---|---|---|---|---|---|---|
Campaigns | v1 | No | Yes | No | Yes | No |
Companies | v3 | Yes | Yes | Yes | Yes | Yes |
Contacts | v3 | Yes | Yes | Yes | Yes | Yes |
Contact Lists | v1 | No | Yes | No | Yes | No |
Deals | v3 | Yes | Yes | Yes | Yes | Yes |
CRM Pipeline (Deal Pipelines) | v1 | No | No | No | Yes | No |
Email Events | v1 | No | Yes | No | Yes | No |
Calls | v3 | Yes | Yes | Yes | Yes | Yes |
Notes | v3 | Yes | Yes | Yes | Yes | Yes |
Emails | v3 | Yes | Yes | Yes | Yes | Yes |
Meetings | v3 | Yes | Yes | Yes | Yes | Yes |
Tasks | v3 | Yes | Yes | Yes | Yes | Yes |
Postal Mails | v3 | Yes | Yes | Yes | Yes | Yes |
Custom Objects | v3 | Yes | Yes | Yes | Yes | Yes |
Forms | v2 | No | No | No | Yes | No |
Owners | v3 | No | Yes | No | Yes | No |
Products | v3 | Yes | Yes | Yes | Yes | Yes |
Tickets | v3 | Yes | Yes | Yes | Yes | Yes |
Workflows | v3 | No | No | No | Yes | No |
Associations | v4 | Yes | No | No | Yes | No |
Associations Labels | v4 | No | No | No | Yes | No |
Example:
hubspot_read = glueContext.create_dynamic_frame.from_options( connection_type="hubspot", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "contact", "API_VERSION": "v3" }
HubSpot entity and field details:
HubSpot API v4:
Entity | API version | Field | Data type | Supported operators |
---|---|---|---|---|
Association Label | v4 | category | String | N/A |
typeId | Integer | N/A | ||
label | String | N/A | ||
Associations | from | Struct | N/A | |
id | String | "=" | ||
to | List | N/A |
Note
For the Associations
object, to fetch associations between two objects, you need to provide the 'from Id' (the ID of the first object) via a mandatory filter while creating an AWS Glue job. If you want to fetch associations for multiple from IDs in that case, you have to provide multiple IDs in the where
clause. For example: for fetching Associations
for contact IDs '1' and '151', you need to provide a filter as where id=1 AND id=151
.
HubSpot API v3:
Entity | Field | Data type | Supported operators |
---|---|---|---|
Owner | firstName | String | N/A |
lastName | String | N/A | |
createdAt | DateTime | N/A | |
archived | Boolean | N/A | |
teams | List | N/A | |
id | String | N/A | |
userId | Integer | N/A | |
String | N/A | ||
updatedAt | DateTime | N/A | |
Workflow | name | String | N/A |
id | Integer | N/A | |
type | String | N/A | |
enabled | Boolean | N/A | |
insertedAt | Long | N/A | |
updatedAt | Long | N/A | |
contactListIds | Struct | N/A | |
personaTagIds | List | N/A |
For the following entities, HubSpot provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for each entity.
Note
DML_STATUS
is a virtual field added on every record at runtime to determine its
status (CREATED/UPDATED).
Entity | Data type | Supported operators |
---|---|---|
Contact | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Company | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Deal | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Ticket | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Product | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Custom Object | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Call | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Integer | "=, !=, <, >, >=, <=" | |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Meeting | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Note | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Task | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A | |
Postal Mail | Integer | "=, !=, <, >, >=, <=" |
Long | "=, !=, <, >, >=, <=" | |
String | "=, !=, LIKE" | |
Date | N/A | |
DateTime | "between" | |
Boolean | "=" | |
List | N/A | |
Struct | N/A |
HubSpot API v2:
Entity | Field | Data type | Supported operators |
---|---|---|---|
Form | portalId | Integer | N/A |
guid | String | N/A | |
name | String | N/A | |
method | String | N/A | |
cssClass | String | N/A | |
redirect | String | N/A | |
submitText | String | N/A | |
notifyRecipients | String | N/A | |
createdAt | Long | N/A | |
updatedAt | Long | N/A | |
ignoreCurrentValues | Boolean | N/A | |
deletable | Boolean | N/A | |
inlineMessage | Boolean | N/A | |
captchaEnabled | Boolean | N/A | |
cloneable | Boolean | N/A | |
formFieldGroups | List | N/A | |
editable | Boolean | N/A | |
deletedAt | Integer | N/A | |
themeName | String | N/A | |
parentId | Integer | N/A | |
style | String | N/A | |
isPublished | Boolean | N/A | |
publishAt | Integer | N/A | |
unpublishAt | Integer | N/A | |
publishedAt | Integer | N/A | |
kickbackEmailWorkflowId | String | N/A | |
kickbackEmailsJson | Integer | N/A | |
customUid | String | N/A | |
createMarketableContact | Boolean | N/A | |
editVersion | Integer | N/A | |
thankYouMessageJson | String | N/A | |
themeColor | String | N/A | |
alwaysCreateNewCompany | Boolean | N/A | |
internalUpdatedAt | Long | N/A | |
businessUnitId | Integer | N/A | |
portableKey | String | N/A | |
paymentSessionTemplateIds | List | N/A | |
selectedExternalOptions | List | N/A |
HubSpot API v1:
Entity | Field | Data type | Supported operators |
---|---|---|---|
Campaign | id | Integer | N/A |
appId | Integer | N/A | |
appName | String | N/A | |
lastUpdatedTime | Long | N/A | |
Contact_List | dynamic | Boolean | N/A |
name | String | N/A | |
portalId | Integer | N/A | |
createdAt | Long | N/A | |
listId | Integer | N/A | |
updatedAt | Long | N/A | |
ListType | String | N/A | |
filters | List | N/A | |
authorId | Integer | N/A | |
metaData | Struct | N/A | |
archived | Boolean | N/A | |
ilsFilterBranch | String | N/A | |
filterIds | List | N/A | |
limitExempt | Boolean | N/A | |
internal | Boolean | N/A | |
readOnly | Boolean | N/A | |
parentId | Integer | N/A | |
Email_Event | id | String | N/A |
type | String | N/A | |
recipient | String | N/A | |
portalId | Integer | N/A | |
appId | Integer | N/A | |
appName | String | N/A | |
emailCampaignId | Long | N/A | |
attempt | Integer | N/A | |
created | Long | N/A | |
sentBy | Struct | N/A | |
smtpId | String | N/A | |
response | String | N/A | |
subject | String | N/A | |
cc | List | N/A | |
bcc | List | N/A | |
replyTo | List | N/A | |
from | String | N/A | |
dropReason | String | N/A | |
dropMessage | String | N/A | |
browser | Struct | N/A | |
userAgent | String | N/A | |
duration | Long | N/A | |
location | Struct | N/A | |
filteredEvent | Boolean | N/A | |
deviceType | String | N/A | |
suppressedReason | String | N/A | |
suppressedMessage | String | N/A | |
CRM_Pipeline | pipelineId | String | N/A |
createdAt | Long | N/A | |
updatedAt | Long | N/A | |
objectType | String | N/A | |
label | String | N/A | |
displayOrder | Integer | N/A | |
active | Boolean | N/A | |
stages | List | N/A | |
objectTypeId | String | N/A | |
default | Boolean | N/A |
Partitioning queries
You can provide the additional Spark options PARTITION_FIELD
, LOWER_BOUND
, UPPER_BOUND
, and NUM_PARTITIONS
if you want to utilize concurrency in Spark. With these parameters, the original query would be split into NUM_PARTITIONS
number of sub-queries that can be executed by Spark tasks concurrently.
PARTITION_FIELD
: the name of the field to be used to partition the query.LOWER_BOUND
: an inclusive lower bound value of the chosen partition field.For the DateTime field, we accept the value in ISO format.
Examples of valid value:
“2024-01-01T10:00:00.115Z"
UPPER_BOUND
: an exclusive upper bound value of the chosen partition field.NUM_PARTITIONS
: the number of partitions.
The following table describes the entity partitioning field support details:
Entity name | Partitioning fields | Data type |
---|---|---|
contact | hs_object_id | Long |
createdate, lastmodifieddate | DateTime | |
company | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
deal | hs_object_id | Long |
createdate, hs_createdate, hs_lastmodifieddate | DateTime | |
ticket | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
product | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
custom_object | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
call | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
hs_object_id | Long | |
createdate, hs_lastmodifieddate | DateTime | |
meeting | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
note | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
task | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime | |
postal_mail | hs_object_id | Long |
createdate, hs_lastmodifieddate | DateTime |
Example:
hubspot_read = glueContext.create_dynamic_frame.from_options( connection_type="hubspot", connection_options={ "connectionName": "connectionName", "ENTITY_NAME": "company", "API_VERSION": "v3", "PARTITION_FIELD": "hs_object_id" "LOWER_BOUND": "50" "UPPER_BOUND": "16726619290" "NUM_PARTITIONS": "10" }