ASAPP customers can securely retrieve application data for their internal consumption from ASAPP via S3, using credentials dedicated to this download function.

If you’re retrieving data from ASAPP’s AI Services, visit this guide instead: Retrieving Data from AI Services

The following provides details on how you can retrieve data from ASAPP.

  • Data Retrieval Prerequisites: Describes prerequisites for secure data exchange. This includes key exchange and authentication, as well as ASAPP’s standards for bucket naming and S3 folder structure.
  • Downloading Data from S3: Describes ASAPP’s conventions for constructing an S3 path to exported data and some general data formatting conventions for ASAPP exports. Additionally, provides a sample download script.

Data Retrieval Prerequisites

ASAPP will share data with you via a set of S3 buckets. These S3 buckets will be encrypted at rest using AES256 and access-protected behind AWS authentication.

At the customer’s option, the individual download files can be encrypted using a scheme mutually agreed upon between ASAPP and the customer. This is not usually required, but in the event that your enterprise requires ASAPP export files to be doubly encrypted, contact your customer success manager to work out the details.

Authentication Credentials

Each run environment (typically preprod and prod) will be allocated a dedicated set of AWS credentials that are authorized to read from that environment’s respective S3 endpoint. These credentials will be different from other AWS credentials ASAPP might have provided you.

ASAPP will provide credentials to you within a GPG encrypted file. To initiate the credential exchange, you must provide ASAPP with a public GPG key that we may use to encrypt the files containing your credentials. It’s safe to send your public GPG key to ASAPP using any available channel. Please do NOT provide ASAPP with your private GPG key.

GitHub provides one of many good available tutorials on GPG key generation here: https://help.github.com/en/articles/generating-a-new-gpg-key .

Once you’ve provided ASAPP with your public GPG key, we will provide your credentials in a file hosted on S3 that you will retrieve using an expiring https link. ASAPPs standard practice is to have those links expire after 24 hours.

The file will be encrypted using your public GPG key. Once you decrypt the provided file using your private GPG key, your credentials will be contained within a tab delimited file with the following structure:

id     secret      bucket     sub-folder (if any)

Please let us know when you’ve successfully retrieved and decrypted your credentials file.

Your Dedicated S3 Buckets

Once you’ve securely received credentials that allow you to read from your dedicated S3 buckets, ASAPP will provide you with URIs representing the set of S3 buckets from which you may securely download data files.

ASAPP can make a broad range of data available to you. You’ll work with your Customer Success and Implementation teams to schedule specific feeds for export.

Each separate feed will have a unique S3 endpoint and, potentially, a unique strategy for data partitioning. While the specific details vary according to the feed at hand, ASAPP follows a consistent set of conventions for naming S3 buckets, paths, and partitions.

The following provides details about the conventions.

Bucket Naming

For clarity, ASAPP names buckets using the following convention:

s3://asapp-\{env\}-\{company_name\}-exports-\{aws-region\}

Key

Description

env

Environment (e.g., prod, preprod, test)

company_name

Your company name, without spaces

aws-region

us-east-1

Note: this is the region currently supported for your ASAPP implementations.

For example, an S3 bucket set up to host exports of pre-production data from ACME would be named:

s3://asapp-preprod-acme-exports-us-east-1

Folder Structure

Within the named bucket, ASAPP uses the following conventions to construct an S3 path that corresponds to the name of the exported feed and to partitions of the exported feed:

s3://BUCKET_NAME/FEED_NAME/version=VERSION_NUMBER/format=FORMAT_NAME/ dt=DATE/hr=HOUR/mi=MINUTE/DATAFILE(S)

s3 PATH ELEMENTDescription
BUCKET_NAME(Described above)
FEED_NAMEThe name of the table, extract, feed, etc.
versionThe version of the feed at hand. Changes whenever the schema, meaning of a column, etc., changes in a way that could break existing integrations.
formatThe format of the exported data. Almost always, this will be JSON Lines.*
dtThe YYYY-MM-DD formatted date corresponding to the exported data.
hr(optional) The HH formatted hour corresponding to the exported data.
mi(optional) The MM formatted minute corresponding to the exported data. Note: This might be either the exact minute for a minutely partitioned data export, or the first minute of a bucket representing a range of minutes in the exported data. For example, 00, 15, 30, 45 would be the mi values for four partitions across a single hour. The 00 partition would contain all data from the start of the hour up to but not including the 15^th^ minute of the hour.
DATAFILE(s)The filename or filenames of the exported feed partition. See the File Names below for additional information about file naming.

*By default, and by very strong preference, ASAPP will output files in JSON Lines format. JSON Lines is robust, unambiguous, and will save us all from the heartbreak CSV. The format is described at http://jsonlines.org. | |

Note that hr and mi partitions will not always apply. In the event that a particular feed is not partitioned on hour or minute, the hr and/or mi subfolders will not be present for that feed.

File Naming

File names that correspond to an exported feed partition will have names in the following form:

\{FEED_NAME\}\{FORMAT\}\{SPLIT_NUMBER\}.\{COMPRESSION\}.\{ENCRYPTION\}

File name elementDescription
FEED_NAMEThe feed name from which this partition is exported.
FORMAT.jsonl
SPLIT_NUMBER(optional) In the event that a particular partition’s export needs to be split across multiple physical files in order to accommodate file size constraints, each split file will be suffixed with a dot followed by a two-digit incrementing sequence. If the whole partition can fit in a single file, no SPLIT_NUMBER will be present in the file name.
COMPRESSION(optional) .gz will be appended to the file name if the file is gzip compressed.
ENCRYPTION(optional) In the atypical case where a file written to the s3 store is doubly encrypted, the filename will have a .enc extension.

Downloading Data from S3

General Data Formatting Notes

  • ASAPP export files are UTF-8 encoded.
  • Control characters are escaped.
  • Files are formatted with Unix-style line endings.

Verifying the Data Export is Complete

Upon completing the upload for a particular partition, ASAPP will upload an EMPTY file named _SUCCESS to the same path as the uploaded file or files. This _SUCCESS file acts as a flag indicating that the upload for the associated partition is complete. A _SUCCESS file will be written even if there is no available data selected for export for the partition at hand.

Until the _SUCCESS file is uploaded, ASAPP’s export/upload is in progress and you should not import the associated data file. You should check for this file before downloading any data partition.

Download Example

The example below assumes a shell terminal with python 2.7+ installed.

# install aws cli (assumes python)
pip install awscli
# configure your S3 credentials if not already done
aws configure 
# pull down all the files for 2019-01-20 for the convos_delta_ended export
# for a company named `acme` to your local drive in pre-production
aws s3 cp --recursive s3://asapp-preprod-acme-exports-us-east-1/convos_delta_ended/version=1/format=jsonl/dt=2019-01-20/ /path/to/put/files/
# you should see some files now in the local path.
# NOTE: if the export is hourly (like the above) you will see subfolders
ls -ltr /path/to/put/files/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=01/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=08/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=04/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=03/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=00/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=02/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=06/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=07/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=05/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=09/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=10/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=12/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=11/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=16/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=15/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=13/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=14/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=19/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=18/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=22/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=17/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=23/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=21/
drwxr-xr-x   3 myuser  mygroup    96 Jan 28 08:58 hr=20/

Deduping S3 Exports Data

All tables need to be deduped in order to get the most recent data and discard any past data that is being duplicated as part of this process.

To dedupe tables, get the latest instance and the latest run per partition key for each table, with some exceptions which are noted in the table below.

Example

In order to dedupe the table convos_metrics ,use this query:

SELECT *
FROM
    (SELECT
    *,
    ROW_NUMBER() OVER (partition by {{ primary_key }} order by {{ logical_timestamp}} DESC, {{ insertion_timestamp }} DESC) as row_idx
    FROM convos_metrics
    )
WHERE row_idx = 1

We partition by the primary_key for that table and get the latest data using order by logical_timestampDESC in the subquery. Then we only select where row_idx = 1 to only pull the latest information we have for each issue_id.

Deduping Guide by Table

Open the expandable table to see the full reference to deduping by table, which includes the fields to partition and order by respectively.

Deduping Reference

TablePartition ByOrder By
admin_activitycompany_id, rep_id, status_description, status_start_tsstatus_end_ts DESC
agent_journey_rep_event_frequencyrep_id, event_type, company_marker, instance_tsinstance_ts DESC
autopilot_flowcompany_id, issue_id, form_start_tsinstance_ts DESC
cobrowsingissue_id, cob_start_tsinstance_ts DESC
convos_intentsissue_idfirst_utterance_ts DESC
convos_intents_endedissue_idfirst_utterance_ts DESC
convos_metadataissue_idlast_event_ts DESC
convos_metadata_endedissue_idconversation_end_ts DESC
convos_metricsissue_idfirst_utterance_ts DESC
convos_metrics_endedissue_idfirst_utterance_ts DESC
convos_summary_tagsissue_id, company_id, summary_tag_presentedinstance_ts DESC
csid_containmentcompany_id, csidinstance_ts DESC
csid_containment_1dcompany_id, csidinstance_ts DESC
customer_feedbackissue_id, company_subdivision, company_segments, last_rep_id, question, instance_tsinstance_ts DESC
customer_paramsevent_id, param_keyinstance_ts DESC
dim_queuesqueue_key
export_row_countsexport_date, export_job, export_interval
fact_convos_missed_endscompany_id, issue_id, conversation_end_tsinstance_ts DESC
flow_completionsissue_idnegation_event_ts,success_event_ts DESC
flow_detailissue_id, event_tsevent_ts DESC
intentscode, name, intent_type
issue_callback_3dissue_idinstance_ts DESC
issue_omnichannelcompany_id, issue_id, third_party_customer_idinstance_ts DESC
issue_queuesissue_id, queue_id, enter_queue_tsinstance_ts DESC
issue_sentimentissue_id, created_tsinstance_ts DESC
issue_session_mergecompany_id, issue_id, session_idinstance_ts DESC
issue_typecompany_id, customer_id, issue_idinstance_ts DESC
knowledge_baseissue_id, company_id, article_id, event_tsevent_ts DESC
live_agent_opportunitiesissue_id, customer_id, opportunity_tsinstance_ts DESC
queue_check_insissue_id, customer_id, check_in_tsinstance_ts DESC
queue_membership_wfminstance_ts, company_id, rep_id, queue_idinstance_ts DESC
queue_wfminstance_ts, queue_name, rep_idinstance_ts DESC
quick_reply_buttons (deprecated)final_intent_code, quick_reply_button_text, escalated_to_chat, escalation_requested, quick_reply_button_indexinstance_ts DESC
repsrep_idcreated_ts DESC
rep_activitycompany_id, instance_ts, rep_id, status_id, in_status_starting_tsinstance_ts DESC
rep_assignment_dispositioncompany_id, issue_id, rep_id, rep_assigned_tsrep_assigned_ts DESC
rep_attributesrep_attribute_id, rep_id, created_tscreated_ts DESC
rep_augmentationissue_idinstance_ts DESC
rep_convosissue_id, rep_id, issue_assigned_tsinstance_ts DESC
rep_hierarchysubordinate_agent_id,superior_agent_id,
rep_utilizedinstance_ts, rep_id, desk_modeinstance_ts DESC
sms_eventssms_flow_id
transfersissue_id, rep_id, timestamp_reqinstance_ts DESC
utterancesissue_id, sender_id, created_tsinstance_ts DESC
voice_intentsissue_id, company_idinstance_ts DESC

Schema Adjustments

You should enable automated schema evolution detection and identify any changes using export_docs.yaml, which is generated each day and sent via the S3 feed. By incorporating this into the workflows, you can maintain a proactive stance, ensuring uninterrupted service and a smooth transition in the event of schema adjustments.

Export Schema

The scrolling window below provides detailed information regarding the primary schema for exported data files that ASAPP can make available to customers.

Visit Retrieving Data from ASAPP, to learn more about how to export AI Services data using the File Exporter API.

The schema is available as a stand-alone webpage

Workforce Management (WFM) Data

In addition to the primary data schema (above), ASAPP also offers separate workforce management data for use in intraday workforce management applications.

The data is structured in two tables and published every 20-30 minutes. This data refresh cadence is intended to support continuous monitoring use cases rather than for historical reporting.

Each table’s metrics are calculated for a 15-minute window.

For example, the tables published at 1:35AM would reflect specific queue or rep activity from 1:00AM - 1:15AM.

queue_wfm

For each queue in your ASAPP implementation, this table contains queue-related metrics during a 15-minute period:

  • Total assignments
  • Customer queue abandons
  • Average handle time
  • Average wrap time
  • Average customer wait time in queue
  • Customer wait times under different SLA thresholds
  • Reps in the queue

queue_membership_wfm

For each queue in your ASAPP implementation, this table contains rep-level information about availability and status of the rep during a 15-minute period (or the period of the 15 minutes during which the rep was logged in):

  • Linear logged-in time
  • Linear available time
  • Linear busy time
  • Total linear utilized time
  • Linear utilized time while available
  • Linear utilized time while busy

WFM data is not available by default. Reach out to your ASAPP account team to request these additional data tables and to discuss your use case.

Was this page helpful?