S3 is the supported mechanism for ongoing data transmissions, though can also be used for one-time transfers where needed. ASAPP customers can transmit the following types of data to S3:

  • Call center data attributes
  • Conversation transcripts from messaging or voice interactions
  • Recorded call audio files
  • Sales records with attribution metadata

Getting Started

Your Target S3 Buckets

ASAPP will provide you with a set of S3 buckets to which you may securely upload your data files, as well as a dedicated set of credentials authorized to write to those buckets. See the next section for more on those credentials.

For clarity, ASAPP name buckets use the following convention:

s3://asapp-\{env\}-\{company_name\}-imports-\{aws-region\}

Key

Description

env

Environment (prod, pre_prod, test)

company_name

The company name: acme, duff, stark_industries, etc.

Note: company name should not have spaces within.

aws-region

us-east-1

Note: this is the current region supported for your ASAPP instance.

So, for example, an S3 bucket set up to receive pre-production data from ACME would be named:

s3://asapp-pre_prod-acme-imports-us-east-1

S3 Target for Historical Transcripts

ASAPP has a distinct target location for sending historical transcripts for AI Services and will provide an exclusive access folder to which transcripts should be uploaded. The S3 bucket location follows this naming convention:

asapp-customers-sftp-\{env\}-\{aws-region\}

Values for env and aws-region are set in the same way as above. As an example, an S3 bucket to receive transcripts for use in production is named:

asapp-customers-sftp-prod-us-east-1

See the Historical Transcript File Structure section more information on how to format transcript files for transmission.

Encryption

ASAPP ensures that the data you write to your dedicated S3 buckets is encrypted in transit using TLS/SSL and encrypted at rest using AES256.

Your Dedicated Export AWS Credentials

ASAPP will provide you with a set of AWS credentials that allow you to securely upload data to your designated S3 buckets. (Since you need write access in order to upload data to S3, you’ll need to use a different set of credentials than the read-only credentials you might already have.)

In order for ASAPP to securely send credentials to you, you must provide ASAPP with a public GPG key that we can use to encrypt a file containing those credentials.

GitHub provides one of many good available  tutorials on GPG key generation here: https://help.github.com/en/articles/generating-a-new-gpg-key .

It’s safe to send your public GPG key to ASAPP using any available channel. Please do NOT provide ASAPP with your private key.

Once you’ve provided ASAPP with your public GPG key, we’ll forward to you an expiring https link pointing to an S3-hosted file containing credentials that have permissions to write to your dedicated S3 target buckets.

The file itself will be encrypted using your public GPG key. Once you decrypt the provided file using your private GPG key, your credentials will be contained within a tab delimited file with the following structure:

id     secret      bucket     sub-folder (if any)

Data File Formatting and Preparation

General Requirements:

  • Files should be UTF-8 encoded.
  • Control characters should be escaped.
  • You may provide files as CSV or JSONL format, but we strongly recommend JSONL where possible. (CSV files are just too fragile.)
  • If you send a CSV file, ASAPP recommends that you include a header. Otherwise, your CSV must provide columns in the exact order listed below.
  • When providing a CSV file, you must provide an explicit null value (as the unquoted string: NULL ) for missing or empty values.

Call Center Data File Structure

The table below shows the required fields to include in your uploaded call center data.

FIELD NAME

REQUIRED?

FORMAT

EXAMPLE

NOTES

customer_id

Yes

String

347bdddb-d3a1-45fc-bbcd-dbd3a175fc1c

External User ID. This is a hashed version of the client ID.

conversation_id

No

String

21352352

If filled in, should map to ASAPP’s system.  May be empty, if the customer has not had a conversation with ASAPP.

call_start

Yes

Timestamp

2020-01-03T20:02:13Z

ISO 8601 formatted UTC timestamp.  Time/date call is received by the system.

call_end

Yes

Timestamp

2020-01-03T20:02:13Z

ISO 8601 formatted UTC timestamp.  Time/date call ends.

Note: duration of call should be Call End - Call Start.

call_assigned_to_agent

No

Timestamp

2020-01-03T20:02:13Z

ISO 8601 formatted UTC timestamp. The date/time the call was answered by the agent.

customer_type

No

String

Wireless Premier

Customer account classification by client.

survey_offered

No

Bool

true/false

Whether a survey was offered or not.

survey_taken

No

Bool

true/false

When a survey was offered, whether it was completed or not.

survey_answer

No

String

Survey answer

toll_free_number

No

String

888-929-1467

Client phone number (toll free number) used to call in that allows for tracking different numbers, particularly ones referred directly by SRS.

If websource or click to call, the web campaign is passed instead of TFN.

ivr_intent

No

String

Power Outage

Phone pathing logic for routing to the appropriate agent group or providing self-service resolution. Could be multiple values.

ivr_resolved

No

Bool

true/false

Caller triggered a self-service response from the IVR and then disconnected.

ivr_abandoned

No

Bool

true/false

Caller disconnected without receiving a self-service response from IVR nor being placed in live agent queue.

agent_queue_assigned

No

String

Wireless Sales

Agent group/agent skill group (aka queue name)

time_in_queue

No

Integer

600

Seconds caller waits in queue to be assigned to an agent.

queue_abandoned

No

Bool

true/false

Caller disconnected after being assigned to a live agent queue but before being assigned to an agent.

call_handle_time

No

Integer

650

Call duration in seconds from call assignment event to call disconnect event.

call_wrap_time

No

Integer

30

Duration in seconds from call disconnect event to end of agent wrap event.

transfer

No

String

Sales Group

Agent queue name if call was transferred. NA or Null value for calls not transferred.

disposition_category

No

String

Change plan

Categorical outcome selection from agent. Alternatively, could be category like ‘Resolved’, ‘Unresolved’, ‘Transferred’, ‘Referred’.

disposition_notes

No

String

Notes from agent regarding the disposition of the call.

transaction_completed

No

String

Upgrade Completed, Payment Processed

Name of transaction type completed by call agent on behalf of customer. Could contain multiple delimited values. May not be available for all agents.

caller_account_value

No

Decimal

129.45

Current account value of customer.

Historical Transcript File Structure

ASAPP accepts uploads for historical conversation transcripts for both voice calls and chats.

The fields described below must be the columns in your uploaded .CSV table.

Each row in the uploaded .CSV table should correspond to one sent message.

FIELD NAMEREQUIRED?FORMATEXAMPLENOTES
conversation_externalIdYesString3245556677Unique identifier for the conversation
sender_externalIdYesString6433421Unique identifier for the sender of the message
sender_roleYesStringagentSupported values are ‘agent’, ‘customer’ or ‘bot’
textYesStringHappy to help, one moment pleaseMessage from sender
timestampYesTimestamp2022-03-16T18:42:24.488424ZISO 8601 formatted UTC timestamp

Proper transcript formatting and sampling ensures data is usable for model training. Please ensure transcripts conform to the following:

Formatting

  • Each utterance is clearly demarcated and sent by one identified sender
  • Utterances are in chronological order and complete, from beginning to very end of the conversation
  • Where possible, transcripts include the full content of the conversation rather than an abbreviated version. For example, in a digital messaging conversation:

Full

Abbreviated

Agent: Choose an option from the list below

Agent: (A) 1-way ticket (B) 2-way ticket (C) None of the above

Customer: (A) 1-way ticket

Agent: Choose an option from the list below

Customer: (A)

Sampling

  • Transcripts are from a wide range of dates to avoid seasonality effects; random sampling over a 12-month period is recommended
  • Transcripts mimic the production conversations on which models will be used - same types of participants, same channel (voice, messaging), same business unit
  • There are no duplicate transcripts

Transmitting Transcripts to S3

Historical transcripts are sent to a distinct S3 target separate from other data imports.

Please refer to the S3 Target for Historical Transcripts section for details.

Sales Methods & Attribution Data File Structure

The table below shows the required fields to be included in your uploaded sales methods and attribution data.

FIELD NAMEREQUIRED?FORMATEXAMPLENOTES
transaction_idYesString1d71dce2-a50c-11ea-bb37-0242ac130002 An identifier which is unique within the customer system to track this transaction.
transaction_timeYesTimestamp2007-04-05T14:30:05.123ZISO 8601 formatted UTC timestamp. Details potential duplicates and also attribute to the right period of time
transaction_value_one_timeNoFloat65.25Single value of initial purchase.
transaction_value_recurringNoFloat7.95Recurring value of subscription purchase.
customer_categoryNoStringUSCustom category value per client.
customer_subcategoryNoStringwirelessCustom subcategory value per client.
external_customer_idNoString34762720001External User ID. This is hashed version of the client ID. In order to attribute to ASAPP metadata, one of these will be required (Customer ID or Conversation ID)
issue_idNoString1E10412200CC60EEABBF32IF filled in, should map to ASAPP’s system. May be empty, if the customer has not had a conversation with ASAPP. In order to attribute to ASAPP metadata, one of these will be required (Customer ID or Conversation ID)
external_session_idYesString1a09ff6d-3d07-45dc-8fa9-4936bfc4e3e5External session id so we can track a customer
product_categoryNoStringWireless InternetCategory of product purchased.
product_subcategoryNoStringBroadbandSubcategory of product purchased.
product_nameNoStringBroadband Gold PackageThe name of the product.
product_idNoStringWI-BBGPThe identifier of the product.
product_quantityYesInteger1A number indicating the quantity of the product purchased.
product_value_one_timeNoFloat60.00Value of the product for one time purchase.
product_value_recurringNoFloat55.00Value of the product for recurring purchase.

Uploading Data to S3

At a high level, uploading your data is a three step process:

  1. Build and format your files for upload, as detailed above.
  2. Construct a “target path” for those files following the convention in the section “Constructing your Target Path” below.
  3. Signal the completion of your upload by writing an empty _SUCCESS file to your “target path”, as described in the section “Signaling that your upload is complete” below.

Constructing your target path

ASAPP’s automation will use the S3 filename of your upload when deciding how to process your data file, where the filename is formatted as follows:

s3://BUCKET_NAME/FEED_NAME/version=VERSION_NUMBER/format=FORMAT_NAME/dt=DATE/hr=HOUR/mi=MINUTE/DATAFILE_NAME(S)

The following table details the convention that ASAPP follows when handling uploads:

Signaling that Your Upload Is Complete

Upon completing a data upload, you must upload an EMPTY file named _SUCCESS to the same path as your uploaded file, as a flag that indicates your data upload is complete. Until this file is uploaded, ASAPP will assume that the upload is in progress and will not import the associated data file.

As an example, let’s say you’re uploading one day of call center data in a set of files.

Incremental and Snapshot Modes

You may provide data to ASAPP as either Incremental or Snapshot data. The value you provide us in the format field discussed above, tells ASAPP whether to treat the data you provide as Incremental or Snapshot data.

When importing data using Incremental mode, ASAPP will append the given data to the existing data imported for that FEED_NAME. When you specify Incremental mode, you are telling ASAPP that for a given date, the data which was uploaded is for that day only.  If you use the value dt=2018-09-02 in your constricted filename, you are indicating that the data contained in that file includes records from 2018-09-02 00:00:00 UTC → 2018-09-02 23:59:59 UTC.

When importing data using Snapshot mode, ASAPP will replace any existing data for the indicated FEED_NAME with the contents of the uploaded file. When you specify Snapshot mode, ASAPP treats the uploaded data as a complete record from “the time history started” until that particular day end.  A date of 2018-09-02 means the data includes, effectively, all things from 1970-01-01 00:00:00 UTC → 2018-09-02 23:59:59 UTC.

Other Upload Notes and Tips

  1. Make sure the structure for the imported file (whether columnar or json formatted) matches the current import standards (see below for details)
  2. Data imports are scheduled daily, 4 hours after UTC midnight (for the previous day’s data)
  3. In the event that you upload historical data (i.e., from older dates than are currently in the system), please inform your ASAPP team so a complete re-import can be scheduled.
  4. Snapshot data must go into a format=snapshot_{type} folder.
  5. Providing a Snapshot allows you to provide all historical data at once.  In effect, this reloads the entire table rather than appending data as in the non-snapshot case.

Upload Example

The example below assumes a shell terminal with python 2.7+ installed.

# install aws cli (assumes python)
pip install awscli
# configure your S3 credentials if not already done
aws configure
# push the files for 2019-01-20 for the call_center_issues import
# for a company named `umbrella-corp` to your local drive in production
aws s3 cp /location/of/your/file.csv s3://asapp-prod-umbrella-corp-imports-us-east-1/call_center_issues/version=1/format=csv/dt=2019-01-20/
aws s3 cp _SUCCESS s3://asapp-prod-umbrella-corp-imports-us-east-1/call_center_issues/version=1/format=csv/dt=2019-01-20/
# you should see some files now in the s3 location
aws s3 ls s3://asapp-prod-umbrella-corp-imports-us-east-1/call_center_issues/version=1/format=csv/dt=2019-01-20/
    file.csv
    _SUCCESS