Digital Measurement SEI Streaming Content Viewing: Difference between revisions
From Engineering Client Portal
Line 11: | Line 11: | ||
* All prefix labels and file names should be lowercase except for app ids. | * All prefix labels and file names should be lowercase except for app ids. | ||
* Supported format are: | * Supported format are: | ||
** Apache Parquet format with snappy compression; (preferred format) | |||
** Text files with utf-8 encoding in JSON Lines format; | ** Text files with utf-8 encoding in JSON Lines format; | ||
* All data files have extensions to indicate the file format (.json | .parquet) | * All data files have extensions to indicate the file format (.json | .parquet) | ||
* Data file can be partitioned into multiple parts with min size of 256MB | * Data file can be partitioned into multiple parts with min size of 256MB |
Revision as of 18:07, 25 June 2024
This interface specification depicts the file schema required to supply Nielsen with response-level Viewing data in a “Server-to-Server” relationship for incorporation into the Nielsen One Content and/or Nielsen One Ads products.
This document describes the log file exposures, to be used in conjunction with Digital Measurement Content Audit Beacon
Delivery Specifications
Files should be delivered at a fixed cadence in a dedicated S3 bucket the following prefix and object naming conventions and data formats.
- All prefix labels and file names should be lowercase except for app ids.
- Supported format are:
- Apache Parquet format with snappy compression; (preferred format)
- Text files with utf-8 encoding in JSON Lines format;
- All data files have extensions to indicate the file format (.json | .parquet)
- Data file can be partitioned into multiple parts with min size of 256MB
- Data can be delivered separately in multiple splits, if needed due to organizational, technical or privacy requirements. This allows to permission s3 access separately, to process data independently and to persist data partitioned, within the limits of the SEI system
S3 Bucket and Prefix Naming Convention
useast1-nlsn-w-dig-sei-<partnerid>-feeds-<env>/<filetype>/<split>/yyyy/mm/dd/hh/<object>
Name | Description |
---|---|
partnerid | Abbreviation provided by Nielsen for each provider or publisher |
env | test or prod
|
filetype | exposure , dcr-exposure , dar-exposure , audio-exposure , etc., where:
|
split | a separate data split, can be by platform (ex.: ios, browser, android, ctv), by country (us, ca, jp, etc.), by publisher, by team, etc. or “all” (if data is provided in one split). |
yyyy/mm/dd/hh |
|
S3 Bucket and Prefix Naming Convention
<partnerid>_<filetype>_<intid>_<appid>_<starttime>_<endtime>.[json|parquet]
Name | Description |
---|---|
partnerid | Abbreviation provided by Nielsen for each provider or publisher |
filetype | exposure, dcr-exposure, dar-exposure, audio-exposure, etc., where:
|
initid | integration id: unique identifier provided by Nielsen |
appid | Nielsen-provided server application identifier |
starttime | start date and hour of the data in the file in UTC
|
endtime | end date and hour (not inclusive) of the data in the file in UTC
|
Success File
Empty _SUCCESS
file should be provided to indicate that data delivery for a particular hour and split is completed (even if there is no data for that particular hour and split).
Manifest File
A manifest should be provided which contains metadata related to the uploaded file(s). The manifest is a text file in .json format that implements the AWS unload manifest file format. It has the same name as the data file, but has the _manifest
suffix.
Example:
{
"entries": [
{"url":"s3://bucket/prefix/0000_object_00.snappy.parquet", "mandatory":true},
{"url":"s3://bucket/prefix/0001_object_00.snappy.parquet", "mandatory":true},
{"url":"s3://bucket/prefix/0002_object_00.snappy.parquet", "mandatory":true}
{"url":"s3://bucket/prefix/0003_object_00.snappy.parquet", "mandatory":true}
],
"meta": {
"schema_version": "S2SV1.7.0",
"accreditation_status": "0",
"start_time": "1710154800",
"end_time": "1710158399",
"record_count": 31337
}
}
The manifest contains a complete list of all files provided for a given period and split, as well as a meta data (a.k.a. header) which should include the following attributes:
Parameter | Description | Required | Specified | Format / Example | Type |
---|---|---|---|---|---|
schema_version | Schema Version | Yes | Nielsen | S2SV1.7.0
|
String |
accreditation_status | Accreditation Status | No | Client | MRC = 1
|
String |
data_start_time | Data Start Time (min) | Yes | Client | Format:32-bit unsigned int Unix time in seconds
Example: |
String |
data_end_time | Data Start Time (max) | Yes | Client | Format:32-bit unsigned int Unix time in seconds
Example: |
String |
record_count | Number of records in data file | Yes | Client | Example: 31337
|
Number |
Data delivery example for hour 11 (11:00:00 AM UTC to 11:59:59 PM UTC):
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0000.json
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0001.json
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0002.json
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0003.json
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_manifest
useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/_SUCCESS
SLA
The files must be delivered into the proper S3 bucket within 3 hours of the start of that hourly viewing file interval. For example, files from 1:00 AM to 2:00 AM must be delivered before 4 AM.
Accuracy of Measurement
The reported “wall clock” time of viewing needs to be within 25 seconds of the actual viewing time.
The reported content “reference” time needs to match the actual content that was played out with plus/minus 10 seconds (for clarity: start and end times can each be off by up to 10 seconds, so the combined under/over reporting for each individual viewing segment should be no greater than 20 seconds)
CTV, Mirroring, and Casting
For apps native to the OTT device (i.e. downloading and viewing a streaming app to an Apple TV), audit ping should fire from the OTT device, and Viewing data should reside in OTT Viewing file.
For mirroring, where video playback occurs on the mobile device and OTT device, only one Viewing file row is necessary where, if possible to determine, set: "secondscr":"MIR"
and include in respective mobile platform Viewing file.
For a casting scenario where content is controlled via the mobile device, but displayed only on the OTT device such as an AppleTV or Chromecast, an audit ping must fire from the mobile device before the casting occurs and at the end of playback from the mobile device only. Viewing data resides in the respective mobile platform Viewing file.
{...
"sessionid":"ABC",
"streamid":"DEF",
"position":[
{
"referencestart":"xxxxxxx123",
"referenceend":"xxxxxxx183",
"playheadstart":"0",
"playheadend":"60"
},
{
"referencestart":"xxxxxxx243",
"referenceend":"xxxxxxx303",
"playheadstart":"120",
"playheadend":"180"
}
]
}
{...
"sessionid":"ABC",
"streamid":"DEF",
"position":[
{
"referencestart":"xxxxxxx183",
"referenceend":"xxxxxxx243",
"playheadstart":"60",
"playheadend":"120"
}
],
"secondscr":"OTT"
}
Schema Parameter Definitions
Parameter | Description | Required | Format / Example | |
---|---|---|---|---|
apn | Application name | Yes |
Format: alphanumeric Example:
| |
apv | Application version | Yes |
Format: alphanumeric Example:
| |
sessionid | Unique, client generated value that represents the start of a user session. "Session" is defined as continuous (flexible) interaction with an application that may span multiple streams. | Yes |
Format: alphanumeric Example: Random GUID:
| |
streamid | ID for every new instance of exposure to a different asset | Yes, if no sessionid |
Format: alphanumeric Example: Random GUID:
| |
streamended | Stream is known to have ended in this file | Yes |
Format: integer Example:
| |
publisher_user_id | Publisher-specific user ID (must remain persistent indefinitely) | No |
Format: alphanumeric Example: Salted, hashed user ID:
| |
provider_user_id | Provider-specific user ID (must remain persistent indefinitely) | No |
Format: alphanumeric Example: Salted, hashed user ID:
| |
assetid | In-house id used for a video asset (TMS ID if available) | Yes |
Format: alphanumeric Example:
| |
nielsen_id3_tag | Encrypted Nielsen ID3 Tag. Contains SID (Source Identifier) codes (PC - Program Content & FD - Final Distributor). | Optional |
Format: alphanumeric; Example:
Note: Only Data Tags should be included, INFO Tags should not. INFO Tag is characterized by the following CID prefix: X100zdCIGeIlgZnkYj6UvQ== Example INFO Tag (not desired):
| |
gracenote_id | Gracenote TMS ID (If available) should be passed for all telecasted content. | Required if id3 not provided |
Format: 14 character string. Normally leading with 2 alpha characters ('EP', 'MV', 'SH' or 'SP'), followed by 12 numbers. Example:
| |
station_id | GraceNote station ID that identifies the station | Required if id3 not provided |
Format: alphanumeric | |
program_name | Name of program | Required if id3 not provided |
Format: alphanumeric, Max 25 characters; no special characters. Example:
See below for more specific guidance on Program Name. | |
network_affiliate | Network affiliation of a station | Required if id3 not provided |
Format: alphanumeric Example:
| |
channel_id | ID of channel | Required if id3 not provided |
Format: integer | |
channel_name | Name of channel | Required if id3 not provided |
Format: alphanumeric | |
callsign | FCC assigned unique identifier for a transmitter station | Required if id3 not provided |
Format: alphanumeric | |
dma | Designated Market Area where viewing occurred | Yes | 501
| |
ad_load_flag | linear or dynamic ad load | Yes |
Format: integer;
| |
ad_support_flag | Intended method of monetizing the content | Yes |
Format: integer;
Note: set to " | |
position | Array of contiguous content viewing.
For viewing gaps < than 1 second, the gap can be smoothed over See below for additional details on position array parameters |
Yes |
Format: "position": [{
"referencestart":"[timestamp]",
"referenceend":"[timestamp]"
"playheadstart":"[playhead position]",
"playheadend":"[playhead position]"
}]
| |
referencestart | Wall clock reference start time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds Example:
| |
referenceend | Wall clock reference end time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds Example:
| |
playheadstart | Content position start time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds or integer indexed from 0 (typical for VOD) Example:
| |
} | playheadend | Content position end time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds or integer indexed from 0 (typical for VOD) Example:
|
viewedads | Array of ads viewed by client.
See below for additional details on position array parameters |
Yes |
Format: "viewedads":
[
{
"adpod":"0",
"adposition":"0",
"adid":"0",
"isdarid":"0",
"adstart":"utc",
"adend":"utc",
"advisible":"0",
"adfocus":"0",
"adaudio":"0"
}
]
| |
{ | adpod | AdPod sequence number.
Increment for each AdPod. If the same AdPod is played out twice (due to rewind), still increment the AdPod sequence to reflect the sequence the AdPods are played. |
Yes |
Format: Integer Example:
|
adposition | Ad sequence number.
Increment each time a new Ad is encountered. If the same Ad is played out twice (due to rewind), still increment the Ad sequence to reflect the sequence the Ads are played |
Yes |
Format: Integer Example:
| |
adid | The unique identifier for this Ad | No |
Format: alphanumeric In-house AdId, Industry AdID, AdNetwork ID or DAR placement ID | |
isdarid | Is the AdId being passed a Nielsen DAR placement ID | Required if AdID |
Format: integer
| |
adstart | Wall clock reference start time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds | |
} | adend | Wall clock reference end time | Yes |
Format: Unix timestamp in 32-bit unsigned int in seconds |
device_id | Mobile Ad ID (IDFA, ADID), Connected Device ID | Yes,if
available |
A487421B-XXXX-YYYY-8343-E3BBB66E44F2
| |
hem_sha256 | SHA-256 hashed email
Note: email normalization rules applied before hashing |
Strongly Preferred | 55C06A30DAA5D5F382FDEB8C702EC57875CC9D91A7C78BB620053FD81DC…
| |
luid | Living Unit ID - Experian Household ID | Yes, for CTV if available | B0EOFEDgD
| |
uoo | User opt out flag for demographic measurement | Yes |
Format: integer
| |
xdua | Device HTTP User Agent string | Yes |
Format: alphanumeric, Example:
| |
xff | IP address | Yes |
Format: xxx.xxx.xxx.xxx | |
psudo_id_sha256 | Hashed Device User Agent string + IP address | No | 421c76d77563afa1914846b010bd164f395bd34c2102e5e99e0cb9cf173…
| |
The following 4 parameters become mandatory if Device User Agent String (UAS) is not available | ||||
device_platform | Device platform(mobile, desktop, connected device) | Required if no UAS | DSK , MBL , OTT
| |
device_type | Device type for connected devices | Required if no UAS | AMN , APL , DVD , GGL , PSX , RKU , STB , STV , XBX
| |
os_group | Operating system of mobile devices. All other device should be NA | Required if no UAS | IOS , DROID , NA
| |
device_group | Type of device: phone, tablet, desktop, set top box (CTV or OTT), unknown | Required if no UAS | DSK , PHN , TAB , STV , DVD , PMP , OTHER , STB , XBX , PSX , AMN , APL , GGL , RKU , UNWN
| |
The following 4 parameters become mandatory if IP Address is not available | ||||
robotic_flag | Flag indicating IAB bot rule if IP address present in IAB bot list | Required if no IP | Format: integer
| |
zip_code | ZIP code where viewing occurred | Required if no IP | 10001
| |
country | Country ISO 3166 ALPHA-2 | Required if no IP | US , CA , etc.
|
Note: All parameters are case sensitive.