Digital Measurement SEI Static Content Viewing

From Engineering Client Portal

Revision as of 21:29, 21 October 2024 by Admin (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Engineering Portal / Digital / Digital Measurement SEI Static Content Viewing
This interface specification depicts the file schema required to supply Nielsen with response-level Viewing data in a “Server-to-Server” relationship for incorporation into the Nielsen One Content and/or Nielsen One Ads products.

This document describes the log file exposures, to be used in conjunction with:

Delivery Specifications

Files should be delivered at a fixed cadence in a dedicated S3 bucket following the outlined in this spec prefix and object naming conventions and data formats.

  • All prefix labels and file names should be lowercase except for app ids.
  • Supported format are:
    • Text files with utf-8 encoding in JSON Lines format;
    • Apache Parquet format with snappy compression;
  • All data files have extensions to indicate the file format (.json | .parquet)
  • Data file can be partitioned into multiple parts with min size of 256MB
  • Data can be delivered separately in multiple splits, if needed due to organizational, technical or privacy requirements. This allows to permission s3 access separately, to process data independently and to persist data partitioned, within the limits of the SEI system;

S3 Bucket and Prefix Naming Convention

useast1-nlsn-w-dig-sei-<partnerid>-feeds-<env>/<filetype>/<split>/yyyy/mm/dd/hh/<object>

Name Description
partnerid Abbreviation provided by Nielsen for each provider or publisher
env test or prod
filetype exposure, dcr-exposure, dar-exposure, audio-exposure, etc., where:
  • exposure is reserved for multi-product files
  • <product>-exposure is for single product files, where <product> is [dar|dcr|dtvr|ctvc|audio]
split a separate data split, can be by platform (ex.: ios, browser, android, ctv), by country (us, ca, jp, etc.), by publisher, by team, etc. or “all” (if data is provided in one split).
yyyy/mm/dd/hh
  • yyyy - year
  • mm - month
  • dd - date padded with 0 example 01, 02,..., 31
  • hh - hour padded with 0 example 00, 02,..., 23

S3 Bucket and Prefix Naming Convention

<partnerid>_<filetype>_<intid>_<appid>_<starttime>_<endtime>.[json|parquet]

Name Description
partnerid Abbreviation provided by Nielsen for each provider or publisher
filetype exposure, dcr-exposure, dar-exposure, audio-exposure, etc., where:
  • exposure is reserved for multi-product files
  • <product>-exposure is for single product files, where <product> is [dar|dcr|dtvr|ctvc|audio]
initid integration id: unique identifier provided by Nielsen
appid Nielsen-provided server application identifier
starttime start date and hour of the data in the file in UTC
  • Example format: yyyymmddhh
endtime end date and hour (not inclusive)  of the data in the file in UTC
  • Example format: yyyymmddhh

Success File

Empty _SUCCESS file should be provided to indicate that data delivery for a particular hour and split is completed (even if there is no data for that particular hour and split).

Manifest File

A manifest should be provided which contains metadata related to the uploaded file(s). The manifest is a text file in .json format that implements the AWS unload manifest file format. It has the same name as the data file, but has the _manifest suffix.

Example:

 {
   "entries": [
     {"url":"s3://bucket/prefix/0000_object_00.snappy.parquet", "mandatory":true},
     {"url":"s3://bucket/prefix/0001_object_00.snappy.parquet", "mandatory":true},
     {"url":"s3://bucket/prefix/0002_object_00.snappy.parquet", "mandatory":true}
     {"url":"s3://bucket/prefix/0003_object_00.snappy.parquet", "mandatory":true}
   ],

   "meta": {
     "schema_version": "S2SV1.7.0",
     "accreditation_status": "0",
     "start_time": "1710154800",
     "end_time": "1710158399",
     "record_count": 31337
   }
 }

The manifest contains a complete list of all files provided for a given period and split, as well as a meta data (a.k.a. header) which should include the following attributes:

Parameter Description Required Specified Format / Example Type
schema_version Schema Version Yes Nielsen S2SV1.7.0 String
accreditation_status Accreditation Status No Client MRC = 1 String
data_start_time Data Start Time (min) Yes Client Format:32-bit unsigned int Unix time in seconds

Example: 1710154800

String
data_end_time Data Start Time (max) Yes Client Format:32-bit unsigned int Unix time in seconds

Example: 1710158399

String
record_count Number of records in data file Yes Client Example: 31337 Number

Data delivery example for hour 11 (11:00:00 AM UTC to 11:59:59 PM UTC):

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0000.json

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0001.json

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0002.json

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_0003.json

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/acme_dtvr-exposure_a9ddf15ea054ea415718767ea6_P47C2495B-BBBA-56DE-BE99-14758F92F034_2024031111_2024031112_manifest

useast1-nlsn-w-dig-sei-acme-feeds-prod/dar-exposure/ios/2022/05/25/11/_SUCCESS

SLA

The files must be delivered into the proper S3 bucket within 3 hours of the start of that hourly viewing file interval. For example, files from 1:00 AM to 2:00 AM must be delivered before 4 AM.

Accuracy of Measurement

The reported “wall clock” time of viewing needs to be within 25 seconds of the actual viewing time.

The reported content “reference” time needs to match the actual content that was played out with plus/minus 10 seconds (for clarity: start and end times can each be off by up to 10 seconds, so the combined under/over reporting for each individual viewing segment should be no greater than 20 seconds)

Schema Parameter Definitions

Parameter Description Required Format / Example
apn Application name Yes

Format: alphanumeric

Example: BestAppIOS

apv Application version Yes

Format: alphanumeric

Example: 21.5

sessionid Unique, client generated value that represents the start of a user session. "Session" is defined as continuous (flexible) interaction with an application that may span multiple streams. Yes

Format: alphanumeric

Example: Random GUID: cdcde33c-b62f-4f17-a9c8-0db4f78e09d6

streamid ID for every new instance of exposure to a different asset Yes, if no sessionid

Format: alphanumeric

Example: Random GUID: d7a909f1-5e77-4af7-8a9b-f2…

streamended Stream is known to have ended in this file Yes

Format: integer

Example:

  • 1 (stream continues in subsequent file),
  • 2 (stream closed)
publisher_user_id Publisher-specific user ID (must remain persistent indefinitely) No

Format: alphanumeric

Example: Salted, hashed user ID:

8f434346648f6b96d9dda901c5…

assetid In-house id used for an asset Yes

Format: alphanumeric

Example: VID123456789

position Array of contiguous content viewing.

For viewing gaps < than 1 second, the gap can be smoothed over

See below for additional details on position array parameters

Yes

Format:

"position": [{
 "referencestart":"[timestamp]",
 "referenceend":"[timestamp]"
 "playheadstart":"[playhead position]",
 "playheadend":"[playhead position]"
}]
{ referencestart Wall clock reference start time Yes if no "cln"

Format: Unix timestamp in 32-bit unsigned int in seconds

Example: 1577858505

referenceend Wall clock reference end time Yes if no "cln"

Format: Unix timestamp in 32-bit unsigned int in seconds

Example: 1577858775

playheadstart Content position start time Yes if no "cln"

Format: Unix timestamp in 32-bit unsigned int in seconds or integer indexed from 0 (typical for VOD)

Example: 1577858515

} playheadend Content position end time Yes if no "cln"

Format: Unix timestamp in 32-bit unsigned int in seconds or integer indexed from 0 (typical for VOD)

Example: 1577858785

cln Cumulative time spent viewing in seconds Yes if no position array
device_id Mobile Ad ID (IDFA, ADID), Connected Device ID Yes,if

available

A487421B-XXXX-YYYY-8343-E3BBB66E44F2
hem_sha256 SHA-256 hashed email

Note: email normalization rules applied before hashing

Strongly Preferred 55C06A30DAA5D5F382FDEB8C702EC57875CC9D91A7C78BB620053FD81DC…
uoo User opt out flag for demographic measurement Yes

Format: integer

  • 0 = not opt-out
  • 1 = opt-out
xdua Device HTTP User Agent string Yes

Format: alphanumeric,

Example: Apple-iPhone1C2/801.293

xff IP address Yes

Format: xxx.xxx.xxx.xxx

psudo_id_sha256 Hashed Device User Agent string + IP address No 421c76d77563afa1914846b010bd164f395bd34c2102e5e99e0cb9cf173…
The following 4 parameters become mandatory if Device User Agent String (UAS) is not available
device_platform Device platform(mobile, desktop, connected device) Required if no UAS DSK, MBL, OTT
device_type Device type for connected devices Required if no UAS AMN, APL, DVD, GGL, PSX, RKU, STB, STV, XBX
os_group Operating system of mobile devices. All other device should be NA Required if no UAS IOS, DROID, NA
device_group Type of device: phone, tablet, desktop, set top box (CTV or OTT), unknown Required if no UAS DSK, PHN, TAB, STV, DVD, PMP, OTHER, STB, XBX, PSX, AMN, APL, GGL, RKU, UNWN
The following 4 parameters become mandatory if IP Address is not available
robotic_flag Flag indicating IAB bot rule if IP address present in IAB bot list Required if no IP Format: integer
  • 0 = not bot,
  • 1 = bot
zip_code ZIP code where viewing occurred Required if no IP 10001
country Country ISO 3166 ALPHA-2 Required if no IP US, CA, etc.

Note: All parameters are case sensitive.