Skip to content

[FEATURE] Add json_object constructor function to PPL #3208

@acarbonetto

Description

@acarbonetto

Is your feature request related to a problem?

As part of the RFC to add JSON functions, the json_object function would be useful to construct json objects from multi-values, scalars, and complex objects (and arrays).

What solution would you like?

To consider: Returning JSON objects OR json-encoded strings

  • Returning JSON objects reduces number of serialize/de-serialize actions on the plugin, and would slightly increase performance. Using JSON objects is compatible with AWS Athena SQL language, opensearch-spark PPL, and Splunk.
  • Returning JSON encoded strings reduces the complexity for the language, and removes the necessity for the user to call to_json_string or casting to String to output fields or call string-operations.

Tasks to complete:

  • Introduce the json_object(key, scalar_value [, key, scalar_value]) -> json_object function, where scalar_value is a scalar expression.
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value may be another json_object.
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value can be a field expression (for multi-valued fields, take the first value).
  • Introduce the json_object(key, value [, key, value]*) -> json_object function, where value can be of json_array.
### `JSON_OBJECT`

**Description**

`json_object(<key>, <value>[, <key>, <value>]...)` returns a JSON object from key-value pairs.

**Argument type:**
- A \<key\> must be STRING.
- A \<value\> can be a scalar, another json object, or json array type.  Note: scalar fields will be treated as single-value.  Use `json_array` to construct an array value from a multi-value. 

**Return type:** JSON Object

Example:

    os> source=people | eval result = json_object('key', 123.45) | fields result
    fetched rows / total rows = 1/1
    +------------------+
    | result           |
    +------------------+
    | {"key":123.45}   |
    +------------------+

    os> source=people | eval result = json_object('outer', json_object('inner', 123.45)) | fields result
    fetched rows / total rows = 1/1
    +------------------------------+
    | result                       |
    +------------------------------+
    | {"outer":{"inner":123.45}}   |
    +------------------------------+

    os> source=people | eval result = json_object('array_doc', json_array(123.45, "string", true, null)) | fields result
    fetched rows / total rows = 1/1
    +------------------------------+
    | result                       |
    +------------------------------+
    | {"array_doc":[123.45, "string", true, null]}   |
    +------------------------------+

What alternatives have you considered?

N/A

Do you have any additional context?

opensearch-project/opensearch-spark#780 - PR to add feature to opensearch-spark PPL

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languageenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions