Since the top-level Folders array is not an array if it contains a single item, we first need to turn it into an array if it isn't one already. In jq, we can do that with
.Folders |= (if type == "array" then . else [.] end)
The general idea of the rest of the transformation is to duplicate the upper-level data, i.e. some of the key+value pairs from the top-most level and the Id and Path key from each Folders element, into the lowest level, the FolderItems elements. We can then convert each FolderItems element into a CSV record.
To avoid keys with duplicate names, we also need to rename the Folders element's Id key to FolderId (and rename Path from the same level to FolderPath, for consistency).
The data from the top-most level that we want can be picked out and transferred down in an internal variable using
pick(.CreationTime, .Id, .UserId, .ClientIPAddress) as $record
This will create $record as
{
"CreationTime": "2024-02-06T12:13:14",
"Id": "abcdabcd-1234-1234-5555-888888888888",
"UserId": "[email protected]",
"ClientIPAddress": "5.5.5.5"
}
... for the first JSON object. Not that we get the keys in the order that we want them in the final CSV output.
We can then extract only the Folders elements with .Folders[] and pick out the Id and Path for each. Since we want to rename these, we can't use pick() again, so we need to be a bit more hands-on:
.Folders[] | { FolderId: .Id, FolderPath: .Path } as $folder
We may then use .FolderItems[] to get a set of FolderItems elements, which we can add $record and $folder in front of:
.FolderItems[] | $record + $folder + .
As a single jq expression:
.Folders |= (if type == "array" then . else [.] end) |
pick(.CreationTime, .Id, .UserId, .ClientIPAddress) as $record |
.Folders[] | { FolderId: .Id, FolderPath: .Path } as $folder |
.FolderItems[] | $record + $folder + .
The result, given the data in the question:
{
"CreationTime": "2024-02-06T12:13:14",
"Id": "abcdabcd-1234-1234-5555-888888888888",
"UserId": "[email protected]",
"ClientIPAddress": "5.5.5.5",
"FolderId": "EEEEEEEE",
"FolderPath": "\\Outbox",
"InternetMessageId": "<[email protected]>",
"SizeInBytes": 12345
}
{
"CreationTime": "2024-02-06T12:13:14",
"Id": "abcdabcd-1234-1234-5555-888888888888",
"UserId": "[email protected]",
"ClientIPAddress": "5.5.5.5",
"FolderId": "EEEEEEEE",
"FolderPath": "\\Outbox",
"InternetMessageId": "<[email protected]>",
"SizeInBytes": 11122
}
(etc.)
I would personally convert this set of JSON objects to CSV using Miller:
$ jq '.Folders |= (if type == "array" then . else [.] end) | pick(.CreationTime, .Id, .UserId, .ClientIPAddress) as $record | .Folders[] | { FolderId: .Id, FolderPath: .Path } as $folder | .FolderItems[] | $record + $folder + .' file | mlr --j2c cat
CreationTime,Id,UserId,ClientIPAddress,FolderId,FolderPath,InternetMessageId,SizeInBytes
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,EEEEEEEE,\Outbox,<[email protected]>,12345
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,EEEEEEEE,\Outbox,<[email protected]>,11122
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,EEEEEEEE,\Outbox,<[email protected]>,88888
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,FFFFFFFFFFFFFFFFFAB,\Inbox,<[email protected]>,44444
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,FFFFFFFFFFFFFFFFFAB,\Inbox,<[email protected]>,100000
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,FFFFFFFFFFFFFFFFFAB,\Inbox,<[email protected]>,109000
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,FFFFFFFFFFFFFFFFFAB,\Inbox,<[email protected]>,22000
2024-02-06T12:13:14,abcdabcd-1234-1234-5555-888888888888,[email protected],5.5.5.5,FFFFFFFFFFFFFFFFFAB,\Inbox,<[email protected]>,333333
2024-02-06T20:00:00,abcdabcd-1234-1234-6666-9999999999999,[email protected],7.7.7.7,12341234,\Temp,<[email protected]>,77777
2024-02-06T20:00:00,abcdabcd-1234-1234-6666-9999999999999,[email protected],7.7.7.7,12341234,\Temp,<[email protected]>,888888
2024-02-06T20:00:00,abcdabcd-1234-1234-6666-9999999999999,[email protected],7.7.7.7,12341234,\Temp,<[email protected]>,99999
To do the conversion to CSV using jq in place of Miller, replace the mlr command with something like
jq -s -r '(first|keys|@csv), map([.[]]|@csv)[]'
This would pick the keys from the first input object as the CSV headers and would CSV-format all values into one record per object.