Build the per-owner RawFileTable CSV (provider file inventory, post-archive safe).

For each station, reads raw.owner/record.json and emits one row per provider file (one per ComponentID x FileID). Handles both states:

raw.owner/record.json present on disk (not yet archived).
raw.owner.tar.gz archive (extracts record.json via streaming tar -xzOf to avoid touching disk).

Usage

buildRawFileTable(path.records, path.index, owners = NULL)

Arguments

path.records: Absolute path to the records root. Required – no default.
path.index: Absolute path to the index root where per-owner CSVs are written. Required – no default.
owners: Character vector of OwnerIDs. NULL = scan all.

Value

Invisibly, the per-owner row counts.

Details

Schema:

OwnerID, EventID, StationID, ComponentID, FileID,
  NP, dt, Fs, Units, HP, LP, isArray

PGA is intentionally not emitted here. Pre-parse PGAs from record.json are in heterogeneous provider units; canonical post-parse PGAs (in mm/s^2) live in RawIntensityTable.<Owner>.csv. Missing provider fields in the canonical schema are emitted as typed NA columns instead of changing the output schema.

isArray = nComponentID > 3 (heuristic per legacy convention).

Examples

root <- file.path(tempdir(), "gmsp-raw-file-example")
index <- file.path(tempdir(), "gmsp-raw-file-index")
unlink(c(root, index), recursive = TRUE)
dir.create(file.path(root, "AAA", "E1", "S1", "raw.owner"),
           recursive = TRUE)
dir.create(index)
record <- list(
  Event = list(EventID = "E1"),
  Station = list(StationID = "S1"),
  Record = list(
    list(ComponentID = "H1", FileID = "H1.txt", NP = 4, dt = 0.01,
         Fs = 100, Units = "cm", HP = NA, LP = NA),
    list(ComponentID = "H2", FileID = "H2.txt", NP = 4, dt = 0.01,
         Fs = 100, Units = "cm", HP = NA, LP = NA),
    list(ComponentID = "UP", FileID = "UP.txt", NP = 4, dt = 0.01,
         Fs = 100, Units = "cm", HP = NA, LP = NA)
  )
)
jsonlite::write_json(
  record,
  file.path(root, "AAA", "E1", "S1", "raw.owner", "record.json"),
  auto_unbox = TRUE
)
suppressMessages(buildRawFileTable(root, index, owners = "AAA"))
data.table::fread(file.path(index, "RawFileTable.AAA.csv"))
#>    OwnerID EventID StationID ComponentID FileID    NP    dt    Fs  Units     HP
#>     <char>  <char>    <char>      <char> <char> <int> <num> <int> <char> <lgcl>
#> 1:     AAA      E1        S1          H1 H1.txt     4  0.01   100     cm     NA
#> 2:     AAA      E1        S1          H2 H2.txt     4  0.01   100     cm     NA
#> 3:     AAA      E1        S1          UP UP.txt     4  0.01   100     cm     NA
#>        LP isArray
#>    <lgcl>  <lgcl>
#> 1:     NA   FALSE
#> 2:     NA   FALSE
#> 3:     NA   FALSE