Show HN: Parqeye – A CLI tool to visualize and inspect Parquet files

(github.com)

61 points | by kaushiksrini 5 hours ago

11 comments

bigshik 53 minutes ago
Nice work—this hits a real pain point with Parquet. My main use case is debugging partitioned datasets on S3 with schema drift and skew, where I care about: which files/partitions have schema mismatches, weird row-group stats (all-null, out-of-range, huge skew), and doing that via metadata only.
Right now parqeye looks mainly single-file focused. Do you have plans for a “dataset mode” that takes a dir/S3 prefix and surfaces per-file/row-group summaries (row counts, min/max, null %, schema diffs vs a reference file) using just Parquet stats so it scales to tens of GB? Or do you see parqeye intentionally staying a single-file inspector?
kylebarron 1 hour ago
Looks great!
Another seemingly extremely similar project released in the last few days: https://github.com/raulcd/datanomy
papers1010 3 hours ago
It’s crazy how long we’ve gone without a tool like this. This is huge. Thank you for finally building this!
[-]
- 0cf8612b2e1e 2 hours ago
  It is really incredible how poor the parquet tooling has been for years. The cornerstone of data engineering, yet just inspecting a file is needlessly clunky.
lolive 3 hours ago
Can DuckDB be included in the tool, so you can run queries directly from the UI? [that would avoid opening DBeaver whenever you need that kind of feature]
[-]
- lolive 3 hours ago
  Hu huuum... https://harlequin.sh/
  [-]
  - mrasong 2 hours ago
    This tool actually feels pretty solid too.
jspanos2 1 hour ago
This is very impressive. Look forward to using this
banga 3 hours ago
Looks like a nice tool, but failed for me when reading a geoparquet file created using duckdb.
swety101 48 minutes ago
Such a cool idea!! So helpful
dionian 25 minutes ago
tried it out. love it.
lolive 3 hours ago
Apart from some visual glitches, this is an INSTANT BUY !
Note: must the Windows binary really be 78MB ?
[-]
- ch2026 1 hour ago
  CLIs are bulky
WorldPeas 5 hours ago
thank you so much! this was an annoyance of mine for so long. edit: any chance you make a brew package? if you'd like I'd be happy to PR it in.
[-]
- kaushiksrini 5 hours ago
  yep! it’s available as a homebrew tap — you can install it with: `brew install kaushiksrini/parqeye/parqeye`
  [-]
  - dacox 32 minutes ago
    awesome! i was just looking at a bucket full of parquet files from last year trying to recall some things about them.
    i tried to install with brew, but it told me my cli tools were "too out of date". Never seen that before! and also just upgraded.
    Will try again tomorrow
  - WorldPeas 5 hours ago
    wonderous.