About 21,100,000 results
Open links in new tab
  1. Inspect Parquet from command line - Stack Overflow

    How do I inspect the content of a Parquet file from the command line? The only option I see now is $ hadoop fs -get my-path local-file $ parquet-tools head local-file | less I would like to avoid

  2. How can I write a parquet file using Spark (pyspark)?

    I'm pretty new in Spark and I've been trying to convert a Dataframe to a parquet file in Spark but I haven't had success yet. The documentation says that I can use write.parquet function to …

  3. Unable to infer schema when loading Parquet file

    The documentation for parquet says the format is self describing, and the full schema was available when the parquet file was saved. What gives? Using Spark 2.1.1. Also fails in 2.2.0. …

  4. How to read a Parquet file into Pandas DataFrame?

    How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a …

  5. Using pyarrow how do you append to parquet file? - Stack Overflow

    Generally speaking, Parquet datasets consist of multiple files, so you append by writing an additional file into the same directory where the data belongs to. It would be useful to have the …

  6. What are the pros and cons of the Apache Parquet format …

    Apr 24, 2016 · Parquet has gained significant traction outside of the Hadoop ecosystem. For example, the Delta Lake project is being built on Parquet files. Arrow is an important project …

  7. indexing - Index in Parquet - Stack Overflow

    Basically Parquet has added two new structures in parquet layout - Column Index and Offset Index. Below is a more detailed technical explanation what it solves and how. Problem …

  8. How to view Apache Parquet file in Windows? - Stack Overflow

    Jun 19, 2018 · 98 What is Apache Parquet? Apache Parquet is a binary file format that stores data in a columnar fashion. Data inside a Parquet file is similar to an RDBMS style table where …

  9. How to handle null values when writing to parquet from Spark

    How to handle null values when writing to parquet from Spark Asked 7 years, 7 months ago Modified 4 years, 1 month ago Viewed 81k times

  10. Methods for writing Parquet files using Python? - Stack Overflow

    Oct 5, 2015 · I'm having trouble finding a library that allows Parquet files to be written using Python. Bonus points if I can use Snappy or a similar compression mechanism in conjunction …