How to show full column content in a Spark Dataframe?

Apache SparkDataframeSpark CsvOutput Formatting

Apache Spark Problem Overview


I am using spark-csv to load data into a DataFrame. I want to do a simple query and display the content:

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("my.csv")
df.registerTempTable("tasks")
results = sqlContext.sql("select col from tasks");
results.show()

The col seems truncated:

scala> results.show();
+--------------------+
|                 col|
+--------------------+
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-06 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
+--------------------+

How do I show the full content of the column?

Apache Spark Solutions


Solution 1 - Apache Spark

results.show(20, false) will not truncate. Check the source

20 is the default number of rows displayed when show() is called without any arguments.

Solution 2 - Apache Spark

If you put results.show(false) , results will not be truncated

Solution 3 - Apache Spark

Below code would help to view all rows without truncation in each column

df.show(df.count(), False)

Solution 4 - Apache Spark

The other solutions are good. If these are your goals:

  1. No truncation of columns,
  2. No loss of rows,
  3. Fast and
  4. Efficient

These two lines are useful ...

    df.persist
    df.show(df.count, false) // in Scala or 'False' in Python

By persisting, the 2 executor actions, count and show, are faster & more efficient when using persist or cache to maintain the interim underlying dataframe structure within the executors. See more about https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-rdd-caching.html" title="example">persist and cache.

Solution 5 - Apache Spark

results.show(20, False) or results.show(20, false) depending on whether you are running it on Java/Scala/Python

Solution 6 - Apache Spark

In Pyspark we can use

df.show(truncate=False) this will display the full content of the columns without truncation.

df.show(5,truncate=False) this will display the full content of the first five rows.

Solution 7 - Apache Spark

In c# Option("truncate", false) does not truncate data in the output.

StreamingQuery query = spark
                    .Sql("SELECT * FROM Messages")
                    .WriteStream()
                    .OutputMode("append")
                    .Format("console")
                    .Option("truncate", false)
                    .Start();

Solution 8 - Apache Spark

The following answer applies to a Spark Streaming application.

By setting the "truncate" option to false, you can tell the output sink to display the full column.

val query = out.writeStream
          .outputMode(OutputMode.Update())
          .format("console")
          .option("truncate", false)
          .trigger(Trigger.ProcessingTime("5 seconds"))
          .start()

Solution 9 - Apache Spark

results.show(false) will show you the full column content.

Show method by default limit to 20, and adding a number before false will show more rows.

Solution 10 - Apache Spark

results.show(20,false) did the trick for me in Scala.

Solution 11 - Apache Spark

Within Databricks you can visualize the dataframe in a tabular format. With the command:

display(results)

It will look like

enter image description here

Solution 12 - Apache Spark

Try df.show(20,False)

Notice that if you do not specify the number of rows you want to show, it will show 20 rows but will execute all your dataframe which will take more time !

Solution 13 - Apache Spark

try this command :

df.show(df.count())

Solution 14 - Apache Spark

Tried this in pyspark

df.show(truncate=0)

Solution 15 - Apache Spark

I use the plugin Chrome extension works pretty well:

[https://userstyles.org/styles/157357/jupyter-notebook-wide][1]

Solution 16 - Apache Spark

Try this in scala:

df.show(df.count.toInt, false)

The show method accepts an integer and a Boolean value but df.count returns Long...so type casting is required

Solution 17 - Apache Spark

PYSPARK

In the below code, df is the name of dataframe. 1st parameter is to show all rows in the dataframe dynamically rather than hardcoding a numeric value. The 2nd parameter will take care of displaying full column contents since the value is set as False.

df.show(df.count(),False)

enter image description here


SCALA

In the below code, df is the name of dataframe. 1st parameter is to show all rows in the dataframe dynamically rather than hardcoding a numeric value. The 2nd parameter will take care of displaying full column contents since the value is set as false.

df.show(df.count().toInt,false)

enter image description here

Solution 18 - Apache Spark

In Spark Pythonic way, remember:

  • if you have to display data from a dataframe, use show(truncate=False) method.
  • else if you have to display data from a Stream dataframe view (Structured Streaming), use the writeStream.format("console").option("truncate", False).start() methods with option.

Hope it could helps someone.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiontracerView Question on Stackoverflow
Solution 1 - Apache SparkTomTom101View Answer on Stackoverflow
Solution 2 - Apache SparkNarendra ParmarView Answer on Stackoverflow
Solution 3 - Apache SparkMoeChenView Answer on Stackoverflow
Solution 4 - Apache SparkcodeaperatureView Answer on Stackoverflow
Solution 5 - Apache SparkDeepak Babu P RView Answer on Stackoverflow
Solution 6 - Apache SparkRaHuL VeNuGoPaL View Answer on Stackoverflow
Solution 7 - Apache SparkBaglay VyacheslavView Answer on Stackoverflow
Solution 8 - Apache SparkfarrellwView Answer on Stackoverflow
Solution 9 - Apache SparkChetan TamballaView Answer on Stackoverflow
Solution 10 - Apache SparkSKAView Answer on Stackoverflow
Solution 11 - Apache SparkIgnacio AlorreView Answer on Stackoverflow
Solution 12 - Apache SparkDjihane AKROUMView Answer on Stackoverflow
Solution 13 - Apache Sparkepic_last_songView Answer on Stackoverflow
Solution 14 - Apache SparkonemanarmyView Answer on Stackoverflow
Solution 15 - Apache SparkKeepLearningView Answer on Stackoverflow
Solution 16 - Apache SparkPritesh KumarView Answer on Stackoverflow
Solution 17 - Apache SparkSarath KSView Answer on Stackoverflow
Solution 18 - Apache SparkngenneView Answer on Stackoverflow