19

I have this dataframe in Spark I want to count the number of available columns in it. I know how to count the number of rows in column but I want to count number of columns.

val df1 = Seq(
    ("spark", "scala",  "2015-10-14", 10,"rahul"),
    ("spark", "scala", "2015-10-15", 11,"abhishek"),
    ("spark", "scala", "2015-10-16", 12,"Jay"),
    ("spark","scala",null,13,"Kiran"))
  .toDF("bu_name","client_name","date","patient_id","paitent _name")
df1.show

Can anybody tell me how I can count number of column count in this dataframe? I am using the Scala language.

6 Answers 6

31

To count the number of columns, simply do:

df1.columns.size
Sign up to request clarification or add additional context in comments.

2 Comments

Or df1.columns.length, seems to be faster
@JohannGoulley: They should be equivalent: stackoverflow.com/questions/22966705/…
11

In python, the following code worked for me:

print(len(df.columns))

Comments

5

data.columns accesses the list of column titles. All you have to do is count the number of items in the list. so

len(df1.columns)

works To obtain the whole data in a single variable, we do

rows = df.count()
columns = len(df.columns)
size = (rows, columns)
print(size)

Comments

2

The length of the mutable indexed sequence also work.

df.columns.length

Comments

0

To count the columns of a Spark dataFrame:

len(df1.columns)

and to count the number of rows of a dataFrame:

df1.count()

Comments

0

in Pyspark you can just result.select("your column").count()

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.