Home >> News >> Using Spark DataFrames for large scale data science

Using Spark DataFrames for large scale data science

When we first open sourced Spark, we aimed to provide a simple API for distributed data processing in general-purpose programming languages (Java, Python, Scala). Spark enabled distributed data processing through functional transformations on distributed collections of data (RDDs). This was an incredibly powerful APIā€”tasks that used to take thousands of lines of code to express could be reduced to dozens.read more

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*