Pinterest • The world’s catalog of ideas

Notebook Workflows: The Easiest Way to Implement Apache Spark Pipelines Today we are excited to announce Notebook Workflows in Databricks. Notebook Workflows is a set of APIs that allow users to chain notebooks together using the standard control structures of the source programming language Python Scala or R to build production pipelines. @tachyeonz


Scala vs. Python for Apache Spark from @dezyreonline #programming #helpdata #data


Cloud Machine Learning Platforms vs. Apache Spark Solutions - Hadoop360

Predicting Airbnb Listing Prices with Scikit-Learn and Apache Spark | MapR

Comprehensive Introduction to Apache Spark RDDs & Dataframes (using PySpark) Industry estimates that we are creating more than 2.5 Quintillion bytes of data every year. @tachyeonz

7 Steps to Mastering Apache Spark 2.0 Not a week goes by without a mention of Apache Spark in a blog news article or webinar on Sparks impact in the big data landscape. @tachyeonz