Please log in to watch this conference skillscast.
Get your hands dirty with distributed tools, during these two hours we’ll have a quick overview on how a dataset can be processed in a distributed way towards the exposition exposition as a web service. The tool we’ll use for this are Spark, Cassandra, Akka HTTP and the Spark Notebook. A primary basic knowledge (conceptual) of these tools are not required but welcome. Your take home for this workshop will be a docker image which will allow you to replay the whole thing at home or at work (don’t forget the sunglasses to add even more to the cool effect). Oh! And you’ll also have a better idea why and how these tools can be chained for even general purpose, yet data oriented.
Setup instructions can be found in this PDF
Please download this PDF now as the link will expire after the conference.
Note that this workshop requires a lengthy setup process.
YOU MAY ALSO LIKE:
Workshop: Mind blown: Crafting a Distributed Data Science Pipeline using Spark, Cassandra, Akka and the Spark Notebook
Andy is a mathematician-turned-distributed computing entrepreneur. Besides running Skills Matter's Spark (and other) courses, Andy also participated in many projects using spark, cassandra, and other distributed technologies, in a range of fields including Geospatial, IoT, Automotive and Smart cities projects. Andy is the creator of the Spark Noeboo, the only reactive and fully Scala notebook for Apache Spark.
Xavier started his career as a researcher in Experimental Physics, and also focused on data processing. Further down the road, he took part in projects in finance, genomics, and software development for academic research. During that time, he worked on timeseries, on the prediction of biological molecular structures and interactions, and applied Machine Learning methodologies. He developed solutions to manage and process data distributed across data centres. He founded and now works at Data Fellas, a company dedicated to distributed computing and advanced analytics, leveraging Scala, Spark, and other distributed technologies.