Please log in to watch this conference skillscast.
Lynn Langit shares lessons learned and cloud data pipeline patterns via examples from work she’s doing with CSIRO Bioinformatics Australia. The team there, led by Dr. Denis Bauer, is analyzing a number of large genomic datasets.
First, Lynn examines real-time analysis with cloud-based solutions. Keeping runtime constant can be challenging for problems that vary in complexity, such as genome engineering. The CSIRO GT-Scan2 tool works by instantaneously recruiting additional Lambda functions as the complexity increases. It was built using a microservices pattern (serverless) using AWS services.
Next, Lynn will demo a Jupyter notebook which shows how genomic research can leverage Apache Spark to massively parallelize the generation of random forests to identify disease genes efficiently.She’ll discuss the pipeline’s use of an OSS library written by the team at CSIRO (VariantSpark).
VariantSpark can analyze 3,000 samples with 80 million features in under 30 minutes. This pipeline enables real-time diagnosis by finding similar patients. This platform is contributing to motor neuron disease research (publicized by the Ice Bucket Challenge) in Australia.
YOU MAY ALSO LIKE:
- How Cities Heal: Minneapolis 2021 (SkillsCast recorded in December 2020)
- Rust Nation 23 (in London on 16th - 17th February 2023)
- F# eXchange 2023: In-Person (in London on 7th - 8th March 2023)
- Haskell In Person: Bring Your Project, Get help with your code & socialise (in Berlin on 1st February 2023)
- How to teach IntelliJ IDEA to your juniors (Online Meetup on 2nd February 2023)
- Teaching Haskell...To High Schoolers! (SkillsCast recorded in December 2022)
- Teaching Haskell...To High Schoolers! (SkillsCast recorded in December 2022)
Building Genomics Pipelines with AWS Lambda and Apache Spark
Lynn Langit
Lynn Langit is an independent Cloud Architect and Developer. She works on genomic-scale cloud pipelines. Also Lynn is an author for LinkedIn Learning, having created 25 courses on cloud topics. For her technical education work, she has been awarded as an AWS Community Hero, Google Cloud Developer Expert and Microsoft Regional Director.