Gain the basic tools and knowledge you need to solve real problems and understand the most recent and advanced NLP topics in this workshop with Leonardo De Marchi.
Extracting knowledge from text data has always been one of the most researched topics in machine learning, but only recently have we witnessed breakthroughs that put NLP in the spotlight. Much information is stored in unstructured data, like text, which is extremely important in many different fields, from finance to social media and e-commerce.
In this course, you will go through Natural Language Processing fundamentals, such as pre-processing techniques, embedding, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases.
Learn how to:
- Understand NLP basics
- Create an NLP pipeline to preprocess the data using python
- Perform topic modelling
- Use python libraries for NLP tasks, in particular, NLTK, Gensim and Glove
- Leverage transfer learning and text embeddings to perform NLP classification
Starts at 10:00 AM BST (10:00 AM UTC)
Our team is happy to discuss other options with you.
Contact us at email@example.com and mention ref:
Private tuition and large-group discounts are also available. Find out more here.
Who should take this workshop?
This course is designed for data scientists, data analysts and software engineers who want to start working with NLP without treating it like a black box.
Python will be used in all exercises, therefore some python knowledge is required. Machine learning knowledge is also beneficial but not required.
This workshop will include the following lessons:
- Theory: Familiarize yourself with NLP fundamentals and text preprocessing, to prepare the data for our models. We will go through the main steps like removing stopwords, stemming, One-Hot Encoding, and more.
- Exercise: Apply text preprocessing methods on a simple dataset.
- Outcome: You will be able to apply to appropriate methodology to preprocess the text.
- Theory: We will see what LDA (Latent Dirichlet Allocation) is and how it can help extract information from documents. We will also try different clustering techniques and implement a Non-negative Matrix factorization.
- Exercise: Apply topic modelling techniques to a simple text.
- Outcome: You will be able to apply to extract the main information from documents using topic modelling techniques.
- Theory: We will learn how it’s possible to represent text and how a classifier can use this representation. We will use TF-Idf (Term Frequency — Inverse Document Frequency) and experiment with a couple of supervised learning models.
- Exercise: Build an NLP pipeline to perform classification.
- Outcome: You will be able to solve a text classification problem end to end.
Introduction to Deep Learning in NLP
- Theory: Understand word embedding, how it works and how to use it. We will go through the main concepts behind word embedding and see some practical examples using the Gensim library.
- Exercise: Leveraging python deep learning libraries to create an NLP pipeline for sentiment analysis.
- Outcome: You will be able to use word embedding to perform any text classification task.
- We will quickly introduce the most recent development of Deep learning in NLP, in particular, we will see how to leverage BERT and ELMO and their pre-trained models to solve NLP problems.
- Outcome: you will be able to understand the theory behind Sequential models and apply it to practical problems.
- In this part, we will see some techniques to generate and summarize text. We will explore methods like Variational AutoEncoders (VAE) and Generative Adversarial Networks (GAN) and of course the most famous of them all, the GPT models from OpenAI. We will also see how to generate text using python.
- Outcome: You will have a good understanding of generative models and how to use them to generate text in python.