Please log in to watch this conference skillscast.
The project started with my professor and I trying to build a database from web-scraping a website, mootleyfool.com. In this website, one can find all sorts of company specific related news, including various quarterly earnings calls transcripts. Our objective was then to scrape as many transcripts as we could from the website and form our database.
We were able to achieve this with ease using F# and I would love to share this experience. I used to web scrape using other languages like python but using F# for this task was surprisingly simple and effective. In the end, we were able to scrape close to 20,000 earning transcripts or approximately 1gb worth of text data. It is worth mentioning that F# asynchronous methods drastically improved the speed of which the transcripts were parsed.
Now that we had our dataset, we proceeded to explore it using standard NLP classification algorithms like naïve bayes. To guide us in this journey, we used Mathias Brandewinder’s F# textbook: Machine Learning projects for .NET developers.
https://aexsalomao.github.io/ConferenceCalls/TranscriptParsing
YOU MAY ALSO LIKE:
Lightning Talk: A Finance Student Learns to Code F#
Antonio Salomao
Finance student and F# enthusiast