A great part of the world's knowledge is stored using text in natural language, but using it in an effective way is still a major challenge. Natural Language Processing (NLP) techniques provide the basis for harnessing this huge amount of data and converting it into a useful source of knowledge for further processing. It uses computer science, artificial intelligence and formal linguistics concepts to analyze natural language, aiming at deriving meaningful and useful information from text.
Information Extraction (IE) is the first step of this process. It attempts to make the text's semantic structure explicit by analysing text and identifying mentions of semantically defined entities and relationships within it. These relationships can then be recorded in a database to search for a particular relationship or to infer additional information from the explicitly stated facts. Moreover, once "basic" data structures like tokens, events, relationships, and references are extracted from the text provided, related information can be extended by introducing new sources of knowledge like ontologies (ConceptNet 5, WordNet, DBpedia, domain specific ontology) or further processed/extended using services like AlchemyAPI.
This session will highlight Neo4j as a viable tool in an NLP ecosystem demonstrating that it offers not only a suitable model for representing such complex data but also providing efficient ways for navigating this data. Dr. Negro will talk about features that allow the creation of advanced services on top of text analysis: recommendations, trend discovery, and finding influencers. In particular, the GraphAware NLP project will be presented as example in this direction. It is an open source Neo4j plugin that integrates NLP processing capabilities (provided by StanfordNLP and other NLP software) and existing ontology data sources (such as ConceptNet 5 and Wordnet) leveraging the power of Neo4j as backend engine
YOU MAY ALSO LIKE:
- Graph-Powered Machine Learning (SkillsCast recorded in February 2018)
- Security in the Age of Big Data (Data Anonymisation & Encryption) (in London on 21st October 2019)
- IWDS 26: Evaluating and improving our Data Science models (in London on 21st October 2019)
- Automating Elaborate-Transform-Load for Busy Data Scientists (SkillsCast recorded in October 2019)
- Apache Druid: the fast, real-time, open-source analytics data store (SkillsCast recorded in September 2019)
Mining and Searching text with Graph Databases
Alessandro has been a long-time member of the graph community and he is the main author of the first-ever recommendation engine based on Neo4j. At GraphAware, he specialises in recommendation engines, graph-aided search, and NLP. He has recently built an application using Neo4j and Elasticsearch aimed at personalising search results, utilizing several machine learning algorithms, natural language processing and ontology hierarchy. Before joining the team, Alessandro has gained over 10 years of experience in software development and spoke at many prominent conferences, such as JavaOne. Alessandro holds a Ph.D. in Computer Science from University of Salento.