Data Science, Natural Language Processing (NLP) Projects

These use Machine Learning, Deep Learning , Data Science knowledge and are connected to the Natural Language Processing (NLP) topics in some way. Most techniques are transferable to other topics as Deep Learning, Machine Learning techniques can be applied across different topics.

I have divided the projects into

  1. Data Science Projects - Involving usage of Python, R. Many of these are related to NLP.
  2. Core NLP Projects - Java based for core NLP topics
  • Note: The tables are searchable by words and columns can be sorted by clicking on the column names

Data Science Projects

Area

Description

Keywords

Links

Year

Data Science, Deep Learning, NLP Deep Learning based Spam Filter on Enron+ Spam Assassin public dataset and comparison with SVM, Random Forest, Xgboost models Topics : Deep Learning, NLP, Machine Learning, TF-IDF, Random Forest, Xgboost, SVM
Development : Python, Tensorflow, Numpy, Pandas, Scikit
Report
Code
2017
Data Science, Deep Learning, NLP Semantic Role Labeling (SRL) system using Deep Learning - Bidirectional LSTM network in Paddle Framework Topics : NLP, Deep Learning
Development : Python, Paddle, Numpy, Scikit
Report
Code
2017
Unsupervised Clustering for text Implementations of few clustering algorithms for text and some tests on subset of dataset of Wikipedia pages.
  • Nearest Neighbour Search
  • K-means, Kmeans++
  • Latent Dirichlet Allocation (Mixed membership Modeling ) for Text Data
Topics : Machine Learning, K-means, Kmeans++, Nearest Neighbour Search, LDA
Development : Python, Numpy, Graphlab
Code
Reports :
2017
NLP, Machine Learning Semantic Role Labeling (SRL) system using Machine Learning - Pipeline of multiple logistic regression models. Also trainable models for Part of speech , Dependnecy Parsing Topics : NLP, Machine Learning
Development : Java, LibLinear, Play Framework, Docker, Web Services
Code
Report
2016
Deep Learning Neural Network implemented from scratch Topics : Deep Learning
Development : Python, Numpy, Pandas, Scikit
Code 2017
Data Science Projects and tasks related to The Data Scientist’s Toolbox Topics : Predictive analytics, Reproducible research, Statistical Inference
Development : R
Reports 2015
Data Science Compound intereste calculator made using R and Shiny, Web Application Framework for R. You can read a small report here Topics : Data Products, R Application
Development : R, Shiny
Demo
Code
Reports
2015

At broader level i have worked on following topics :

I have worked on core topics in NLP :

  • Syntactic and Semantic Analysis using Syntactic Parse Trees , Dependency Parsing, Linguistic Knowledge
  • Relation Extraction, Semantic Role Labeling
  • Named Entities Recognition (NER), Knowledge Graphs , Ontologies
  • Discourse analysis at Sentence , Paragraph, Document Level for text
  • Question Answering, Question Generation, Natural Language Generation (NLG)

Area

Description

Keywords

Links

Year

NLP Core Various Modules from NLP Pipeline Topics : NLP, Syntax Trees, Named Entities Recognition (NER), Discourse Analysis, Linguistics, Wordnet
Development : Java
Demo 2016
Grammar Analysis Grammar Analysis of English Sentences using Syntactic Rules based on English Grammar. The System is designed to be generic using only standard english grammar rules. Topics : NLP, Syntax Trees, Named Entities Recognition (NER), Discourse Analysis, Linguistics, Wordnet
Development : Java
Demo 2016
Google Chrome Extension Text Analyzer Extension can provide grammatical and semantic information for the selected text in English directly in Google chrome browser Topics : Google Chrome Extension
Development : Javascript, Ajax, Html
Report
Code
2015
Gate Plugins Creating plugin for Gate . Topics : NLP, Information Extraction,
Development : Javascript, Ajax, Html
Code 2014
NLP Pipeline Information Extraction System can perform NLP tasks like Named Entity Recognition, Sentence Simplification, Relation Extraction etc. Topics : NLP, Information Extraction,
Development : Javascript, Ajax, Html
Code 2013