Skip to content

aggregating news with artificial intelligence and lots of love.

License

Notifications You must be signed in to change notification settings

amirothman/news_aggregator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Clustering News With Artificial Intelligence And Lots of Love

This is an experimental project to cluster news articles. Some of the technologies used include:

Text modelling:

  • word2vec (gensim)
  • doc2vec (gensim)
  • fastText
  • LDA (gensim)

Database:

  • redis
  • mongodb

Web back-end:

  • Flask

Nearest-neighbour Approximation:

  • Annoy

Notes

Currently, the document processing is a bit slow ( ~10 Minutes for ~3000 articles).

''Installation''

git clone https://github.com/amirothman/news_aggregator

install missing dependencies with

pip install <package-name>

create following empty-directories if they do not exist:

model
corpus
similarity_index
textfiles

Launching the web server

With flask:

FLASK_APP=webapp.py flask run

With Gunicorn:

gunicorn -b 0.0.0.0:5000 webapp:app

About

aggregating news with artificial intelligence and lots of love.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages