Python package that helps you easily retrieve complete web articles.
- Python 3.5+
- Newspaper3k
- API Key from NewsApi or API Key from GNews
pip3 install newsdatascraper
from newsdatascraper import Scraper
#To first get a single article on a topic
new_scraper = Scraper('mock-api-key')
articles = new_scraper.fetch_all_articles(query='two sigma', pageSize = 10)
"""
We support two APIs: NewsApi and GNewsApi
To control the API being used change the argument of mode to either 'NEWSPAPER' or 'GNEWS'
"""
new_scraper = Scraper('mock-api-key', mode = 'GNEWS')
articles = new_scraper.fetch_all_articles(query='two sigma', pageSize = 10,
dateFrom = "2019-08-04", dateTo = "2019-08-10")
#To access individual articles and their properties
first_article = articles.articles[0]
print(first_article.content)
#We also provide helper functions to serialize the data
articles.toCsv('test.csv')
articles.toPickle('test.pickle')
articles.toJson()
Please look at rate limits in the APIs to determine your prefered usage
Run format
black .
Run Linter
pylama -o setup.cfg .
Run tests
pytest
Run tests + code coverage
sh ./scripts/generate_coverage.sh