Skip to content

newspaper

12,365 6 492 MIT
0.2.8 (4 years ago) Dec 28 2012 96.7 thousand (month)

newspaper is a Python package that allows developers to easily extract text, images, and videos from articles on the web.

It is designed to be fast, easy to use, and compatible with a wide variety of websites. It uses advanced algorithms to extract relevant information and metadata from articles, and it also supports several languages.

newspaper includes a http client or can ingest pre-scraped HTML documents.

Example Use


from newspaper import Article

# Create a new article object
article = Article('https://www.example.com/article')

# Download the article
article.download()

# Parse the article
article.parse()

# Print the article text
print(article.text)

# Print the article title
print(article.title)

# Print the article authors
print(article.authors)

# Print the article publication date
print(article.publish_date)

Alternatives / Similar


1,349 2020.1.16 (3 years ago) Dec 14 2008 compare
722 1.4.0 (3 months ago) Jul 17 2019 compare
2,206 0.8.1 (2 years ago) Jun 30 2011 compare
728 0.14.0 (2 months ago) Oct 27 2015 compare
3,007 0.11.0 (3 months ago) Oct 20 2013 compare
9,316 1.1.9 (4 years ago) Aug 24 2018 compare
70 2.0.7 (2 months ago) Dec 11 2020 compare

Other Languages

2,028 v1.1.3 (1 year, 9 months ago) Apr 20 2016 compare