Skip to content

NewsAutoAnalysisUKR scrapes, analyzes, and summarizes Ukraine War news articles using Python libraries and OpenAI's GPT. It offers visualizations like word clouds and frequency charts, providing in-depth media coverage insights.

License

Notifications You must be signed in to change notification settings

Op27/NewsAutoAnalysisUKR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NewsAutoAnalysisUKR

NewsAutoAnalysisUKR is a tool designed to scrape, analyze, and summarize news articles focusing on the War in Ukraine. By leveraging advanced Python libraries and OpenAI's GPT models, this application provides insightful visualizations such as word clouds and frequency analysis charts, alongside comprehensive summaries that offer a deeper understanding of the conflict's coverage in the media.

Demo.mp4

Features

  • Article Scraping: Automatically scrapes news articles from specified sources.
  • Data Analysis: Performs text analysis to identify the most frequent terms and generates word clouds and bar charts.
  • Summarization: Utilizes OpenAI's GPT models to generate concise summaries of the collected articles.
  • Customization: Offers users the ability to customize the analysis and summarization processes via editable configuration files (prompt.txt), enhancing the relevance and precision of the outputs for varied use cases.
  • Automation: Streamlines the entire process from data collection to analysis and summarization, requiring minimal user input.

Quick Start Guide

Prerequisites

Ensure you have Python 3.x installed on your system. This project depends on several Python libraries, including requests, beautifulsoup4, wordcloud, matplotlib, python-docx, and openai. You can install these using pip:

pip install requests beautifulsoup4 wordcloud matplotlib python-docx openai

Installation

  1. Clone the repository to your local machine:

    git clone https://github.com/Op27/NewsAutoAnalysisUKR.git
  2. Navigate to the project directory:

    cd NewsAutoAnalysisUKR
  3. Install the required Python packages:

    pip install -r requirements.txt

    Note: This command ensures that all the necessary Python packages are installed and up to date. If you have previously installed the required packages, this command will verify your installation, making it a safe operation to perform.

Usage

To run the tool, execute the main script from the command line:

python main.py

Obtaining OpenAI API Key

For summarization features in NewsAutoAnalysisUKR, an API key from OpenAI is required. This key enables interaction with OpenAI's GPT models to generate summaries. To obtain an API key, users must create an account on the OpenAI platform and follow the instructions to register for API access. Please be aware that OpenAI's services might incur costs depending on the usage volume, so it's advisable to review their pricing structure. After obtaining your API key, ensure to secure it properly and not to share it publicly or with unauthorized users.

Customization

Configuring the prompt.txt for OpenAI

NewsAutoAnalysisUKR utilizes OpenAI's GPT models for generating summaries of the scraped news articles. The tool sends a customizable prompt to the GPT model, which is defined in the prompt.txt file. Users can modify this file to tailor the analysis and summaries generated by the tool according to their specific requirements or interests.

Editing prompt.txt

  1. Locate the prompt.txt file in the root directory of the project.
  2. Open the file in a text editor of your choice.
  3. Edit the prompt according to your needs. The content of this file will be used as the basis for the GPT model's text generation, so consider what instructions or context you want to provide to achieve the desired output.
  4. Save your changes to the file before running the analysis.

This flexibility allows users to direct the focus of the summaries, making the tool versatile across a range of use cases. Whether you're interested in highlighting certain aspects of the news articles or analyzing from a specific perspective, customizing the prompt.txt file enables personalized analysis outputs.

How It Works

Scraping: The tool first scrapes news articles from pre-defined URLs.
Analysis: Analyzes the text to generate visualizations and identify key themes.
Summarization: Summarizes the content using GPT, producing a coherent overview of the main points.

Disclaimer

Web Scraping

Web scraping, as utilized by NewsAutoAnalysisUKR for analyzing publicly available news articles, is legal within certain boundaries. However, it's critical to note that personal data is protected under GDPR in the European Union and by similar privacy laws worldwide. Users of NewsAutoAnalysisUKR should ensure not to scrape personal data unless they have a legitimate reason, in accordance with applicable privacy laws. We encourage users to familiarize themselves with and adhere to the terms of service of any source websites, including but not limited to BBC, when using this tool.

Responsibility

The use of NewsAutoAnalysisUKR and any actions or consequences resulting from its application are solely the responsibility of the user. The project owners and contributors do not assume any legal liability or responsibility for the manner in which the tool is used or for ensuring compliance with applicable laws. Users are responsible for using NewsAutoAnalysisUKR in a manner that is consistent with all relevant legal requirements and regulations that apply to their specific circumstances.

Contributing

Contributions to NewsAutoAnalysisUKR are welcome! If you have suggestions for improvements or new features, please feel free to:

  • Open an issue to discuss what you would like to change.
  • Fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

NewsAutoAnalysisUKR scrapes, analyzes, and summarizes Ukraine War news articles using Python libraries and OpenAI's GPT. It offers visualizations like word clouds and frequency charts, providing in-depth media coverage insights.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages