NewsAutoAnalysisUKR is a tool designed to scrape, analyze, and summarize news articles focusing on the War in Ukraine. By leveraging advanced Python libraries and OpenAI's GPT models, this application provides insightful visualizations such as word clouds and frequency analysis charts, alongside comprehensive summaries that offer a deeper understanding of the conflict's coverage in the media.
Demo.mp4
- Article Scraping: Automatically scrapes news articles from specified sources.
- Data Analysis: Performs text analysis to identify the most frequent terms and generates word clouds and bar charts.
- Summarization: Utilizes OpenAI's GPT models to generate concise summaries of the collected articles.
- Customization: Offers users the ability to customize the analysis and summarization processes via editable configuration files (prompt.txt), enhancing the relevance and precision of the outputs for varied use cases.
- Automation: Streamlines the entire process from data collection to analysis and summarization, requiring minimal user input.
Ensure you have Python 3.x installed on your system. This project depends on several Python libraries, including requests
, beautifulsoup4
, wordcloud
, matplotlib
, python-docx
, and openai
. You can install these using pip:
pip install requests beautifulsoup4 wordcloud matplotlib python-docx openai
-
Clone the repository to your local machine:
git clone https://github.com/Op27/NewsAutoAnalysisUKR.git
-
Navigate to the project directory:
cd NewsAutoAnalysisUKR
-
Install the required Python packages:
pip install -r requirements.txt
Note: This command ensures that all the necessary Python packages are installed and up to date. If you have previously installed the required packages, this command will verify your installation, making it a safe operation to perform.
To run the tool, execute the main script from the command line:
python main.py
For summarization features in NewsAutoAnalysisUKR, an API key from OpenAI is required. This key enables interaction with OpenAI's GPT models to generate summaries. To obtain an API key, users must create an account on the OpenAI platform and follow the instructions to register for API access. Please be aware that OpenAI's services might incur costs depending on the usage volume, so it's advisable to review their pricing structure. After obtaining your API key, ensure to secure it properly and not to share it publicly or with unauthorized users.
NewsAutoAnalysisUKR utilizes OpenAI's GPT models for generating summaries of the scraped news articles. The tool sends a customizable prompt to the GPT model, which is defined in the prompt.txt file. Users can modify this file to tailor the analysis and summaries generated by the tool according to their specific requirements or interests.
- Locate the prompt.txt file in the root directory of the project.
- Open the file in a text editor of your choice.
- Edit the prompt according to your needs. The content of this file will be used as the basis for the GPT model's text generation, so consider what instructions or context you want to provide to achieve the desired output.
- Save your changes to the file before running the analysis.
This flexibility allows users to direct the focus of the summaries, making the tool versatile across a range of use cases. Whether you're interested in highlighting certain aspects of the news articles or analyzing from a specific perspective, customizing the prompt.txt file enables personalized analysis outputs.
Scraping: The tool first scrapes news articles from pre-defined URLs.
Analysis: Analyzes the text to generate visualizations and identify key themes.
Summarization: Summarizes the content using GPT, producing a coherent overview of the main points.
Web scraping, as utilized by NewsAutoAnalysisUKR for analyzing publicly available news articles, is legal within certain boundaries. However, it's critical to note that personal data is protected under GDPR in the European Union and by similar privacy laws worldwide. Users of NewsAutoAnalysisUKR should ensure not to scrape personal data unless they have a legitimate reason, in accordance with applicable privacy laws. We encourage users to familiarize themselves with and adhere to the terms of service of any source websites, including but not limited to BBC, when using this tool.
The use of NewsAutoAnalysisUKR and any actions or consequences resulting from its application are solely the responsibility of the user. The project owners and contributors do not assume any legal liability or responsibility for the manner in which the tool is used or for ensuring compliance with applicable laws. Users are responsible for using NewsAutoAnalysisUKR in a manner that is consistent with all relevant legal requirements and regulations that apply to their specific circumstances.
Contributions to NewsAutoAnalysisUKR are welcome! If you have suggestions for improvements or new features, please feel free to:
- Open an issue to discuss what you would like to change.
- Fork the repository and submit a pull request with your changes.
This project is licensed under the MIT License - see the LICENSE file for details.