ContextBridge-Semantic-Internal-Link-Tool is an advanced Python script designed to enhance website structure and user experience by identifying and suggesting intelligent internal linking opportunities. By leveraging the power of natural language processing and machine learning, this tool analyzes web page content, semantics, and context to recommend highly relevant internal links.
- Utilizes GPT-3.5 for accurate product classification into categories and subcategories
- Employs OpenAI's text-embedding-ada-002 model to generate high-quality content embeddings
- Calculates multi-dimensional similarity scores using cosine similarity and TF-IDF
- Identifies potential internal linking opportunities based on content relevance, semantic similarity, and category coherence
- Generates context-aware anchor texts for suggested links using GPT-3.5
- Outputs comprehensive results to an Excel file for easy analysis and implementation
- Python 3.7+
- OpenAI API key
-
Clone this repository:
git clone https://github.com/yourusername/ContextBridge-Semantic-Internal-Link-Tool.git cd ContextBridge-Semantic-Internal-Link-Tool
-
Install the required packages:
pip install pandas numpy openai scikit-learn nltk openpyxl
-
Set up your OpenAI API key:
- Replace
'your-api-key-here'
in the script with your actual OpenAI API key.
- Replace
-
Prepare your input data:
- Create an Excel file named
page_metadata.xlsx
with columns 'url' and 'h1' containing your page URLs and H1 titles.
- Create an Excel file named
-
Run the script:
python contextbridge.py
-
The script will process your data and generate an Excel file named
internal_linking_results.xlsx
with the suggested internal linking opportunities.
- The script loads the page data from the Excel file.
- It uses GPT-3.5 to classify each product into a main category and subcategory.
- Embeddings are generated for each page's content using OpenAI's embedding model.
- The script calculates similarities between pages using these embeddings and text-based similarity metrics.
- It then identifies potential linking opportunities based on content similarity, embedding similarity, and category relevance.
- For each potential link, it generates an appropriate anchor text using GPT-3.5.
- Finally, it saves all the results, including source and target URLs, categories, similarity scores, and suggested anchor texts, to an Excel file.
This script makes multiple API calls to OpenAI, which may incur costs. Make sure you understand the pricing and have appropriate usage limits set up in your OpenAI account.
Contributions, issues, and feature requests are welcome. Feel free to check issues page if you want to contribute.