Website Link Scanner & HTML Tree Generator is a Python-based tool designed to scan any given website, extract all the internal links , and output them in a structured HTML tree format. This is perfect for web crawlers, SEOs, and developers who need to analyze website structure or map out all available links within a webpage.
- Website Scanning: Scans any URL to find all the internal links .
- HTML Tree Generation: Generates a hierarchical HTML tree representing the structure of links.
- Support for Any Website: Works on any public website, provided it's accessible and follows standard HTML structures.
- Output Flexibility: Can save output as HTML files for easy viewing or processing.
- Error Handling: Gracefully handles errors like broken links, timeout issues, or unsupported HTML formats.
To run the Website Link Scanner, you'll need to have Python 3.10 or higher installed on your machine. Follow these steps:
- Clone this repository:
git clone git@github.com:HosseinDahaei/Website-Scanner.git
cd Website-Scanner
- Install the required dependencies:
pip install -r requirements.txt
- Run the script:
python main.py https://example.com
After running the script with the URL of your choice, the program will scan the provided website and output the results as an HTML tree.
The program will generate an HTML file (or print to stdout if preferred) that contains the links in a tree format like this:
<ul>
<li>https://example.com/
<ul>
<li>https://example.com/page1</li>
<li>https://example.com/page2</li>
</ul>
</li>
</ul>
- Python: Core programming language used for link scanning and tree generation.
- Requests: For fetching the HTML content of web pages.
- BeautifulSoup (bs4): For parsing the HTML and extracting links.
- HTML/CSS: For generating the visual output of the tree.
Check out the example_output file to see what the generated HTML tree might look like.
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Here's how you can help:
- Fork the repository.
- Create a new branch (
git checkout -b feature/my-new-feature
). - Commit your changes (
git commit -am 'Add some feature'
). - Push to the branch (
git push origin feature/my-new-feature
). - Create a new Pull Request.
If you have any questions, suggestions, or issues, feel free to open an issue or contact me directly at dahaeehossein@gmail.com.
Happy scanning! 🚀