Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTML feature #314

Open
KingAkeem opened this issue Oct 14, 2023 · 20 comments
Open

Add HTML feature #314

KingAkeem opened this issue Oct 14, 2023 · 20 comments

Comments

@KingAkeem
Copy link
Member

The HTML feature hasn't been implemented yet.

The feature will operate on a single URL, the user should be able to pass a flag (--html) to
output the HTML of that specific webpage. If they also pass the (--save) flag, then the HTML should be saved to a .html file.

@Soham-Thodge
Copy link

I would like to try to solve this issue @KingAkeem

@KingAkeem
Copy link
Member Author

Just assigned it to you @kronos2003, it's all yours. Let me know if you need any help.

@Soham-Thodge
Copy link

I am a bit new so i've made some changes to the file, can you let me know how can i run the tool to ensure the function works as intended?

@KingAkeem
Copy link
Member Author

KingAkeem commented Oct 14, 2023

We use poetry to manage dependencies so you'll need to install it first.
https://python-poetry.org/docs/#installing-with-the-official-installer

If you're already familiar with Python virtual environments, the requirements.txt file is also up to date so you could create a virtual env, then install those using pip install -r requirements.txt, either option will work.

Once that's done, follow these examples:
https://github.com/DedSecInside/TorBot?tab=readme-ov-file#installation

The main file is named __main__ meaning the application can be ran using the directory without needing to include the file name as well.

@pavankalyan767
Copy link

i would like to take up this issue since it is my first open source contribution please guide me

@PSNAppz
Copy link
Member

PSNAppz commented Oct 15, 2023

@pavankalyan224847 This is currently assigned to @kronos2003. Can you see other open issues?

@pavankalyan767
Copy link

yea no problem i'll look into other issues

@Soham-Thodge
Copy link

@KingAkeem i managed to get the dependencies installed in a venv and i tried to run the updated main.py but it's showing a getaddrinfo() error

@KingAkeem
Copy link
Member Author

KingAkeem commented Oct 15, 2023

Which version of Python are you using, can you post the error and what command were you running? Try to give as much detail as possible.

@Soham-Thodge
Copy link

I'll post the error in some hours as I've just logged off
For the python version im using 3.11.4

@KingAkeem
Copy link
Member Author

OK won't be able to do much until you post the error since it sounds like a configuration issue on your machine. Some things you can check yourself is

  1. Do you have Tor running? If not you can use the --disable-socks5 flag to run without it.
  2. Do you have it configured correctly? Check the .env for the expected host and port.

@Soham-Thodge
Copy link

(HTML) C:\Users\pc\TorBot\torbot>python main.py -u http://torlinksge6enmcyyuxjpjkoouw4oorgdgeo7ftnq3zodj7g2zxi3kyd.onion/
Traceback (most recent call last):
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 348, in normalize_port
port_as_int = int(port)
^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'None'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\pc\TorBot\torbot_main_.py", line 158, in
run(arg_parser, version)
File "C:\Users\pc\TorBot\torbot_main_.py", line 89, in run
with httpx.Client(timeout=60, proxies=socks5_proxy if not args.disable_socks5 else None) as client:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_client.py", line 670, in init
proxy_map = self._get_proxy_map(proxies, allow_env_proxies)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_client.py", line 228, in _get_proxy_map
proxy = Proxy(url=proxies) if isinstance(proxies, (str, URL)) else proxies
^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_config.py", line 333, in init
url = URL(url)
^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urls.py", line 113, in init
self._uri_reference = urlparse(url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 246, in urlparse
parsed_port: typing.Optional[int] = normalize_port(port, scheme)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 350, in normalize_port
raise InvalidURL(f"Invalid port: {port!r}")
httpx.InvalidURL: Invalid port: 'None'

@KingAkeem
Copy link
Member Author

Are you using the latest version of dev? There is no main.py currently.

@Soham-Thodge
Copy link

I started working on the latest build forking it just 3 days prior so im not sure regarding the non-existence of main.py

@KingAkeem
Copy link
Member Author

Are you running the command from the root directory or from within the torbot directory. It's possible that the .env file cannot be found.

Try running the program from the root directory based on the example in the README.

@Soham-Thodge
Copy link

I'm running the program from the torbot directory where the main.py file is located

@KingAkeem
Copy link
Member Author

Try running the program from the root directory using torbot/main.py, the address information cannot be found from the .env which is in the root directory. There's a ticket to switch the environment variables to CLI flags which will resolve this issue. But for now, you'll need to run it from the root directory.

@KingAkeem
Copy link
Member Author

Any updates?

@Soham-Thodge
Copy link

I've been going on,but it will take some time to find and root out the exact error in the updated files

@KingAkeem
Copy link
Member Author

I've updated the program to use CLI flags instead of the .env file for the SOCKS configuration. If you update your branch with the latest version, it may resolve your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

4 participants