Navigation X
ALERT
Click here to register with a few steps and explore all our cool stuff we have to offer!



   1255

Adding proxies to my program

by FortniteSucks - 12 March, 2022 - 03:52 AM
This post is by a banned member (FortniteSucks) - Unhide
178
Posts
8
Threads
6 Years of service
#1
I have a program that I am working on that checks Minecraft names to see if they are available or not through the use of an API.
However, I know that its limited to one request per IP, per second, so I was thinking adding proxies into my program was the way to go and though it was easy.
However I was wrong. I think? Can someone point me in the right direction of what / where I should go to do this or learn before I do this.
This post is by a banned member (UberFuck) - Unhide
UberFuck  
Godlike
1.555
Posts
375
Threads
6 Years of service
#2
(This post was last modified: 12 March, 2022 - 08:39 AM by UberFuck. Edited 1 time in total.)
You'll probably want to use request's proxy capabilities, or alternatively aiohttp-socks if you are doing things asynchronously.

Scraped proxies are usually going to be slow, experience timeouts, and are prone to be dead before you even start processing.  Scraped proxy servers are typically overloaded with requests, and get so bogged down they can't relay all the traffic in an efficient manner.  This means you really need to check any scraped proxies before doing extended processing using them, and wrap all your HTTP requests in try...except blocks with retry attempts.  If you only have a small amount of names to check, I would recommend trying webshare.io for their 10 free proxies.

Because of the inherent delays with using proxies, for it to be faster than 1 per second you probably will want/need to use some sort of parallel processing or multi-threading.  My go-to for synchronous processing is joblib.  If you want to use joblib for parallel processing w/ a progress bar I'd recommend taking a look at this solution using tqdm.  I've been re-using that same chunk of code for the last two years on several projects and it's quite handy.
This post is by a banned member (FortniteSucks) - Unhide
178
Posts
8
Threads
6 Years of service
#3
(12 March, 2022 - 04:08 AM)foxegado Wrote: Show More
You'll probably want to use request's proxy capabilities, or alternatively aiohttp-socks if you are doing things asynchronously.

Scraped proxies are usually going to be slow, experience timeouts, and are prone to be dead before you even start processing.  Scraped proxy servers are typically overloaded with requests, and get so bogged down they can't relay all the traffic in an efficient manner.  This means you really need to check any scraped proxies before doing extended processing using them, and wrap all your HTTP requests in try...except blocks with retry attempts.  If you only have a small amount of names to check, I would recommend trying webshare.io for their 10 free proxies.

Because of the inherent delays with using proxies, for it to be faster than 1 per second you probably will want/need to use some sort of parallel processing or multi-threading.  My go-to for synchronous processing is joblib.  If you want to use joblib for parallel processing w/ a progress bar I'd recommend taking a look at this solution using tqdm.  I've been re-using that same chunk of code for the last two years on several projects and it's quite handy.

Thank you so much for your time, I have gone through and looked at all of these libraries, but I would have no idea where to start to implement this stuff. I am just not experienced enough yet. I understand everything your saying in the second paragraph but the first and third I just don't understand everything you're saying. I know this is a lot to ask, but could you maybe dumb it down a bit?
This post is by a banned member (UberFuck) - Unhide
UberFuck  
Godlike
1.555
Posts
375
Threads
6 Years of service
#4
(13 March, 2022 - 02:29 AM)FortniteSucks Wrote: Show More
Thank you so much for your time, I have gone through and looked at all of these libraries, but I would have no idea where to start to implement this stuff. I am just not experienced enough yet. I understand everything your saying in the second paragraph but the first and third I just don't understand everything you're saying. I know this is a lot to ask, but could you maybe dumb it down a bit?

In that case, I would focus on just running in parallel, and don't worry about asynchronous processing.

In my first paragraph, I'm just saying you will probably want to use the requests module.  There are other modules you can use, but requests is the most common and it's able to handle proxies with or without authentication.  If you don't have it installed, you can install it with pip install requests.  To work with socks proxies you will need an additional module, and will need to run pip install requests[socks] .

In code, the usage is as follows:
Code:
import requests

# myHttpProxies = {'http':'http://user:pass@host:port', 'https':'http://user:pass@host:port')}
mySocksProxies = {'http':'socks5://user:pass@host:port', 'https':'socks5://user:pass@host:port')}
resp = requests.get('http://yoursite.com', proxies=mySocksProxies)
if resp.ok:
    print(resp.text)


In paragraph 3 - I'm saying you should be using parallel processing.  Instead of having a for loop iterating over each account and checking sequentially, you want to divide the workload so multiple accounts can be checked at the same time. 

For example with a normal request to check one account, the round trip might take half a second without a proxy, but with a proxy it might take 2 seconds.  Let's say you have 10 accounts to check. 
  • If you were to make 10 requests in parallel using proxies, you would get all responses back in 2 seconds. 
  • If you were to loop through these sequentially without a proxy, it would take 5 seconds. 
  • If you were to execute sequentially with proxies, it would take 20 seconds.

There are a few ways to implement parallelism in your code.  You can use the built-in libraries such as multiprocessing, or the thread module for multithreading, but I recommend using joblib since it has support for different backends (ie multiprocessing, threads, and more).

Hope this clarifies my earlier post.

Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
or
Sign in
Already have an account? Sign in here.


Forum Jump:


Users browsing this thread: 1 Guest(s)