Scrapy Proxy Integration
This guide may be outdated. For an up-to-date guide please see our documentation.
What is Scrapy?
Scrapy is a Python framework for web crawling and scraping, which allows users to extract structured data from websites. It is open-source, fast, and extensible. Scrapy can be used for various purposes, such as data mining, monitoring, and automated testing.
Scrapy integration with Bright Data proxies
Open your preferred IDE and start a new scrapy project, type in the command line :
scrapy startproject <project_name>
This will create a new folder with the project name, within the folder open a python file.
- Go to your Bright Data Control Panel and clicking the ‘Proxies & Scraping Infra’ icon
- Create a new proxy zone by clicking ‘Add’, choosing a network type, configuring the proxy, and clicking save
- Under your proxy-zone’s ‘Access parameters’ tab, you will find your ‘USERNAME’ and ‘PASSWORD’ values.
- In your scrapy spider code file, within the request’s meta parameter set the ‘proxy’ value to be the following, using the ‘USERNAME’ and ‘PASSWORD’ values from before: “http://USERNAME:[email protected]:33335”
- For Example:
import scrapy
class BrightdatascrapyexampleSpider(scrapy.Spider):
name = "BrightDataScrapyExample"
def start_requests(self):
request = scrapy.Request(url="http://example.com",callback=self.parse)
request.meta['proxy'] = "http://USERNAME:[email protected]:33335"
yield request
def parse(self, response):
print(response.body)
Then run the following command in your command line :
scrapy runspider <Pythonfilename.py>
How To Use Bright Data Proxy Manger With Scrapy
- Create a proxy zone same as in the direct integration above
- Install the Proxy Manager
- Click ‘add new port’ and configure it for your use case
- In your Scrapy spider code file, within the request’s meta parameter set the ‘proxy’ value to be the following: “http://IP:PORTNUMBER”
- The local host IP is 127.0.0.1 – this is the value you need to use if the proxy manager is installed on your machine. If the proxy manager is installed on an external server, input that server’s IP address
- The port created in the Proxy Manager is 24XXX, for example, 24000 – the default first port number
- For example:
import scrapy
class BrightdatascrapyexampleSpider(scrapy.Spider):
name = "BrightDataScrapyExample"
def start_requests(self):
request = scrapy.Request(url="http://example.com",callback=self.parse)
request.meta['proxy'] = "http://127.0.0.1:24000"
yield request
def parse(self, response):
print(response.body)
⚠️Important note: If you are using Bright Data’s Residential Proxies, Web Unlocker or SERP API, you need to install an SSL certificate to enable end-to-end secure connections to your target website(s). This is a simple process, see https://docs.brightdata.com/general/account/ssl-certificate#installation-of-the-ssl-certificate for instructions.
Get proxies for Scrapy
수상 경력에 빛나는 프록시 인프라로 구동됩니다
400 million개 이상의 주거용 IP, 최고 수준의 기술력, 그리고 국가, 도시, 우편번호, 통신사, ASN을 대상으로 할 수 있는 능력은 저희 프리미엄 프록시 서비스를 개발자들에게 최고의 선택이 되게 합니다.
모든 개발 경로를 위한 프록시
네트워크, 피어, IP를 자유롭게 조합하여 지속적인 웹 데이터 흐름을 최적화하세요.
Proxy Network Pricing
업계 최고의 고객 경험
매일 새로운 기능이 출시됩니다
필요할 때 바로 질문에 답변하기 위해
실시간 네트워크 성능 대시보드
성능 최적화를 위해
데이터 수집 목표를 달성하기 위해
프록시 및 데이터 수집 분야의 선도 기업
650 매일 수집되는 공개 데이터 TB
매일 새로운 기능이 출시됩니다
세계 최고의 대학 7/10개 제공