Global and Domain Concurrency limit implemented. Updated README

This commit is contained in:
Musab Gültekin
2019-06-09 11:53:40 +03:00
parent a9aaf86df3
commit d967555b62
4 changed files with 80 additions and 9 deletions

View File

@ -1,5 +1,5 @@
# Geziyor
Geziyor is a fast web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
Geziyor is a blazing fast web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
[![GoDoc](https://godoc.org/github.com/geziyor/geziyor?status.svg)](https://godoc.org/github.com/geziyor/geziyor)
[![report card](https://goreportcard.com/badge/github.com/geziyor/geziyor)](http://goreportcard.com/report/geziyor/geziyor)
@ -8,9 +8,24 @@ Geziyor is a fast web crawling and web scraping framework, used to crawl website
- 1.000+ Requests/Sec
- Caching
- Automatic Data Exporting
- Limit Concurrency Global/Per Domain
- Automatic response decoding to UTF-8
## Example
## Usage
Simplest usage
```go
geziyor.NewGeziyor(geziyor.Options{
StartURLs: []string{"http://api.ipify.org"},
ParseFunc: func(r *geziyor.Response) {
fmt.Println(r.Doc.Text())
},
}).Start()
```
Export all quotes and authors to out.json file.
```go
geziyor := NewGeziyor(Opt{
StartURLs: []string{"http://quotes.toscrape.com/"},