Musab Gültekin
|
d3bdaf6240
|
Added documentation and tests for request.Meta
|
2021-05-30 10:43:54 +03:00 |
|
Musab Gültekin
|
7d2fe57bab
|
Added error logging for HTML parser.
|
2019-12-11 13:55:38 +03:00 |
|
Musab Gültekin
|
9b8a3837bd
|
Added response joinURL test and updated chromedp.
|
2019-09-13 14:34:29 +03:00 |
|
Musab Gültekin
|
0e5230eac8
|
Remote endpoint support added for js rendered requests. Geziyor is beta now.
|
2019-08-05 15:14:47 +03:00 |
|
Musab Gültekin
|
32077d8433
|
Updated docs for rendered requests
|
2019-07-26 16:40:42 +03:00 |
|
Musab Gültekin
|
dfabcb84fd
|
JSON renamed to JSONLine. JSON List support added.
|
2019-07-14 03:30:59 +03:00 |
|
Musab Gültekin
|
d19465c44a
|
Robotstxt metrics added.
|
2019-07-08 14:51:54 +03:00 |
|
Musab Gültekin
|
90d2be2210
|
Caching policies added.
We used httpcache library to implement this. As it was not possible to support different policies, I mostly copied and modified it.
|
2019-07-07 12:18:40 +03:00 |
|
Musab Gültekin
|
42faa92ece
|
Robots.txt support implemented
|
2019-07-06 16:18:03 +03:00 |
|
Musab Gültekin
|
2cab68d2ce
|
Middlewares refactored to multiple files in middleware package.
Extractors removed as they introduce complexity to scraper. Both in learning and developing.
|
2019-07-04 21:04:29 +03:00 |
|
Musab Gültekin
|
b355a566cf
|
Added more tests and refactored exporter tests. Added code coverage badge.
|
2019-07-02 14:53:06 +03:00 |
|
Musab Gültekin
|
4ab7cfd904
|
Exporter and Extractor interfaces moved to its own package for simplicity of main Geziyor package
|
2019-07-02 13:22:23 +03:00 |
|
Musab Gültekin
|
c0dd0393e6
|
Maximum redirection option added. Performance improvement on exports. Duplicate requests only checked on GET requests.
|
2019-07-01 15:44:28 +03:00 |
|
Musab Gültekin
|
fb5b4e3406
|
README updated according to new package names
|
2019-06-30 22:21:36 +03:00 |
|
Musab Gültekin
|
bd6466a5f2
|
http package renamed to client to reduce cunfusion
|
2019-06-29 14:18:31 +03:00 |
|
Musab Gültekin
|
1e109c555d
|
Request and response moved to http package
|
2019-06-29 13:36:39 +03:00 |
|
Musab Gültekin
|
276b248ebb
|
Synchronized requests support added. Benchmarks added.
|
2019-06-28 17:28:16 +03:00 |
|
Musab Gültekin
|
b000581c3d
|
Extractors implemented. Exporters name simplified. README Updated for extracting data. Removed go 1.11 support
|
2019-06-28 13:00:30 +03:00 |
|
Musab Gültekin
|
92e7cfefec
|
Fixed README Doc.
|
2019-06-22 13:13:33 +03:00 |
|
Musab Gültekin
|
a64a262554
|
HTTP Client can be changed now. Docs updated.
|
2019-06-22 13:12:05 +03:00 |
|
Musab Gültekin
|
7bc782400c
|
Expvar metrics support added. Metrics refactored to its own package.
|
2019-06-21 21:37:25 +03:00 |
|
Musab Gültekin
|
88c4b1dd35
|
Prometheus metrics support added.
|
2019-06-21 20:05:28 +03:00 |
|
Musab Gültekin
|
a5ec28664d
|
Cookies support added.
|
2019-06-17 13:31:19 +03:00 |
|
Musab Gültekin
|
e50fa3b1dc
|
Response middlewares support implemented.
|
2019-06-16 18:29:07 +03:00 |
|
Musab Gültekin
|
80383ebd6f
|
Middlewares and some string util functions refactored. Added partial Documentation.
|
2019-06-16 10:38:03 +03:00 |
|
Musab Gültekin
|
40f673f2e2
|
Fixed README. More Go versions added for testing
|
2019-06-15 22:35:51 +03:00 |
|
Musab Gültekin
|
7b23596a2d
|
Middleware support added. HTML Parsing disable option added.
Goroutine leaks will be tested using leaktest lib.
|
2019-06-15 17:55:40 +03:00 |
|
Musab Gültekin
|
b2f32b8830
|
Merge branch 'master' into master
|
2019-06-14 15:32:36 +03:00 |
|
Musab Gültekin
|
6caf1effd6
|
Rendered field exported to support rendered requests on Do function. Data races fixed.
|
2019-06-14 15:23:56 +03:00 |
|
Ibrahim Serdar Acikgoz
|
7360ffa3c9
|
Update README.md
|
2019-06-14 14:57:53 +03:00 |
|
Musab Gültekin
|
1a7d480b36
|
JS Rendered requests with Chrome support added
|
2019-06-13 22:08:45 +03:00 |
|
Musab Gültekin
|
184081d3bf
|
README updated for more advanced usage. Updated tests.
|
2019-06-12 22:22:01 +03:00 |
|
Musab Gültekin
|
d56ea161a5
|
Making new requests on StartRequestsFunc is simplified by using channels
|
2019-06-12 21:54:57 +03:00 |
|
Musab Gültekin
|
2f6cb06982
|
Disabling charset detection implemented.
|
2019-06-12 11:44:31 +03:00 |
|
Musab Gültekin
|
3790295658
|
Multiple Exporters and custom Exporters support added.
|
2019-06-11 16:10:49 +03:00 |
|
Musab Gültekin
|
e4e8723426
|
Callback are now mandatory as almost all the scrapers use it.
|
2019-06-11 14:24:48 +03:00 |
|
Musab Gültekin
|
7abc7a370d
|
Disabling logs support added.
|
2019-06-09 19:14:46 +03:00 |
|
Musab Gültekin
|
b973c1c064
|
Request delays support added
|
2019-06-09 14:24:53 +03:00 |
|
Musab Gültekin
|
2263108838
|
User-Agent change support added.
|
2019-06-09 13:43:17 +03:00 |
|
Musab Gültekin
|
d967555b62
|
Global and Domain Concurrency limit implemented. Updated README
|
2019-06-09 11:53:40 +03:00 |
|
Musab Gültekin
|
2e3bd18430
|
Options refactored to its own file. Timeout increased to 60 sec
|
2019-06-08 20:36:43 +03:00 |
|
Musab Gültekin
|
815ae7eec5
|
Do request support added. Updated docs.
|
2019-06-08 19:45:48 +03:00 |
|
Musab Gültekin
|
54c7d3550f
|
Gezer renamed to Geziyor
|
2019-06-08 17:14:10 +03:00 |
|
Musab Gültekin
|
c525e0d7d0
|
Don't visit already visited URLs. Update README
|
2019-06-08 17:04:00 +03:00 |
|
Musab Gültekin
|
6358b87472
|
Use parse function to parse responses, instead of channels.
Parse response as HTML Document using goquery.
Added simple README.
|
2019-06-06 22:48:57 +03:00 |
|