Musab Gültekin
|
fc67cec165
|
Update chromedp
|
2021-08-08 21:29:08 +03:00 |
|
Musab Gültekin
|
f35d34bc02
|
chromedp library updated.
|
2021-05-23 23:14:47 +03:00 |
|
Musab Gültekin
|
9ea67b3554
|
Use fmt.Errorf instead of errors package. This is good convention after go 1.13
|
2021-04-17 11:11:29 +03:00 |
|
Musab Gültekin
|
fbee722a38
|
Rate limiting per second implemented
|
2021-04-16 15:31:31 +03:00 |
|
Musab Gültekin
|
129402d754
|
Updated chromedp
|
2021-01-28 20:50:25 +03:00 |
|
Musab Gültekin
|
7a76a9b95e
|
Allocators seperated for transparency. Updated chrome library.
|
2020-09-05 16:14:41 +03:00 |
|
Musab Gültekin
|
cbca22fefb
|
Updated chrome protocol library
|
2019-11-16 20:34:57 +03:00 |
|
Musab Gültekin
|
9b8a3837bd
|
Added response joinURL test and updated chromedp.
|
2019-09-13 14:34:29 +03:00 |
|
Musab Gültekin
|
e07ef4d66d
|
Fixed important bug on rendering that was causing client request made too. Updated chromedp dependency
|
2019-07-26 16:07:09 +03:00 |
|
Musab Gültekin
|
90d2be2210
|
Caching policies added.
We used httpcache library to implement this. As it was not possible to support different policies, I mostly copied and modified it.
|
2019-07-07 12:18:40 +03:00 |
|
Musab Gültekin
|
42faa92ece
|
Robots.txt support implemented
|
2019-07-06 16:18:03 +03:00 |
|
Musab Gültekin
|
71683ec6de
|
Chardet removed as its not good enough to detect. Built-int library is good enough.
|
2019-07-03 20:54:17 +03:00 |
|
Musab Gültekin
|
33238bc875
|
Charset detection heuristics added with chardet lib.
|
2019-07-03 18:08:28 +03:00 |
|
Musab Gültekin
|
b355a566cf
|
Added more tests and refactored exporter tests. Added code coverage badge.
|
2019-07-02 14:53:06 +03:00 |
|
Musab Gültekin
|
7bc782400c
|
Expvar metrics support added. Metrics refactored to its own package.
|
2019-06-21 21:37:25 +03:00 |
|
Musab Gültekin
|
88c4b1dd35
|
Prometheus metrics support added.
|
2019-06-21 20:05:28 +03:00 |
|
Musab Gültekin
|
141bab0d05
|
Error handling improved
|
2019-06-20 10:14:36 +03:00 |
|
Musab Gültekin
|
f384fc2c13
|
Try parsing HTML even if content-type is empty.
|
2019-06-18 13:00:16 +03:00 |
|
Musab Gültekin
|
7b23596a2d
|
Middleware support added. HTML Parsing disable option added.
Goroutine leaks will be tested using leaktest lib.
|
2019-06-15 17:55:40 +03:00 |
|
Musab Gültekin
|
6caf1effd6
|
Rendered field exported to support rendered requests on Do function. Data races fixed.
|
2019-06-14 15:23:56 +03:00 |
|
Musab Gültekin
|
1a7d480b36
|
JS Rendered requests with Chrome support added
|
2019-06-13 22:08:45 +03:00 |
|
Musab Gültekin
|
e4e8723426
|
Callback are now mandatory as almost all the scrapers use it.
|
2019-06-11 14:24:48 +03:00 |
|
Musab Gültekin
|
ca2414c5c8
|
Request callbacks added.
Recover from all panics and continue scraping.
Only parse HTML if response is HTML.
|
2019-06-09 21:13:30 +03:00 |
|
Musab Gültekin
|
a9aaf86df3
|
Automatic determining response and decoding it.
|
2019-06-09 10:46:32 +03:00 |
|
Musab Gültekin
|
54c7d3550f
|
Gezer renamed to Geziyor
|
2019-06-08 17:14:10 +03:00 |
|
Musab Gültekin
|
ca197ff06a
|
Caching added.
JSON File export will append, not truncate.
|
2019-06-08 15:29:09 +03:00 |
|
Musab Gültekin
|
6358b87472
|
Use parse function to parse responses, instead of channels.
Parse response as HTML Document using goquery.
Added simple README.
|
2019-06-06 22:48:57 +03:00 |
|
Musab Gültekin
|
1c96048082
|
Initial commit
|
2019-06-06 17:11:19 +03:00 |
|