Musab Gültekin
|
32077d8433
|
Updated docs for rendered requests
|
2019-07-26 16:40:42 +03:00 |
|
Musab Gültekin
|
e07ef4d66d
|
Fixed important bug on rendering that was causing client request made too. Updated chromedp dependency
|
2019-07-26 16:07:09 +03:00 |
|
Musab Gültekin
|
762854e511
|
Go 1.10 and 1.11 support added by using different methods on reflect package.
|
2019-07-21 12:08:41 +03:00 |
|
Musab Gültekin
|
df37629d4d
|
Disabled indenting on JSON exporter as it looks so ugly on exported data.
JSONLine still supports indenting.
|
2019-07-14 03:37:52 +03:00 |
|
Musab Gültekin
|
dfabcb84fd
|
JSON renamed to JSONLine. JSON List support added.
|
2019-07-14 03:30:59 +03:00 |
|
Musab Gültekin
|
d19465c44a
|
Robotstxt metrics added.
|
2019-07-08 14:51:54 +03:00 |
|
Musab Gültekin
|
d3c4389c46
|
Retrying support added for chrome. Fixed robots.txt retry issue. Fixed Meta issue
|
2019-07-07 19:50:15 +03:00 |
|
Musab Gültekin
|
90d2be2210
|
Caching policies added.
We used httpcache library to implement this. As it was not possible to support different policies, I mostly copied and modified it.
|
2019-07-07 12:18:40 +03:00 |
|
Musab Gültekin
|
0d6c2a6864
|
Graceful shut down system implemented
|
2019-07-06 18:32:13 +03:00 |
|
Musab Gültekin
|
42faa92ece
|
Robots.txt support implemented
|
2019-07-06 16:18:03 +03:00 |
|
Musab Gültekin
|
2cab68d2ce
|
Middlewares refactored to multiple files in middleware package.
Extractors removed as they introduce complexity to scraper. Both in learning and developing.
|
2019-07-04 21:04:29 +03:00 |
|
Musab Gültekin
|
9adff75509
|
Retry requests support implemented for client.
|
2019-07-04 13:36:10 +03:00 |
|
Musab Gültekin
|
da03567fae
|
Extractors refactored to support pass by value. Documentation added for request and response.
|
2019-07-04 02:13:29 +03:00 |
|
Musab Gültekin
|
71683ec6de
|
Chardet removed as its not good enough to detect. Built-int library is good enough.
|
2019-07-03 20:54:17 +03:00 |
|
Musab Gültekin
|
33238bc875
|
Charset detection heuristics added with chardet lib.
|
2019-07-03 18:08:28 +03:00 |
|
Musab Gültekin
|
b355a566cf
|
Added more tests and refactored exporter tests. Added code coverage badge.
|
2019-07-02 14:53:06 +03:00 |
|
Musab Gültekin
|
4ab7cfd904
|
Exporter and Extractor interfaces moved to its own package for simplicity of main Geziyor package
|
2019-07-02 13:22:23 +03:00 |
|
Musab Gültekin
|
c0dd0393e6
|
Maximum redirection option added. Performance improvement on exports. Duplicate requests only checked on GET requests.
|
2019-07-01 15:44:28 +03:00 |
|
Musab Gültekin
|
80f3500a69
|
Fixed Chrome response not right on some sites.
|
2019-07-01 12:32:15 +03:00 |
|
Musab Gültekin
|
fb5b4e3406
|
README updated according to new package names
|
2019-06-30 22:21:36 +03:00 |
|
Musab Gültekin
|
0eda056065
|
Attribute extractor added. HTML extractor added. Outer HTML Extractor added.
exporter package renamed to export, extractor package renamed to extract for simplicity.
|
2019-06-30 22:20:17 +03:00 |
|
Musab Gültekin
|
7c383b175f
|
Metrics Server support added for expvar. Refactored some methods.
|
2019-06-30 19:09:03 +03:00 |
|
Musab Gültekin
|
ec4551a8a0
|
Making Requests and reading responses refactored to client package.
|
2019-06-30 16:21:18 +03:00 |
|
Musab Gültekin
|
0eac5f5f40
|
Fixed exporters bug that was causing last exported items not written to disk.
|
2019-06-29 16:11:52 +03:00 |
|
Musab Gültekin
|
bd6466a5f2
|
http package renamed to client to reduce cunfusion
|
2019-06-29 14:18:31 +03:00 |
|
Musab Gültekin
|
1e109c555d
|
Request and response moved to http package
|
2019-06-29 13:36:39 +03:00 |
|
Musab Gültekin
|
59757607eb
|
Pretty print exporter added. Panic counter added to metrics
|
2019-06-29 11:20:06 +03:00 |
|
Musab Gültekin
|
276b248ebb
|
Synchronized requests support added. Benchmarks added.
|
2019-06-28 17:28:16 +03:00 |
|
Musab Gültekin
|
b000581c3d
|
Extractors implemented. Exporters name simplified. README Updated for extracting data. Removed go 1.11 support
|
2019-06-28 13:00:30 +03:00 |
|
Musab Gültekin
|
679fd8ab7a
|
Map support added for CSV exporter
|
2019-06-27 22:39:06 +03:00 |
|
Musab Gültekin
|
8fe194bd10
|
Added options and tests for exporters.
|
2019-06-27 16:54:09 +03:00 |
|
Musab Gültekin
|
d20ea47390
|
Fix Header convertion bug. Map was not canonicalizing keys
|
2019-06-22 15:04:08 +03:00 |
|
Musab Gültekin
|
02df5aa4e8
|
Fixed issues on non-trailing URLS on rendered requests
|
2019-06-22 14:47:12 +03:00 |
|
Musab Gültekin
|
92e7cfefec
|
Fixed README Doc.
|
2019-06-22 13:13:33 +03:00 |
|
Musab Gültekin
|
a64a262554
|
HTTP Client can be changed now. Docs updated.
|
2019-06-22 13:12:05 +03:00 |
|
Musab Gültekin
|
7bc782400c
|
Expvar metrics support added. Metrics refactored to its own package.
|
2019-06-21 21:37:25 +03:00 |
|
Musab Gültekin
|
88c4b1dd35
|
Prometheus metrics support added.
|
2019-06-21 20:05:28 +03:00 |
|
Musab Gültekin
|
141bab0d05
|
Error handling improved
|
2019-06-20 10:14:36 +03:00 |
|
Musab Gültekin
|
f88b88986c
|
Delays and logs refactored as middlewares.
|
2019-06-20 09:54:30 +03:00 |
|
Musab Gültekin
|
514fe2e8d2
|
Recover system refactored like middleware
|
2019-06-19 22:45:40 +03:00 |
|
Musab Gültekin
|
c28b228a12
|
Response header bug fixed for Chrome
|
2019-06-18 16:37:06 +03:00 |
|
Musab Gültekin
|
ec83a92eb3
|
Response header support added for Chrome Rendering
|
2019-06-18 16:26:40 +03:00 |
|
Musab Gültekin
|
217f3c96df
|
Header and native http.Response support added for Chrome rendering
|
2019-06-18 16:16:29 +03:00 |
|
Musab Gültekin
|
936d157785
|
Revert "Try parsing HTML even if content-type is empty."
This reverts commit f384fc2c
|
2019-06-18 13:03:00 +03:00 |
|
Musab Gültekin
|
f384fc2c13
|
Try parsing HTML even if content-type is empty.
|
2019-06-18 13:00:16 +03:00 |
|
Musab Gültekin
|
4177f10de9
|
Request creation simplified and basic auth test added.
|
2019-06-17 13:53:34 +03:00 |
|
Musab Gültekin
|
a5ec28664d
|
Cookies support added.
|
2019-06-17 13:31:19 +03:00 |
|
Musab Gültekin
|
dd6687f976
|
Fixed build issue
|
2019-06-17 12:21:40 +03:00 |
|
Musab Gültekin
|
e50fa3b1dc
|
Response middlewares support implemented.
|
2019-06-16 18:29:07 +03:00 |
|
Musab Gültekin
|
80383ebd6f
|
Middlewares and some string util functions refactored. Added partial Documentation.
|
2019-06-16 10:38:03 +03:00 |
|